DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Allowable Subject Matter
Claims 8 - 13 are currently subject to non-statutory double patent rejections, but are otherwise not subject to any prior art rejections under either 35 U.S.C. § 102 or 35 U.S.C. § 103. Assuming that the foregoing shortcomings of these claims were rectified by the timely filing of a terminal disclaimer, these claims would be allowable.
The following is a statement of reasons for the indication of allowable subject matter:  
With regards to independent claim 8, this claim recites the same features as were found allowable in parent application 16/426264. Accordingly, this claim is found allowable for the same reasons as were provided in the parent application.
With regards to claims 9 - 13, these claims depend from claim 8 and therefore incorporate the features of that claim that were found allowable. These claims are found allowable for the same reasons as were provided with respect to their parent claim(s).
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “search module”, “training data generation module”, “positive sample generation module”, “negative sample generation module”, “machine-learning training module”, in claims 1 - 7.  Although the claims recite the various modules are “implemented at least partially in hardware”, the term “hardware” does not comprise an art recognized structure sufficient for performing any of the recited functions.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claim 1 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 1
U.S. Patent No. 11,144,784
Claim 1
In a digital medium machine learning model training environment, a system comprising:
In a digital medium machine learning model training environment, a method implemented by a computing device, the method comprising: 

a search module implemented at least partially in hardware to generate a plurality of search results having digital images, the plurality of search results resulting from searches performed using a plurality of text queries, respectively; 
receiving, by the computing device, a plurality of text queries used to initiate a plurality of digital image searches; generating, by the computing device, a plurality of filtered text queries by filtering stop words from the plurality of text queries;
.
a training data generation module implemented at least partially in hardware to generate a training dataset, the training data generation module including: 
generating, by the computing device, a training dataset based on the plurality of filtered text queries and a plurality of digital images generated by a plurality of digital image searches, the training dataset including: 

a positive sample generation module configured to select a positive digital image sample from the plurality of digital images, the positive digital image sample selected from a first said search result; and
a positive digital image sample located using a first respective said filtered text query; and

a negative sample generation module configured to select a negative digital image sample from the plurality of digital image, the negative digital image sample selected from a second said search result other than the first said search result; and a 
a negative digital image using a second respective said filtered text query that shares at least one item of text with the first respective said filtered text query and does not share at least one other item of text with the first respective said filtered text query; 
machine-learning training module implemented at least partially in hardware to train a model as part of machine learning based on the training dataset to perform an image search.
training, by the computing device, a model using machine learning based on a loss function using the training dataset; and generating, by the computing device, a subsequent search result using the model


	
Claim 2 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 2 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 2
U.S. Patent No. 11,144,784
Claim 2
The system as described in claim 1, wherein the training of the model results in a single unified text-and-digital image embedding space based on the plurality of text queries and the plurality of digital images.
The method as described in claim 1, wherein the training of the model results in a single unified text-and-digital image embedding space based on the plurality of filtered text queries and the plurality of digital images.


	
Claim 3 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 3
U.S. Patent No. 11,144,784
Claim 1
The system as described in claim 1, wherein the negative digital image sample is not included in the first said search result.
a negative digital image using a second respective said filtered text query that shares at least one item of text with the first respective said filtered text query and does not share at least one other item of text with the first respective said filtered text query


	
Claim 4 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 3 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 4
U.S. Patent No. 11,144,784
Claim 3
The system as described in claim 1, wherein the negative sample generation module is configured to select the negative digital image sample based on the positive digital image sample.
The method as described in claim 1, wherein … generating the negative digital image sample from the plurality of digital images based on the positive digital image sample.


	
Claim 5 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 5
U.S. Patent No. 11,144,784
Claim 1
The system as described in claim 4, wherein the negative sample generation module is configured to select the negative digital image sample based at least part on identifying a second said search query used as a basis to generate the second said search result includes at least two items of text also included as part of a first said search query used as a basis to generate the first said search result.
receiving, by the computing device, a plurality of text queries used to initiate a plurality of digital image searches;
* * *
a negative digital image using a second respective said filtered text query that shares at least one item of text with the first respective said filtered text query and does not share at least one other item of text with the first respective said filtered text query;


	
The present claim 5 lies within the range disclosed by claim 1 of U.S. Patent No. 11,144,784 in that it recites a slightly narrower range (“shares at least two items of text”) than claim 1 of U.S. Patent No. 11,144,784, which recites, “shares at least one item of text.” In the case where the claimed ranges "overlap or lie inside ranges disclosed by the prior art" a prima facie case of obviousness exists. In re Wertheim, 541 F.2d 257, 191 USPQ 90 (CCPA 1976); In re Woodruff, 919 F.2d 1575, 16 USPQ2d 1934 (Fed. Cir. 1990) (The prior art taught carbon monoxide concentrations of "about 1-5%" while the claim was limited to "more than 5%." The court held that "about 1-5%" allowed for concentrations slightly above 5% thus the ranges overlapped.); In re Geisler, 116 F.3d 1465, 1469-71, 43 USPQ2d 1362, 1365-66 (Fed. Cir. 1997) (Claim reciting thickness of a protective layer as falling within a range of "50 to 100 Angstroms" considered prima facie obvious in view of prior art reference teaching that "for suitable protection, the thickness of the protective layer should be not less than about 10 nm [i.e., 100 Angstroms]." The court stated that "by stating that ‘suitable protection’ is provided if the protective layer is ‘about’ 100 Angstroms thick, [the prior art reference] directly teaches the use of a thickness within [applicant’s] claimed range.").
Claim 6 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 8 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 6
U.S. Patent No. 11,144,784
Claim 8
The system as described in claim 1, wherein the machine-learning training module is configured to generate a positive image embedding from the positive digital image sample, a text embedding from the text query associated with the positive digital image sample, and a negative image embedding from the negative digital image sample.
The method as described in claim 1, wherein the training includes generating a positive image embedding from the positive digital image sample, a text embedding from the text query associated with the positive digital image sample, and a negative image embedding generating from the negative digital image sample.


	
Claim 7 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 9 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 7
U.S. Patent No. 11,144,784
Claim 9
The system as described in claim 6, wherein the machine-learning training module is configured to train the model using a triplet loss function that addresses a loss between the text embedding and the positive image embedding separately from a loss between the text embedding and the negative image embedding.
The method as described in claim 8, wherein the loss function is a triplet loss function that addresses a loss between the text embedding and the positive image embedding separately from a loss between the text embedding and the negative image embedding.


	
Claim 8 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 15 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 8
U.S. Patent No. 11,144,784
Claim 15
In a digital medium machine learning model training environment, a method implemented by a computing device, the method comprising: 
In a digital medium machine learning model training environment, a method implemented by a computing device, the method comprising:
receiving, by the computing device, a plurality of digital images and a plurality of text associated with the plurality of digital images, respectively; 
receiving, by the computing device, a plurality of digital images and a plurality of text associated with the plurality of digital images, respectively;

Present Application
Claim 8 continued…
U.S. Patent No. 11,144,784
Claim 15 continued…
generating, by the computing device, a training dataset based on the plurality of digital images and the plurality of text, the training dataset including a positive digital image sample, text of the plurality of text associated with the positive digital image sample, and a negative digital image sample;
generating, by the computing device, a training dataset based on the plurality of digital images and the plurality of text, the training dataset having a first training dataset … including a positive digital image sample, text of the plurality of text associated with the positive digital image sample, and a negative digital image sample;
training, by the computing device, a model using machine learning based on the training dataset to perform image searches, the training using a loss function that defines a loss between the text and the positive image separately from a loss between the text and the negative image.
training, by the computing device, a model using machine learning based on a loss function using the training dataset… using the loss function, a loss between the text embedding and the positive image embedding for the first training dataset separately from a loss between the text embedding and the negative image embedding of the second training dataset.


	
Claim 9 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 17 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 9
U.S. Patent No. 11,144,784
Claim 17
The method as described in claim 8, wherein a distance of the loss between the text and the positive image decreases and a distance of the loss between the text and the negative image increases during the training.
The method as described in claim 15, wherein a distance of the loss between the text embedding and the positive image embedding decreases and a distance of the loss between the text embedding and the negative image embedding increases during the training.


	
Claim 10 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 16 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 10
U.S. Patent No. 11,144,784
Claim 16
The method as described in claim 8, wherein the training trains the model to implement a single unified text-and-digital image embedding space based on the plurality of text and the plurality of digital images.
The method as described in claim 15, wherein the training trains the model to implement a single unified text-and-digital image embedding space based on the plurality of text and the plurality of digital images.


	
Claim 11 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 18 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 11
U.S. Patent No. 11,144,784
Claim 18
The method as described in claim 8, wherein the positive digital image sample is selected from a first search result of a plurality of search results and the negative digital image sample is selected from a second search result of the plurality of search results other than the first search result.
(incorporated from parent claim 15)
a first training dataset that is query based and a second training dataset that is title based and including a positive digital image sample, text of the plurality of text associated with the positive digital image sample, and a negative digital image sample

The method as described in claim 15, wherein the training dataset wherein: the first training dataset is a query-based training dataset that includes a plurality of text queries used to initiate a plurality of digital image searches and a plurality of digital images that are user selected from search results generated by the plurality of digital image searches; and the second training dataset is a title-based training dataset that includes a corresponding plurality of digital images and titles associated with the corresponding plurality of digital images.

 	
Claim 12 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 18 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 12
U.S. Patent No. 11,144,784
Claim 18
The method as described in claim 8, wherein the training dataset includes: a first training dataset that is a query-based training dataset and includes a plurality of text queries used to initiate a plurality of digital image searches and a plurality of digital images that are user selected from search results generated by the plurality of digital image searches; and a second training dataset is a title-based training dataset and includes a corresponding plurality of digital images and titles associated with the corresponding plurality of digital images.
The method as described in claim 15, wherein the training dataset wherein: the first training dataset is a query-based training dataset that includes a plurality of text queries used to initiate a plurality of digital image searches and a plurality of digital images that are user selected from search results generated by the plurality of digital image searches; and the second training dataset is a title-based training dataset that includes a corresponding plurality of digital images and titles associated with the corresponding plurality of digital images.


	
Claim 13 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 20 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 13
U.S. Patent No. 11,144,784
Claim 20
The method as described in claim 12, wherein a loss for the training dataset is calculated by averaging the loss for the query-based training dataset with a loss for the title-based training dataset.
The method as described in claim 19, wherein a loss for the training dataset is calculated by averaging the loss for the query-based training dataset with the loss for the title-based training dataset.


	
Claim 14 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 14
U.S. Patent No. 11,144,784
Claim 1
In a digital medium machine learning model training environment, a method implemented by a computing device, the method comprising: 
In a digital medium machine learning model training environment, a method implemented by a computing device, the method comprising: 

receiving, by the computing device, a plurality of digital images and a plurality of text associated with the plurality of digital images, respectively; 
receiving, by the computing device, a plurality of text queries used to initiate a plurality of digital image searches; 

generating, by the computing device, a training dataset based on the plurality of digital images and the plurality of text, the training dataset including: a positive digital image sample; 
generating, by the computing device, a training dataset based on the plurality of filtered text queries and a plurality of digital images generated by a plurality of digital image searches, the training dataset including: a positive digital image sample located using a first respective said filtered text query; 
at least two terms of text of the plurality of text associated with the positive digital image sample; and 
a negative digital image using a second respective said filtered text query that shares at least one item of text with the first respective said filtered text query …
a negative digital image sample also associated with the at least two terms of text of the plurality of text; and 
a negative digital image using a second respective said filtered text query that shares at least one item of text with the first respective said filtered text query and does not share at least one other item of text with the first respective said filtered text query; 
training, by the computing device, a model using machine learning based on the training dataset to perform image searches.
training, by the computing device, a model using machine learning based on a loss function using the training dataset; and generating, by the computing device, a subsequent search result using the model.


The present claim 14 lies within the range disclosed by claim 1 of U.S. Patent No. 11,144,784 in that it recites a slightly narrower range (“shares at least two items of text”) than claim 1 of U.S. Patent No. 11,144,784, which recites, “shares at least one item of text.” In the case where the claimed ranges "overlap or lie inside ranges disclosed by the prior art" a prima facie case of obviousness exists. In re Wertheim, 541 F.2d 257, 191 USPQ 90 (CCPA 1976); In re Woodruff, 919 F.2d 1575, 16 USPQ2d 1934 (Fed. Cir. 1990) (The prior art taught carbon monoxide concentrations of "about 1-5%" while the claim was limited to "more than 5%." The court held that "about 1-5%" allowed for concentrations slightly above 5% thus the ranges overlapped.); In re Geisler, 116 F.3d 1465, 1469-71, 43 USPQ2d 1362, 1365-66 (Fed. Cir. 1997) (Claim reciting thickness of a protective layer as falling within a range of "50 to 100 Angstroms" considered prima facie obvious in view of prior art reference teaching that "for suitable protection, the thickness of the protective layer should be not less than about 10 nm [i.e., 100 Angstroms]." The court stated that "by stating that ‘suitable protection’ is provided if the protective layer is ‘about’ 100 Angstroms thick, [the prior art reference] directly teaches the use of a thickness within [applicant’s] claimed range.").

Claim 15 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 5 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 15
U.S. Patent No. 11,144,784
Claim 5
The method as described in claim 14, wherein the plurality of text is titles associated with respective digital images of the plurality of digital images.
The method as described in claim 1, wherein the generating of the training dataset includes generating a title-based training dataset having titles associated with a corresponding plurality of digital images.


	
Claim 16 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 8 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 16
U.S. Patent No. 11,144,784
Claim 8
The method as described in claim 14, wherein the training includes generating a positive image embedding from the positive digital image sample, a text embedding from the at least two terms of text associated with the positive digital image sample, and a negative image embedding from the negative digital image sample.
The method as described in claim 1, wherein the training includes generating a
positive image embedding from the positive digital image sample, a text embedding
from the text query associated with the positive digital image sample, and a negative
image embedding generating from the negative digital image sample.


	
The present claim 16 lies within the range disclosed by claim 8 of U.S. Patent No. 11,144,784 in that it recites a slightly narrower range (“at least two terms of text”) than claim 8 of U.S. Patent No. 11,144,784, which recites, “text.” In the case where the claimed ranges "overlap or lie inside ranges disclosed by the prior art" a prima facie case of obviousness exists. In re Wertheim, 541 F.2d 257, 191 USPQ 90 (CCPA 1976); In re Woodruff, 919 F.2d 1575, 16 USPQ2d 1934 (Fed. Cir. 1990) (The prior art taught carbon monoxide concentrations of "about 1-5%" while the claim was limited to "more than 5%." The court held that "about 1-5%" allowed for concentrations slightly above 5% thus the ranges overlapped.); In re Geisler, 116 F.3d 1465, 1469-71, 43 USPQ2d 1362, 1365-66 (Fed. Cir. 1997) (Claim reciting thickness of a protective layer as falling within a range of "50 to 100 Angstroms" considered prima facie obvious in view of prior art reference teaching that "for suitable protection, the thickness of the protective layer should be not less than about 10 nm [i.e., 100 Angstroms]." The court stated that "by stating that ‘suitable protection’ is provided if the protective layer is ‘about’ 100 Angstroms thick, [the prior art reference] directly teaches the use of a thickness within [applicant’s] claimed range.").
Claim 17 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 9 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 17
U.S. Patent No. 11,144,784
Claim 9
The method as described in claim 16, wherein the training uses a triplet loss function that addresses a loss between the text embedding and the positive image embedding separately from a loss between the text embedding and the negative image embedding.
The method as described in claim 8, wherein the loss function is a triplet loss function that addresses a loss between the text embedding and the positive image embedding separately from a loss between the text embedding and the negative image embedding.


	
Claim 18 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 7 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 18
U.S. Patent No. 11,144,784
Claim 7
The method as described in claim 14, further comprising: selecting the positive digital image sample from a first search result of a plurality of search results; and selecting the negative digital image sample from a second search result of the plurality of search results other than the first search result.
The method as described in claim 6, wherein the generating of the negative digital image sample includes: generating filtered titles by filtering stop words from the titles; generating a subset of the corresponding plurality of digital images by excluding a digital image from the corresponding plurality of digital images having each item of text included with the filtered title associated with the positive digital image sample; and selecting the negative digital image sample from the subset.


	










(continued on next page)
Claim 19 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim  of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 19
U.S. Patent No. 11,144,784
Claim 6
The method as described in claim 14, wherein the training dataset includes: a first training dataset that is a query-based training dataset and includes a plurality of text queries used to initiate a plurality of digital image searches and a plurality of digital images that are user selected from search results generated by the plurality of digital image searches; and a second training dataset is a title-based training dataset and includes a corresponding plurality of digital images and titles associated with the corresponding plurality of digital images.
(Incorporated from parent claim 5)
The method as described in claim 1, wherein the generating of the training dataset includes generating a title-based training dataset having titles associated with a corresponding plurality of digital images.

The method as described in claim 5, wherein the generating of the title-based training dataset includes: selecting the positive digital image sample from the corresponding plurality of digital images; and generating the negative digital image sample from the corresponding plurality of digital images based on the positive digital image sample.


	
Claim 20 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 7 of U.S. Patent No. 11,144,784. Although the claims at issue are not identical, they are not patentably distinct from each other as shown in the following table:
Present Application
Claim 20
U.S. Patent No. 11,144,784
Claim 7
The method as described in claim 14, wherein the negative digital image sample is not included in a search result used to locate the positive digital image sample.
The method as described in claim 6, wherein the generating of the negative digital image sample includes: generating filtered titles by filtering stop words from the titles; generating a subset of the corresponding plurality of digital images by excluding a digital image from the corresponding plurality of digital images having each item of text included with the filtered title associated with the positive digital image sample; and selecting the negative digital image sample from the subset.


	




(continued on next page)
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1, 4 - 5, 14 - 15, 18 - 20, are rejected under 35 U.S.C. 102(a)(1) as being anticipated by FU et al (U.S. PG Pub. No. 2017 /0330054).

With regards to claim 1, FU discloses generating a plurality of search results having digital images, the plurality of search results resulting from searches performed using a plurality of text queries, respectively at ¶¶ [0003] (“Image search refers to an information retrieval process whereby a user enters a natural language query, for example, a query entered via a text field provided by a search engine; an image collection is searched and a sorted image result according to relevance and other parameters is returned”), [0092], [0093].
FU discloses generating a training dataset by selecting a positive digital image sample from the plurality of digital images, the positive digital image sample selected from a first said search result at ¶¶ [0093]-[0096] (“Therefore, in a exemplified implementation of this embodiment, a positive sample image and a negative sample image corresponding to a query may be automatically determined according to image click logs of users. For example, after a user enters a query to search for an image, an image clicked by the user based on a searching result is used as a positive sample image corresponding to the query, and an image not clicked by the user is used as a negative sample image corresponding to the query.”) 
FU discloses generating a training dataset by selecting a negative digital image sample from the plurality of digital image, the negative digital image sample selected from a second said search result other than the first said search result at ¶¶ [0093]-[0096] (“Therefore, in a exemplified implementation of this embodiment, a positive sample image and a negative sample image corresponding to a query may be automatically determined according to image click logs of users. For example, after a user enters a query to search for an image, an image clicked by the user based on a searching result is used as a positive sample image corresponding to the query, and an image not clicked by the user is used as a negative sample image corresponding to the query.”) 
FU discloses training a model as part of machine learning based on the training dataset to perform an image search at ¶¶ [0086]-[0087], [0093]-[0101], [0105] and FIG. 5. 
With regards to claim 4, FU discloses selecting the negative digital image sample based on the positive digital image sample at ¶ [0093]; to wit: “an image clicked by the user based on a searching result is used as a positive sample image corresponding to the query, and an image not clicked by the user is used as a negative sample image corresponding to the query.” 
With regards to claim 5, FU discloses selecting the negative digital image sample based at least part on identifying a second said search query used as a basis to generate the second said search result includes at least two items of text also included as part of a first said search query used as a basis to generate the first said search result  at: ¶ [0107](“FIG. 6 is a flowchart for a method of establishing an image search relevance prediction model… [T]he step of selecting a set number of training samples is specifically optimized by: …, wherein the query sample comprises: … at least two queries meeting a set similarity threshold condition…”); ¶¶ [0113]-[0114](“Correspondingly, the query sample may further comprise at least two queries meeting a set similarity threshold condition. In a specific example, ‘birthday card’ may be directly selected as a query sample, or ‘birthday card’, ‘birthdate card’ and ‘date-of-birth card’ may be used as query samples in a manner of semantic similarity clustering.”); ¶¶ [0093]-[0096] (“Therefore, in a exemplified implementation of this embodiment, a positive sample image and a negative sample image corresponding to a query may be automatically determined according to image click logs of users. For example, after a user enters a query to search for an image, an image clicked by the user based on a searching result is used as a positive sample image corresponding to the query, and an image not clicked by the user is used as a negative sample image corresponding to the query.”)
With regards to claim 14, FU discloses receiving a plurality of digital images and a plurality of text associated with the plurality of digital images, respectively ¶¶ [0003], [0092], [0093].
FU discloses generating, by the computing device, a training dataset based on the plurality of digital images and the plurality of text wherein the training dataset includes a positive digital image sample and the training dataset includes at least two terms of text of the plurality of text associated with the positive digital image sample at: ¶ [0107](“FIG. 6 is a flowchart for a method of establishing an image search relevance prediction model… [T]he step of selecting a set number of training samples is specifically optimized by: …, wherein the query sample comprises: … at least two queries meeting a set similarity threshold condition…”); ¶¶ [0113]-[0114](“Correspondingly, the query sample may further comprise at least two queries meeting a set similarity threshold condition. In a specific example, ‘birthday card’ may be directly selected as a query sample, or ‘birthday card’, ‘birthdate card’ and ‘date-of-birth card’ may be used as query samples in a manner of semantic similarity clustering.”); ¶¶ [0093]-[0096] (“Therefore, in a exemplified implementation of this embodiment, a positive sample image and a negative sample image corresponding to a query may be automatically determined according to image click logs of users. For example, after a user enters a query to search for an image, an image clicked by the user based on a searching result is used as a positive sample image corresponding to the query, and an image not clicked by the user is used as a negative sample image corresponding to the query.”)
FU discloses generating, by the computing device, a training dataset based on the plurality of digital images and the plurality of text wherein the training dataset includes a negative digital image sample also associated with the at least two terms of text of the plurality of text at: ¶ [0107](“FIG. 6 is a flowchart for a method of establishing an image search relevance prediction model… [T]he step of selecting a set number of training samples is specifically optimized by: …, wherein the query sample comprises: … at least two queries meeting a set similarity threshold condition…”); ¶¶ [0113]-[0114](“Correspondingly, the query sample may further comprise at least two queries meeting a set similarity threshold condition. In a specific example, ‘birthday card’ may be directly selected as a query sample, or ‘birthday card’, ‘birthdate card’ and ‘date-of-birth card’ may be used as query samples in a manner of semantic similarity clustering.”); ¶¶ [0093]-[0096] (“Therefore, in a exemplified implementation of this embodiment, a positive sample image and a negative sample image corresponding to a query may be automatically determined according to image click logs of users. For example, after a user enters a query to search for an image, an image clicked by the user based on a searching result is used as a positive sample image corresponding to the query, and an image not clicked by the user is used as a negative sample image corresponding to the query.”)
FU discloses training, by the computing device, a model using machine learning based on the training dataset to perform image searches at ¶¶ [0086]-[0087], [0093]-[0101] and FIG. 5.

With regards to claim 15, FU discloses the plurality of text is titles associated with respective digital images of the plurality of digital images (“image associated text data”) at ¶¶ [0047]-[0048]; to wit: “The image associated text data specifically refers to: text information that is stored corresponding to the image and used to briefly describe the image content. For example, when an image is stored, a title ‘birthday card’ of the image is stored at the same time.”
With regards to claim 18, FU discloses selecting the positive digital image sample from a first search result of a plurality of search results at ¶ [0095] (“Therefore, in a exemplified implementation of this embodiment, a positive sample image and a negative sample image corresponding to a query may be automatically determined according to image click logs of users. For example, after a user enters a query to search for an image, an image clicked by the user based on a searching result is used as a positive sample image corresponding to the query, and an image not clicked by the user is used as a negative sample image corresponding to the query.”)
FU discloses selecting the negative digital image sample from a second search result of the plurality of search results other than the first search result at ¶ [0095] (“Therefore, in a exemplified implementation of this embodiment, a positive sample image and a negative sample image corresponding to a query may be automatically determined according to image click logs of users. For example, after a user enters a query to search for an image, an image clicked by the user based on a searching result is used as a positive sample image corresponding to the query, and an image not clicked by the user is used as a negative sample image corresponding to the query.”)
With regards to claim 19, FU discloses the training dataset includes a first training dataset that is a query-based training dataset and includes a plurality of text queries used to initiate a plurality of digital image searches and a plurality of digital images that are user selected from search results generated by the plurality of digital image searches at ¶¶ [0093]-[0095]; ¶¶ [0003]- [0004] (“Image search refers to an information retrieval process whereby a user enters a natural language query, for example, a query entered via a text field provided by a search engine; an image collection is searched and a sorted image result according to relevance… Currently, relevance characteristics of image search are described mainly using the following three approaches: 1. a text matching characteristic, which is obtained by comparing image surrounding text with a query…”); and, at ¶ [0095] (“Therefore, in a exemplified implementation of this embodiment, a positive sample image and a negative sample image corresponding to a query may be automatically determined according to image click logs of users. For example, after a user enters a query to search for an image, an image clicked by the user based on a searching result is used as a positive sample image corresponding to the query, and an image not clicked by the user is used as a negative sample image corresponding to the query.”) 
FU discloses the training dataset includes a second training dataset is a title-based training dataset and includes a corresponding plurality of digital images and titles associated with the corresponding plurality of digital images at ¶¶ [0047]-[0048]; to wit: “The image associated text data specifically refers to: text information that is stored corresponding to the image and used to briefly describe the image content. For example, when an image is stored, a title ‘birthday card’ of the image is stored at the same time.”
With regards to claim 20, FU discloses  the negative digital image sample is not included in a search result used to locate the positive digital image sample at ¶¶ [0127]-[0129]; to wit: “That is, at least one image is acquired from a positive image sample set corresponding to a non-associated query other than the current operation query, to serve as a target negative sample image corresponding to the current operation query. For example, a positive sample image set corresponding to a query ‘tiger’ comprises ‘image 81~image 100’, and ‘image 81~image 100’ may all be used as target negative sample images of ‘birthday card’.” 







(continued on next page)
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 2, 6, 16, are rejected under 35 U.S.C. 103 as being unpatentable over FU et al (U.S. PG Pub. No. 2017 /0330054) in view of JIN et al (U.S. PG Pub. No. 2017/0206465).
With regards to claim 2, Fu discloses training a model using machine learning based on a loss function using the training dataset at ¶¶ [0086]-[0087], [0093]-[0101] and FIG. 5. But, FU does not specify the training of the model results in a single unified text-and-digital image embedding space based on the plurality of text queries and the plurality of digital images. However, this limitation was known in the art:
JIN discloses training of a model that results in a single unified text-and-digital image embedding space based on the plurality of text queries and the plurality of digital images at ¶¶ [0053]-[0066]; see, also, ¶ [0039] and FIG. 3.
At the time of the filing of the present application, it would have been obvious to a person of ordinary skill in the art to train a model that results in a single unified text-and-digital image embedding space based on the plurality of text queries and the plurality of digital images, as taught by JIN, when training a model using machine learning based on a loss function using the training dataset, according to the method taught by FU.  The motivation for doing so comes from JIN, which discloses, “Once the embedding space is trained, it is usable to discover text labels for images, e.g., for image tagging, for multiple-texts based image search (to identify images as corresponding to searches).”  (JIN, ¶ [0084]; see, also, ¶¶ [0019]-[0020]).  Therefore, it would have been obvious to combine JIN with FU to obtain the invention specified in this claim.

    PNG
    media_image1.png
    391
    713
    media_image1.png
    Greyscale

With regards to claim 6, JIN discloses the training includes generating a positive image embedding (e.g., “                                
                                    
                                        
                                            
                                                
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            j
                                                        
                                                    
                                                
                                            
                                        
                                        
                                            j
                                            ∈
                                            
                                                
                                                    τ
                                                
                                                
                                                    +
                                                
                                            
                                        
                                    
                                
                            ”) from a positive digital image sample, a text embedding (e.g., “f(xi)”) from the text query associated with the positive digital image sample, and a negative image embedding (e.g., “                                
                                    
                                        
                                            
                                                
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            k
                                                        
                                                    
                                                
                                            
                                        
                                        
                                            k
                                            ∈
                                            
                                                
                                                    τ
                                                
                                                
                                                    -
                                                
                                            
                                        
                                    
                                
                            ”) generating from a negative digital image sample at ¶¶ [0053], [0058], [0061]. The motivation for this combination is the same as was previously presented.

With regards to claim 16, FU discloses two terms of text associated with the positive digital image sample at: ¶ [0107](“FIG. 6 is a flowchart for a method of establishing an image search relevance prediction model… [T]he step of selecting a set number of training samples is specifically optimized by: …, wherein the query sample comprises: … at least two queries meeting a set similarity threshold condition…”); ¶¶ [0113]-[0114](“Correspondingly, the query sample may further comprise at least two queries meeting a set similarity threshold condition. In a specific example, ‘birthday card’ may be directly selected as a query sample, or ‘birthday card’, ‘birthdate card’ and ‘date-of-birth card’ may be used as query samples in a manner of semantic similarity clustering.”).
Jin discloses the training includes generating a positive image embedding (e.g., “                                
                                    
                                        
                                            
                                                
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            j
                                                        
                                                    
                                                
                                            
                                        
                                        
                                            j
                                            ∈
                                            
                                                
                                                    τ
                                                
                                                
                                                    +
                                                
                                            
                                        
                                    
                                
                            ”) from the positive digital image sample, a text embedding (e.g., “f(xi)”) from the terms of text associated with the positive digital image sample, and a negative image embedding (e.g., “                                
                                    
                                        
                                            
                                                
                                                    
                                                        
                                                            y
                                                        
                                                        
                                                            k
                                                        
                                                    
                                                
                                            
                                        
                                        
                                            k
                                            ∈
                                            
                                                
                                                    τ
                                                
                                                
                                                    -
                                                
                                            
                                        
                                    
                                
                            ”) from the negative digital image sample at ¶¶ [0053], [0058], [0061]. The motivation for this combination is the same as was previously presented.

Claims 7 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over FU et al (U.S. PG Pub. No. 2017 /0330054) in view of JIN et al (U.S. PG Pub. No. 2017/0206465), in further view of MURALI (U.S. PG Pub. No. 2019/0347357).
With regards to claim 7, Fu discloses training a model using machine learning based on a loss function using the training dataset at ¶¶ [0086]-[0087], [0093]-[0101] and FIG. 5, but does not disclose the positive and negative text embedding’s recited in parent claim 8. JIN discloses the positive and negative text embedding’s recited in parent claim 8 at ¶¶ [0053], [0058], [0061]. Jin further discloses that the single unified text-and-digital image embedding space based on the plurality of text queries and the plurality of digital images is the result of training a model at ¶¶ [0053]-[0066]; see, also, ¶ [0039] and FIG. 3. Jin further addresses a loss (e.g.,                         
                            
                                
                                    min
                                
                                ⁡
                                
                                    
                                        
                                            D
                                        
                                        
                                            f
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            i
                                                        
                                                        
                                                            c
                                                        
                                                    
                                                
                                            
                                            ,
                                            
                                                
                                                    y
                                                
                                                
                                                    j
                                                
                                            
                                        
                                    
                                
                            
                        
                    ) between the text embedding and the positive image embedding separately from a loss (e.g.,                        
                            
                                
                                    min
                                
                                ⁡
                                
                                    
                                        
                                            D
                                        
                                        
                                            f
                                            
                                                
                                                    
                                                        
                                                            x
                                                        
                                                        
                                                            i
                                                        
                                                        
                                                            c
                                                        
                                                    
                                                
                                            
                                            ,
                                            
                                                
                                                    y
                                                
                                                
                                                    k
                                                
                                            
                                        
                                    
                                
                            
                        
                    ) between the text embedding and the negative image embedding at ¶¶ [0060]-[0062]; to wit: “[T]he MIE module 114 uses an adjusted multi-instance embedding loss formula that encourages positive text labels (e.g., those associated with the training image) to have smaller min distances than most negative text labels.” The motivation for this combination is the same as was previously presented. But, Jin does not specify the loss function is a triplet loss function. However, this limitation was known in the art:
MURALI discloses a loss function that is a triplet loss function that addresses a loss between an “anchor” (“query image vector”) and a positive image separately from a loss between the “anchor” (“query image vector”) and the negative image at ¶ [0043]; to wit: “A triplet loss procedure minimizes a distance between an anchor and a positive having the same identity, while maximizing a distance between an anchor and a negative having a different identity. In the present example embodiment, the triplet loss procedure minimizes the distance between the vectors 18 and 20 for cases where they represent substantially a same image/content, and maximizes the distance between the vectors 18 and 20 for cases where they represent different images/content from one another.” See, also, ¶¶ [0032]-[0033], [0038]. At the time of the filing of the present application, it would have been obvious to a person of ordinary skill in the art to use a loss function that is a triplet loss function that addresses a loss between an “anchor” (“query image vector”) and a positive image separately from a loss between the “anchor” (“query image vector”) and the negative image, as taught by MURALI, when training a single unified text-and-digital image embedding space using a loss function that encourages positive text labels (e.g., those associated with the training image) to have smaller min distances than most negative text labels.  The motivation for doing so comes from MURALI, which discloses, “[T]he triplet loss procedure minimizes the distance between the vectors 18 and 20 for cases where they represent substantially a same image/content, and maximizes the distance between the vectors 18 and 20 for cases where they represent different images/content from one another.”  (MURALI, ¶ [0043]).  Therefore, it would have been obvious to combine MURALI with JIN and FU to obtain the invention specified in this claim.
With regards to claim 17, the steps performed by the method of this claim are obvious over the combination of MURALI, JIN and FU for the same reasons as were presented with respect to claim 7, which recites an apparatus configured to perform these same steps.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVID F DUNPHY whose telephone number is (571)270-1230.  The examiner can normally be reached on 9 am - 5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached on 5712727332.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/DAVID F DUNPHY/Primary Examiner, Art Unit 2668