Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1, 5-8, 14, 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Lakhman (U.S. Pub 2019/0179796 A1), in view of Aggarwal (U.S. Pub 2020/0380403 A1)
Claim 1
Lakhman discloses a system comprising (fig. 1):
at least one processor ([0060], line 7-18, “... a processor, processors, CPU...”); and 
memory storing instructions that, when executed by the at least one processor, causes the system to perform a set of operations, the set of operations comprising ([0060], ROM, RAM. Fig. 1 shows storage device 215): 
training a machine learning model  based on a training dataset comprising a search query corpus and the document corpus, wherein training the machine learning model comprises ([0085], line 2-3, “... generate training samples for an MLA...” [0088], line 4-13, “... retrieve from the search log database 215... an indication of search queries 301... The indication of search queries 301 may generally include (1) search queries, (2) associated image search results...” [0101], line 1-2, “... The set of training objects 345 may then be used for training a MLA on the training server 230...” <examiner note: a search log database 215/training data set includes a collection of queries q, and collection of documents d that are collected and used as training data to train a MLA. This application can be applied to general web searches and/or other types of vertical domain searches [0085]>):
generating, using the machine learning model, a set of ranking scores for documents of the document corpus based on a first search query of the search query corpus ([0091], line 4-6, “... for a given query qn, the search query aggregator 310 may retrieve query-document-metric tuples 304 based on a relevance score R(dn) of the document dn...” <examiner note: each document Dn has a relevance score R(dn) for a given query qn>);
refining the training dataset based on the generated set of ranking scores ([0091]; line 8-11, “... only query-document-metric tuples 304 with documents having a relevance score R(d.sub.n) over a predetermined threshold value may be retrieved...” <examiner note: document dn with relevance score R(dn) meets threshold are retrieved>) 
Lakhman discloses “... While the description refers to vertical searches for images and image search results, the present technology may also be applied to general web searches and/or other types of vertical domain searches. Without limiting the generality of the foregoing, the non-limiting embodiments of the present technology can be applied to other types of documents, such as web results, videos, music, news, and other types of searches...”  [0085]
However, Lakhman does not explicitly disclose
determining a first negative document from a set of negative documents for the first search query; and
evaluating a loss function using the first negative document to train the machine learning model; 
obtaining a request comprising a second search query; 
generating, using the trained machine learning model, a set of documents from the document corpus that is responsive to the second search query; and
providing, in response to the request, the set of documents that is responsive to the second search query.
Aggarwal discloses “... employs a training data generation module 126 to generate a training dataset 128...” [0044]; “... Training of the model 120 is performed using digital image/text pairs 202 of a training dataset 128...” [0054], “... digital image sample...that has associated text, e.g., text queries...” <examiner note: the training data include image sample and text/queries>
determining a first negative document from a set of negative documents for the first search query ([0054], “... generate the negative digital image samples... the training data generation module 126 selects a positive digital image sample from a plurality of digital images that has associated text, e.g., text queries...” [0055], “... The training data generation module 126 then generates a subset... For example, suppose the positive digital image sample has associated text of “man on a motorbike.” Digital images are excluded from the subset having digital images associated with either “man” or “motorbike,” i.e., are “filtered out.” The subset is then used to select a negative digital image sample...” <examiner note: a positive image sample associated with text/query is selected to generate a subset of image samples that is not associated with the text/query. The subset is considered as negative image sample associated with the given query. A negative image sample is selected from subset/a set of negative samples>); and
evaluating a loss function using the first negative document to train the machine learning model ([0057], “... The machine-learning training module 130 may also implement a loss function 132 as part of training... Continuing with the example above... trains the model 120 using a positive digital image sample, a negative digital image sample, and text associated with the positive digital image sample. A text embedding is generated from the text... A positive image embedding is also generated from the positive digital image sample and a negative image embedding generated from the negative digital image sample...” [0058], line -14, “... The loss function 132 is configured in this example to evaluate a loss between the text embedding and the positive image embedding separately from a loss between the text embedding and the negative image embedding...” <examiner note: loss function is evaluated using negative sample to train model>); 
obtaining a request comprising a second search query ([0077], line 1-2, “... an input 602 is received... that includes a digital image and/or text 606...” [0079], line 1-2, “... the input (e.g., a search query input) includes both text and digital images...”); 
generating, using the trained machine learning model, a set of documents from the document corpus that is responsive to the second search query ([0080], line 1-6, “... A result 616 is generated by... comparing the embedding of the input 602 with embeddings 612 of digital images or text (e.g., maintained in a storage device 614) as part of the visually guided machine-learning embedding space 122 (block 712)...” <examiner note: during training, a visually guided language embeding space 122 is generated basedon the training set that includes queries, positive samples, and negative samples. When a second search query is received, the embedding of the second search query is compared with embeddings in embedding space 122 to obtain a set of images from the this embedding space/training dataset); and

    PNG
    media_image1.png
    1014
    829
    media_image1.png
    Greyscale

providing, in response to the request, the set of documents that is responsive to the second search query ([0080], line 6-7, “... This is then output (block 714) in real time...”)
Lakhman discloses training objects are generated for training a machine learning model based on a set of queries and a set of documents/images that response to queries. The model is trained to “... vertical searches for images and image search results, the present technology may also be applied to general web searches and/or other types of vertical domain searches. Without limiting the generality of the foregoing, the non-limiting embodiments of the present technology can be applied to other types of documents, such as web results, videos, music, news, and other types of searches...” as discloses in [0085]. However, Lakhman does not explicitly disclose including negative samples to train the model. Aggarwal discloses negative samples are untilized to train and evaluate the model so that “... a distance, during training, between the text embedding and the positive image embedding within the embedding space to decrease and a distance between the text embedding and the negative image embedding to increase...” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate negative samples to training model because “... accuracy of the model 120 is increased over conventional loss functions that did not support such an ability to separately address these losses...” [0058].
Claim 5
Claim 1 is included, Aggarwal further discloses wherein the first negative document is determined from the set of negative documents for the first search query using noise-contrastive estimation ([0056], “... In another example, even “harder” negative digital image samples may be generated by the training data generation module 126, automatically and without user intervention. To do so in this example, the training data generation module 126 also generates a subset from a plurality of digital images that excludes digital images from the plurality of digital images that have each of the terms, excluding stop words (i.e., are “pivots”), in the text associated with the positive digital image sample. The training data generation module 126 then selects the negative digital image sample from this subset. For example, suppose again that the positive digital image sample has associated text of “man on a motorbike.” Digital images are then filtered from the plurality of digital images that have both “man” or “motorbike.” The subset resulting from the filtering is then used to select a negative digital image sample. This may be performed, for instance, for title-based training data which typically includes significant amounts of text. As a result, the model 120 is further able to discriminate between “good” and bad” examples of digital image and text associations as part of training...”)
Claim 6
Claim 1 is included, Aggarwal discloses wherein the loss function evaluates a first cosine similarity between a query embedding vector for the first search query and a first document embedding vector for the first negative document ([0130], “... during training a distance between the positive image embedding 1602 and the text embedding 306 reduces over time while a distance between the negative image embedding 1604 and the text embedding 306 increases...”)
Claim 7
Claim 6 is included, Aggarwal discloses wherein the loss function further evaluates a second cosine similarity between the query embedding vector and a second document embedding vector for a positive document associated with the first search query ([0130], “... during training a distance between the positive image embedding 1602 and the text embedding 306 reduces over time while a distance between the negative image embedding 1604 and the text embedding 306 increases...”)
Claim 8
Claim 1 is included, Aggarwal further dislcoses wherein generating the set of documents that is responsive to the second search query comprises: performing an approximate nearest neighbor search using a query embedding vector for the second search query and document embedding vectors for documents of the document corpus to generate the set of documents; and ranking the set of documents according to associated ranking scores ([0080] A result 616 is generated by a comparison module 610 of the operation module 314 by comparing the embedding of the input 602 with embeddings 612 of digital images or text (e.g., maintained in a storage device 614) as part of the visually guided machine-learning embedding space 122 (block 712). This is then output (block 714) in real time. The embeddings 610, for instance, may correspond to digital images 110. A search of the digital images 110 is performed by comparing embeddings of the digital images with an embedding generated based on the input, e.g., as part of a nearest neighbor search such as least squared distance. This may also be used to support natural language processing, text summarization, speech recognition, text classification, and other functionality...)
	Claim 14 and 17-20 are similar to claim 1 and 5-8. The claims are rejected based on similar reasons


Claim(s) 2 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Lakhman (U.S. Pub 2019/0179796 A1), in view of Aggarwal (U.S. Pub 2020/0380403 A1), as applied to claim 1, and further in view of Zou (U.S. Pub 2020/0226126 A1)
Claim 2
Claim 1 is included, however, Lakman does not explicitly disclose wherein the set of operations further comprises: generating, for each document of the document corpus, a document embedding vector based on an embedding vector of at least one subpart of the document.
	Zou discloses wherein the set of operations further comprises: generating, for each document of the document corpus, a document embedding vector based on an embedding vector of at least one subpart of the document ([0020, “... The search documents 126 are the documents will ultimately be searched and evaluated for relevance to user queries... the search system produces a search document vector 128... the vectorization of a search document may include vectoring sub-parts of the document, e.g. paragraphs, pages, etc...”
	Lakhman discloses document vectors are generated at document level. However, Lakhman does not disclose the generated document vector is not based on an embedding subpart of the document. Zou discloses the document vector for the document is being vetorized based on sub-parts of document (e.g., parapgrahs, pages, words, sentences). The benefit of generating vectors of sub-parts of documents is scores of sub-parts are included in the search results. By incorporating Zou into Lakhman, the search results include “... graphics that indicate the scores of the document's subparts. This may allow the user to quickly jump to the subparts of the document that are most relevant to the query text 132...” [0036]
	Claim 15 are similar to claim 2. The claim is rejected based on similar reason.

Claim(s) 3-4 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Lakhman (U.S. Pub 2019/0179796 A1), in view of Aggarwal (U.S. Pub 2020/0380403 A1), as applied to claim 1, and further in view of Wang (U.S. Pub 20210374344 A1)
Claim 3
Claim 1 is included, Lakhman discloses wherein refining the training dataset comprises: retaining, for the first search query, a subset of documents of the document corpus in the training dataset based on the set of ranking scores ([0091]; line 8-11, “... only query-document-metric tuples 304 with documents having a relevance score R(dn) over a predetermined threshold value may be retrieved...” <examiner note: query q1 includes a subset of documents d1-dn>)

    PNG
    media_image2.png
    290
    690
    media_image2.png
    Greyscale

	Aggarwal disclose select a negative sample from the set of negative samples. However, Lakhman and Aggarwal does not explicitly disclose determining a second negative document for the first search query from the document corpus, wherein the second negative document is part of the set of negative documents for the first search query.
	Wang discloses determining a second negative document for the first search query from the document corpus, wherein the second negative document is part of the set of negative documents for the first search query ([0059], “... As a piece of sample data, for example, a query and webpage Title1, Title2, Title3 and Title4 corresponding to the query, the sorting is Title1>Title2>Title3>Title4...[0060], "... for example, a query, positive sample webpages Title2 and Title4 corresponding to the query, and negative sample webpages Title1 and Title3 corresponding to the query..."<examiner note: query d has at least two negative samples>)
Aggarwal discloses loss function is calculated based on the triplet (q, d+, d-). Wang disclose loss funciton is calculated based on (q, a set of d+, and a set of d-). Thi s way, the optimization of training model is performed in listwise. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include additional negative samples (e.g., not clicked data) into the same query as discloses by Wang to Aggarwal to optimze the model in listwise training.
Claim 4
Claim 3 is included, Wang further discloses wherein the second negative document is randomly determined ([0061], “... In 4021, an input sequence is formed in order with the item to be matched and information of at least two sample resources in the same training sample...”)
	Claim 16 is similar to claim 3. Claim 16 is rejected based on similar reason.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  



The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 9-10 and 12 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Aggarwal  (U.S. Pub 2020/0380403 A1).
Claim 9
Aggarwal discloses a method for generating a set of documents responsive to a search query, comprising:
obtaining a request comprising a search query ([0077], line 1-2, “... an input 602 is received... that includes a digital image and/or text 606...” [0079], line 1-2, “... the input (e.g., a search query input) includes both text and digital images...”); 
generating a query embedding vector for the search query ([0078], line 1-2, “... generates an embedding from the input 602...”); 
generating, based on the query embedding vector and document embedding vectors for documents of a document corpus ([0042], line 1-2, “... a visually guided language embedding space 122...” [0046], “... FIG. 2... text embeddings generated from text... are represented as circles and digital image embeddings...” <examiner embedding space 122 include digital image embeddings/documents>), a set of documents responsive to the search query ([0080], line 1-6, “... A result 616 is generated by... comparing the embedding of the input 602 with embeddings 612 of digital images or text (e.g., maintained in a storage device 614) as part of the visually guided machine-learning embedding space 122 (block 712)...” <examiner note: during training, a visually guided language embeding space 122 is generated basedon the training set that includes queries, positive samples, and negative samples. When a second search query is received, the embedding of the second search query is compared with embeddings in embedding space 122 to obtain a set of images from the this embedding space/training dataset); 
ranking the set of documents according to associated ranking scores ([0092], line 8-12, “... search module 114 to form a ranked list of the digital images 110 (e.g., based on “closeness” to the search query embedding), which are used to generate the first search result 814...”); and 
providing, in response to the request, the ranked set of documents that is responsive to the search query ([0080], line 6-7, “... This is then output (block 714) in real time...”)
Claim 10
Claim 9 is included, Aggarwal further discloses wherein generating the set of documents responsive to the search query comprises processing the query embedding vector and the document embedding vectors using an approximate nearest neighbor search ([0092], line 1-7, “... The search query embedding 810 is then compared by the search module 114 with embeddings 812 generated for the digital images 110 or text to perform the search. A nearest neighbor search, for instance, is performed (e.g., based on a least square distance) between the embeddings to determine which embeddings 812 and respective digital images, from which, the embeddings 812 are formed...”)
Claim 12
Claim 9 is included, Aggrawal discloses wherein a document embedding vector for a document of the document corpus is associated with a body of the document  ([0046], “... FIG. 2... text embeddings generated from text... are represented as circles and digital image embeddings...”)

Claim(s) 11 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Aggarwal (U.S. Pub 2020/0380403 A1), as applied to claim 9, and further in view of Zou (U.S. Pub 2020/0226126 A1)
Claim 11
Claim 9 is included, Aggarwal discloses wherein a document embedding vector for a document of the document corpus is a pre-generated document embedding vector ([0042], line 1-2, “... a visually guided language embedding space 122...” [0046], “... FIG. 2... text embeddings generated from text... are represented as circles and digital image embeddings...”)
	However, Aggarwal does not explcitly  wherein a document embedding vector for a document based on a plurality of embedding vectors, wherein each embedding vector of the plurality of embedding vectors is associated with a subpart of the document.
Zou discloses wherein a document embedding vector for a document based on a plurality of embedding vectors, wherein each embedding vector of the plurality of embedding vectors is associated with a subpart of the document ([0020, “... The search documents 126 are the documents will ultimately be searched and evaluated for relevance to user queries... the search system produces a search document vector 128... the vectorization of a search document may include vectoring sub-parts of the document, e.g. paragraphs, pages, etc...”)
Aggrawal discloses document vectors are generated at document level. However, Aggrawal does not disclose the generated document vector is not based on an embedding subpart of the document. Zou discloses the document vector for the document is being vetorized based on sub-parts of document (e.g., parapgrahs, pages, words, sentences). The benefit of generating vectors of sub-parts of documents is scores of sub-parts are included in the search results. By incorporating Zou into Aggarwal, the search results include “... graphics that indicate the scores of the document's subparts. This may allow the user to quickly jump to the subparts of the document that are most relevant to the query text 132...” [0036]
Claim 13
Claim 9 is included, Zou discloses wherein providing the ranked set of documents comprises providing a subpart of a document in the ranked set of documents ([0036], “... graphics that indicate the scores of the document's subparts. This may allow the user to quickly jump to the subparts of the document that are most relevant to the query text 132...”)

Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to HAU HAI HOANG whose telephone number is (571)270-5894. The examiner can normally be reached 1st biwk: Mon-Thurs 7:00 AM-5:00 PM; 2nd biwk: Mon-Thurs: 7:00 am-5:00pm, Fri: 7:00 am - 4:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Robert Beausoliel can be reached on 571 262 3645. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

HAU HAI. HOANG
Primary Examiner
Art Unit 2167



/HAU H HOANG/     Primary Examiner, Art Unit 2167