DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
This Application claims priority to provisional 62/981,654 filed 26 February 2020.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 16 October 2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1, 2, 9, 10, 18 and 19 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by US Patent No 10,140,315 to Hohwald et al (hereafter Hohwald).

Referring to claim 1, Hohwald discloses a method for multimodal content retrieval, comprising: 
receiving a search query corresponding to a request for content (see column 11, lines 23-25 – The processor of the server is configured to receive, from a user of the client, a search query for the collection of media files.); 
retrieving, based on receiving the search query, content features corresponding to a subset of content items from among a plurality of content items (see column 12, lines 43-67 – using vectors from a particular cluster); 
calculating similarity values [similarity score] between the search query [input data vector generated for each visual media input file] and the retrieved content features [vector for each visual portion of each visual media file from the collection] (see column 11, line 38 – column 12, line 17 – A dot product is performed between an input data vector generated for each visual media input and the data vector for each visual portion of each visual media file from the collection to generate a dot product similarity score for each visual media input file and visual pairing portion.); 
determining attention scores [responsiveness scores] for the calculated similarity values (see column 13, lines 1-49 – The dot product similarity score for the pairing of the visual media input file and the visual portion for a visual media file is then weighted based on past responsiveness of the corresponding visual portion to the cluster to which the input data vector is assigned.); and 
selecting a content item from among the subset of content items of the plurality of content items, the selected content item containing a content feature corresponding to a highest attention score of the attention scores (see column 13, lines 39-51 – The processor of the server is configured to provide, in response to the search query, an identifier of at least one responsive visual media item from the collection of media files for display as responsive to the search query. The responsive visual media files can be sorted in descending order according to a responsiveness score, which can be equal to a visual similarity score, with the responsive visual media files having the highest n visual similarity scores being identified to the user as responsive to the visual media input files of the search query.).  
Referring to claim 2, Hohwald discloses the method of claim 1, wherein the similarity values are calculated based on performing a vector distance operation or a vector similarity operation on averaged vectors associated with the retrieved content features (Hohwald: see column 12, lines 5-67).
Referring to claim 9, Hohwald discloses the method of claim 1, wherein the plurality of content items comprise at least one image and at least one video (Hohwald: see column 5, lines 50-54 – The collection of media files includes visual media files such as images and video recordings with or without audio.).
Referring to claim 10, Hohwald discloses a computer system for multimodal content retrieval, the computer system comprising: 
one or more computer-readable non-transitory storage media configured to store computer program code (see column 18, lines 26-53); and
one or more computer processors configured to access said computer program code and operate as instructed by said computer program code (see column 18, lines 26-53), 
receiving a search query corresponding to a request for content (see column 11, lines 23-25 – The processor of the server is configured to receive, from a user of the client, a search query for the collection of media files.); 
retrieving, based on receiving the search query, content features corresponding to a subset of content items from among a plurality of content items (see column 12, lines 43-67 – using vectors from a particular cluster); 
calculating similarity values [similarity score] between the search query [input data vector generated for each visual media input file] and the retrieved content features [vector for each visual portion of each visual media file from the collection] (see column 11, line 38 – column 12, line 17 – A dot product is performed between an input data vector generated for each visual media input and the data vector for each visual portion of each visual media file from the collection to generate a dot product similarity score for each visual media input file and visual pairing portion.); 
determining attention scores [responsiveness scores] for the calculated similarity values (see column 13, lines 1-49 – The dot product similarity score for the pairing of the visual media input file and the visual portion for a visual media file is then weighted based on past responsiveness of the corresponding visual portion to the cluster to which the input data vector is assigned.); and 
selecting a content item from among the subset of content items of the plurality of content items, the selected content item containing a content feature corresponding to a highest attention score of the attention scores (see column 13, lines 39-51 – The processor of the server is configured to provide, in response to the search query, an identifier of at least one responsive visual media item from the collection of media files for display as responsive to the search query. The responsive visual media files can be sorted in descending order according to a responsiveness score, which can be equal to a visual similarity score, with the responsive visual media files having the highest n visual similarity scores being identified to the user as responsive to the visual media input files of the search query.).  
Referring to claim 11, Hohwald discloses the computer system of claim 10, wherein the similarity values are calculated based on performing a vector distance operation or a vector similarity operation on averaged vectors associated with the retrieved content features (Hohwald: see column 12, lines 5-67).
Referring to claim 18, Hohwald discloses the computer system of claim 10, wherein the plurality of content items comprise at least one image and at least one video (Hohwald: see column 5, lines 50-54 – The collection of media files includes visual media files such as images and video recordings with or without audio.).
Referring to claim 19, Hohwald discloses a non-transitory computer readable medium having stored thereon a computer program for multimodal content retrieval (see column 18, lines 26-53), the computer program configured to cause one or more computer processors to: 
receive a search query corresponding to a request for content (see column 11, lines 23-25 – The processor of the server is configured to receive, from a user of the client, a search query for the collection of media files.); 
retrieve, based on receiving the search query, content features corresponding to a subset of content items from among a plurality of content items (see column 12, lines 43-67 – using vectors from a particular cluster); 
calculate similarity values [similarity score] between the search query [input data vector generated for each visual media input file] and the retrieved content features [vector for each visual portion of each visual media file from the collection] (see column 11, line 38 – column 12, line 17 – A dot product is performed between an input data vector generated for each visual media input and the data vector for each visual portion of each visual media file from the collection to generate a dot product similarity score for each visual media input file and visual pairing portion.); 
determine attention scores [responsiveness scores] for the calculated similarity values (see column 13, lines 1-49 – The dot product similarity score for the pairing of the visual media input file and the visual portion for a visual media file is then weighted based on past responsiveness of the corresponding visual portion to the cluster to which the input data vector is assigned.); and 
select a content item from among the subset of content items of the plurality of content items, the selected content item containing a content feature corresponding to a highest attention score of the attention scores (see column 13, lines 39-51 – The processor of the server is configured to provide, in response to the search query, an identifier of at least one responsive visual media item from the collection of media files for display as responsive to the search query. The responsive visual media files can be sorted in descending order according to a responsiveness score, which can be equal to a visual similarity score, with the responsive visual media files having the highest n visual similarity scores being identified to the user as responsive to the visual media input files of the search query.).  

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 3-8, 12-17 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over US Patent No 10,140,315 to Hohwald et al (hereafter Hohwald) as applied to claims 1, 10 and 19 above, and further in view of US PGPub 2018/0336241 to Noh et al (hereafter Noh).

Referring to claims 3, 12 and 20, Hohwald discloses wherein the retrieving the content features corresponding to the subset of content items from among the plurality of content items, comprises: 
retrieving content average values corresponding to the content features of the plurality of content items (see column 12, lines 43-67 – centroid of a cluster); 
calculating a query value associated with contextual representations corresponding to the search query (see column 11, line 62 – column 12, line 67); 
determining overall similarity values associated with the content features based on the retrieved average content values and the calculated query average value (see column 12, lines 43-67 – distance between centroid and input vector); and 
selecting content features corresponding to the subset of content items from among the plurality of content items based on the determined overall similarity values associated with the selected content features being greater than a threshold value (see column 12, lines 43-67 – below a distance threshold).
Hohwald fails to explicitly teach a query average value. Noh teaches the comparison of a query vector and a vector of content in order to locate search results, including the further limitations of 
calculating a query average value associated with contextual representations corresponding to the search query (see [0026] and [0071] – The vector for the given job search query may be generated by using the word-embedding machine-learning model to determine the vector of each word in the given job search query, and then combining the determined vectors together (e.g., averaging the vectors). ); 
determining overall similarity values associated with the content features based on the retrieved average content values and the calculated query average value (see [0026] – calculating similarity).
Hohwald and Noh are analogous art since they both relate to generating search results using vector comparison.  It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to calculate and utilize the query average vector of Noh as the query vector of Hohwald.  One would have been motivated to do so in order to provide variations for a word in the form of vectors since valuable information is missed when mapping by only a single word instead of variations of the word (Noh: see [0002]).   
Referring to claim 4, the combination of Hohwald and Noh (hereafter Hohwald/Noh) discloses the method of claim 3, wherein the content average values are stored prior to receiving the search query (Hohwald: see column 15, lines 47-62).
Referring to claim 5, Hohwald/Noh discloses the method of claim 3, wherein the content average values are calculated by: detecting salient regions or grid cells in the content items (Hohwald: see column 9, lines 20-60); mapping the detected salient regions or grid cells to a set of vectors (Hohwald: see column 9, lines 20-60); and averaging the set of vectors (Hohwald: see column 9, lines 20-60).
Referring to claim 6, Hohwald/Noh discloses the method of claim 3, wherein the content average values correspond to averages of vectors corresponding to regions of the content items (Hohwald: see column 12, lines 43-67 and Noh: see [0026]).
Referring to claim 7, Hohwald/Noh discloses the method of claim 3, wherein the query average value corresponds to an average of vectors corresponding to contextual representation of words in the search query (Noh: see [0026] and [0071]).
Referring to claim 13, Hohwald/Noh discloses the computer system of claim 12, wherein the content average values are stored prior to receiving the search query (Hohwald: see column 15, lines 47-62).
Referring to claim 14, Hohwald/Noh discloses the computer system of claim 12, wherein the content average values are calculated by: detecting salient regions or grid cells in the content items (Hohwald: see column 9, lines 20-60); mapping the detected salient regions or grid cells to a set of vectors (Hohwald: see column 9, lines 20-60); and averaging the set of vectors (Hohwald: see column 9, lines 20-60).
Referring to claim 15, Hohwald/Noh discloses the computer system of claim 12, wherein the content average values correspond to averages of vectors corresponding to regions of the content items (Hohwald: see column 12, lines 43-67 and Noh: see [0026]).
Referring to claim 16, Hohwald/Noh discloses the computer system of claim 12, wherein the query average value corresponds to an average of vectors corresponding to contextual representation of words in the search query (Noh: see [0026] and [0071]).

Allowable Subject Matter
Claims 8 and 17 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
As allowable subject matter has been indicated, applicant's reply must either comply with all formal requirements or specifically traverse each requirement not complied with.  See 37 CFR 1.111(b) and MPEP § 707.07(a).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US Patent No 11,409,791 to Torabi et al  - Comparison of average video vector and query vectors
US PGPub 2021/0303614
US Patent 11,048,744 to Hohwald et al

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KIMBERLY LOVEL WILSON whose telephone number is (571)272-2750. The examiner can normally be reached 8-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Robert Beausoliel can be reached on 571-272-3645. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KIMBERLY L WILSON/Primary Examiner, Art Unit 2167