DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status
Claims 1-20 are allowed in this Office action.

Examiner’s Amendment
An Examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to the Applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this instant Examiner’s amendment was given in a telephonic communication (see attached Interview Summary) from Applicant’s representative Mr. Laura Majerus on June 29, 2022.





The claims are amended as presented below and will replace all previous version(s):
Claim 1. (Currently Amended) A method, comprising: 
obtaining: 
a first embedding produced by an embedding model from an input string representing an entity; and 
a hierarchy of clusters of embeddings generated by the embedding model from a set of standardized entities; 
searching the hierarchy of clusters for a subset of the embeddings that are within a threshold proximity to the first embedding in a vector space, wherein the searching of the hierarchy of clusters comprises: 
identifying, at a root level of the hierarchy, a first subset of the clusters with centers that are closest to the first embedding; 
ordering the first subset of the clusters in a priority queue by distances between the centers of the first subset of the clusters and the first embedding in the vector space; 
iteratively expanding a first cluster of the first subset at a front of the priority queue into a set of child clusters of the first cluster in the first level of the hierarchy; and 
inserting the set of child clusters into the priority queue according to the distances to the first embedding until a second cluster in a lowest level of the hierarchy is identified to have a center with a shorter distance to the first embedding than other clusters in the priority queue; 
calculating embedding match scores between the input string and a first subset of the standardized entities represented by the subset of the embeddings based on distances between the subset of the embeddings and the first embedding in the vector space, wherein the distances represent semantic similarity; and 
modifying, based on the embedding match scores, content outputted in response to the input string within a user interface of an online system, wherein the modified content outputted comprises standardized entities that are semantically related to the entity.
Claim 2. (Original) 
Claim 3. (Currently Amended) The method of claim 2, wherein the modifying of the content outputted in the user interface of the online system comprises: 
combining the embedding match scores and the similarity match scores into overall match scores between the input string and a third subset of the standardized entities, wherein each standardized entity in the third subset of the standardized entities is obtained from the first or second subsets of the standardized entities; 
applying a machine learning model to the overall match scores and features related to a set of documents containing one or more standardized entities in the third subset of the standardized entities to produce a set of relevance scores between the input string and the set of documents; and 
outputting, in the user interface, at least a portion of a ranking of the set of documents by the set of relevance scores.
Claim 4. (Currently Amended) The method of claim 3, wherein the combining of the embedding match scores and the similarity match scores into the overall match scores comprises: 
generating the overall match scores as linear combinations of the embedding match scores and the similarity match scores.
Claim 5. (Currently Amended) The method of claim 3, wherein the combining of the embedding match scores and the similarity match scores into the overall match scores comprises: 
when an embedding match score between the input string and a standardized entity falls below a threshold, excluding calculation of an overall match score between the input string and the standardized entity.
Claims 6-7. (Original) 
Claim 8. (Currently Amended) The method of claim 7, wherein the generating of the hierarchy of clusters from the embeddings further comprises: 
creating the hierarchy of clusters to comprise at least one of a first number of clusters at a root level of the hierarchy and or a second number of levels in the hierarchy.
Claims 9-10. (Original)
Claim 11. (Previously Presented) 




Claim 12. (Currently Amended) A system, comprising: 
one or more processors; and memory storing instructions that, when executed by the one or more processors, cause the system to: 
obtain: 
a first embedding produced by an embedding model from an input string representing an entity; and 
a hierarchy of clusters of embeddings generated by the embedding model from a set of standardized entities; 
search the hierarchy of clusters for a subset of the embeddings that are within a threshold proximity to the first embedding in a vector space, wherein the searching of the hierarchy of clusters comprises: 
identifying, at a root level of the hierarchy, a first subset of the clusters with centers that are closest to the first embedding; 
ordering the first subset of the clusters in a priority queue by distances between the centers of the first subset of the clusters and the first embedding in the vector space; 
iteratively expanding a first cluster of the first subset at a front of the priority queue into a set of child clusters of the first cluster in the first level of the hierarchy; and 
inserting the set of child clusters into the priority queue according to the distances to the first embedding until a second cluster in a lowest level of the hierarchy is identified to have a center with a shorter distance to the first embedding than other clusters in the priority queue; 
calculate embedding match scores between the input string and a first subset of the standardized entities represented by the subset of the embeddings based on distances between the subset of the embeddings and the first embedding in the vector space, wherein the distances represent semantic similarity; and 
modify, based on the embedding match scores, content outputted in response to the input string within a user interface of an online system, wherein the modified content outputted comprises standardized entities that are semantically related to the entity.
Claim 13. (Original) 
Claim 14. (Currently Amended) The system of claim 13, wherein the modifying of the content outputted in the user interface of the online system comprises: 
calculating the overall match scores for a third subset of the standardized entities as linear combinations of the embedding match scores and the similarity match scores, wherein each standardized entity in the third subset of the standardized entities is obtained from the first or second subsets of the standardized entities; 
applying a machine learning model to the overall match scores and features related to a set of documents containing one or more standardized entities in the third subset of the standardized entities to produce a set of relevance scores between the input string and the set of documents; and 
outputting, in the user interface, at least a portion of a ranking of the set of documents by the set of relevance scores.
Claim 15. (Original) 


Claim 16. (Currently Amended) The system of claim 15, wherein the generating of the hierarchy of clusters further comprises: 
creating a disjoint subset of the clusters at each level of the hierarchy.
Claim 17. (Original) 
Claim 18. (Currently Amended) The system of claim 12, wherein the modifying of the content outputted in the user interface of the online system comprises: 
applying a machine learning model to the embedding match scores and features related to a set of documents containing one or more standardized entities in the first subset of the standardized entities to produce a set of relevance scores between the input string and the set of documents; and 
outputting, in the user interface, at least a portion of a ranking of the set of documents by the set of relevance scores.
Claim 19. (Currently Amended) The system of claim [[12]] 18, wherein the modifying of the content outputted in the user interface of the online system further comprises: 
when an embedding match score between the input string and a standardized entity falls below a threshold, removing one or more documents containing the standardized entity from the set of documents prior to producing the set of relevance scores.



Claim 20. (Currently Amended) A non-transitory computer-readable storage medium storing instructions that when executed by a computer cause the computer to perform a method, the method comprising: 
obtaining: 
a first embedding produced by an embedding model from an input string representing an entity; and 
a hierarchy of clusters of embeddings generated by the embedding model from a set of standardized entities; 
searching the hierarchy of clusters for a subset of the embeddings that are within a threshold proximity to the first embedding in a vector space, wherein the searching of the hierarchy of clusters comprises: 
identifying, at a root level of the hierarchy, a first subset of the clusters with centers that are closest to the first embedding; 
ordering the first subset of the clusters in a priority queue by distances between the centers of the first subset of the clusters and the first embedding in the vector space; 
iteratively expanding a first cluster of the first subset at a front of the priority queue into a set of child clusters of the first cluster in the first level of the hierarchy; and 
inserting the set of child clusters into the priority queue according to the distances to the first embedding until a second cluster in a lowest level of the hierarchy is identified to have a center with a shorter distance to the first embedding than other clusters in the priority queue; 
calculating embedding match scores between the input string and a first subset of the standardized entities represented by the subset of the embeddings based on distances between the subset of the embeddings and the first embedding in the vector space, wherein the distances represent semantic similarity; and 
modifying, based on the embedding match scores, content outputted in response to the input string within a user interface of an online system, wherein the modified content outputted comprises standardized entities that are semantically related to the entity.

Summary of Prior Art
The available prior art are summarized as follows:
Cordova-Diba et al. (Pub. No. US 2016/0042251) teaches generation of interactive content wherein a representation of candidate object(s) in content of a digital media asset are received. For each of the candidate object(s), feature(s) of the candidate object are compared to corresponding feature(s) of a plurality of reference objects to identify reference object(s) that match the candidate object. For each of the matched candidate object(s), a hotspot package is generated. The hotspot package may comprise a visual overlay which comprises information associated with the reference object(s) matched to the respective candidate object.
Wu et al. (Pub. No. US 2018/0150749) teaches providing a conversation session with an artificial intelligence entity that is associated with a business entity. Input is provided to an artificial intelligence entity advertisement system. The input is analyzed to determine the subject matter of the input. An artificial intelligence entity associated with the subject matter is then selected and provided to the user. The artificial intelligence entity recommends products or services that are provided by the business entity to the user.
Sirin et al. (Pub. No. US 2019/0384863) teaches templates related to each graph data model of a graph data model set for converting non-graph data representations in a non-graph database to graph data representations compatible with a graph database may be obtained. One or more templates and the non-graph data representations may be provided to a neural network for the neural network to predict additional templates. 
Goval et al. (Pub. No. US 2012/0016864) teaches an optimized search engine index. The optimized index is formed by merging small lower level indexes of fresh documents together into a hierarchical cluster of multiple higher level indexes. The optimized index of fresh documents is formed via a single threaded process, while a fresh index serving platform concurrently serves fresh queries. The hierarchy of higher level indexes is formed by merging lower and/or higher level indexes with similar expiration times together. Therefore, as some indexes expire, the remaining un-expired indexes can be re-used and merged with new incoming indexes. The single threaded process provides fast serving of fresh documents, while also providing time to integrate the fresh indexes into a long term primary search engine index, prior to expiring.
Dahl et al. (Pat. No. US 11,030,257) teaches clustering media items in a semantic space to generate theme-based folders that organize media items by content theme. In particular, the disclosed systems can access media items that are stored in an original folder structure. Content-based tags can be generated for each media item in a collection of media items. Based on the generated tags, the collection of media items can be mapped to a semantic space and cluster the collection of media items. 
Duffy (Pub. No. US 2015/0331908) teaches user identification of a desired document, in which a database is provided which identifies a collection of documents in an embedding space, the database identifying a distance between each pair of the documents in the embedding space corresponding to a predetermined measure of dissimilarity between the pair of documents. In dependence upon a user query, the system constrains the embedding space geometrically to develop a first candidate space, and identifies toward the user a first set of N1>1 candidate documents from the first candidate space based on calculated discriminativeness of the documents.
He et al. (Pub. No. US 2017/0060844) teaches providing semantically-relevant discovery of solutions. A computing device can receive an input, such as a query and each word of the input is processed sequentially to determine a semantic representation of the input. A response is determined to the input, such as an answer, based on the semantic representation of the input matching a semantic representation of the response. An output including one or more relevant responses to the request can then be provided to the requestor.
Mansour et al. (Pub. No. US 2016/0232157) teaches retrieving/identifying a document comprising text stored in a document repository. A memory stores a graphical structure comprising a first plurality of nodes each representing a person, and a second plurality of nodes each representing a document in the document repository, the nodes being connected by edges according to automatically observed interactions between the represented people and documents. A node relatedness calculator computes distances between nodes of the graphical structure using the topic annotations. An input receives an identifier of a user who is represented by one of the first plurality of nodes. An identifier/retriever identifies one or more documents from the document repository by using the identifier and using the computed distances between nodes.

Reasons for Allowance
The following is an examiner's statement of reasons for allowance of Claims 1-20:
In interpreting the claims filed on 16 June 2022, in view of the updated search / examination, the interview dated 29 June 2022, and the available prior art, the Examiner finds the claimed invention to be patentably distinct from the prior art of records. Specifically, the prior art of records, individually or in combination, fail to explicitly teach, suggest or render obvious the claimed invention as recited in independent claims 1, 12, and 20.
Other dependent claims are also allowed based on their dependencies on claims 1, 12, and 20.
Any comments considered necessary by the Applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee. Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”



Contact Information
Any inquiry concerning this communication or earlier communications from the Examiner should be directed to Son Hoang whose telephone number is (571) 270-1752. The Examiner can normally be reached on Monday – Friday (7:00 AM – 4:00 PM).
If attempts to reach the Examiner by telephone are unsuccessful, the Examiner’s supervisor, Usmaan Saeed can be reached on (571) 272-4046. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

          /SON T HOANG/
 Primary Examiner, Art Unit 2169                                                                                                                                                                                                  June 29, 2022