DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d).  Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Response to Amendment
The amendments filed with the AFCP 2.0 application filed on 2022-06-15 have been entered.  Subject to the Examiner’s Amendment below, Claims 3-4, 6-7, 10-11, 13-14, 17-18, and 20 are allowed.
EXAMINER’S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this Examiner’s Amendment was given by Rebecca Rudolph (#41,539) via telephonic interview on 2022-06-27 and email on 2022-06-30.
The application has been amended as follows for independent claims 1, 8, and 15:
Claim 1:
A text processing method based on ambiguous entity words, comprising:
obtaining a context of a text to be disambiguated and at least two candidate entities represented by the text to be disambiguated, wherein the at least two candidate entities have different semantics;
generating a semantic vector of the context based on a trained word vector model;
generating a first entity vector of each of the at least two candidate entities based on a trained unsupervised neural network model, wherein text semantics of respective entities and a relationship between respective entities have been learned by the unsupervised neural network model; 
determining a similarity between the context and each candidate entity according to the semantic vector of the context and the first entity vector of each of the at least two candidate entities; and
determining a target entity represented by the text to be disambiguated in the context from the at least two candidate entities according to the similarity between the context and each candidate entity,
wherein the unsupervised neural network model in which the text semantics of respective entities and the relationship between respective entities have been learned is obtained by the following steps:
inputting a semantic vector corresponding to a text of each entity in a preset knowledge base to a trained supervised neural network model, to generate a second entity vector of the entity, wherein semantics of respective entities have been learned by the supervised neural network model, and the second entity vector of the entity is an entity semantic vector that has learned the semantic of the entity;
initializing a first entity vector of each entity output by the unsupervised neural network model based on the second entity vector of the entity in the preset knowledge base; and
training the initialized unsupervised neural network model based on an association relationship between respective entities,
wherein, the determining the similarity between the context and each candidate entity according to the semantic vector of the context and the first entity vector of each of the at least two candidate entities comprises:
inputting the semantic vector of the context to the unsupervised neural network model, to obtain the first entity vector corresponding to the context, so as to calculate the similarity between the first entity vector corresponding to the context and the first entity vectors corresponding to the at least two candidate entities in a same vector space,
wherein generating a semantic vector of the context based on a trained word vector model comprises:
inputting the context of the text to be disambiguated to the trained word vector model to obtain a semantic vector corresponding to each word in the context of the text to be disambiguated, 
wherein initializing the first entity vector of each entity output by the unsupervised neural network model based on the second entity vector of the entity in the preset knowledge base comprises:
inputting each entity in the knowledge base to an untrained unsupervised neural network model to generate an initial first entity vector corresponding to the entity, wherein the initial first entity vector is a random number sequence generated randomly; and
replacing the initial first entity vector of each entity output from the unsupervised neural network model by the second entity vector corresponding to the entity output from the supervised neural network model,
wherein training the initialized unsupervised neural network model based on an association relationship between respective entities comprises:
training the initialized unsupervised neural network model based on entities in the knowledge base that have the association relationship, and/or based on entities in a search log that have a co-occurrence relationship; and
determining that training the unsupervised neural network model is finished when a distance between first entity vectors output by the unsupervised neural network model corresponds to a closeness between entities.

Claim 8:
A text processing device based on ambiguous entity words, comprising:
one or more processors;
a memory;
one or more programs, stored in the memory, when executed by the one or more processors, configured to perform following actions:
obtaining a context of a text to be disambiguated and at least two candidate entities represented by the text to be disambiguated, wherein the at least two candidate entities have different semantics;
generating a semantic vector of the context based on a trained word vector model;
generating a first entity vector of each of the at least two candidate entities based on a trained unsupervised neural network model, wherein text semantics of respective entities and a relationship between respective entities have been learned by the unsupervised neural network model; 
determining a similarity between the context and each candidate entity according to the semantic vector of the context and the first entity vector of each of the at least two candidate entities; and
determining a target entity represented by the text to be disambiguated in the context from the at least two candidate entities according to the similarity between the context and each candidate entity,
wherein the unsupervised neural network model in which the text semantics of respective entities and the relationship between respective entities have been learned is obtained by the following steps:
inputting a semantic vector corresponding to a text of each entity in a preset knowledge base to a trained supervised neural network model, to generate a second entity vector of the entity, wherein semantics of respective entities have been learned by the supervised neural network model, and the second entity vector of the entity is an entity semantic vector that has learned the semantic of the entity;
initializing a first entity vector of each entity output by the unsupervised neural network model based on the second entity vector of the entity in the preset knowledge base; and
training the initialized unsupervised neural network model based on an association relationship between respective entities,
wherein, the determining the similarity between the context and each candidate entity according to the semantic vector of the context and the first entity vector of each of the at least two candidate entities comprises:
inputting the semantic vector of the context to the unsupervised neural network model, to obtain the first entity vector corresponding to the context, so as to calculate the similarity between the first entity vector corresponding to the context and the first entity vectors corresponding to the at least two candidate entities in a same vector space,
wherein generating a semantic vector of the context based on a trained word vector model comprises:
inputting the context of the text to be disambiguated to the trained word vector model to obtain a semantic vector corresponding to each word in the context of the text to be disambiguated, 
wherein initializing the first entity vector of each entity output by the unsupervised neural network model based on the second entity vector of the entity in the preset knowledge base comprises:
inputting each entity in the knowledge base to an untrained unsupervised neural network model to generate an initial first entity vector corresponding to the entity, wherein the initial first entity vector is a random number sequence generated randomly;
replacing the initial first entity vector of each entity output from the unsupervised neural network model by the second entity vector corresponding to the entity output from the supervised neural network model,
wherein training the initialized unsupervised neural network model based on an association relationship between respective entities comprises:
training the initialized unsupervised neural network model based on entities in the knowledge base that have the association relationship, and/or based on entities in a search log that have a co-occurrence relationship; and
determining that training the unsupervised neural network model is finished when a distance between first entity vectors output by the unsupervised neural network model corresponds to a closeness between entities.

Claim 15:
A non-transitory computer readable storage medium, having stored therein computer programs that, when executed by a processor, implements the text processing method based on ambiguous entity words, wherein the method comprising:
obtaining a context of a text to be disambiguated and at least two candidate entities represented by the text to be disambiguated, wherein the at least two candidate entities have different semantics;
generating a semantic vector of the context based on a trained word vector model;
generating a first entity vector of each of the at least two candidate entities based on a trained unsupervised neural network model, wherein text semantics of respective entities and a relationship between respective entities have been learned by the unsupervised neural network model; 
determining a similarity between the context and each candidate entity according to the semantic vector of the context and the first entity vector of each of the at least two candidate entities; and
determining a target entity represented by the text to be disambiguated in the context from the at least two candidate entities according to the similarity between the context and each candidate entity,
wherein the unsupervised neural network model in which the text semantics of respective entities and the relationship between respective entities have been learned is obtained by the following steps:
generating a second entity vector of each entity in a preset knowledge base by using a trained supervised neural network model, wherein semantics of respective entities have been learned by the supervised neural network model;
initializing first entity vectors of respective entities output by the unsupervised neural network model based on the second entity vector of each entity in the preset knowledge base; and
training the initialized unsupervised neural network model based on an association relationship between respective entities,
wherein, the determining the similarity between the context and each candidate entity according to the semantic vector of the context and the first entity vector of each of the at least two candidate entities comprises:
inputting the semantic vector of the context to the unsupervised neural network model, to obtain the first entity vector corresponding to the context, so as to calculate the similarity between the first entity vector corresponding to the context and the first entity vectors corresponding to the at least two candidate entities in a same vector space,
wherein generating a semantic vector of the context based on a trained word vector model comprises:
inputting the context of the text to be disambiguated to the trained word vector model to obtain a semantic vector corresponding to each word in the context of the text to be disambiguated, 
wherein initializing the first entity vector of each entity output by the unsupervised neural network model based on the second entity vector of the entity in the preset knowledge base comprises:
inputting each entity in the knowledge base to an untrained unsupervised neural network model to generate an initial first entity vector corresponding to the entity, wherein the initial first entity vector is a random number sequence generated randomly;
replacing the initial first entity vector of each entity output from the unsupervised neural network model by the second entity vector corresponding to the entity output from the supervised neural network model,
wherein training the initialized unsupervised neural network model based on an association relationship between respective entities comprises:
training the initialized unsupervised neural network model based on entities in the knowledge base that have the association relationship, and/or based on entities in a search log that have a co-occurrence relationship; and
determining that training the unsupervised neural network model is finished when a distance between first entity vectors output by the unsupervised neural network model corresponds to a closeness between entities.






REASONS FOR ALLOWANCE
The following is an examiner’s statement of reasons for allowance.
The prior art of record does not teach or render obvious the following limitation:  “initializing a first entity vector of each entity output by the unsupervised neural network model based on the second entity vector of the entity in the preset knowledge base; wherein initializing the first entity vector of each entity output by the unsupervised neural network model based on the second entity vector of the entity in the preset knowledge base comprises: inputting each entity in the knowledge base to an untrained unsupervised neural network model to generate an initial first entity vector corresponding to the entity, wherein the initial first entity vector is a random number sequence generated randomly; replacing the initial first entity vector of each entity output from the unsupervised neural network model by the second entity vector corresponding to the entity output from the supervised neural network model”.
The closest pieces of prior art are the following.
Zwicklbauer et. al. (“Robust and Collective Entity Disambiguation through Semantic Embeddings”, hereinafter “Zwicklbauer”) teaches a system of entity disambiguation by calculating semantic embeddings using Word2Vec and Doc2Vec, and identifying candidate entity meanings for ambiguous “surface forms” by “compute the cosine similarity between the entity-context embeddings and the Doc2Vec inferred context vector of the surface form.” However, Zwicklbauer does not teach a combination of an unsupervised and supervised neural network model, wherein output from the supervised neural network model is used as an initialization for the output of the unsupervised neural network model during training of the unsupervised neural network model.
Amiri (“Learning Text Pair Similarity with Context-sensitive Autoencoders”) discloses an unsupervised neural network model (an autoencoder), and using that unsupervised neural network model that performs “encodes input text into context-sensitive representations and uses them to compute similarity between text pairs.”  However, Amiri does not teach initializing the unsupervised neural network model with output from a supervised neural network model, during training of the unsupervised neural network model.
Zhang et. al. (CN 107102989 A) discloses a disambiguation method based on semantic feature vectors constructed according to text information in a knowledge base, wherein a target entity is chosen by the candidate with the maximum similarity based on cosine similarity.  However, Zhang also does not teach the above limitation.
Francis-Landau et. al. ("Capturing Semantic Similarity for Entity Linking with Convolutional Neural Networks") discloses inputting Word2Vec vectors into a Convolutional Neural Network to produce vectors that capture semantic correspondence between a context and a target entity.  However, Francis-Landau also does not teach the above limitation.
Perianin et. al. ("Exploiting Synonymy and Hypernymy to Learn Efficient Meaning Representations") discloses using an unsupervised neural network (Autoencoder) to learn representations for each meaning of a word for use in word sense disambiguation.  However, Perianin also does not teach the above limitation.
Meij et. al. (US 2016/0189047 A1) discloses entity linking using Word2Vec vectors with negative sampling and calculating cosine distance between vectors to identify the target entity. However, Meij also does not teach the above limitation.
For at least these reasons, independent claims 1, 8, and 15 are allowable over the prior art of record. Claims 3-4, 6-7, 10-11, 13-14, 17-18, and 20 are allowable by virtue of their dependence from their respective base claims.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEONARD A SIEGER whose telephone number is (571)272-9710. The examiner can normally be reached M-F 8:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on (571) 272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/L.A.S./Examiner, Art Unit 2126   
                                                                                                                                                                                                     /NICHOLAS KLICOS/Primary Examiner, Art Unit 2145