Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This is in response to U.S. Patent Application No. 17/211,669 filed on 05/06/2021.
Claims 1 - 20 are currently pending for consideration

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 11/23/2021 was filed before the mailing date of the non-final office action. The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Title of the Invention
37 C.F.R. 1.72(a) states: "The title of the invention may not exceed 500 characters in length and must be as short and specific as possible" (emphasis added). Thus, the title of the invention “METHOD AND APPARATUS FOR TRAINING NATURAL LANGUAGE PROCESSING MODEL, DEVICE AND STORAGE MEDIUM” is not sufficiently descriptive. A new title is required that is more clearly and more specifically indicative of the invention to which
the claims are directed. The examiner suggests “METHOD AND APPARATUS FOR TRAINING NATURAL LANGUAGE PROCESSING MODEL FOR COREFERENCE RESOLUTION TASKS” or others.

Claim Interpretation
During patent examination, pending claims must be “given their broadest reasonable interpretation consistent with the specification.” MPEP 2111; See also, MPEP 2173.02. Limitations appearing in the specification but not recited in the claim are not read into the claim. In re Prater, 415 F.2d 1393, 1404-05, 162 USPQ 541, 550-551 (CCPA 1969). See also, In re Zletz, 893 F.2d 319, 321-22, 13 USPQ2d 1320, 1322 (Fed. Cir. 1989) (“During patent examination the pending claims must be interpreted as broadly as their terms reasonably allow’). 
The reason is simply that during patent prosecution when claims can be amended, ambiguities should be recognized, scope and breadth of language explored, and clarification imposed. An essential purpose of patent examination is to fashion claims that are precise, clear, correct, and unambiguous. Only in this way can uncertainties of claim scope be removed, as much as possible, during the administrative process.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. Claims 1-20 are rejected under 35 USC 101 because the claimed inventions are directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.
While independent claims 1, 9, and 17 are each directed to a statutory category (method or manufacture), they each recite a series of steps pertaining to receiving data and identifying content to be displayed based on the received data, which appears to be directed to an abstract idea.  
That is, each of independent claims 1, 9, and 17 pertain to receiving data, identifying and manipulating data for presentation based on the received data — this concept is not meaningfully different than those concepts found by the courts to be abstract (see: Intellectual Ventures v. Cap One Bank 1077: collecting, displaying and manipulating data; Intellectual Ventures v. Cap One Bank: customizing information and presenting it to users based on particular characteristics; Content Extraction: collecting data, recognizing certain data within the collected data set and storing the recognized data in memory; Electric power: collecting information, analyzing it, and displaying certain results of the collection and analysis). 
At step 2A, prong 1, the limitations of “constructing training language materials pairs…”, “training the natural language processing model with training language materials pairs of a coreference resolution task…”, and “training the natural language processing model with the positive samples of the training language materials pairs…”. The BRI of these limitations encompass, for example, a person collecting data, recognizing certain data within the collected data set and storing the recognized data for further training to collect mentally based on what is observed in the data with regards to the collected data set with further selections for further training. These steps, recited at a high level of generality, encompass mental observations, evaluations and judgments, and are similar to "collecting information, analyzing it, and displaying or processing certain results of the collection and analysis," where the data analysis steps are recited at a high level of generality such that they could practically be performed in the human mind” and “a claim to collecting and comparing known information (claim 1), which are steps that can be practically performed in the human mind”, see MPEP 2106.04(a)(2)(III)(A). Accordingly, the claim recites an abstract idea.

At step 2A prong 2, the judicial exception is not integrated into a practical application. In particular, the claim recites that the method with the additional elements “wherein the training language material pair comprises a positive sample and a negative sample”, “… to learn the capability of recognizing corresponding positive samples and negative samples”, and “… to learn the capability of the coreference resolution task” The “a positive sample and a negative sample”, “recognizing… positive samples and negative samples …”, and “ learn the coreference resolution task” steps only amount to extra-solution activities of collecting and recognizing data for use in the method and outputting a result (MPEP 2106.05(g)). None of these limitations, taken either alone or in combination, integrate the abstract idea into a practical application. 

At Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of “comprising a positive sample and a negative sample”, “recognizing corresponding positive samples and negative samples”, and “learn the coreference resolution task” only amounts to a group of further suggestions of the use of the judicial exception to a particular technological environment. Furthermore, the “train to learn”, and further “train to learn” steps recite WURC activities of “collecting information, analyzing it, and displaying certain results of the collection and analysis and further collecting with certain results” (Electric power). These additional limitations, taken either alone or in combination, fail to amount to significantly more than the judicial exception as these steps do not provide an inventive concept. The claim is not patent eligible.
Independent claims 9 and 17 are rejected using similar analysis as claim 1.
For the dependent claims with further “replacing a target noun… with a noun… acquiring other nouns… taking… as the positive sample… taking… as the negative sample” of claim 2, “inputting… learn to predict reference relationship… adjust… to predict the correct reference relationships…” of claim 3, “masking the pronoun… inputting… predicts the probability… judging… adjust the parameters… ”  of claim 4, “inputting… to predict the reference relationships… adjust…” of claim 5, “masking the pronoun… inputting… predicts the probability… judging… adjust the parameters…” of claim 6, “acquiring the probability… constructing a first loss function… constructing a second loss function… generating the target loss function…” of claim 7, and “acquiring the probability… constructing a first loss function… constructing a second loss function… generating the target loss function…” of claim 8, these additional elements do not integrate the abstract idea into a practical application and do not amount to anything more than merely manipulating and displaying content on a generic/conventional display. 
For at least these reasons, the claimed inventions of each of dependent claims 2-8, 10-15, and 18-20 are directed or indirect to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more and are rejected under 35 USC 101.

Examiner Note
The Examiner cites particular columns, line numbers and/or paragraph numbers in the references as applied to the claims below for the convenience of the Applicant(s). Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the Applicant fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 3 – 8, 11 – 16, and 19 – 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 3 – 6, 11 – 14, and 19 – 20 recites the phrase “such that”. The use of the phrase “such that” creates two, mutually exclusive interpretations of this limitation. Under the first interpretation, the use of “such that” merely indicates that obtaining at least the natural language processing model learns to predict is the intended use of the performance of the processing model learning and is, therefore, accorded no patentable weight. Under the second interpretation, the phrase “such that” is attempting to indicate that the at least one natural language processing model is affirmatively required to be obtained. “[I]f a claim is amenable to two or more plausible claim constructions, the USPTO is justified in requiring the applicant to more precisely define the metes and bounds of the claimed invention by holding the claim unpatentable under 35 U.S.C. § 112, second paragraph, as indefinite.” Ex parte Miyazaki, 89 USPQ2d 1207, 1211 (BPAI 2008) (precedential). See also Ex parte McAward, Appeal 2015-006416 (PTAB 2017) (precedential) (affirming the holding in Ex parte Miyazaki).
Therefore, these claims and their dependent claims 7-8, and 15-16 are indefinite.
Claims 2, 10 and 18 recite the terms “a coreference resolution task” and "a preset language material set”. Accordingly, it is unclear if these “coreference resolution task” and "preset language material set” are the same “coreference resolution task” and "preset language material set” or a different “coreference resolution task” and "preset language material set” as recited in the corresponding independent claims.
For purposes of examination, the examiner interpreted the claims as being the same “coreference resolution task” and "preset language material set”.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claim(s) 1 – 6, 9 – 14, and 17 – 20 are rejected under 35 U.S.C. 103 as being unpatentable over Qin et al. (CN 101901213 A, “Qin”) in view of Subramanya et al. (US 9514098 B1, “Sub”).
As to claim 1, Qin discloses A method for training a natural language processing model, comprising:
constructing training language material pairs of a coreference resolution task based on a preset language material set, wherein each training language material pair 5comprises a positive sample and a negative sample; (Qin: [0008-9, Abstract] A natural language pre-processing to bottom of the training corpus (i.e. preset language material set), extracting between possible candidate noun phrase coreference relationship exists; the marking of a noun phrase and a corpus-referring chain is extracted noun phrases, comprises a positive/negative training examples (i.e. training language material pair);… for constructing a dynamic generalized co-reference resolution (i.e. task) method based on examples).
training the natural language processing model with the positive samples of 10the training language material pairs to enable the natural language processing model to learn the capability of the coreference resolution task. (Qin: [Claim 3] the dynamic generalized co-reference resolution method based on constructed training positive/negative examples comprises: … existing co-reference relationship from a pair of positive examples).
The examiner notes that the “to enable” is the intended use term and interprets with no weight.
However, Qin may not explicitly disclose all the aspects of the training the natural language processing model with the training language material pairs to enable the natural language processing model to learn the capability of recognizing corresponding positive samples and negative samples; and
Sub discloses training the natural language processing model with the training language material pairs to enable the natural language processing model to learn the capability of recognizing corresponding positive samples and negative samples; and (Sub: [col 5 ln 57-62, col 12 ln 35-37] a language processor identify a language include training co-reference data for the language…based on such labeled data as a training set to determine learned co-reference embeddings,… the coreference resolver may identify the candidate antecedent noun phrase may be determined to be in a positive and a negative sets of candidate antecedents (i.e. sample). Such positive and negative training pairs are utilized to learn coreference embeddings).
The examiner notes that the “to enable” is the intended use term and interprets with no weight.
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Qin and Sub disclosing natural language processing which are analogous art from the “same field of endeavor”, and, when Sub's training co-reference data for the natural language corresponding to a positive and a negative sets of candidate antecedents was combined with Qin's constructing a dynamic generalized co-reference resolution model with pre-processed corpus, the claimed limitation on the training the natural language processing model with the training language material pairs to enable the natural language processing model to learn the capability of recognizing corresponding positive samples and negative samples would be obvious. The motivation to combine Qin and Sub is to provide a method to determining coreference resolution using distributed word representations with efficiency. (See Sub [col 1 ln 23-25]).

As to claim 2, Qin in view of Sub discloses The method according to claim 1, wherein the constructing training language material pairs of a coreference resolution task based on a preset language material set 15comprises: 
for each language material in the preset language material set, replacing a target noun which does not appear for the first time in the corresponding language material with a pronoun as a training language material; (Sub: [col 11 ln 19-34, col 23 ln 45-48] a noun phrase may be identified as a null antecedent in a collection of co-referential noun phrases… the null antecedent may be the first appearance of the noun phrase in the text document… replacing the given noun phrase (i.e. target noun)… in response to selecting the candidate noun phrase as the antecedent (i.e. pronoun) for the given noun phrase).
The examiner notes that Sub discloses [col 8 ln 19-43] the coreference resolver may determine the referring/antecedent feature representation for a noun phrase. The referring and/or antecedent feature include one or more of a type of mention (e.g., named, nominal, pronominal), a type of entity (e.g., person, location, organization), number of words in the noun phrase, and a gender associated with the noun phrase. The referring/antecedent feature representation is the pronoun.
acquiring other nouns from the training language material; (Sub: [col 1 ln 51-64] identifying distributed word representations for one or more noun phrases (i.e. other nouns)… determining, for each of the one or more noun phrases, a referring/antecedent feature representation, where the referring/antecedent feature representation for a given noun phrase of the one or more noun phrases may include the distributed word representation as the training language materials for the given noun phrase)
taking the training language material and the reference relationship of the pronoun to the target noun as the positive sample of the training language material pair; and (Sub: [col 12 ln 33-52] for each referring noun phrase i, a candidate antecedent noun phrase j (i.e. reference relationship) may be identified. The candidate antecedent noun phrase j may be determined to be in a positive set of candidate antecedents a.sup.+(i),… A noun phrase j belongs to a.sup.+(i) if (j,i) is a positive training pair for coreference… Such positive and negative training pairs are utilized to learn coreference embeddings).
taking the training language material and the reference relationships of the pronoun to other nouns as the negative samples of the training language material pair. (Sub: [col 12 ln 33-52] for each referring noun phrase i, a candidate antecedent noun phrase j (i.e. reference relationship) may be identified. The candidate antecedent noun phrase j may be determined to be in… a negative set of candidate antecedents a.sup.−(i)… The noun phrase j may belong to a.sup.−(i) if (j,i) is a negative training pair for coreference… Such positive and negative training pairs are utilized to learn coreference embeddings).

As to claim 3, Qin in view of Sub discloses The method according to claim 1, wherein the training the natural language processing model with training language material pairs to enable the natural language processing model to learn the capability of recognizing corresponding positive samples and negative samples comprises: 
inputting each training language material pair into the natural language 23processing model, such that the natural language processing model learns to predict whether the reference relationships in the positive sample and the negative sample are correct or not; and (Sub [col 7 ln 1-9, Claim 1] the coreference resolver accesses the content database to retrieve (i.e. inputting) stored distributed word representations for the one or more noun phrases in the text segment as training language material pair for a feed-forward neural network language model includes several variables… determining, a score based on the distance measure between the given noun phrase and the antecedent (i.e. reference relationships);... selecting as the positive/negative sample... based on the determined score (i.e. correct or not)).
when the prediction is wrong, adjusting the parameters of the natural language 5processing model to adjust the natural language processing model to predict the correct reference relationships in the positive samples and the negative samples. (Sub [col 22 ln 21-29, Claim 1] utilizing each of variations and modifications within the scope of the all parameters… and configurations for performing the function and/or obtaining the results for the natural language processing model… modifying… distributed word representations for a plurality of noun phrases with training pair of referring/antecedent feature based on the determined score (i.e. prediction result) for the prediction of the lower score is a wrong prediction… based on labeled based on labeled data includes co-referential (i.e. reference relationship) annotations to identify positive and negative sets of candidate antecedents).

As to claim 4, Qin in view of Sub discloses The method according to claim 1, wherein the training the natural language processing model with the positive samples of training language material pairs to 10enable the natural language processing model to learn the capability of the coreference resolution task comprises: 
masking the pronoun in the training language material of the positive sample of each training language material pair; (Qin: [0009] the marking (i.e. masking) of a noun phrase and a corpus-referring (i.e. pronoun) chain is extracted noun phrases for positive/negative training examples).
inputting the training language material with the masked pronoun into the isnatural language processing model, such that the natural language processing model predicts the probability that the pronoun belongs to each noun in the training language material; (Sub: [col 12 ln 26-32, col 7 ln 14-16] the coreference resolver access the content database to retrieve training data that may include labeled data. Such labeled data may include co-referential annotations of the one or more noun phrases appearing in the text. Based on such that of a word W.sub.t is computed from a context of previous n words W.sub.t-n, . . . W.sub.t-1… represent a probability distribution prediction on contextual analysis). 
based on the probability that the pronoun belongs to each noun in the training language material predicted by the natural language processing model, and the target 20noun to which the pronoun marked in the positive sample refers, generating a target loss function; (Sub: [col 7 ln 14-38] The model includes an output layer that represent a probability distribution of W.sub.t based on contextual analysis… generating a loss function, a learning rate for a stochastic gradient descent (“SGD”) utilized in determining the distributed word representation, and model architecture… a hinge-loss function is selected that associates a higher score with the word w selected from the correct (i.e. positive sample) next word than for a randomly selected incorrect word ŵ from the vocabulary).
judging whether the target loss function is converged; and (Sub: [col 6 ln 2-7] The coreference resolver utilize the labeled data as a training set to determine learned coreference embeddings of the one or more noun phrases… based on optimizing (i.e. converged) a loss function relevant to coreference resolution).
adjusting the parameters of the natural language processing model based on a gradient descent method if the target loss function is not converged. (Sub: [col 7 ln 21-25] Additional alternate variables (i.e. parameters) may be utilized, including a learning rate for a stochastic gradient descent (“SGD”) utilized in determining the distributed word representation, and model architecture… if an error used in a loss function is not optimized).  
As to claim 5, Qin in view of Sub discloses The method according to claim 2, wherein the training the natural language processing model with training language material pairs to enable the natural language processing model to learn the capability of recognizing corresponding positive samples and negative samples comprises:
inputting each training language material pair into the natural language 24processing model, such that the natural language processing model learns to predict whether the reference relationships in the positive sample and the negative sample are correct or not; and (Sub [col 7 ln 1-9, Claim 1] the coreference resolver accesses the content database to retrieve (i.e. inputting) stored distributed word representations for the one or more noun phrases in the text segment as training language material pair for a feed-forward neural network language model includes several variables… determining, a score based on the distance measure between the given noun phrase and the antecedent (i.e. reference relationships);... selecting as the positive/negative sample... based on the determined score (i.e. correct or not)).
when the prediction is wrong, adjusting the parameters of the natural language 5processing model to adjust the natural language processing model to predict the correct reference relationships in the positive samples and the negative samples. (Sub [col 22 ln 21-29, Claim 1] utilizing each of variations and modifications within the scope of the all parameters… and configurations for performing the function and/or obtaining the results for the natural language processing model… modifying… distributed word representations for a plurality of noun phrases with training pair of referring/antecedent feature based on the determined score (i.e. prediction result) for the prediction of the lower score is a wrong prediction… based on labeled based on labeled data includes co-referential (i.e. reference relationship) annotations to identify positive and negative sets of candidate antecedents).

As to claim 6, Qin in view of Sub discloses The method according to claim 2, wherein the training the natural language processing model with the positive samples of training language material pairs to 10enable the natural language processing model to learn the capability of the coreference resolution task comprises: 
masking the pronoun in the training language material of the positive sample of each training language material pair; (Qin: [0009] the marking (i.e. masking) of a noun phrase and a corpus-referring (i.e. pronoun) chain is extracted noun phrases for positive/negative training examples).
inputting the training language material with the masked pronoun into the 1snatural language processing model, such that the natural language processing model predicts the probability that the pronoun belongs to each noun in the training language material; (Sub: [col 12 ln 26-32, col 7 ln 14-16] the coreference resolver access the content database to retrieve training data that may include labeled data. Such labeled data may include co-referential annotations of the one or more noun phrases appearing in the text. Based on such that of a word W.sub.t is computed from a context of previous n words W.sub.t-n, . . . W.sub.t-1… represent a probability distribution prediction on contextual analysis).
based on the probability that the pronoun belongs to each noun in the training language material predicted by the natural language processing model, and the target 20noun to which the pronoun marked in the positive sample refers, generating a target loss function; (Sub: [col 7 ln 14-38] The model includes an output layer that represent a probability distribution of W.sub.t based on contextual analysis… generating a loss function, a learning rate for a stochastic gradient descent (“SGD”) utilized in determining the distributed word representation, and model architecture… a hinge-loss function is selected that associates a higher score with the word w selected from the correct (i.e. positive sample) next word than for a randomly selected incorrect word ŵ from the vocabulary).
judging whether the target loss function is converged; and (Sub: [49] a loss function is identified based on the positive and negative sets of candidate antecedents and the inner product).
adjusting the parameters of the natural language processing model based on a gradient descent method if the target loss function is not converged. (Sub: [44] coreference embeddings learning model is based on an iterative algorithm, such as an algorithm that optimizes (i.e. adjusts) a loss function relevant to coreference resolution… a stochastic gradient descent method is utilized to learn the coreference embeddings… for a number of incorrect candidate antecedent noun phrases covered).

Regarding claims 9 – 14, these claims recite the device performed by the method of claims 1 – 6, respectively; therefore, the same rationale of rejection is applicable.
Regarding claims 17 – 20, these claims recite the computer readable storage medium performed by the method of claims 1 – 4, respectively; therefore, the same rationale of rejection is applicable.

Claim(s) 7 – 8 and 15 – 16 are rejected under 35 U.S.C. 103 as being unpatentable over Qin in view of Sub and further in view of Sim et al. (US 20200134442 A1, “Sim”).

As to claim 7, Qin in view of Sub discloses The method according to claim 4, wherein the based on the probability that the pronoun belongs to each noun in the training language material predicted by the natural language processing model, and the target noun to which the pronoun marked in the positive sample refers, generating a target loss function comprises: 
However, Qin in view of Sub may not explicitly disclose all the aspects of the acquiring the probability that the pronoun belongs to the target noun predicted 25by the natural language processing model based on the target noun to which the pronoun marked in the positive sample refers; 
constructing a first loss function based on the probability that the pronoun belongs to the target noun predicted by the natural language processing model;
constructing a second loss function based on the probabilities that the pronoun belongs to other nouns than the target noun predicted by the natural language processing model; and 
generating the target loss function based on the first loss function and the second loss function.
Sim discloses acquiring the probability that the pronoun belongs to the target noun predicted 25by the natural language processing model based on the target noun to which the pronoun marked in the positive sample refers; (Sim: [0092, 0062] receives the learned representation for the input sequence from the sample representation program and predicts the probability of the sample constituting a task… a task classifier can be trained by the machine learning program, based on the adapted source samples as “positive” sample to predict the task labels on the target samples from the target corpus).
Sim discloses constructing a first loss function based on the probability that the pronoun belongs to the target noun predicted by the natural language processing model; (Sim: [0044] Model NN designs a training process to arrive at appropriate weights by choosing a number of neuron layers... Training data is fed into the NN and results are compared to an objective function (i.e. first error function) that provides an indication of error indicating a measure of how wrong the NN's result is compared to an expected result… based on the probability of the target sample as a positive class from the target corpus ).
Sim discloses  5constructing a second loss function based on the probabilities that the pronoun belongs to other nouns than the target noun predicted by the natural language processing model; and (Sim: [0044] The error is then used to correct the weights, over the iteration of the objective function (i.e. second error function) of the weights will collectively converge to encode the operational data into the NN using the target sample as “negative’ sample).
Sim discloses generating the target loss function based on the first loss function and the second loss function. (Sim: [0044, 0098] This process may be called an optimization of the objective function (e.g., a cost or loss function), whereby the cost or loss is minimized based on the first objection function and the second objection).
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Qin in view of Sub and Sim disclosing natural language processing  which are analogous art from the “same field of endeavor”, and, when Sim's iteration of the objective function of the weights collectively converge for the training samples was combined with Qin in view of Sub's constructing a dynamic generalized co-reference resolution model with pre-processed corpus, the claimed limitation on the acquiring the probability that the pronoun belongs to the target noun predicted 25by the natural language processing model based on the target noun to which the pronoun marked in the positive sample refers; 
constructing a first loss function based on the probability that the pronoun belongs to the target noun predicted by the natural language processing model;
constructing a second loss function based on the probabilities that the pronoun belongs to other nouns than the target noun predicted by the natural language processing model; and 
generating the target loss function based on the first loss function and the second loss function would be obvious. The motivation to combine Qin in view of Sub and Sim is to provide an approach to reduce bias from models trained using public corpora with efficiency. (See Sim [007]).

As to claim 8, Qin in view of Sub discloses The method according to claim 6, wherein the based on the probability that the pronoun belongs to each noun in the training language material predicted by the natural language processing model, and the target noun to which the pronoun marked in the positive sample refers, generating a target loss function comprises: 15
However, Qin in view of Sub may not explicitly disclose all the aspects of the acquiring the probability that the pronoun belongs to the target noun predicted by the natural language processing model based on the target noun to which the pronoun marked in the positive sample refers;  
constructing a first loss function based on the probability that the pronoun belongs to the target noun predicted by the natural language processing model; 
20constructing a second loss function based on the probabilities that the pronoun belongs to other nouns than the target noun predicted by the natural language processing model; and 
generating the target loss function based on the first loss function and the second loss function. 
Sim discloses acquiring the probability that the pronoun belongs to the target noun predicted 25by the natural language processing model based on the target noun to which the pronoun marked in the positive sample refers; (Sim: [0092, 0062] receives the learned representation for the input sequence from the sample representation program and predicts the probability of the sample constituting a task… a task classifier can be trained by the machine learning program, based on the adapted source samples as “positive” sample to predict the task labels on the target samples from the target corpus).
Sim discloses constructing a first loss function based on the probability that the pronoun belongs to the target noun predicted by the natural language processing model; (Sim: [0044] Model NN designs a training process to arrive at appropriate weights by choosing a number of neuron layers... Training data is fed into the NN and results are compared to an objective function (i.e. first error function) that provides an indication of error indicating a measure of how wrong the NN's result is compared to an expected result… based on the probability of the target sample as a positive class from the target corpus ).
Sim discloses  5constructing a second loss function based on the probabilities that the pronoun belongs to other nouns than the target noun predicted by the natural language processing model; and (Sim: [0044] The error is then used to correct the weights, over the iteration of the objective function (i.e. second error function) of the weights will collectively converge to encode the operational data into the NN using the target sample as “negative’ sample).
Sim discloses generating the target loss function based on the first loss function and the second loss function. (Sim: [0044, 0098] This process may be called an optimization of the objective function (e.g., a cost or loss function), whereby the cost or loss is minimized based on the first objection function and the second objection).
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Qin in view of Sub and Sim disclosing natural language processing  which are analogous art from the “same field of endeavor”, and, when Sim's iteration of the objective function of the weights collectively converge for the training samples was combined with Qin in view of Sub's constructing a dynamic generalized co-reference resolution model with pre-processed corpus, the claimed limitation on the acquiring the probability that the pronoun belongs to the target noun predicted 25by the natural language processing model based on the target noun to which the pronoun marked in the positive sample refers; 
constructing a first loss function based on the probability that the pronoun belongs to the target noun predicted by the natural language processing model;
constructing a second loss function based on the probabilities that the pronoun belongs to other nouns than the target noun predicted by the natural language processing model; and 
generating the target loss function based on the first loss function and the second loss function would be obvious. The motivation to combine Qin in view of Sub and Sim is to provide an approach to reduce bias from models trained using public corpora with efficiency. (See Sim [007]).

Regarding claims 15-16, these claims recite the device performed by the method of claims 7-8, respectively; therefore, the same rationale of rejection is applicable.




Conclusion
	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JENQ-KANG (Kang) CHU whose telephone number is (571)270-7396. The examiner can normally be reached M-F 8-6 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kavita Padmanabhan can be reached on 5712728352. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JENQ-KANG CHU/Examiner, Art Unit 2176                                                                                                                                                                                                        

/ARIEL MERCADO/Primary Examiner, Art Unit 2176