DETAILED ACTION
Notice of AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 03/29/2022 has been considered by the examiner.
Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference character(s) not mentioned in the description: Fig. 2, element 208.  Corrected drawing sheets in compliance with 37 CFR 1.121(d), or amendment to the specification to add the reference character(s) in the description in compliance with 37 CFR 1.121(b) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-6, 8-13, and 15-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

Claim 1 recites an method of representing multi-turn conversations, e.g., conversations between two or more entities, such as two people.  Under the broadest reasonable interpretation, these limitations cover performance of the limitations in the human mind with the assistance of physical aids (e.g., pen and paper), but for the recitation of generic computer components.  That is, other than reciting “processor”, nothing in the claim precludes the steps from practically being performed in the mind.  For example, a human could receive data corresponding to a conversation having one or more utterances (e.g., receive and read a transcript of a conversation between two people, such as a transcript of a telephone call or a legal deposition), identify contextual representations for the one or more utterances (e.g., mentally, or with a pen, label certain words with contextual information, such as position and for pronouns, referring back to the antecedent for the pronoun), determine a span corresponding to the identified contextual representations (e.g., look in the transcript for the antecedent for a pronoun and note the page, line, and word number), and rewrite the one or more utterances based on maximizing a probability associated with the determined span (e.g., revise the transcript to replace pronouns with the proper nouns, and in cases where the transcript may be ambiguous, selecting the proper noun that is most likely to be associated with the pronoun).
The judicial exception is not integrated into a practical application. In particular, the claims only recites generic computing components, i.e., a “processor”. Such generic computing components are recited at a high-level of generality (i.e., as a generic processor performing a generic computer function of receiving, identifying, determining, and re-ordering/rewriting information) such that it amounts no more than mere instructions to apply the exception using generic computer components. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Claim 1 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional limitations of using generic computer components amounts to no more than mere instructions to apply the exception using generic computer components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. Claim 1 is not patent eligible.
Claim 2 depends from claim 1 and does not remedy the deficiencies of claim 1 and is therefore rejected under the same grounds as claim 1 above.  Claim 2 further recites rewriting one or more utterances based on recovering an omission of one or more words from the conversation (e.g., mentally reviewing a transcript and identifying a zero pronoun situation and revising the sentence to refer to the missing proper noun).  None of the additional limitations recited in claim 2 amount to anything more than the same or a similar abstract idea as recited in claim 1.    Nor do any limitations in claim 2: (a) integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea or (b) amount to significantly more than the judicial exception because the additional limitations of using generic computer components amounts to no more than mere instructions to apply the exception using generic computer components.  Moreover, outputting data to a device is merely post-solution activity. Claim 2 is not patent eligible.
Claim 3 depends from claim 1 and does not remedy the deficiencies of claim 1 and is therefore rejected under the same grounds as claim 1 above.  Claim 3 further recites rewriting one or more utterances based on recovering a co-reference of one or more words from the conversation (e.g., mentally reviewing a transcript and identifying a pronoun and then revising the sentence to refer to the missing proper noun).  None of the additional limitations recited in claim 3 amount to anything more than the same or a similar abstract idea as recited in claim 1.    Nor do any limitations in claim 3: (a) integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea or (b) amount to significantly more than the judicial exception because the additional limitations of using generic computer components amounts to no more than mere instructions to apply the exception using generic computer components.  Moreover, outputting data to a device is merely post-solution activity. Claim 3 is not patent eligible.
Claim 4 depends from claim 1 and does not remedy the deficiencies of claim 1 and is therefore rejected under the same grounds as claim 1 above.  Claim 4 further recites rewriting one or more utterances based on generating a candidate sentence corresponding to the utterances (e.g., mentally reviewing a transcript and identifying a pronoun and then writing a candidate sentence that proposes a replacement proper noun).  None of the additional limitations recited in claim 4 amount to anything more than the same or a similar abstract idea as recited in claim 1.    Nor do any limitations in claim 4: (a) integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea or (b) amount to significantly more than the judicial exception because the additional limitations of using generic computer components amounts to no more than mere instructions to apply the exception using generic computer components.  Moreover, outputting data to a device is merely post-solution activity. Claim 4 is not patent eligible.
Claim 5 depends from claim 4 and does not remedy the deficiencies of claim 4 and is therefore rejected under the same grounds as claim 4 above.  Claim 5 further recites rewriting one or more utterances based on sampling tags at one or more positions of the one or more utterances (e.g., mentally reviewing a transcript and labeling or highlighting certain words, and then using some of those labels or highlights as considerations when rewriting the sentence).  None of the additional limitations recited in claim 5 amount to anything more than the same or a similar abstract idea as recited in claim 4.    Nor do any limitations in claim 5: (a) integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea or (b) amount to significantly more than the judicial exception because the additional limitations of using generic computer components amounts to no more than mere instructions to apply the exception using generic computer components.  Moreover, outputting data to a device is merely post-solution activity. Claim 5 is not patent eligible.
Claim 6 depends from claim 5 and does not remedy the deficiencies of claim 5 and is therefore rejected under the same grounds as claim 5 above.  Claim 6 further recites that the candidate sentence is generated based on minimizing a tagging loss value associated with the sampled tags (e.g., mentally reviewing a transcript and labeling certain words, and then asking a second person to review the labeling for confirmation, and then using the confirmed tags to rewrite a sentence).  None of the additional limitations recited in claim 6 amount to anything more than the same or a similar abstract idea as recited in claim 5.    Nor do any limitations in claim 6: (a) integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea or (b) amount to significantly more than the judicial exception because the additional limitations of using generic computer components amounts to no more than mere instructions to apply the exception using generic computer components.  Moreover, outputting data to a device is merely post-solution activity. Claim 6 is not patent eligible.

Claim 8 claims a computer system that corresponds to the method of claim 1 and is therefore rejected under the same grounds as claim 1 above.  The additional limitations in claim 8 such as “computer-readable non-transitory storage media”, “computer program code”, and “computer processors” are merely generic computing components that do not (a) integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea or (b) amount to significantly more than the judicial exception because the additional limitations of using generic computer components amounts to no more than mere instructions to apply the exception using generic computer components.  

Claim 9 depends from claim 8 and claims a computer system that corresponds to the method of claim 2 and is therefore rejected under the same grounds as claims 2 and 8 above.
Claim 10 depends from claim 8 and claims a computer system that corresponds to the method of claim 3 and is therefore rejected under the same grounds as claims 3 and 8 above.
Claim 11 depends from claim 8 and claims a computer system that corresponds to the method of claim 4 and is therefore rejected under the same grounds as claims 4 and 8 above.
Claim 12 depends from claim 11 and claims a computer system that corresponds to the method of claim 5 and is therefore rejected under the same grounds as claims 5 and 11 above.
Claim 13 depends from claim 12 and claims a computer system that corresponds to the method of claim 6 and is therefore rejected under the same grounds as claims 6 and 12 above.
Claim 15 claims a computer system that corresponds to the method of claim 1 and is therefore rejected under the same grounds as claim 1 above. The additional limitations in claim 15 such as “non-transitory computer readable medium”, “computer program”, and “computer processors” are merely generic computing components that do not (a) integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea or (b) amount to significantly more than the judicial exception because the additional limitations of using generic computer components amounts to no more than mere instructions to apply the exception using generic computer components.  
Claim 16 depends from claim 15 and claims a computer readable medium that corresponds to the method of claim 2 and is therefore rejected under the same grounds as claims 2 and 15 above.
Claim 17 depends from claim 15 and claims a computer readable medium that corresponds to the method of claim 3 and is therefore rejected under the same grounds as claims 3 and 15 above.
Claim 18 depends from claim 15 and claims a computer readable medium that corresponds to the method of claim 4 and is therefore rejected under the same grounds as claims 4 and 15 above.
Claim 19 depends from claim 18 and claims a computer readable medium that corresponds to the method of claim 5 and is therefore rejected under the same grounds as claims 5 and 18 above.
Claim 20 depends from claim 19 and claims a computer readable medium that corresponds to the method of claim 6 and is therefore rejected under the same grounds as claims 6 and 19 above.

	Claims 7 and 14 recite “wherein the contextual representations are determined by a Bidirectional Encoder Representation from Transformers (BERT) encoder.”  This limitation is directed to a particular application of the BERT encoder and cannot practically be interpreted as being performed in the human mind, or by a human with pen and paper, and therefore claims 7 and 14 satisfy the requirements of 35 U.S.C. 101.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1, 2, 4, 8, 9, 11, 15, 16, and 18 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Zhang, W., et al. “Neural recovery machine for Chinese dropped pronoun.” Frontiers of Computer Science, 13(5), pp. 1023-1033 (2019), hereinafter referenced as ZHANG.

Regarding claim 1, ZHANG discloses:
A method of representing multi-turn conversations, (neural recovery machine implementing a process for recovering dropped pronouns in dialogue; p. 1023, section 1; example 3-turn dialogue in Table 1; p. 1023; data sets included conversation data of 2501 sentences and question answering dialogue of 11,160 sentences; p. 1024, section 2.1) executable by a processor, (a neural processing machine, which is a computer that necessarily includes a processor; p. 1023, section 1; pre-processing performed by LTP cloud service; p. 1027, p. 4.1) comprising:
receiving data corresponding to a conversation having one or more utterances; (Fig. 2, input layer of neural network of machine receives raw input of one or more words; example 3-turn dialogue in Table 1; p. 1023; data sets included conversation data of 2501 sentences and question answering dialogue of 11,160 sentences; p. 1024, section 2.1)
identifying contextual representations for the one or more utterances; (each word has a context embedding, which is constructed by concatenating the word embedding in a specific window; p. 1025, section 3.1, p. 1026, section 3.2, p. 1027, section 3.4; each word embedding has 300 dimensions; p. 1028, section 4.2, table 9; Fig. 4, left and right context are considered; p. 1027, section 3.4)
determining a span corresponding to the identified contextual representations; and (Fig. 1, position of dropped pronoun is identified; pp. 1025-1026, section 3.1; Fig. 2, a neural network model is used to determine the position of the dropped pronoun within the raw input, section 3.3; Fig. 3 shows that dropped position is 3rd word in the sentence)
rewriting the one or more utterances (Fig. 1, dropped pronoun is generated and placed in sentence, e.g., “he” is determined as dropped pronoun in 3rd position in the sentence; pp. 1025-1026, section 3.1; see also Fig. 3, a neural network model is used to generate the dropped pronoun and place the dropped pronoun in the sentence; Fig. 4, a zero pronoun-specific neural network (ZPSNN) is used to generate a noun phrase to replace the dropped pronoun; p. 1027, section 3.4) based on maximizing a probability associated with the determined span. (Figs. 2 and 3, the output layer of the neural network utilizes a softmax function (see equation 3), which assigns decimal probabilities to each output node, e.g., possible dropped pronoun, (which may be 14 for OntoNote 4.0 and 10 for Baidu Zhidao dataset examples), where the highest probability is determined as the output of the network, e.g., the generated dropped pronoun; p. 1025, section 2.3, table 4 (the 10/14 dropped pronouns used by Baidu Zhidao/OntoNotes 4.0, respectively, pp. 1026-1027, sections 3.2-3.3; the examiner notes that in paras. 0026 and 0027 in the instant specification, applicant similarly discloses using a softmax equation to determine probability distributions)

Regarding claim 2, ZHANG discloses the method of claim 1.  ZHANG further discloses:
wherein the one or more utterance is rewritten based on recovering an omission of one or more words from the conversation. (Table 1, dropped pronouns in brackets were omitted in original text; p. 1023, section 1; Fig. 1, dropped pronoun is generated and placed in sentence, e.g., “he” is determined as dropped pronoun in 3rd position in the sentence; pp. 1025-1026, section 3.1; see also Fig. 3, a neural network model is used to generate the dropped pronoun and place the dropped pronoun in the sentence; Fig. 4, a zero pronoun-specific neural network (ZPSNN) is used to generate a noun phrase to replace the dropped pronoun; p. 1027, section 3.4)

Regarding claim 4, ZHANG discloses the method of claim 1.  ZHANG further discloses:
wherein the rewriting the one or more utterances comprises generating a candidate sentence corresponding to the utterances. (Fig. 1, “He says [he] will buy an iphone 6s” generated sentence; pp. 1025-1026, section 3.1; see also Fig. 3, “He says he will buy an iphone 6s” generated sentence; Fig. 4, a zero pronoun-specific neural network (ZPSNN) is used to generate a noun phrase to replace the dropped pronoun; p. 1027, section 3.4)

	Regarding claim 8, ZHANG discloses:
A computer system for representing multi-turn conversations, (neural recovery machine implementing a process for recovering dropped pronouns in dialogue; p. 1023, section 1; example 3-turn dialogue in Table 1; p. 1023; data sets included conversation data of 2501 sentences and question answering dialogue of 11,160 sentences; p. 1024, section 2.1 the computer system comprising: one or more computer-readable non-transitory storage media configured to store computer program code; and one or more computer processors configured to access said computer program code and operate as instructed by said computer program code, said computer program code including: (a neural processing machine, which is a computer that necessarily includes a processor and storage for configured to store code executed by the processor; p. 1023, section 1; pre-processing performed by LTP cloud service; p. 1027, p. 4.1; the examiner further notes that applicant acknowledges that types of computing devices in a cloud computing environment are understood; instant specification, para. 0047)
The remaining limitations in claim 8 correspond to the method of claim 1 and are rejected under the same grounds as explained above with respect to claim 1.

Claim 9 depends from claim 8 and claims a computer system that corresponds to the method of claim 2 and is therefore rejected under the same grounds as claims 2 and 8 above.
Claim 11 depends from claim 8 and claims a computer system that corresponds to the method of claim 4 and is therefore rejected under the same grounds as claims 4 and 8 above.

Regarding claim 15, ZHANG discloses:
A non-transitory computer readable medium having stored thereon a computer program for representing multi-turn conversations, (neural recovery machine implementing a process for recovering dropped pronouns in dialogue; p. 1023, section 1; example 3-turn dialogue in Table 1; p. 1023; data sets included conversation data of 2501 sentences and question answering dialogue of 11,160 sentences; p. 1024, section 2.1) the computer program configured to cause one or more computer processors to: (a neural processing machine, which is a computer that necessarily includes a processor and storage for configured to store code executed by the processor; p. 1023, section 1; pre-processing performed by LTP cloud service; p. 1027, p. 4.1; the examiner further notes that applicant acknowledges that types of computing devices in a cloud computing environment are understood; instant specification, para. 0047)
The remaining limitations in claim 15 correspond to the method of claim 1 and are rejected under the same grounds as explained above with respect to claim 1.

Claim 16 depends from claim 15 and claims a computer readable medium that corresponds to the method of claim 2 and is therefore rejected under the same grounds as claims 2 and 15 above.
Claim 18 depends from claim 15 and claims a computer readable medium that corresponds to the method of claim 4 and is therefore rejected under the same grounds as claims 4 and 15 above.
Claim Rejections - 35 USC § 103
Claims 3, 10, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over ZHANG in view of Chen, Chen, et al. "Chinese zero pronoun resolution: Some recent advances." Proceedings of the 2013 conference on empirical methods in natural language processing. 2013.  Pp. 1360-1365, hereinafter referenced as CHEN.

Regarding claim 3, ZHANG discloses the method of claim 1.  However, ZHANG fails to explicitly teach:
wherein the one or more utterance is rewritten based on recovering a co-reference corresponding to one or more words from the conversation.

However, in a related field of endeavor, CHEN pertains to resolving anaphoric zero pronouns (AZP) to candidate antecedents to establish coreference links.  (p. 1360, section 1).  A support vector machine (SVM) is trained to distinguish AZPs from non-AZPs. (p. 1361, section 2.1).  Candidate noun phrases are analyzed to determine if they are coreferent with an AZP.  (p. 1362, section 3).

The combination of ZHANG in view of CHEN makes obvious:
wherein the one or more utterance is rewritten based on recovering a co-reference corresponding to one or more words from the conversation. (CHEN discloses a SVM classifier that identifies an anaphoric zero pronoun and its candidate antecedents; CHEN, p. 1360, section 1, p. 1361, section 2.1, p. 1362, section 3; the combination of ZHANG and CHEN now uses the SVM classifier in CHEN to supplement the dropped pronoun recovery task in addition to recovering an anaphoric zero pronoun for a co-referent antecedent, indeed, ZHANG specifically cites to CHEN as related work and that zero anaphora resolution “can be used either as features or an application for the DP recovery task”; ZHANG, p. 1031, section 5; further, as disclosed in ZHANG, the NRM in ZHANG could be used to improve the zero pronoun resolution, which is the problem addressed in CHEN; ZHANG, p. 1030, sections 4.5-4.6). 

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the zero anaphora resolution teachings of CHEN with ZHANG.  Indeed, as referenced above, one of ordinary skill would be motivated to make such a combination because ZHANG explicitly cites to CHEN as related work and explains that zero anaphora resolution “can be used either as features or an application for the DP recovery task”. (ZHANG, p. 1031, section 5).  Further, as noted in CHEN, one of ordinary skill in the art would be motivated to utilize the teachings of CHEN to resolve zero pronouns, which are more difficult to resolve that overt pronouns because zero pronouns lack grammatical attributes such as number or gender.  (CHEN, p. 1360, section 1).
The examiner further notes that ZHANG discloses that the developed neural recovery machine can improve the performance of a zero pronoun specific neural network by recovering the dropped pronouns for the anaphoric zero pronouns.  (ZHANG, p. 1030, sections 4.5-4.6).

Claim 10 depends from claim 8 and claims a computer system that corresponds to the method of claim 3 and is therefore rejected under the same grounds as claims 3 and 8 above.
Claim 17 depends from claim 15 and claims a computer readable medium that corresponds to the method of claim 3 and is therefore rejected under the same grounds as claims 3 and 15 above.

Claims 5-7, 12-14, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over ZHANG in view of Yavuz et al., U.S. Patent Application Publication 2021/0375269 A1, hereinafter referenced as YAVUZ.

Regarding claim 5, ZHANG discloses the method of claim 4.  However, ZHANG fails to explicitly teach:
further comprising generating the candidate sentence based on sampling tags at one or more positions of the one or more utterances.

However, in a related field of endeavor, YAVUZ discloses utilizing a neural network to tag words with dialog acts based on user intent.  (para. 0003).  For example, in multi-turn conversations between a user and system, certain utterances in the dialogue may be tagged with a predicted dialogue act to be performed by the system.  (Figs. 2A and 2B, para. 0025).  A language model, such as the bidirectional encoder representation transformer (BERT) may be used to perform the dialog act tagging.  (para. 0024).  The language model may be trained with reference to a labeled data source, and then a supervised tagging loss may be calculated as a cross-entry loss.  (para. 0039-41).  The supervised tagging loss may be backpropagated to improve the performance of the language model, e.g., BERT. (para. 0042).

The combination of ZHANG in view of YAVUZ makes obvious:
further comprising generating the candidate sentence based on sampling tags at one or more positions of the one or more utterances. (YAVUZ discloses tagging utterances using a language model, e.g., BERT; YAVUZ, paras. 0024, 0040; the combination of ZHANG and YAVUZ now uses a language model such as BERT to tag the input words with contextual information, such as word position information, parts-of-speech, or other tags that the neural networks in ZHANG can use as factors to determine a dropped pronoun, indeed, ZHANG already discloses using 200 or 300 dimensions of word embedding, so additional embedding can be performed using the tagging of YAVUZ; ZHANG, p. 1028, section 4.2 and table 9; the neural network in ZHANG now utilizes these tags as nodes, where the neural network is trained to assign relative weights to each node/tag, and a representative part of the nodes/tags is weighted, e.g., sampling, to determine the output position of the dropped pronoun which is used to output a sentence including a noun phrase for the dropped pronoun; ZHANG; Figs. 1 and 3, pp. 1025-1026, section 3.1).

	Therefore, one of ordinary skill in the art would be motivated to combine the teachings of YAVUZ with ZHANG, particularly the teachings in YAVUZ relating to using a language model such as BERT to tag words with intent and contextual information, to provide additional information for the neural network in ZHANG to consider when determining the position and resolution of a dropped pronoun.  As disclosed in YAVUZ, one of ordinary skill would be motivated to make such a combination in order to take advantage of pre-trained language models, e.g., BERT, and provide cross-domain generalization, so that instead of completely retraining the language model, the language model can be fine-tuned to save resources.  (YAVUZ, paras. 0021, 0027, 0069-71).  Utilizing cross-domain generalization allows the model to be more easily trained to perform similar tasks (e.g., a model for restaurant reservations can more easily be adapted for flight booking).  (YAVUZ, para. 0019). Therefore, the neural recovery machine in ZHANG could benefit from a language model such as BERT without needing to completely re-train the model to utilize the tags/embeddings important to ZHANG.

Regarding claim 6, ZHANG in view of YAVUZ discloses the method of claim 5.  However, ZHANG fails to explicitly teach:
wherein the candidate sentence is generated based on minimizing a tagging loss value associated with the sampled tags.  

However, in a related field of endeavor, YAVUZ, as explained above, YAVUZ discloses using a pre-trained language model such as BERT to perform utterance tagging.  The combination of ZHANG in view of YAVUZ makes obvious:
wherein the candidate sentence is generated based on minimizing a tagging loss value associated with the sampled tags.  (YAVUZ discloses a supervised tagging loss module configured to update a language model, e.g., BERT, by backpropagating, the supervised tagging loss calculated using a cross-entropy loss calculation in view of tags from a labeled data source and predicted tags from the language model, e.g., iteratively fine-tuning the model to minimize the supervised tagging loss via backpropagation; YAVUZ, Fig. 4A, paras. 0039-0042; the combination of ZHANG and YAVUZ now updates the BERT language model using a supervised tagging loss using the updated tags and generates an output sentence, utilizing the BERT language model, as explained above in the rejection of claim 5).

Therefore, one of ordinary skill in the art would be motivated to combine the teachings of YAVUZ with ZHANG, particularly the teachings in YAVUZ relating to calculating a supervised tagging loss and backpropagating such loss to update a language model such as BERT, for use in ZHANG to consider when determining the position and resolution of a dropped pronoun.  As disclosed in YAVUZ, one of ordinary skill would be motivated to make such a combination in order to take advantage of pre-trained language models, e.g., BERT, and provide cross-domain generalization, so that instead of completely retraining the language model, the language model can be fine-tuned to save resources.  (YAVUZ, paras. 0021, 0027, 0069-71).  Utilizing cross-domain generalization allows the model to be more easily trained to perform similar tasks (e.g., a model for restaurant reservations can more easily be adapted for flight booking).  (YAVUZ, para. 0019). Therefore, the neural recovery machine in ZHANG could benefit from a language model such as BERT without needing to completely re-train the model to utilize the tags/embeddings important to ZHANG.

Regarding claim 7, ZHANG discloses the method of claim 1.  However, ZHANG fails to explicitly teach:
wherein the contextual representations are determined by a Bidirectional Encoder Representation from Transformers (BERT) encoder.

However, in a related field of endeavor, YAVUZ, as explained above, YAVUZ discloses using a pre-trained language model such as BERT to perform utterance tagging.  The combination of ZHANG in view of YAVUZ makes obvious:
wherein the contextual representations are determined by a Bidirectional Encoder Representation from Transformers (BERT) encoder. (YAVUZ discloses tagging utterances using a language model, e.g., BERT; YAVUZ, paras. 0024, 0040; the combination of ZHANG and YAVUZ now uses a language model such as BERT to tag the input words with contextual information, such as word position information, parts-of-speech, or other tags that the neural networks in ZHANG can use as factors to determine a dropped pronoun, indeed, ZHANG already discloses using 200 or 300 dimensions of word embedding, so additional embedding can be performed using the tagging of YAVUZ; ZHANG, p. 1028, section 4.2 and table 9).

Therefore, one of ordinary skill in the art would be motivated to combine the teachings of YAVUZ with ZHANG, particularly the teachings in YAVUZ relating to using a language model such as BERT, for use in ZHANG to consider when determining the position and resolution of a dropped pronoun.  As disclosed in YAVUZ, one of ordinary skill would be motivated to make such a combination in order to take advantage of pre-trained language models, e.g., BERT, and provide cross-domain generalization, so that instead of completely retraining the language model, the language model can be fine-tuned to save resources.  (YAVUZ, paras. 0021, 0027, 0069-71).  Utilizing cross-domain generalization allows the model to be more easily trained to perform similar tasks (e.g., a model for restaurant reservations can more easily be adapted for flight booking).  (YAVUZ, para. 0019). Therefore, the neural recovery machine in ZHANG could benefit from a language model such as BERT without needing to completely re-train the model to utilize the tags/embeddings important to ZHANG.

Claim 12 depends from claim 11 and claims a computer system that corresponds to the method of claim 5 and is therefore rejected under the same grounds as claims 5 and 11 above.
Claim 13 depends from claim 12 and claims a computer system that corresponds to the method of claim 6 and is therefore rejected under the same grounds as claims 6 and 12 above.
Claim 14 depends from claim 8 and claims a computer system that corresponds to the method of claim 7 and is therefore rejected under the same grounds as claims 7 and 8 above.
Claim 19 depends from claim 18 and claims a computer readable medium that corresponds to the method of claim 5 and is therefore rejected under the same grounds as claims 5 and 18 above.
Claim 20 depends from claim 19 and claims a computer readable medium that corresponds to the method of claim 6 and is therefore rejected under the same grounds as claims 6 and 19 above.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20220068462 A1 (Dolan et al.) discloses a system and method for remembering content received from a patient for future use and understood by computerized natural language processing during a chatbot therapy session.
US 20210174204 A1 (Yin et al.) discloses using a neural network model for natural language processing (NLP).  Coreference resolution aims to cluster entities and pronouns that refer to the same object.  (Para. 0085).
US 20190188257 A1 (Iida et al.) discloses a context analysis apparatus and discloses conventional anaphora/ellipsis resolution techniques that use clues obtained from an anaphor (pronouns, zero-pronouns, etc.) and its candidate antecedent.
US 20190035387 A1 (Zitouni et al.) discloses a mechanism to adapt a machine learning model used in a language understanding model that has been trained using a first set of user input having a first set of features to effectively operate using user input having a second set of features. Losses are defined based on the first set of features, the second set of features or features common to both the first set and second set. The losses comprise one or more of a source side tagging loss, a reconstruction loss, an adversarial domain classification loss, a non-adversarial domain classification loss, an orthogonality loss, and target side tagging loss.
Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).  Discloses the BERT language model, referenced in YAVUZ.
Elgohary, Ahmed, et al. "Can you unpack that? learning to rewrite questions-in-context." Can You Unpack That? Learning to Rewrite Questions-in-Context (2019), pp. 1-9. Discloses a question answering mechanism for rewriting questions-in-context.
Pan, Zhufeng, et al. "Improving open-domain dialogue systems via multi-turn incomplete utterance restoration." Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019.  Pp. 1824-1833. Discloses a multi-turn conversation system that can better understand incomplete utterances.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL C LEE whose telephone number is (571)272-4933. The examiner can normally be reached M-F 9:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on 571-272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MICHAEL C. LEE/Examiner, Art Unit 2655                                                                                                                                                                                                        
/ANDREW C FLANDERS/Supervisory Patent Examiner, Art Unit 2655