DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
Response to Amendment
The amendment filed on December 13th, 2021 has been entered. Claims 1-20 remain pending in the application. 
Applicant’s amendments to the claims have overcome each and every 112(b) rejection previously set forth in the Non-Final Office Action mailed on October 15th, 2021. As such, these rejections have been withdrawn.
Response to Arguments
Applicant’s arguments filed December 13th, 2021 have been fully considered but they are not persuasive.
In regards to claim 1, the applicant argues that:
At the page 5 of the Office Action, it is acknowledged that Yin does not teach the 
identification, by the computer, of the dropped pronoun based on a probability value associated with the contextual presentation. The rejection is based on the assertion that Section 3.1, lines 31-33 of Wang disclose the claimed identification of the dropped pronoun based on a probability value. 
However, the above mentioned Section of Wang describes that the contextual 
representation is integrated into the reconstructor, based on which ZP prediction is conducted. Thus, Wang merely describes a prediction component, which is completely different from the claimed probability value associated with the contextual representation. Thus, Wang fails to suggest, at least "identifying, by the computer, the 
Therefore, Wang fails to cure the deficient disclosure of Yin. Accordingly, even if Wang, and Yin could have somehow been combined, the combination of Wang, and Yin would still have failed to have taught or suggested the combination of limitations recited in claim 1. Obviousness requires a suggestion of all limitations in a claim. See, e.g., In re Wada and Murphy, Appeal 2007-3733, citing CFMT, Inc. v. Yieldup International Corp., 349 F.3d 1333 (Fed. Cir. 2003). As a result, claim 1 and its dependent claims 2-10 are patentable over the combination of Yin and Wang for at least these reasons.
The examiner respectfully disagrees with this assertion. As noted in Section 3, lines 18-20, “The contextual representation is integrated into the reconstructor”. This reconstructor is described in Section 3.1, lines 10-12, along with Equation (1), where htrec, the hidden state in the reconstructor, is shown to be a function of the context vectors; that is, it is based on the contextual representations. It is then described in Section 3.1, lines 31-33 that the zero pronoun is predicted based on a probability. Section 3.1, lines 34-36 and Equation (4) then teach that the probability is calculated using a function of htrec, which, as noted above, is calculated based on the contextual representations. Thus, the probability is transitively associated with the contextual representations, and Wang teaches the claimed limitation. 
With respect to the dependent claims under 35 U.S.C. 103, no further arguments are provided. Hence, the applicant’s arguments are not persuasive.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 


An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 

Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but were nonetheless considered for interpretation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, these claim limitations are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph because the claim limitation(s) recite(s) sufficient structure, materials, or acts to entirely perform the recited function.  Such claim limitation(s) is/are: "self-attention module for predicting the start position and the end position of the span" in claims 9 and 18. “Module” is considered a substitute for “means” that is a generic placeholder (Prong A), and it is further modified by the functional language “for predicting…” (Prong B). However, it has been determined that “self-attention” does modify the generic placeholder with sufficient structure for performing the claimed function, as an art-recognized structure (See Devlin et al., “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”, and Yin et al., "Zero Pronoun Resolution with Attention-based Neural Network". Also see MPEP 2181(I)(C)).
Because this/these claim limitation(s) is/are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are not being interpreted to cover only the corresponding 
If applicant intends to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to remove the structure, materials, or acts that performs the claimed function; or (2) present a sufficient showing that the claim limitation(s) does/do not recite sufficient structure, materials, or acts to perform the claimed function.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-7 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims are drawn to a series of elements that can broadly be construed as performance in the mind with the aid of pen and paper. The claims recite methods consisting of several steps performed by a generic computer This judicial exception is not integrated into a practical application because, while it does claim the use o (MPEP 2016(III)(C)). The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As noted above, the computer is described generically, without further improvement or meaningful limitation that would allow it to qualify as significantly more (MPEP 2106.05(I)(A)). The claims are not directed to patent eligible subject matter.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-4, 9-14, 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Yin et al. (hereinafter "Yin", "Zero Pronoun Resolution with Attention-based Neural Network") in view of Wang et al. (hereinafter "Wang", "One Model to Learn Both: Zero Pronoun Prediction and Translation").
The term “contextual representations” was interpreted as follows as its broadest reasonable interpretation consistent with the specification: 

In regards to Claims 1, 11, and 20, Yin teaches:
a reception, by a computer, of data corresponding to one or more input words (Section 3.12, line 2: “We run experiments on… the OntoNotes dataset”);
a determination of contextual representations for the received input data (Section 3.2, lines 2-3;  real-valued vectors determined from context words are considered to be contextual representations for the received input data); and
a determination, by the computer, of a span associated with one or more of the received input words corresponding to which of the input words the dropped pronoun refers (Section 3.2.3, line 13; The “span” in the instant claim is merely identification of the input words that the dropped pronoun refers to. This is functionally identical to the “candidate… selected to be the antecedent of the anaphoric zero pronoun (i.e. dropped pronoun)” described by Yin).
However, Yin does not teach the identification, by the computer, of the dropped pronoun based on a probability value associated with the contextual representations.
Wang teaches the identification, by a computer, of a dropped pronoun based on a probability value associated with the contextual representations (Section 3, lines 18-20: “The contextual representation is integrated into the reconstructor, based on which ZP (zero pronoun) prediction is conducted.” Here, “ZP prediction” is equivalent to “identification of a dropped pronoun”. Section 3.1, lines 31-33 also note that the pronoun’s identity is determined by probability). 
Yin notes that zero pronouns create difficulties in natural language processing because they are gaps without text (Section 1, lines 9-10). Thus, a system that first identifies the text that is dropped would resolve pronouns with much more ease. In addition, Wang teaches that performing ZP pronoun 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yin to incorporate the teachings of Wang to predict the dropped pronoun before resolving it. Doing so would remedy the issue of lack of representation for zero pronouns recognized by Yin without the use of an external, potentially lower accuracy prediction model as recognized by Wang.
In regards to claims 2 and 12, Yin and Wang teach the elements of claim 1 and 11 as written above. Wang further teaches the detection, by a computer, that the dropped pronoun was dropped (Section 3.1, lines 19-34; Wang shows how each space between the input words is labelled as having a zero pronoun (“’N’ denotes no ZP”) or not; Also section 3, lines 4-13). Wang describes the claimed step (detecting that the dropped pronoun was dropped) as a necessary element of dropped pronoun identification.
In regards to claims 3 and 13, Yin and Wang teach the elements of claim 2 and 12 as written above. Wang further teaches the detection of the dropped pronoun using a binary classification (Section 3.1, lines 24-28; the label can be either No ZP or any ZP) based on a second probability value (Section 3.1, lines 31-33; integration of this element would require an additional probability value) associated with the contextual representations (Section 3.1, lines 22-24).
Here, Wang describes the claimed step (detecting that the dropped pronoun was dropped) as a necessary element of dropped pronoun identification.  
In regards to claims 4 and 14, Yin and Wang teach the elements of claim 2 and 12 as written above. Wang additionally teaches the calculation of a loss value based on the detecting of the dropped pronoun, and the identifying of the dropped pronoun (Section 3.1, lines 43-50: the loss function would 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yin to incorporate the teachings of Wang to calculate a loss value based on the combined elements of the two because loss functions based on the results of task completion during training are an integral element of neural network-based models for natural language processing. Thus, the incorporation of new tasks would require the incorporation of new bases for the calculation of the loss function.
In regards to claims 9 and 18, Yin and Wang teach the elements of claim 1 and 11 as written above. Yin further teaches the determination of a start and an end position for the span based on a concatenation of the one or more received input words (Section 3.2.1 lines 28-30) and on a self-attention determination module for predicting the start position and the end position of the span (Section 3.2.1 line 1: the “mechanism” is construed as a “module”).
In regards to claims 10 and 19, Yin and Wang teach the elements of claim 1 and 11 as written above. Wang further teaches the identification of the dropped pronoun based on the probability value associated with the contextual representations as comprising identifying, based on the probability value (Section 3.1, lines 31-33) that the dropped pronoun would occur between two of the received input words (Section 3.1, lines 22-24).
Claims 5, 6, 8, 15, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Yin in view of Wang and Devlin et al. (hereinafter “Devlin”, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”).
In regards to claims 5 and 15, Yin and Wang teaches the elements of claim 1 and 11 as written above. However, Yin and Wang do not teach the appending, by the computer, of a class token to the 
Devlin teaches the appending, by the computer, of a class token to the received input word data, wherein the appended class token corresponds to a start position associated with the received input words (Section 3, lines 53-55: “The first token of every sequence is always a special classification token”).
This class token is an integral feature of BERT, a language representation model described by Devlin. Devlin teaches that BERT is an improvement over previous language representation models because a single pre-trained model is able to be fine-tuned to a wide range of tasks without substantial architectural modifications (Abstract, lines 9-15).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yin to incorporate the teachings of Devlin to append a class token to the input data that corresponds to a start position with the received input words in the process of utilizing BERT. This would take advantage of BERT’s moldability in processing two related tasks (dropped pronoun recovery and dropped pronoun resolution) with one pre-trained model, as recognized by Devlin.
In regards to claim 6, Yin, Wang, and Devlin teach the elements of claim 5 as above. In addition, Devlin further teaches assigning a span of zero to the appended class token (Section 4.3, lines 6-8: Devlin teaches the assignation of an answer span that starts and ends at the CLS token for questions that do not have an answer. That is, manually assigning a span of zero to the CLS where it is known that no answer exists simplifies the task).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yin to incorporate the teachings of Devlin to assign a span of zero to the class token in order to simplify the task, as recognized by Devlin.
In regards to claims 8 and 17, Yin and Wang teach the elements of claim 1 and 11 as above. However, Yin and Wang do not teach the use of a BERT encoder to determine contextual representations. 
Devlin teaches the use of a BERT encoder to determine contextual representations (Abstract, lines 6-9).
Devlin teaches that BERT is an improvement over previous language representation models because a single pre-trained model is able to be fine-tuned to a wide range of tasks without substantial architectural modifications (Abstract, lines 9-15).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yin to incorporate the teachings of Devlin to determine contextual representations by a BERT encoder. This would take advantage of BERT’s moldability in processing two related tasks (dropped pronoun recovery and dropped pronoun resolution) with one pre-trained model, as recognized by Devlin.
Claims 7 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Yin in view of Wang and Chen et al. (hereinafter "Chen", "Chinese Zero Pronoun Resolution with Deep Neural Networks").
Yin and Wang teach the elements of claim 1 and 11 as above. In addition, Wang further teaches the determination that the data corresponding to the received input words contains a dropped pronoun (Section 3.1, lines 19-34; Wang shows how each space between the input words is labelled as having a zero pronoun (“’N’ denotes no ZP”) or not; Also section 3, lines 4-13). Here, Wang describes the claimed step (detecting that the dropped pronoun was dropped) as a necessary element of dropped pronoun identification.  
However, Yin and Wang do not teach the determination of a type associated with the dropped pronoun.

Chen teaches that the use of lexical features contributes significantly to the performance of anaphoric zero pronoun resolvers (Section 1, lines 48-53). Thus, it would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Yin to incorporate the teachings of Chen to determine a type associated with the dropped pronoun in order to improve the performance of dropped pronoun resolution.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Tan et al. ("Detecting and Translating Dropped Pronouns in Neural Machine Translation") teach a usage of a neural network for dropped pronoun detection and recovery for the purpose of machine translation. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALEXANDER J KIM whose telephone number is (571)272-4442. The examiner can normally be reached M-F 7:30 AM - 5:30 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on (571) 272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ALEXANDER JOONGIE KIM/Examiner, Art Unit 2655                                                                                                                                                                                                        
/ANDREW C FLANDERS/Supervisory Patent Examiner, Art Unit 2655