DETAILED ACTION
This is a response to the Amendment to Application # 16/006,691 filed on May 16, 2022 in which claims 1, 3-7, 10, and 19 were amended.  

Continued Examination Under 37 C.F.R. § 1.114
A request for continued examination under 37 C.F.R. § 1.114, including the fee set forth in 37 C.F.R. § 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 C.F.R. § 1.114, and the fee set forth in 37 C.F.R. § 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 C.F.R. § 1.114. Applicant's submission filed on May 16, 2022 has been entered.
 
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
Claims 1-20 are pending, of which claims 1-4, 10-13, 19, and 20 are rejected under 35 U.S.C. § 102(a)(2) and claims 5-8 and 14-17 are rejected under 35 U.S.C. § 103.

Claim Objections
Claims 1-18 are objected to for failing to comply with 37 C.F.R. § 1.75(g), which requires “[t]he least restrictive claim should be presented as claim number 1” (emphasis added). See also, MPEP § 608.01(i)). In the present application, the claim presented as claim number 10 is the least restrictive claim of the independent claims. This objection will be held in abeyance upon Applicant’s request.
	

Claim Rejections - 35 U.S.C. § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA  35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of the appropriate paragraphs of 35 U.S.C. § 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-4, 10-13, 19, and 20 are rejected under 35 U.S.C. § 102(a)(2) as being anticipated by Cohen et al., US Publication 2019/0197154 (hereinafter Cohen), as cited on the Notice of References Cited dated October 25, 2021.

Regarding claim 1, Cohen discloses a system for natural language processing, the system comprising “a memory storing instructions; and a processor coupled with the memory and configured, when executing the instructions on the memory.” (Cohen ¶ 7). Additionally, Cohen discloses the instructions “to cause the system to: encode first words from a context and second words from a question. (Cohen ¶¶ 52,63, Fig. 9) where the features of the bar graph (i.e., the context) are encoded (Cohen ¶ 52) and shown to include words in the form of labels (Cohen Fig. 9) and further giving an example of encoding the question “what is the value of xyz?” (Cohen ¶ 63). Further, Cohen discloses “wherein the question is separate from but related to the context” (Cohen ¶ 18) where the question is about (i.e., related to) the graph, but it is not itself, the graph. Moreover, Cohen discloses “the encodings performed in parallel” (Cohen ¶ 77) by indicating that any steps may be performed in parallel. Likewise, Cohen discloses “decode the encoded context and the encoded question” (Cohen ¶¶ 60, 86) where the results of the CNN are decoded (Cohen ¶ 86) and detailing that the input of the CNN was the encoded context and question. (Cohen ¶ 60). 
Cohen also discloses “generate, based on the decoded context and the decoded question, a first distribution over the first words from the context, a second distribution over the second words from the question, and a third distribution over third words in a vocabulary; generate a first weight of the first distribution, a second weight of the second distribution, and a third weight of the third distribution” (Cohen ¶¶ 51, 64, 86-88, 115 and Fig. 3) where a distribution is generated (Cohen ¶ 88) based on the results of the CNN (i.e., the decoded context and question, Cohen ¶ 86). Cohen discloses that the feature vector of the entire graph, which would include the labels (i.e., the first words), is weighted (Cohen ¶ 64); that the attention-weighted features include the words of the question (Cohen ¶ 115); and that a weighted distribution for the output (i.e., an answer) is used (Cohen ¶ 51). Finally, Cohen discloses “generate a composite distribution based on the first weight, second weight, and third weight; and select words for inclusion in an answer using the composite distribution” (Cohen ¶ 64) by combining the feature vectors to generate output.

Regarding claim 10, it merely recites the method performed by the system of claim 1. The method comprises executing computer software modules for performing the various functions. Cohen comprises computer software modules for performing the same functions. Thus, claim 10 is rejected using the same rationale set forth in the above rejection for claim 1.

Regarding claim 19, it merely recites a non-transitory machine-readable medium for embodying the system of claim 1. The medium comprises computer software modules for performing the various functions. Cohen comprises computer software modules for performing the same functions. Thus, claim 19 is rejected using the same rationale set forth in the above rejection for claim 1.

Regarding claims 2, 11, and 20, Cohen discloses the limitations contained in parent claims 1, 10, and 19 for the reasons discussed above. In addition, Cohen discloses “wherein the context and the question correspond to a natural language processing task type selected from question answering, machine translation, document summarization, database query generation, sentiment analysis, natural language inference, semantic role labeling, relation extraction, goal oriented dialogue, and pronoun resolution” (Cohen Abstract) where the context and the question correspond to question answering.

Regarding claims 3 and 12, Cohen discloses the limitations contained in parent claims 1 and 10 for the reasons discussed above. In addition, Cohen discloses “determine a coattention between the first words in the context and the second words in the question” (Cohen ¶ 23) by determining an attention model for the visualization and the query, which a person of ordinary skill in the art would understand to be a “coattention” model within the plain and ordinary meaning of the term, which is a model that focusses on both the image and the question.1

Regarding claims 4 and 13, Cohen discloses the limitations contained in parent claims 1 and 10 for the reasons discussed above. In addition, Cohen discloses “generate an attention across the context and an attention across the question in parallel; and generate final encodings of the context and the question in parallel based on the generated attention.” (Cohen ¶¶ 23, 77).

Claim Rejections - 35 U.S.C. § 103
The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


This application currently names joint inventors. In considering patentability of the claims, the Examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicants are advised of the obligation under 37 C.F.R. § 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. § 102(b)(2)(C) for any potential 35 U.S.C. § 102(a)(2) prior art against the later invention.

Claims 5 and 14 are rejected under 35 U.S.C. § 103 as being unpatentable over Cohen in view of Xiong et al., Dynamic Coattention Networks for Question Answering, Published as a Conference Paper at the International Conference on Learning Representations. Toulon, France. April 24-26, 2017. pp. 1-14, as cited on the Information Disclosure Statement dated February 19, 2019 (hereinafter Xiong).

Regarding claims 5 and 14, Cohen discloses the limitations contained in parent claims 1 and 10 for the reasons discussed above. In addition, Cohen discloses “encode the words in the context and words in the question in parallel” (Cohen ¶ 77) by indicating that any steps may be performed in parallel.
Cohen does not appear to explicitly disclose “project the encodings of the words in the context and the words in the question in parallel; and further encode the projections of the encodings.”
However, Xiong discloses a question and answering system including “project the encodings of the words in the context and the words in the question in parallel” (Xiong 3, 6) by disclosing that the implementation system includes linear networks and that the operations may be performed in parallel. Additionally, Xiong discloses “further encode the projections of the encodings” (Xiong 3, Fig. 2) by providing a diagram with a series of parallel bi-directional LSTMs.
Cohen and Xiong are analogous art because they are from the “same field of endeavor,” namely that of question and answering systems. 
Prior to the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Cohen and Xiong before him or her to modify the network of Cohen to include the linear networks of Xiong.
The motivation for doing so would have been that the use of system described by Xiong in question and answer systems has been shown to provide a more accurate result than previous methods. (Xiong 1). 

Claims 6 and 15 rejected under 35 U.S.C. § 103 as being unpatentable over Cohen in view of Sebastian Ruder; Deep Learning for NLP Best Practices; July 25, 2017; ruder.io; Pages 1-25 (hereinafter Ruder), as cited on the Notice of References Cited dated October 25, 2021.

Regarding claims 6 and 15, Cohen discloses the limitations contained in parent claims 1 and 10 for the reasons discussed above. In addition, Cohen does not appear to explicitly disclose “encode and embed an intermediate version of the answer; generate an attention between the encoded and embedded intermediate version of the answer and a final encoding of the context; generate an intermediate decoder state from the generated attention; and generate context and question decoder states based on a final encoding of the context, a final encoding of the question, and the intermediate decoder state.”
However, Ruder discloses a neural network “encode and embed an intermediate version of the data” (4-5) by indication that there are eight layers, which would mean that the result of each layer, except the final layer, would be an “intermediate” version. Additionally, Ruder discloses “generate an attention between the encoded and embedded intermediate version of the data and a final encoding of the data.” (Ruder 10-11). Further, Ruder discloses “generate an intermediate decoder state from the generated attention” (Ruder 4-5, 12-13) by indicating that it is well-known to use LSTM on the encoded data. Finally, Ruder discloses “generate data decoder states based on a final encoding of the data, a final encoding of the data, and the intermediate decoder state” (Ruder 8) by disclosing that the decoder states are generated based on the current position and previous states (i.e., intermediate states).
Further, a person of ordinary skill in the art prior to the effective filing date would have recognized that when Ruder was combined with Cohen, the specific data, e.g., the context and the questions, of Cohen would be operated on according to the neural network components of Ruder. Therefore, the combination of Cohen and Ruder at least teaches and/or suggests the claimed limitations “encode and embed an intermediate version of the answer; generate an attention between the encoded and embedded intermediate version of the answer and a final encoding of the context; generate an intermediate decoder state from the generated attention; and generate context and question decoder states based on a final encoding of the context, a final encoding of the question, and the intermediate decoder state,” rendering them obvious.
Cohen and Ruder are analogous art because they are from the “same field of endeavor,” namely that of neural networks. 
Prior to the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Cohen and Ruder before him or her to modify the neural networks of Cohen to include the self-attention layers and bi-directional LSTM memory of Ruder.
The motivation for doing so would have been that these practices are known to be “best practices” within the art. (Ruder 1). 

Claims 7, 8, 16, and 17 are rejected under 35 U.S.C. § 103 as being unpatentable over Cohen in view of Bengio, et al., Curriculum learning, In Proceedings of the 26th International Conference on Machine Learning, Montreal, Canada, 2009, 8 pages, (hereinafter Bengio), as cited on the Information Disclosure Statement dated February 19, 2019.

Regarding claims 7 and 16, Cohen discloses the limitations contained in parent claims 1 and 10 for the reasons discussed above. In addition, Cohen does not appear to explicitly disclose “wherein the system is trained against a subset of task types, wherein the system is further trained against a full set of task types that the system is designed to process after the system is trained against the subset of task types.”
However, Bengio discloses a machine learning system including the requirement “wherein the system is further trained against a full set of task types that the system is designed to process after the system is trained against the subset of task types” (Bengio 1) by using a curriculum learning strategy. A person of ordinary skill in the art would understand a “curriculum” strategy to be a strategy that begins training with a small training set and then increase the difficult of the training set in size and complexity until the system has been trained on everything. (Bengio, § 1 Introduction).
Cohen and Bengio are analogous art because they are from the “same field of endeavor,” namely that of machine learning systems. 
Prior to the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Cohen and Bengio before him or her to modify the training of Cohen to include the curriculum learning of Bengio.
The motivation for doing so would have been that curriculum learning improves the speed and quality of the training process.  

Regarding claims 8 and 17, the combination of Cohen and Bengio discloses the limitations contained in parent claims 7 and 16 for the reasons discussed above. In addition, the combination of Cohen and Bengio discloses “wherein the subset of task types are selected according to a curriculum strategy.” (Bengio 1).
	
Allowable Subject Matter
Claims 9 and 18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Response to Arguments
Applicant’s request filed May 16, 2022, to hold the objections to claims 1-18 in abeyance (Remarks 8) have been noted.

Applicant’s arguments filed May 16, 2022, with respect to the objections to claims 1, 7, 10, 16, and 19 and the rejection of claims 1-9 under 35 U.S.C. § 112(b) (Remarks 8-9) have been fully considered and are persuasive. The objections to claims 1, 7, 10, 16, and 19 and the rejection of claims 1-9 under 35 U.S.C. § 112(b) have been withdrawn. 

Applicant’s arguments filed May 16, 2022, with respect to the rejection of claims 1-8, 10-17, 19, and 20 under 35 U.S.C. §§ 102 and 103 (Remarks 9-10) have been considered but are moot in view of the new ground(s) of rejection. 

	
	Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicant's disclosure:
Gal et al., US Publication 2011/0065082; System and method for implementing a question and answer system.
Perez, US Publication 2018/0137854, System and method for implementing a question and answer system.
Martin et al., US Publication 2018/0247549, System and method for implementing a question and answer system.
Shu et al., US Publication 2020/0066262, System and method for implementing a question and answer system.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANDREW R DYER whose telephone number is (571)270-3790. The examiner can normally be reached Monday-Friday 7:30-3:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kavita Padmanabhan can be reached on 571-272-8352. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/ANDREW R DYER/Primary Examiner, Art Unit 2176


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 Jiasen Lu, Jianwei Yang, Dhruv Batra, and Devi Parikh; Hierarchical question-image co-attention for visual question answering; 2016; In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS'16); Curran Associates Inc., Red Hook, NY, USA; Page 1.