DETAILED ACTION
This is a response to Application # 16/006,691 filed on June 12, 2018 in which claims 1-20 were presented for examination.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

Status of Claims
Claims 1-20 are pending, of which claims 1-9 and 16 are rejected under 35 U.S.C. § 112(b); claims 1, 2, 10, 11, 19, and 20 are rejected under 35 U.S.C. § 102(a)(2); and claims 3-8 and 12-17 are rejected under 35 U.S.C. § 103.

Information Disclosure Statement
The information disclosure statement filed February 19, 2019 fails to comply with the provisions of 37 C.F.R. § 1.97, 1.98 and MPEP § 609 because the non-patent literature submitted does not contain the proper bibliographic information as required by 37 C.F.R. § 1.98(b)(5). Specifically, 37 C.F.R. § 1.98(b)(5) states “[e]ach publication listed in an information disclosure statement must be identified by … title … [and] relevant pages of the publication.” (Emphasis added). The citations for NPL items 1 and 3 include page numbers that do not match those of the provided documents and the citation for NPL item 10 does not include any page numbers. Finally, the citation for NPL item 13 recites a title different from the provided document. It has been placed in the application file, but the information referred to therein has not been considered as to the merits. The remainder of the information disclosure statement complies with the provisions of 37 C.F.R. § 1.97, 1.98 and MPEP § 609, and has 

The information disclosure statement filed February 19, 2019 fails to comply with the provisions of 37 C.F.R. § 1.97, 1.98 and MPEP § 609 because the non-patent literature submitted does not contain the proper bibliographic information as required by 37 C.F.R. § 1.98(b)(5). Specifically, 37 C.F.R. § 1.98(b)(5) states “[e]ach publication listed in an information disclosure statement must be identified by … relevant pages of the publication.” (Emphasis added). The citation for NPL item 21 contains two, conflicting sets of page numbers; the citations for NPL items 32, 37, and 49 do not include any page numbers; and the citations for NPL items 25, 28, 33, 41, 46, 49, and 50 include page numbers that do not match those of the provided documents. It has been placed in the application file, but the information referred to therein has not been considered as to the merits. The remainder of the information disclosure statement complies with the provisions of 37 C.F.R. § 1.97, 1.98 and MPEP § 609, and has been placed in the application file and the information referred to therein has been considered as to the merits.  

The information disclosure statement filed February 19, 2019 fails to comply with the provisions of 37 C.F.R. § 1.97, 1.98 and MPEP § 609 because the non-patent literature submitted does not contain the proper bibliographic information as required by 37 C.F.R. § 1.98(b)(5). Specifically, 37 C.F.R. § 1.98(b)(5) states “[e]ach publication listed in an information disclosure statement must be identified by … relevant pages of the publication.” (Emphasis added). The citation for NPL items 32, 45, and 46 each contain two, conflicting sets of page numbers; the citations for NPL items 6 and 20 do not include any page numbers; and the citations for NPL items 19, 22, 24, 26, 28, and 39 include page numbers that do not match those of the provided documents. It has been placed in the application file, but the 

The information disclosure statement filed October 15, 2019 fails to comply with the provisions of 37 C.F.R. § 1.97, 1.98 and MPEP § 609 because the non-patent literature submitted does not contain the proper bibliographic information as required by 37 C.F.R. § 1.98(b)(5). Specifically, 37 C.F.R. § 1.98(b)(5) states “[e]ach publication listed in an information disclosure statement must be identified by … relevant pages of the publication.” (Emphasis added). The citation for NPL 6 recites “whole document,” which does not provide any page numbers as the examiner is unable to determine if the whole document has, in fact, been submitted. Additionally, NPL item 4 is duplicative of a reference provided on the IDS filed February 19, 2019 and, thus, has not been considered a second time.
Finally, 37 C.F.R. § 1.98(a)(2) requires a “legible copy of: (i) Each foreign patent.” Alternatively, “[w]hen the disclosures of two or more patents or publications listed in an information disclosure statement are substantively cumulative, a copy of one of the patents or publications as specified in paragraph (a) of this section may be submitted without copies of the other patents or publications, provided that it is stated that these other patents or publications are cumulative.” 37 C.F.R. § 1.98(c). However, the examiner can find neither a legible copy of Foreign Reference 1 nor a statement indicating that it is cumulative of another reference. 
 It has been placed in the application file, but the information referred to therein has not been considered as to the merits. The remainder of the information disclosure statement complies with the provisions of 37 C.F.R. § 1.97, 1.98 and MPEP § 609, and has been placed in the application file and the information referred to therein has been considered as to the merits.  

Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. § 119(e) or under 35 U.S.C. §§ 120, 121, or 365(c) is acknowledged. 

Drawings
The drawings are objected to because portions of the text in Figure 1 is small, unfocused, and difficult to read.  Applicant should amend all text to be easily readable and reproducible in the drawings.
Corrected drawing sheets in compliance with 37 C.F.R. § 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 C.F.R. § 1.121(d). If the changes are not accepted by the Examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Interpretation
The following is a quotation of 35 U.S.C. § 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts 

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. § 112(f) because the claim limitations uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitations are: the “multi-layer encoder;” the “multi-layer decoder;” the “pointer generator,” the “switch,” the “parallel self-attention encoders,” and the “self-attention decoder” in claims 1-9.
Because these claim limitations are being interpreted under 35 U.S.C. § 112(f), they are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If Applicant does not intend to have these limitations interpreted under 35 U.S.C. § 112(f), Applicant may:  (1) amend the claim limitation(s) to avoid them being interpreted under 35 U.S.C. § 112(f) (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitations recite sufficient structure to perform the claimed function so as to avoid them being interpreted under 35 U.S.C. § 112(f).

This application also includes one or more claim limitations that use the word “means” or “step” (or the equivalent thereof) but are nonetheless not being interpreted under 35 U.S.C. § 112(f) because the claim limitations recite sufficient structure, materials, or acts to entirely perform the recited function.  Such claim limitations are: the “parallel bi-directional long short term memories” and the “bi-directional long short term memory” in claims 4 and 6.
not being interpreted under 35 U.S.C. § 112(f), they are not being interpreted to cover only the corresponding structure, material, or acts described in the specification as performing the claimed function, and equivalents thereof.
If Applicant intends to have these limitations interpreted under 35 U.S.C. § 112(f), Applicant may:  (1) amend the claim limitations to remove the structure, materials, or acts that performs the claimed function; or (2) present a sufficient showing that the claim limitations do not recite sufficient structure, materials, or acts to perform the claimed function.

Additionally, the term “anti-curriculum” learning does not appear to be a known term of art. However, this term is defined by the present specification as situations where “the training sample is selected from those task types which are characterized as being more difficult to learn, have longer answer sequences, and/or involve different types of decoding.” (Spec ¶ 87). Should Applicant intend for a different meaning to apply to this term, the examiner recommends amending the claims to better define the intended meaning. 

Claim Objections
Claims 1-18 are objected to for failing to comply with 37 C.F.R. § 1.75(g), which requires “[t]he least restrictive claim should be presented as claim number 1” (emphasis added). See also, MPEP § 608.01(i)). In the present application, the claim presented as claim number 10 is the least restrictive claim of the independent claims. This objection will be held in abeyance upon Applicant’s request.

Claim 16 is objected to because of the following informalities:  the claim limitation “further comprising training the multi-layer encoder and the multi-layer decoder against a subset of task types being training the multi-layer decoder and the multi-layer encoder against a full set of task types” (emphasis added) is grammatically incorrect. Appropriate correction is required.

Claim Rejections - 35 U.S.C. § 112
The following is a quotation of 35 U.S.C. § 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


Claims 1-9 and 16 are rejected under 35 U.S.C. § 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.

Regarding claims 1-9, the claim limitation “multi-layer encoder;” “multi-layer decoder;” “pointer generator,” “switch,” “parallel self-attention encoders,” and “self-attention decoder” recited or inherited in these claims invoke 35 U.S.C. § 112(f). However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. When a claim invokes 35 U.S.C. § 112(f) for a computer implemented means-plus-function claim, the specification must disclose the specific algorithm required to transform the general-purpose computing equipment into the required special purpose computer. See MPEP § 2181(II)(B).
Therefore, the claim is indefinite and is rejected under 35 U.S.C. § 112(b).
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. § 112(f); 

(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. § 132(a)).
If Applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. § 132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 C.F.R. § 1.75(d) and MPEP §§ 608.01(o) and 2181.

Regarding claims 7 and 16, these claims include the limitation “wherein the system is trained against a subset of task types before being trained against a full set of task types that the system is designed to process.” A broad limitation together with a narrow limitation that falls within the broad limitation (in the same claim) may be considered indefinite if the resulting claim does not clearly set forth the metes and bounds of the patent protection desired. See MPEP § 2173.05(c). In the present instance, the claims recite the broad recitation “wherein the system is trained against a subset of task 
If Applicant intends for these claims to require the system to be trained against a full set of task types, the examiner recommends amending these claims to “wherein the system is trained against a subset of task types wherein the system is further  after the system is trained against the subset of task types.”
If Applicant intends for these claims to not require the system to be trained against a full set of task types, the examiner recommends amending these claims to “wherein the system is trained against a subset of task types 
Finally, if Applicant does not intended to require the system to be trained against a full set of task types but does intend to place a timing requirement if the system were to be trained against a full set of task types, the examiner recommends making such a statement on the record and making a clarifying amendment such as “wherein the system is trained against a subset of task types before any training  occurs.”

Claim Rejections - 35 U.S.C. § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA  35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory 

The following is a quotation of the appropriate paragraphs of 35 U.S.C. § 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

 (a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 2, 10, 11, 19, and 20 are rejected under 35 U.S.C. § 102(a)(2) as being anticipated by Lao et al., US Publication 2018/0114108 (hereinafter Lao).

Regarding claim 1, Lao discloses a system for natural language processing, the system comprising “a multi-layer encoder for encoding first words from a context and second words from a question in parallel” (Lao ¶¶ 3, 21, 22, and 75) by encoding an input passage (Lao ¶ 21) that contains a sequence of words (Lao ¶ 22, i.e., a first and second word) that may be encoded in parallel (Lao ¶ 75) and may contain multiple layers (Lao ¶ 3). The present specification uses the term “context” to refer to the words that are around the selected words (Spec. ¶ 20 and Fig. 1), and thus, the sequence of words of Lao includes both context and words from the question. (See also Lao ¶ 34). Additionally, the terms “question” and “answer” appear to refer to the query and the query result, respectively, and, therefore, because the answer of Lao if a query, it is a “question” within the scope of the present invention.
Further, Lao discloses “a multi-layer decoder for decoding the encoded context and the encoded question” (Lao ¶¶ 3, 31) by decoding the encoded input sequence using a recurrent neural network (Lao ¶ 31) and indicating that recurrent neural networks may contain multiple layers (Lao ¶ 33). Moreover, Lao discloses “a pointer generator for generating distributions over the first words from the by generating a distribution for the tokens of the decoded input passage.
Likewise, Lao  discloses “a switch for: generating a weighting of the distribution over the first words from the context, the distribution over the second words from the question, and the distribution over the third words in the vocabulary; generating a composite distribution based on the weighting of the distribution over the first words from the context, the distribution over the second words from the question, and the distribution over the third words in the vocabulary” (Lao ¶ 44) by generating a weighted average (i.e., a composite distribution)  that comprises each of the entries weighted by time step (i.e., a weighting). Finally, Lao discloses “selecting words for inclusion in an answer using the composite distribution” (Lao ¶ 40) by selecting the highest scoring token as the answer to the query.

Regarding claim 10, it merely recites the method performed by the system of claim 1. The method comprises executing computer software modules for performing the various functions. Lao comprises computer software modules for performing the same functions. Thus, claim 10 is rejected using the same rationale set forth in the above rejection for claim 1.

Regarding claim 19, it merely recites a non-transitory machine-readable medium for embodying the system of claim 1. The medium comprises computer software modules for performing the various functions. Lao comprises computer software modules for performing the same functions. Thus, claim 19 is rejected using the same rationale set forth in the above rejection for claim 1.

Regarding claim 2, 11, and 20, Lao discloses the limitations contained in parent claim 1, 10, and 19 for the reasons discussed above. In addition, Lao discloses “wherein the context and the question correspond to a natural language processing task type selected from question answering, machine where the task type is question answering.

Claim Rejections - 35 U.S.C. § 103
The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


This application currently names joint inventors. In considering patentability of the claims, the Examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicants are advised of the obligation under 37 C.F.R. § 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. § 102(b)(2)(C) for any potential 35 U.S.C. § 102(a)(2) prior art against the later invention.

Claims 3, 5, 12, and 14 are rejected under 35 U.S.C. § 103 as being unpatentable over Lao in view of Xiong et al., Dynamic Coattention Networks for Question Answering, Published as a Conference Paper at the International Conference on Learning Representations. Toulon, France. April 24-26, 2017. pp. 1-14, as cited on the Information Disclosure Statement dated February 19, 2019 (hereinafter Xiong).

Regarding claim 3, Lao discloses the limitations contained in parent claim 1 for the reasons discussed above. In addition, Lao does not appear to explicitly disclose “wherein the multi-layer encoder comprises: a coattention network for determining a coattention between the first words in the context and the second words in the question; and parallel bi-directional long short term memories to compress outputs from the coattention layer.”
However, Xiong discloses a question and answering system “wherein the multi-layer encoder comprises: a coattention network for determining a coattention between the first words in the context and the second words in the question” (Xiong 1) by using a coattention network for determining coattention between the question and the interactions (i.e., context). Additionally, Xiong discloses “parallel bi-directional long short term memories to compress outputs from the coattention layer” (Xiong 3, Fig. 2) by providing a diagram with a series of parallel bi-directional LSTMs.
Lao and Xiong are analogous art because they are from the “same field of endeavor,” namely that of question and answering systems. 
Prior to the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Lao and Xiong before him or her to modify the attention of Lao to include the coattention layers of Xiong.
The motivation for doing so would have been that the use of coattention system in question and answer systems has been shown to provide a more accurate result than previous methods. (Xiong 1). 

Regarding claims 5 and 14, Lao discloses the limitations contained in parent claims 1 and 10 for the reasons discussed above. In addition, Lao discloses “wherein the multi-layer encoder comprises: parallel encoding layers for encoding the words in the context and words in the question in parallel” (Lao ¶¶ 3, 21, 22, and 75) by encoding an input passage (Lao ¶ 21) that contains a sequence of words (Lao ¶ 22, i.e., a first and second word) that may be encoded in parallel (Lao ¶ 75) and may contain multiple layers (Lao ¶ 3). The present specification uses the term “context” to refer to the words that are around the selected words (Spec. ¶ 20 and Fig. 1), and thus, the sequence of words of Lao includes both context and words from the question. (See also Lao ¶ 34).
Lao does not appear to explicitly disclose “parallel linear networks for projecting the encodings of the words in the context and the words in the question in parallel; and a bidirectional long short term memory for further encoding the projections of the encodings.”
However, Xiong discloses a question and answering system including “parallel linear networks for projecting the encodings of the words in the context and the words in the question in parallel” (Xiong 3, 6) by disclosing that the implementation system includes linear networks and that the operations may be performed in parallel. Additionally, Xiong discloses “a bidirectional long short term memory for further encoding the projections of the encodings” (Xiong 3, Fig. 2) by providing a diagram with a series of parallel bi-directional LSTMs.
Lao and Xiong are analogous art because they are from the “same field of endeavor,” namely that of question and answering systems. 
Prior to the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Lao and Xiong before him or her to modify the network of Lao to include the linear networks of Xiong.
The motivation for doing so would have been that the use of system described by Xiong in question and answer systems has been shown to provide a more accurate result than previous methods. (Xiong 1). 

Regarding claim 12, Lao discloses the limitations contained in parent claim 10 for the reasons discussed above. In addition, Lao does not appear to explicitly disclose “determining a coattention between the first words in the context and the second words in the question.”
However, Xiong discloses a question and answering system “determining a coattention between the first words in the context and the second words in the question” (Xiong 1) by using a coattention network for determining coattention between the question and the interactions (i.e., context). 
Lao and Xiong are analogous art because they are from the “same field of endeavor,” namely that of question and answering systems. 
Prior to the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Lao and Xiong before him or her to modify the attention of Lao to include the coattention layers of Xiong.
The motivation for doing so would have been that the use of coattention system in question and answer systems has been shown to provide a more accurate result than previous methods. (Xiong 1). 

Claims 4, 6, 13, and 15 are rejected under 35 U.S.C. § 103 as being unpatentable over Lao in view of Sebastian Ruder; Deep Learning for NLP Best Practices; July 25, 2017; ruder.io; Pages 1-25 (hereinafter Ruder).

Regarding claims 4 and 13, Lao discloses the limitations contained in parent claims 1 and 10 for the reasons discussed above. In addition, Lao does not appear to explicitly disclose “wherein the multi-layer encoder comprises: parallel self-attention encoders for generating an attention across the context and an attention across the question in parallel; and parallel bi-directional long short term memories for 
However, Ruder discloses a neural network “wherein the multi-layer encoder comprises: parallel self-attention encoders for generating an attention across the data …” (Ruder 10-11) by using a self-attention layer. Further, Im discloses “parallel bi-directional long short term memories for generating final encodings of the data … based on the generated attention” (Ruder 4-5, 12-13) by indicating that it is well-known to use LSTM on the encoded data and indicating that the LSTM may be bi-directional. 
Further, a person of ordinary skill in the art prior to the effective filing date would have recognized that when Ruder was combined with Lao, the specific data, e.g., the context and the questions, of Lao would be operated on according to the neural network components of Ruder and that the neural network of Ruder would be operated in parallel as taught by Lao. Therefore, the combination of Lao and Ruder at least teaches and/or suggests the claimed limitations “wherein the multi-layer encoder comprises: parallel self-attention encoders for generating an attention across the context and an attention across the question in parallel; and parallel bi-directional long short term memories for generating final encodings of the context and the question in parallel based on the generated attention,” rendering them obvious.
Lao and Ruder are analogous art because they are from the “same field of endeavor,” namely that of neural networks. 
Prior to the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Lao and Ruder before him or her to modify the neural networks of Lao to include the self-attention layers and bi-directional LSTM memory of Ruder.
The motivation for doing so would have been that these practices are known to be “best practices” within the art. (Ruder 1). 

Regarding claims 6 and 15, Lao discloses the limitations contained in parent claims 1 and 10 for the reasons discussed above. In addition, Lao does not appear to explicitly disclose “wherein the multi-layer decoder comprises: an encoding and embedding layer for encoding and embedding an intermediate version of the answer; a self-attention decoder for generating an attention between the encoded and embedded intermediate version of the answer and a final encoding of the context; a long short term memory for generating an intermediate decoder state from outputs of the self-attention decoder; and a context and question attention network for generating context and question decoder states based on a final encoding of the context, a final encoding of the question, and the intermediate decoder state.”
However, Ruder discloses a neural network “wherein the multi-layer decoder comprises: an encoding and embedding layer for encoding and embedding an intermediate version of the data” (4-5) by indication that there are eight layers, which would mean that the result of each layer, except the final layer, would be an “intermediate” version. Additionally, Ruder discloses “a self-attention decoder for generating an attention between the encoded and embedded intermediate version of the data and a final encoding of the data.” (Ruder 10-11). Further, Ruder discloses “a long short term memory for generating an intermediate decoder state from outputs of the self-attention decoder” (Ruder 4-5, 12-13) by indicating that it is well-known to use LSTM on the encoded data. Finally, Ruder discloses “a data attention network for generating data decoder states based on a final encoding of the data, a final encoding of the data, and the intermediate decoder state” (Ruder 8) by disclosing that the decoder states are generated based on the current position and previous states (i.e., intermediate states).
Further, a person of ordinary skill in the art prior to the effective filing date would have recognized that when Ruder was combined with Lao, the specific data, e.g., the context and the questions, of Lao would be operated on according to the neural network components of Ruder. 
Lao and Ruder are analogous art because they are from the “same field of endeavor,” namely that of neural networks. 
Prior to the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Lao and Ruder before him or her to modify the neural networks of Lao to include the self-attention layers and bi-directional LSTM memory of Ruder.
The motivation for doing so would have been that these practices are known to be “best practices” within the art. (Ruder 1). 

Claims 7, 8, 16, and 17 are rejected under 35 U.S.C. § 103 as being unpatentable over Lao in view of Bengio, et al., Curriculum learning, In Proceedings of the 26th International Conference on Machine Learning, Montreal, Canada, 2009, 8 pages, (hereinafter Bengio), as cited on the Information Disclosure Statement dated February 19, 2019.

Regarding claims 7 and 16, Lao discloses the limitations contained in parent claims 1 and 10 for the reasons discussed above. In addition, Lao does not appear to explicitly disclose “wherein the system 
However, Bengio discloses a machine learning system including the requirement “wherein the system is trained against a subset of task types before being trained against a full set of task types that the system is designed to process” (Bengio 1) by using a curriculum learning strategy. 
Lao and Bengio are analogous art because they are from the “same field of endeavor,” namely that of machine learning systems. 
Prior to the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art, having the teachings of Lao and Bengio before him or her to modify the training of Lao to include the curriculum learning of Bengio.
The motivation for doing so would have been that curriculum learning improves the speed and quality of the training process.  

Regarding claims 8 and 17, the combination of Lao and Bengio discloses the limitations contained in parent claims 7 and 16 for the reasons discussed above. In addition, the combination of Lao and Bengio discloses “ wherein the subset of task types are selected according to a curriculum strategy.” (Bengio 1).

Allowable Subject Matter
Claim 18 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicant's disclosure:
Guo et al., US Publication 2018/0082184, Artificial intelligence based question answering system.
Min et al., US Publication 2018/0121785, Artificial intelligence based question answering system.
Chang et al., US Publication 2018/0143978, Artificial intelligence based question answering system.
Xiong et al., US Publication 2018/0329884, Artificial intelligence based question answering system.
Lee et al., US Publication 2018/0336183, Artificial intelligence based question answering system.
Ke et al., US Publication 2018/0365321, Artificial intelligence based question answering system.
Miller et al., US Publication 2019/0005021, Artificial intelligence based question answering system.
Yuan et al., US Publication 2019/0043379, Artificial intelligence based question answering system.
Lei, US Publication 2019/0122101, Artificial intelligence based question answering system.
Cohen et al., US Publication 2019/0197154, Artificial intelligence based question answering system.
Bajaj et al., US Publication 2019/0228099, Artificial intelligence based question answering system.
Celikyilmaz et al., US Publication 2019/0287012, Artificial intelligence based question answering system.

Chiu et al., US Patent 10,281,885, Artificial intelligence based question answering system.
Merity et al., US Patent 10,565,493, Artificial intelligence based question answering system.
McCann et al., US Patent 10,776,581, Artificial intelligence based question answering system.
Flunkert et al., US Patent 10,936,947, Artificial intelligence based question answering system.
Nam et al.; Dual Attention Networks for Multimodal Reasoning and Matching; March 21, 2017; 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Pages 1-9; Artificial intelligence based question answering system.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANDREW R DYER whose telephone number is (571)270-3790.  The examiner can normally be reached on Monday-Friday 7:30-3:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kavita Stanley can be reached on 571-272-8352.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic 

/ANDREW R DYER/Primary Examiner, Art Unit 2176