Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 1/7/2019 was filed.  The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the Examiner.

Allowable Subject Matter
Claims 1-3 would be allowable if rewritten or amended to overcome the issues as set forth in this Office action.
The following is a statement of reasons for the indication of allowable subject matter: the prior art references do not explicitly teach the claim limitations.
Li teaches a meta-learner with a “two-level learning process” for classification task learning in a “meta space”. Wherein the meta-learning process takes in a plurality of data batches to each meta-learner utilizing few-shot learning techniques. The process involves learning meta parameters, calculating empirical loss via gradient descent, and calculating prediction loss via mean squared error. The process also utilizes weight matrices that are initialized with a certain mean and standard deviation value. However, Li does not explicitly teach the generation of the first and second meta weight values, the comparison of those weight values to generate a slow weight value, storing the slow weight value in memory, the comparison steps to generate a third meta information and a fast weight, the transmission of the third meta information, the storing of the fast weight in memory, and the parameterizing of the meta weight values with the fast weight to update the slow weight. As such, the claim limitations are distinguishable from the reference. 
Gupta teaches a “multipart artificial neural network” (ANN) for meta-learning to classify questions. Input data can be received by the ANN to train the ANN on auxiliary tasks to assign labels to the input and to make predictions related to relationships between the words. The meta-learning process uses supervised learning to make a prediction in a hypothesis space that includes computing a loss function with regards to the prediction. The process also includes calculating weight values for the ANN and its nodes. In addition, Gupta also teaches a memory device for storing information. However, Gupta does not explicitly teach the generation of the first and second meta weight values, the comparison of those weight values to generate a slow weight value, storing the slow weight value in memory, the comparison steps to generate a third meta information and a fast weight, the transmission of the third meta information, the storing of the fast weight in memory, and the parameterizing of the meta weight values with the fast weight to update the slow weight. As such, the claim limitations are distinguishable from the reference.
Vilalta teaches data classification via a classification system that uses a meta-learning process that includes model selection and generation of meta-rules and meta-features. The system receives training domain data set and generates meta-features from the domain data set. Then, “the meta-learning model selection process 700 attempts to look for correlations among the meta-features to generate rules that show when to assign a specific model to a certain domain. The selection process 800 identifies the best matching rule to assign a model to a new domain. The meta-rules generation process 900 looks for correlations between meta features and models to determine when a model is best for a specific domain dataset.” The meta-learning process also involves calculating weighted distances between the training sample data sets to measure the variability of the class label associated with the training sample data sets. In addition, Vilalta also teaches a memory device for storing instructions. However, Vilalta does not explicitly teach the generation of the first and second meta weight values, the comparison of those weight values to generate a slow weight value, storing the slow weight value in memory, the comparison steps to generate a third meta information and a fast weight, the transmission of the third meta information, the storing of the fast weight in memory, and the parameterizing of the meta weight values with the fast weight to update the slow weight. As such, the claim limitations are distinguishable from the reference.
Ba teaches training of a neural network with classification tasks, generation of slow and fast weights in a neural network, and a storing or caching of the slow and fast weights in relation to memory. However, Ba does not explicitly teach the meta-learning such as the meta space, the generation of the first and second meta information values, the generation of the first and second meta weight values, the comparison of those weight values to generate a slow weight value, the comparison steps to generate a third meta information and a fast weight, the transmission of the third meta information, the comparison steps to generate a third meta information and a fast weight, the transmission of the third meta information, and the parameterizing of the meta weight values with the fast weight to update the slow weight. As such, the claim limitations are distinguishable from the reference.

Claims 4-14 would be allowable if rewritten or amended to overcome the issues as set forth in this Office action.
The following is a statement of reasons for the indication of allowable subject matter: the prior art references do not explicitly teach the claim limitations. 
Li teaches a meta-learner with a “two-level learning process” for classification task learning in a “meta space”. Wherein the meta-learning process takes in a plurality of data batches to each meta-learner utilizing few-shot learning techniques. The process involves learning meta parameters, calculating empirical loss via gradient descent, and calculating prediction loss via mean squared error. The process also utilizes weight matrices that are initialized with a certain mean and standard deviation value. However, Li does not explicitly teach the first and second meta weights, generation of the representation loss and the representation loss gradients, generating the first fast weight, generating the task loss and the task loss gradient, mapping the task loss gradient through parameterizing by a second meta weight, integration via parameterizing of the respective slow weights and fast weights, generating a third fast weight via reading the weight memory with soft attention, generating the training loss via parameterizing  with integration of the second slow and fast weights, and updating the various weights using the training loss and loss gradient associated with the respective weights. As such, the claim limitations are distinguishable from the reference.
Duan teaches a meta-learning framework that utilizes a one-shot imitation learning technique. The technique involves using a soft attention module with attention weights and inputs comprising a query, context vectors, and memory vectors. The outputs comprise “a weighted combination of the memory content, where the weights are given by a softmax operation over the attention weights.” Duan also teaches computing a cross-entropy loss related to an action and a demonstration of a task. However, Duan does not explicitly teach the first and second meta weights, generation of the representation loss and the representation loss gradients, generating the first fast weight, generating the task loss and the task loss gradient, mapping the task loss gradient through parameterizing by a second meta weight, integration via parameterizing of the respective slow weights and fast weights, generating a third fast weight, generating the training loss via parameterizing with integration of the second slow and fast weights, and updating the various weights using the training loss and loss gradient associated with the respective weight weights. As such, the claim limitations are distinguishable from the reference.
Gupta teaches a “multipart artificial neural network” (ANN) for meta-learning to classify questions. Input data can be received by the ANN to train the ANN on auxiliary tasks to assign labels to the input and to make predictions related to relationships between the words. The meta-learning process uses supervised learning to make a prediction in a hypothesis space that includes computing a loss function with regards to the prediction. The process also includes calculating weight values for the ANN and its nodes. In addition, Gupta also teaches a memory for storing information. However, Gupta does not explicitly teach the first and second meta weights, generation of the representation loss and the representation loss gradients, generating the first fast weight, generating the task loss and the task loss gradient, mapping the task loss gradient through parameterizing by a second meta weight, integration via parameterizing of the respective slow weights and fast weights, generating a third fast weight via reading the weight memory with soft attention, generating the training loss via parameterizing  with integration of the second slow and fast weights, and updating the various weights using the training loss and loss gradient associated with the respective weight weights. As such, the claim limitations are distinguishable from the reference.
Vilalta teaches data classification via a classification system that uses a meta-learning process that includes model selection and generation of meta-rules and meta-features. The system receives training domain data set and generates meta-features from the domain data set. Then, “the meta-learning model selection process 700 attempts to look for correlations among the meta-features to generate rules that show when to assign a specific model to a certain domain. The selection process 800 identifies the best matching rule to assign a model to a new domain. The meta-rules generation process 900 looks for correlations between meta features and models to determine when a model is best for a specific domain dataset.” The meta-learning process also involves calculating weighted distances between the training sample data sets to measure the variability of the class label associated with the training sample data sets. In addition, Vilalta also teaches a memory for storing instructions. However, Vilalta does not explicitly teach the first and second meta weights, generation of the representation loss and the representation loss gradients, generating the first fast weight, generating the task loss and the task loss gradient, mapping the task loss gradient through parameterizing by a second meta weight, integration via parameterizing of the respective slow weights and fast weights, generating a third fast weight via reading the weight memory with soft attention, generating the training loss via parameterizing  with integration of the second slow and fast weights, and updating the various weights using the training loss and loss gradient associated with the respective weight weights. As such, the claim limitations are distinguishable from the reference.
Ba teaches training of a neural network with classification tasks, generation of slow and fast weights in a neural network, and a storing or caching of the slow and fast weights in relation to memory. However, Ba does not explicitly teach the first and second meta weights, generation of the representation loss and the representation loss gradients, generating the task loss and the task loss gradient, mapping the task loss gradient through parameterizing by a second meta weight, integration via parameterizing of the respective slow weights and fast weights, generating a third fast weight via reading the weight memory with soft attention, generating the training loss via parameterizing  with integration of the second slow and fast weights, and updating the various weights using the training loss and loss gradient associated with the respective weight weights. As such, the claim limitations are distinguishable from the reference.

Claim 16 would be allowable if rewritten to overcome the issues as set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  claim 16 recites that the meta and base learner modules cooperatively integrate a first slow weight and a first fast weight using an augmentation layer approach, wherein an input of an augmentation layer is first transformed by the slow and fast weights, then passed through a non-linearity resulting in separate activation vectors, then the activation vectors are aggregated by an element-wise vector addition. 
Jeng teaches the modules for the meta learner and base learner, but it does not explicitly teach an integration of the slow and fast weights using an augmentation layer approach, and the transformation, passing, and aggregating of the vectors via element-wise addition. 
Vinyals teaches meta-learning, non-linearity via rectified linear unit (ReLU) activation, and memory augmented neural networks, but it does not explicitly teach base learner module, slow and fast weights, an integration of the slow and fast weights using an augmentation layer approach, and the transformation, passing, and aggregating of the vectors via element-wise addition. 
Ba teaches slow and fast weights, augmentation layer approach, non-linearity via rectified linear unit (ReLU) activation, vectors, and vector addition, but it does not explicitly teach meta and base learner modules, an integration of the slow and fast weights using an augmentation layer approach, that an input is first transformed by the slow and fast weights before being passed through the non-linearity to result in separate activation vectors, and then followed by an aggregation of those activation vectors via element-wise vector addition.
Therefore, the claim limitations are distinguishable from the cited references. 

Claim Objections
Claim 4 is objected to because of the following informalities: in step (i) the limitation recites “representation loss associated using”. The limitation lacks a connecting term, e.g. with. The claim can be amended like so: “representation loss associated with using”. Appropriate correction is required.

Claim 13 is objected to because of the following informalities: in step (i) the limitation recites “representation loss associated using”. The limitation lacks a connecting term, e.g. with. The claim can be amended like so: “representation loss associated with using”. Appropriate correction is required.

Claim 20 is objected to because of the following informalities: there should be a period at the end of the claim. The claim should be amended like so: “(MLP).”. Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are: meta learner module and base learner module in claims 15, 16, 19, and 20. Claims 17 and 18 are also interpreted under §112(f) by virtue of their dependency. 
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 15-20 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
Claims 15, 16, 19, and 20 invoke 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph by reciting a meta learner module and a base learner module. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. The modules are only described in specification [0028] as “two main learning modules” in correlation with a method, but with no description of any relevant corresponding structure. It has been recognized by the courts that “merely restating a function associated with a means-plus-function limitation is insufficient to provide the corresponding structure for definiteness” (see MPEP 2181(IV)). Accordingly, claims 15, 16, 19, and 20 are rejected for lack of written description. Claims 17 and 18 are also rejected by virtue of their dependency and they do not provide any further clarification on the issue.

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites the limitation “the steps” in the preamble on line 5. There is insufficient antecedent basis for this limitation in the claim.

Claim 1 recites the limitation “the memory” in step j. However, it is not clear whether this memory is referring back to the memory in the preamble or in step f since both recite a separate instance of “a memory”. As such, clarification is sought as to which instance of memory is being referred back to and whether Applicant intended for the memory in the preamble and step f to be different from or the same as each other. Appropriate correction is required. 
Claim 1 also recites the limitation “the input task data set” in step k. However, it is not clear whether this input task data set is referring back to the input task data set in the preamble or in step g since both recite a separate instance of “input task data set”. As such, clarification is sought as to which instance of input task data set is being referred back to and whether Applicant intended for the input task data set in the preamble and step f to be different from or the same as each other. Appropriate correction is required.
	Claims 2-3 are rejected by virtue of their dependency from claim 1 and because they do not provide further clarification on the issues.

Claim 2 recites the limitation “the first and second meta values” in line 5. There is insufficient antecedent basis for this limitation in the claim. While it is likely that Applicant intended to refer back to claim 1 for antecedent basis, there is still a sufficient lack of antecedent basis because there is not a recitation of only just “first and second meta values”. Rather, claim 1 recites in step k “the first and second meta weight values”. Appropriate correction is required.
	Claim 3 is rejected by virtue of its dependency from claim 2 and because it does not provide further clarification on the issue.

Claim 4 recites the limitation “the steps” in the preamble on line 5. There is insufficient antecedent basis for this limitation in the claim.
Claim 4 also recites the limitation “the task-dependent input representation” in step (iv), There is insufficient antecedent basis for this limitation in the claim. It is likely that Applicant intended to recite: “the first task-dependent input representation”. 
Claims 5-12 are rejected by virtue of their dependency from claim 4 and because they do not provide further clarification on the issues.

Claim 5 recites the limitation “(i) the integration”. However, it is not clear whether this integration is referring back to the integration in claim 4 in step (iv) or in step d(i) since both recite a separate instance of “an integration of the first slow weight and the first fast weight”. As such, clarification is sought as to which instance of integration is being referred back to and whether Applicant intended for the integration in step (iv) or in step d(i) to be different from or the same as each other. Appropriate correction is required.
Claim 6 is rejected by virtue of its dependency from claim 5 and because it does not provide further clarification on the issue.

Claim 13 recites the limitation “the task-dependent input representation” in step (iv), There is insufficient antecedent basis for this limitation in the claim. It is likely that Applicant intended to recite: “the first task-dependent input representation”. 
Claim 14 is rejected by virtue of its dependency from claim 13 and because it does not provide further clarification on the issue.

Claim 19 recites the limitation “from the set of examples”. There is insufficient antecedent basis for this limitation in the claim. While it is likely that Applicant intended to refer back to claim 15 for antecedent basis, there is still a sufficient lack of antecedent basis because there is not a recitation of only just “set of examples”. Rather, claim 15 recites in step b(i) “a support set of examples” and in step b(iii) “a training set of examples”.  
In addition, since there are two separate instances of “set of examples”, it is not clear whether this set is referring back to the set in step b(i) or step b(iii) of claim 15. As such, the claim must be amended to recite the appropriate antecedent basis and clarification is sought as to which instance of set of samples is being referred back to. 

Claim limitations “a meta learner module” and “a base learner module” in claims 15, 16, 19, and 20 invoke 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. 
The modules are only described in specification [0028] as “two main learning modules” in correlation with a method, but with no description of any relevant corresponding structure. It has been recognized by the courts that “merely restating a function associated with a means-plus-function limitation is insufficient to provide the corresponding structure for definiteness” (see MPEP 2181(IV)).
Therefore, the claims 15, 16, 19, and 20 are indefinite and are rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph. Claims 17 and 18 are also rejected by virtue of their dependency and they do not provide any further clarification on the issue.
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 
(b)        Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1, 4, 7, 9-13, 15, and 17-19 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

Step 1 for all claims
	Under the first part of the analysis, claims 1, 4, 7, and 9-12 recite a method, claim 13 recites a non-transitory computer readable medium, and claims 15 and 17-19 recite a system. Accordingly, these claims fall within the four statutory categories and the analysis now proceeds to Step 2A, Prongs 1 and 2 and then Step 2B.
 
Claim 1
Step 2A, prong 1: the following limitations recite mental processes:
“A method of classifying an input task data set by meta level continual learning, …
a) analyzing a first training data set to thereby generate a first meta information value in a task space; 
b) assigning the first meta information value to the first training data set to generate a first meta weight value in a meta space; 
c) analyzing a second training data set that is distinct from the first training data set to generate a second meta information value in the task space; 
d) assigning the second meta information value to the second training data set to generate a second meta weight value in the meta space; 
e) comparing the first meta weight value and the second meta weight value to generate a slow weight value; …
g) comparing an input task data set to the slow weight value to generate a third meta information value in the task space; …
i) comparing the third meta information value to the slow weight value to generate a fast weight value in the meta space; … and
k) parameterizing the first and second meta weight values with the fast weight value to update the slow weight value, whereby a value is associated with the input task data set, thereby classifying the input task data set by meta level continual learning.”
The above limitations describe mental processes because, under a broadest reasonable interpretation (BRI), they involve: classifying input data; analyzing first and second training data; assigning first and second meta information values; comparing the first and second meta weight values to generate a slow weight value; comparing input task to the slow weight value to generate a third meta information value; comparing the third information value to the slow weight to generate a fast weight; and parameterizing the first and second meta values that enables classifying the input task data. 
Thus, the claim recites mental processes based on observations, evaluations, judgments, or opinions that are performable in the human mind or with the aid of pencil and paper (see MPEP 2106.04(a)(2)(III)). Indeed, the claim limitations mainly relate to classifying, analyzing, assigning, comparing, and parameterizing various types of data and values, which are processes that are based on observations, evaluations, judgments, or opinions and thus are performable in the human mind or with the aid of pencil and paper. Wherein parameterizing merely denotes a description or representation of the weight values with the fast weight, which is a process that can also be performed mentally or with the aid of pencil and paper. Furthermore, the generation of the various information and weight values from the analyzing and comparing steps also denote mental processes because one can mentally or with the aid of pencil and paper, generate values from making the analysis and comparison of the corresponding data. That is, the generations are also based on observations, evaluations, judgments, or opinions, which thus denotes mental processes. 
As such, these limitations are conceivably performed mentally or with the aid of paper and pencil and thus are considered as mental processes.
Step 2A, prong 2: the following limitations recite additional elements:
“…by a processor and a memory with computer code instructions stored thereon, the memory operatively coupled to the processor such that, when executed by the processor, the computer code instructions cause the system to implement the method, the method comprising the steps of: …
f) storing the slow weight value in a memory that is accessible by the task space and the meta space; …
 h) transmitting the third meta information value from the task space to the meta space; …
j) storing the fast weight in the memory; ….”
The preamble recites additional elements related to mere instructions for applying the judicial exception, i.e. the steps related to performing the mental processes of classifying an input task data set, on a generic computing device such as the memory and processor with computer code instructions (see MPEP 2106.05(f)). Wherein the recitation of the memory and processor with computer code instructions denote a generic computing environment (see MPEP 2106.05(h)) and do not amount to anything more than the use of the generic computing environment to perform the mental processes (see MPEP 2106.04(a)(2)(III)(C)). 
The limitation describing storing the slow weight value and storing the fast weight value recite, at a high level of generality, the additional element of an insignificant extra-solution activity related to mere data storage (see MPEP 2106.05(g)). The limitation describing transmitting the third meta information value recites, at a high level of generality, the additional element of an insignificant extra-solution activity related to mere data output (see MPEP 2106.05(g)). 
Thus, the limitations taken together do not integrate the judicial exception into a practical application. 
Step 2B: the limitations recited above do not amount to significantly more than the judicial exception. As stated above, the preamble reciting the processing circuitry relate to mere instructions to apply the judicial exception on generic computing devices, wherein such application does not amount to significantly more than the judicial exception because the use of generic computing tools to execute the instruction for the judicial exception does not denote anything significantly more than the judicial exception (see MPEP 2106.05(f)). 
Furthermore, specifying the utilization of the memory and processor with computer code instructions is an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)).
Additionally, the limitations reciting storing the slow and fast weight values and transmitting the third meta information value denote mere data storage and data output indicative of insignificant extra-solution activities (see MPEP 2106.05(g)). Wherein the courts have held that “receiving or transmitting data over a network” or “storing and retrieving information in memory” are known to be well-understood, routine, and conventional activities when recited at a high level of generality (see MPEP 2106.05(d)(II)). 
As such, the limitations do not amount to significantly more than the judicial exception. 

Claim 4
Step 2A, prong 1: the following limitations recite mental processes with mathematical concepts:
“a) for each of a set of T support examples from a set of N support examples, N and T being integers,
i) generating a representation loss associated using a representation learning function parameterized by a first slow weight (Q), and 
ii) generating a representation loss gradient based on the representation loss and a loss gradient associated with the first slow weight (Q); 
b) generating a first fast weight by evaluating a first generating function parameterized by a first meta weight (G) and the loss gradients associated with the first slow weights generated for the T support examples; 
c) for each of the set of N support examples, 
i) generating a task loss using a base learning function parameterized by a second slow weight (W), 
ii) generating a task loss gradient based on the task loss and a loss gradient associated with the second slow weight (W), 
iii) mapping the task loss gradient, through a second generating function parameterized by a second meta weight (Z), to a second fast weight, and …, and

iv) generate a first task-dependent input representation using the representation learning function parameterized by an integration of the first slow weight and the first fast weight, and…; 
d) for each of a set of L training examples, L being an integer, 
i) generate a second task-dependent input representation using the representation learning function parameterized by an integration of the first slow weight and the first fast weight, 
ii)…, using the second task-dependent input representation, to generate a third fast weight, and 
iii) generate a training loss using a base learning function parameterized by an integration of the second slow weight and the second fast weight, added to a previous training loss; and 
e) update the first slow weight, the second slow weight, the first meta weight and the second meta weight using the training loss and a loss gradient associated with the first slow weight, the second slow weight, the first meta weight and the second meta weight.  
The above limitations describe mental processes because, under a broadest reasonable interpretation (BRI), they involve: generating a representation loss and a representation loss gradient for support examples; generating fast weight by evaluating a first generating function; generating task loss and task loss gradient for the support examples; mapping the task loss gradient for the support examples; generating a first task-dependent input representation for the support examples; generating a second task-dependent input representation for training examples; generating a third fast weight from reading memory for the training examples; generating a training loss for the training examples; and updating the various weights. 
Thus, the claim recites mental processes based on observations, evaluations, judgments, or opinions that are performable in the human mind or with the aid of pencil and paper (see MPEP 2106.04(a)(2)(III)). Indeed, the claim limitations mainly relate to generating various types of losses and loss gradients, mapping loss gradient, and updating various weights, which are processes that are based on observations, evaluations, judgments, or opinions and thus are performable in the human mind or with the aid of pencil and paper. In addition, the recitation of parameterizing merely denotes a description or representation of the weight values, which is a process that can also be performed mentally or with the aid of pencil and paper. As such, these limitations are conceivably performed mentally or with the aid of paper and pencil.
Furthermore, these limitations also recite mathematical concepts because, as described above, the description of loss gradients and learning functions based on integration of various weight values relate to mathematical relationships and calculations (see MPEP 2106.04(a)(2)(I)). That is, the limitations denote mental processes with mathematical concepts because the limitations are based on observations, evaluations, judgments, or opinions related to the mathematical concepts that are performable in the human mind or with the aid of pencil and paper.
 As such, the limitations denote mental processes with mathematical concepts. 
Step 2A, prong 2: the following limitations recite additional elements:
“A method of facilitating one-shot learning in a neural network, by a processor and an instruction memory with computer code instructions stored thereon, the instruction memory operatively coupled to the processor such that, when executed by the processor, the computer code instructions cause the system to implement the method, the method comprising the steps of: …
… storing the second fast weight in a weight memory …
… indexing the weight memory with the task-dependent input representation …
… reading the weight memory with soft attention ….”
The preamble recites additional elements related to mere instructions for applying the judicial exception on a generic computing device such as the memory and processor with computer code instructions (see MPEP 2106.05(f)). Wherein the recitation of the memory and processor with computer code instructions denote a generic computing environment (see MPEP 2106.05(h)). 
While the preamble recites “facilitating one-shot learning in a neural network”, this is insufficient to integrate the claim limitations into a practical application because, aside from this statement in the preamble, the claim limitations do not describe a neural network and one-shot learning technique. Rather, the limitations describe various generation steps, a mapping step, a reading step, and an updating step. As such, the limitations do not provide a concrete tying between how these steps relate to or denote the facilitation of one-shot learning in the neural network. Indeed, the limitations as recited might not have any relation to a neural network or one-shot learning in a neural network at all because one can apply the limitations simply to model various data and to generate their losses associated with various weight values. Thus, when reading the preamble in the context of the entire claim, the recitation “facilitating one-shot learning in a neural network” is not limiting because the body of the claim describes a complete invention and the language recited solely in the preamble does not provide any distinct definition of any of the claimed invention’s limitations. Therefore, the preamble of the claim(s) is not considered a limitation and is of no significance to claim construction. See Pitney Bowes, Inc. v. Hewlett-Packard Co., 182 F.3d 1298, 1305, 51 USPQ2d 1161, 1165 (Fed. Cir. 1999). See MPEP § 2111.02. As such, the recitation of “facilitating one-shot learning in a neural network” in the preamble does not integrate the claim into a practical application.
The limitation describing storing the second fast weight value recites, at a high level of generality, the additional element of an insignificant extra-solution activity related to mere data storage (see MPEP 2106.05(g)). The limitations describing the indexing and reading the weight memory relate to mere instructions for applying the judicial exception (see MPEP 2106.05(f)). 
Thus, the limitations taken together do not integrate the judicial exception into a practical application. 
Step 2B: the limitations recited above do not amount to significantly more than the judicial exception. As stated above, the preamble and limitations describing the indexing and reading the weight memory relate to mere instructions to apply the judicial exception on generic computing devices, wherein such application does not amount to significantly more than the judicial exception because the use of generic computing tools to execute the instruction for the judicial exception does not denote anything significantly more than the judicial exception (see MPEP 2106.05(f)). 
Furthermore, specifying the utilization of the memory and processor with computer code instructions is an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)).
Additionally, the limitation reciting storing the second fast weight value denotes mere data storage indicative of an insignificant extra-solution activity (see MPEP 2106.05(g)). Wherein the courts have held that “storing and retrieving information in memory” are known to be well-understood, routine, and conventional activities when recited at a high level of generality (see MPEP 2106.05(d)(II)). 
As such, the limitations do not amount to significantly more than the judicial exception. 

Claim 7
Step 2A, prong 1: the claim inherits the mental processes with mathematical concepts from the independent claim. The claim does not recite additional abstract ideas. 
Step 2A, prong 2: the claim recites the additional element: “wherein the set of N support examples comprise class labels”. The composition of the N support examples denotes a field of use (see MPEP 2106.05(h)). 
Step 2B: the limitation recited above does not amount to significantly more than the judicial exception because it merely denotes a field of use related to the N support examples (see MPEP 2106.05(h)).

Claim 9
Step 2A, prong 1: the claim recites a mental process with a mathematical concept: “wherein generating the task loss further comprises utilizing a loss function capable of capturing a representation learning objective”. 
The limitation denotes a mental process because, under a broadest reasonable interpretation (BRI), it involves generating the task loss, which is a process that is based on observations, evaluations, judgments, or opinions that are performable in the human mind or with the aid of pencil and paper. Wherein the loss function capable of capturing a representation learning objective denotes a mathematical relationship (see MPEP 2106.04(a)(2)(I)). As such, the limitation denotes a mental process with a mathematical concept because the limitation is based on observations, evaluations, judgments, or opinions related to the mathematical concept that is performable in the human mind or with the aid of pencil and paper.
Step 2A, prong 2: the claim does not recite any additional elements that integrate the judicial exception into a practical application.
Step 2B: the claim does note recite any additional elements that amount to significantly more than the judicial exception.

Claim 10
Step 2A, prong 1: the claim recites a mathematical concept: “wherein the loss function is a cross-entropy loss function when the set of N support examples has a single example per class”. The limitation denotes a mathematical relationship because a cross-entropy loss function relates to a mathematical relationship or calculation or formula (see MPEP 2106.04(a)(2)(I)). As such, the limitation denotes a mathematical concept.
Step 2A, prong 2: the claim does not recite any additional elements that integrate the judicial exception into a practical application.
Step 2B: the claim does note recite any additional elements that amount to significantly more than the judicial exception.
Claim 11
Step 2A, prong 1: the claim recites a mathematical concept: “wherein the loss function is a contrastive loss function when the set of N support examples has a more than one example per class”. The limitation denotes a mathematical relationship because a contrastive loss function relates to a mathematical relationship or calculation or formula (see MPEP 2106.04(a)(2)(I)). As such, the limitation denotes a mathematical concept.
Step 2A, prong 2: the claim does not recite any additional elements that integrate the judicial exception into a practical application.
Step 2B: the claim does note recite any additional elements that amount to significantly more than the judicial exception.

Claim 12
Step 2A, prong 1: the claim recites a mathematical concept: “comprises an attention function and a normalizing function”. The limitation denotes a mathematical relationship because attention and normalizing functions relate to a mathematical relationship or calculation or formula (see MPEP 2106.04(a)(2)(I)). As such, the limitation denotes a mathematical concept.
Step 2A, prong 2: the claim recites the additional element: “wherein reading the memory with soft attention”. The limitation relates to mere instructions for applying the judicial exception (see MPEP 2106.05(f)). Thus, it does not integrate the judicial exception into a practical application. 
Step 2B: the limitations recited above does not amount to significantly more than the judicial exception. As stated above, describing the reading the memory relate to mere instructions to apply the judicial exception on generic computing devices, wherein such application does not amount to significantly more than the judicial exception because the use of generic computing tools to execute the instruction for the judicial exception does not denote anything significantly more than the judicial exception (see MPEP 2106.05(f)). 
As such, the limitation does not amount to significantly more than the judicial exception. 

Claim 13: is substantially similar to independent claim 4 and thus is rejected for the same reasons as claim 4. Claim 13 just adds in a “non-transitory computer-readable medium with computer code instruction stored thereon, the computer code instructions, when executed by a processor, cause an apparatus” to process the method, which is an additional element related to mere instructions for applying the judicial exception (see MPEP 2106.05(f)) and does not integrate the judicial exception into a practical application. Wherein the recitation of the memory and processor with computer code instructions denote a generic computing environment (see MPEP 2106.05(h)). Furthermore, the recitation of generic computing components to perform the mental process still amounts to a mental process that can be performed on a generic computer. See MPEP 2106.04(a)(2)(III)(C). Similarly, the use of the generic computer to perform the mathematical concepts intertwined with the mental processes amount to the use of a generic computer element to perform the judicial exception and nothing more.  

Claim 15
Step 2A, prong 1: the following limitations recite mental processes: 
“(ii) generate one or more fast weights, and 
(iii) optimize one or more slow weights…, based on the one or more fast weights and a training set of examples”. 
The above limitations describe mental processes because, under a broadest reasonable interpretation (BRI), they involve: generating fast weights and optimizing slow weights. Thus, the claim recites mental processes based on observations, evaluations, judgments, or opinions that are performable in the human mind or with the aid of pencil and paper (see MPEP 2106.04(a)(2)(III)). Indeed, one can mentally or with the aid of pencil and paper generate data comprising weights via observations, evaluations, judgments, or opinions. Likewise, one can mentally or with the aid of pencil and paper optimize weights based on other data such as fast weights and training sets via observations, evaluations, judgments, or opinions. As such, these limitations denote mental processes.
Step 2A, prong 2: the following limitations recite additional elements:
“A system for facilitating one-shot learning in neural network, comprising: 
a) a meta learner module; 
b) a base learner module operatively coupled to the meta learner module, the meta learner module and base learner module configured to cooperatively (i) acquire meta information from a support set of examples, 
… used by the base learner module …; and 
c) a memory device operatively coupled to the meta learner module and the base learner module, the meta learner module and base learner module being configured to cooperatively store the one or more slow weights and the one or more fast weights in the memory device.”
While the preamble recites “facilitating one-shot learning in a neural network”, this is insufficient to integrate the claim limitations into a practical application because, aside from this statement in the preamble, the claim limitations do not describe a neural network and one-shot learning technique. Rather, the limitations describe an acquiring step, a generation step, an optimization step, and a storing step. As such, the limitations do not provide a concrete tying between how these steps relate to or denote the facilitation of one-shot learning in the neural network. Indeed, the limitations as recited might not have any relation to a neural network or one-shot learning in a neural network at all because one can apply the limitations simply to model various data and to acquire, generate, optimize, and store data associated with various weights. Thus, when reading the preamble in the context of the entire claim, the recitation “facilitating one-shot learning in a neural network” is not limiting because the body of the claim describes a complete invention and the language recited solely in the preamble does not provide any distinct definition of any of the claimed invention’s limitations. Therefore, the preamble of the claim(s) is not considered a limitation and is of no significance to claim construction. See Pitney Bowes, Inc. v. Hewlett-Packard Co., 182 F.3d 1298, 1305, 51 USPQ2d 1161, 1165 (Fed. Cir. 1999). See MPEP § 2111.02. As such, the recitation of “facilitating one-shot learning in a neural network” in the preamble does not integrate the claim into a practical application.
The limitation reciting the meta and the base learner modules denote additional elements related to mere instructions for applying the judicial exception on a generic computing system such as the meta and the base learner modules (see MPEP 2106.05(f)). The limitation describing acquire meta information recites, at a high level of generality, the additional element of an insignificant extra-solution activity related to mere data gathering (see MPEP 2106.05(g)). The limitations describing the use by the base learner module relate to mere instructions for applying the judicial exception (see MPEP 2106.05(f)). The system in the preamble and the limitations describing the operative coupling between the two learner modules and the operative coupling between the memory device and the two learner modules denote a field of use indicative of a generic computing environment (see MPEP 2106.05(h)). The limitation describing store the slow and fast weights in the memory device recites, at a high level of generality, the additional element of an insignificant extra-solution activity related to mere data storage (see MPEP 2106.05(g)). 
Thus, the limitations taken together do not integrate the judicial exception into a practical application. 
Step 2B: the limitations recited above do not amount to significantly more than the judicial exception. As stated above, the limitations describing the meta and the base learner modules and the use by the base learner module relate to mere instructions to apply the judicial exception on generic computing devices, wherein such application does not amount to significantly more than the judicial exception because the use of generic computing tools to execute the instruction for the judicial exception does not denote anything significantly more than the judicial exception (see MPEP 2106.05(f)). 
Additionally, the system in the preamble and the limitations describing the operative coupling between the two learner modules and the operative coupling between the memory device and the two learner modules denote a field of use indicative of a generic computing environment (see MPEP 2106.05(h)). Wherein an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)).
The limitations reciting acquiring meta information and storing the weights in the memory device denote mere data storage and data output indicative of insignificant extra-solution activities (see MPEP 2106.05(g)). Wherein the courts have held that “receiving or transmitting data over a network” or “storing and retrieving information in memory” are known to be well-understood, routine, and conventional activities when recited at a high level of generality (see MPEP 2106.05(d)(II)). 
As such, the limitations do not amount to significantly more than the judicial exception. 

Claim 17
Step 2A, prong 1: the following limitations recite a mathematical concept: “wherein the non-linearity is implemented with a rectified linear unit (ReLU)”. The limitation denotes a mathematical relationship because a rectified linear unit relates to a mathematical relationship or calculation or formula (see MPEP 2106.04(a)(2)(I)). As such, the limitation denotes a mathematical concept.
Step 2A, prong 2: the claim does not recite any additional elements that integrate the judicial exception into a practical application.
Step 2B: the claim does note recite any additional elements that amount to significantly more than the judicial exception.

Claim 18
Step 2A, prong 1: the claim inherits the mental processes from the independent claim. The claim does not recite additional mental processes. 
Step 2A, prong 2: the claim recites the additional element: “wherein the support set of examples and the training set of examples further comprise class labels”. The composition of the support set of examples denotes a field of use (see MPEP 2106.05(h)). 
Step 2B: the limitation recited above does not amount to significantly more than the judicial exception because it merely denotes a field of use related to the support set of examples (see MPEP 2106.05(h)).

Claim 19
Step 2A, prong 1: the following limitations recite mental processes: “evaluate each example instance from the set of examples, and generate the one or more fast weights and optimize the one or more slow weights based on the example instance, before an evaluation of a subsequent example instance.”
The above limitations describe mental processes because, under a broadest reasonable interpretation (BRI), they involve: evaluating example instance and generating fast weights and optimizing the slow weights before a subsequent example instance occurs. Thus, the claim recites mental processes because the processes are based on observations, evaluations, judgments, or opinions that are performable in the human mind or with the aid of pencil and paper (see MPEP 2106.04(a)(2)(III)). Indeed, the claim limitations mainly relate to evaluating example instance data and generating and optimizing weighs based on the example instance data before a subsequent example instance occurs. As such, these limitations are conceivably performed mentally or with the aid of paper and pencil and thus are considered as mental processes.
Step 2A, prong 2: the claim recites the additional element: “wherein the meta learner and base learner modules are configured to”. The limitation recites additional elements related to mere instructions for applying the judicial exception on a generic computing device such as the two learner modules (see MPEP 2106.05(f)). Wherein the recitation of the two learner modules denote generic computer components (see MPEP 2106.05(h)).
 Step 2B: the limitation recited above does not amount to significantly more than the judicial exception. As stated above, the imitation describing the two learner modules relate to mere instructions to apply the judicial exception on generic computing devices, wherein such application does not amount to significantly more than the judicial exception because the use of generic computing tools to execute the instruction for the judicial exception does not denote anything significantly more than the judicial exception (see MPEP 2106.05(f)). The limitation does not amount to anything more than the use of the generic computing environment to perform the mental processes (see MPEP 2106.04(a)(2)(III)(C)). 
Furthermore, specifying the utilization of the two learner modules is an implementation to a generic computer environment that has been held in FairWarning v. Iatric Sys to be merely indicative of a field of use or tech environment and thus not significantly more than the judicial exception (see MPEP 2106.05(h)).
As such, the limitations do not amount to significantly more than the judicial exception. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 15 and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Jeng et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2004/0024745, hereinafter Jeng) in view of Vinyals et. al. “Matching Networks for One Shot Learning” (hereinafter Vinyals) and Ba et. al., “Using FastWeights to Attend to the Recent Past” (hereinafter Ba).


Regarding claim 15, Jeng teaches:
A system …, comprising ([0017]: describing a computing system as shown in Fig. 1.): 
a) a meta learner module ([0026]-[0028]: describing a “meta learner”. Wherein the meta learner is part of a computing system with servers and program software/modules to implement the meta learner ([0017]-[0018]), denoting a meta learner module.) 
b) a base learner module operatively coupled to the meta learner module, the meta learner module and base learner module configured to cooperatively ([0026]-[0028]: describing a operative coupling between the meta learner and “base learner[] modules” wherein data, e.g. computation results related to various tasks, are cooperatively passed between the two. This is shown in Fig. 5. The meta learner module was previously described.)
(i) acquire meta information from a support set of examples ([0024]-[0025]: describing meta processing of domain data sets, i.e. support set of examples, to obtain information.), 
…, and 
… by the base learner module ([0028]: “by base learner[] modules”.),…; and 
c) a memory device operatively coupled to the meta learner module and the base learner module ([0017]: describing that the system comprises of computers and servers, which have associated memory operatively coupled. Wherein the system also includes the meta and bae learners modules that are operatively coupled together and with the system ([0026]-[0028]). See also Figs. 1 and 5: showing the system and the two learners. The meta learner module was previously described.), the meta learner module and base learner module being configured to cooperatively (([0017]-[0018] and [0026]-[0028]: describing a operative coupling between the meta learner and base learner modules. This is shown in Fig. 5.)… in the memory device ([0017]: describing that the system comprises of computers and servers, which have associated memory.).

While the cited reference Jeng teaches the above limitations of claim 15, it does not explicitly teach: “for facilitating one-shot learning in neural network” in the preamble. Vinyals teaches: one-shot learning within a neural network (Vinyals Sections 2-2.2 and 5). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system for meta learning in Jeng to include the one-shot learning in Vinyals. Doing so would enable a “new neural architecture that, by way of its corresponding training regime, is capable of state-of-the-art performance on a variety of one-shot classification tasks” (Vinyals Section 5). 

While the cited references in combination teach the above limitations of claim 15, they do not explicitly teach: “(ii) generate one or more fast weights”; “(iii) optimize one or more slow weights used… based on the one or more fast weights and a training set of examples”; and “store the one or more slow weights and the one or more fast weights”. Ba teaches: 
“(ii) generate one or more fast weights”: describing generation of fast weights in a neural network (Ba Sections 3 and 3.1). 
“(iii) optimize one or more slow weights used … based on the one or more fast weights and a training set of examples”: describing that the slow weights of neural network layers can learn, i.e. are optimized, by stochastic gradient descent (Ba Section 3 and Supplemental Section A). Wherein the learning of the slow weights of the neural network occurs in correlation with fast weights (Ba Section 3) and with training data (Ba Section 4.1 and Supplemental Section A). See also Figs. 1 and 3 showing the slow and fast weights in the layers of the neural network.
“store the one or more slow weights and the one or more fast weights”: describing storing/caching of the slow and fast weights in relation to memory (Ba Sections 3 and 4.2).     
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system for meta learning with one-shot learning in the combined cited references to include the slow and fast weights in Ba. Doing so would enable an improvement in “machine learning by showing that the performance of [neural networks] on a variety of different tasks can be improved by introducing a mechanism that allows each new state of the hidden units to be attracted towards recent hidden states in proportion to their scalar products with the current state” (Ba Section 5). 

Regarding claim 17, the rejection of claim 15 is incorporated. Vinyals further teaches:
The system of claim 15, wherein the non-linearity is implemented with a rectified linear unit (ReLU) (Vinyals Section 4.1.1: describing that the neural network model comprises of “a Relu non-linearity”.).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system for meta learning with the slow and fast weights in the combined cited references to include the ReLU in Vinyals. Doing so would enable a “new neural architecture that, by way of its corresponding training regime, is capable of state-of-the-art performance on a variety of one-shot classification tasks” (Vinyals Section 5). 
Regarding claim 18, the rejection of claim 15 is incorporated. Vinyals further teaches:
The system of claim 15, wherein the support set of examples and the training set of examples further comprise class labels (Vinyals Sections 2.1, 4.1.2, and 4.1.3: describing that the “support set of k examples” and “test example” data sets comprise class labels.).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system for meta learning with the slow and fast weights in the combined cited references to include the class labels in Vinyals. Doing so would enable a “training procedure [that] is based on a simple machine learning principle: test and train conditions must match. Thus to train our network to do rapid learning, we train it by showing only a few examples per class, switching the task from minibatch to minibatch, much like how it will be tested when presented with a few examples of a new task” (Vinyals Section 1). 

Regarding claim 19, the rejection of claim 15 is incorporated. Jeng teaches:
The system of claim 15, wherein the meta learner and base learner modules are configured to evaluate each example instance from the set of examples ([0026] and [0028]: describing that the meta learner and base learner modules analyze each data set from a plurality of data sets, i.e. set of examples.), and ….

While the cited reference Jeng teaches the above limitations of claim 19, it does not explicitly teach: “generate the one or more fast weights and optimize the one or more slow weights based on the example instance, before an evaluation of a subsequent example instance”. Ba further teaches: 
“generate the one or more fast weights (Ba Sections 3 and 3.1: describing generation of fast weights in the neural network) and optimize the one or more slow weights based on the example instance (Ba Section 3 and Supplemental Section A: describing that the slow weights of neural network layers can learn, i.e. are optimized, by stochastic gradient descent based on current inputs.), before an evaluation of a subsequent example instance (Ba Sections 3 and 3.1: describing an analysis at each iteration in a current hidden state with the current inputs at a “particular time step”. The inputs include a plurality of validation and test examples (Ba Section 4.1). That is, the current inputs are analyzed at each iteration before additional inputs are analyzed.)”.
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system for meta learning with one-shot learning in the combined cited references to include the slow and fast weights in Ba. A motivation to combine the cited references with Ba was previously given

Regarding claim 20, the rejection of claim 15 is incorporated. Jeng teaches:
The system of claim 15, wherein the meta learner and base learner modules ([0017]-[0018] and [0026]-[0028]: describing the meta learner and base learner modules.)….  

While the cited reference Jeng teaches the above limitations of claim 20, it does not explicitly teach: “are integrated by a layer augmented multilayer perceptron (MLP)”. Ba further teaches: an integration process that involves augmenting multiple hidden layers of the neural network (Ba Section 4.2), i.e. multilayer perceptron. See also Figs. 1 and 3: showing the augmentation of the hidden layers.).  
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system for meta learning with one-shot learning in the combined cited references to include the MLP in Ba. Doing so would enable an improvement in “machine learning by showing that the performance of [neural networks] on a variety of different tasks can be improved by introducing a mechanism that allows each new state of the hidden units to be attracted towards recent hidden states in proportion to their scalar products with the current state” (Ba Section 5). 
 
Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicant's disclosure:
Bilalli et. al. “Automated Data Pre-Processing via Meta-Learning”: describing a meta-learner for pre-processing data tasks. The process involves creating a meta-space based on “metadata that can be extracted from datasets and from the executions of algorithms on those datasets”. From the meta-space, a plurality of meta-learners is built. 
Datta et. al (U.S. Pat. App. Pre-Grant Pub. No. 2009/0083332): describing meta learning comprising meta-learner for annotating data. Wherein the meta-learner is “trained on available data with feedback”. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SELENE A HAEDI whose telephone number is (571)270-5762. The examiner can normally be reached M-F 11 AM - 7 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, OMAR FERNANDEZ RIVAS can be reached on (571)272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/SELENE A. HAEDI/Examiner, Art Unit 2128