DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 7/26/2021 has been entered.
 
Response to Amendments
Acknowledgement is made of Applicant's claim amendments on 7/26/2021. The claim amendments are entered. Presently, claims 12, 16-31, 43, 47-49, 52-53, and 56 remain pending. Claims 7-11, 13-15, 45-46, 51, 54-55, and 57-62 was previously canceled. Claims 1-6, 32-42, 44 and 50 were elected to be withdrawn from consideration as part of a restriction. Claims 12 and 43 are amended. 

Applicant has amended the specification to include the requisite trademark designations for the various trademarks. Accordingly, the specification objection is withdrawn. 


Response to Arguments
Applicant's arguments filed on 7/26/2021 have been fully considered but they are not persuasive.

Applicant argues that Diamos does not apply because it allegedly does not teach all the various limitations of the newly presented claim amendments (Applicant's Reply pgs. 9-10). Diamos by itself does not explicitly teach all the amended claim limitations, and accordingly, the §102 rejection is withdrawn. 
However, the combination of Diamos, when considered in conjunction with Kwasny, which has been incorporated into the §103 rejection of the independent claims as necessitated by Applicant's amendments, does teach the amended claim limitations. Accordingly, an updated rejection is presented under §103.

Applicant also reiterates similar arguments regarding Diamos and the interpretation of the terms “target”, “transformer”, and “auto-associative” (Applicant's Reply pgs. 9-10). Those arguments were previously addressed in the Final Office Action (FOA) and are again reiterated in response to these arguments. Reference can be made to the FOA for further details. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 12, 16, 43, and 47 are rejected under 35 U.S.C. 103 as being unpatentable over Diamos et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2017/0169326, hereinafter Diamos) in view of Kwasny et. al. (U.S. Pat. No. 6,285,992, hereinafter Kwasny).

Regarding claim 12, Diamos teaches:
A computer system comprising: 
a set of processor cores ([0051]-[0052]: “multiple processor cores”.); and 
computer memory in communication with the set of processor cores, wherein the computer memory stores software that when executed by the set of processor cores, causes the set of processor cores to recursively train a recurrent neural network with a plurality of input examples, such that ([0028] and [0035]-[0036]: describing neural network architecture that “when implemented in software and executed on processors, these neural network architectures are commonly represented using one two-dimensional matrix”. Wherein the neural network can comprise a recurrent neural network (RNN) that can be trained recursively using input data ([0039]-[0044], [0047], and [0061]).): 
the recurrent neural network comprises a deep neural network that comprises N+1 layers, numbered 0, ..., N, wherein N > 3, and wherein layer 0 is an input layer and layer N is an output layer of the recurrent neural network, and wherein layers 1 to N-1 are between the input layer and the output layer (Figs. 2A-3A: showing the RNN architecture comprising a plurality of layers, wherein the plurality of layers denote depth, with a RNN being a type of deep neural network (DNN). Similarly, see [0034]-[0035]: describing the RNN architecture and operation in further detail, wherein the architecture comprises input, intermediate, and output layers and nodes.); 
the recurrent neural network is trained to produce an output pattern for each of the input examples ([0038]-[0039] and [0072]: describing that “a single input sequence x and corresponding output sequence y be sampled from a training set”, so that “an output sequence [can be] transformed from an input sequence”.); 
a target for the output pattern for each input example is the input example ([0046]: “Each individual value 134 in output vector 136 corresponds to a character that is associated with at least one input value from input data vectors 118-124”. 
See also [0038]-[0039] and [0072]: describing that “a single input sequence x and corresponding output sequence y be sampled from a training set”, so that “an output sequence [can be] transformed from an input sequence”.) …; and 
(Figs. 2A-3A: showing the RNN architecture comprising a plurality of layers with corresponding arcs and nodes. Similarly, see [0034]-[0035]: describing the RNN architecture and operation in further detail, wherein the architecture comprises input, intermediate, and output layers and nodes.).

While the cited reference teaches the limitations of claim 12, it does not explicitly teach: “such that the recurrent neural network auto-associatively learns, through the recursive training, to memorize the plurality of input examples” on lines 14-15. Kwasny discloses the claim limitations, teaching: neural network that operates “based on [a] sequential recursive auto-associative memory” that can be trained in an auto-associative manner so that the outputs and inputs match (Kwasny col. 5, lines 59-67 to col. 6, lines 1-35). That is, an input can be memorized via the operation and training of the neural network.  
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system in the cited reference to include the training in Kwasny. Doing so would enable “[a] neural network approach derived from sequential recursive auto-associative memory [to be] used to parse the wavelet coefficients and hierarchy data. Since the wavelet coefficients are continuous, linear output instead of sigmoidal output is used. This variation is therefore referred to as linear output sequential recursive auto-associative memory, or LOSRAAM. The objective of training the LOSRAAM network is to have the output exactly match the input.” (Kwasny Abstract). 

Regarding claim 16, Diamos teaches:
The computer system of claim 12, wherein the software causes the set of processor cores to train the recurrent neural network such that the only directed arcs in the recurrent neural network from a higher numbered layer to a lower numbered layer are from a node in the output layer N to a node in the input layer 0 (Figs. 2D and 3A: showing that the output nodes and layers at t5 correspond back to the previous nodes and layers subsequently all the way back to the initial input nodes and layers at t0.).

Regarding claim 43, Diamos teaches:
A method comprising: 
training, recursively, by a computer system that comprises a set of processor cores, a recurrent neural network with a plurality of input examples, such that ([0028] and [0035]-[0036]: describing neural network architecture that “when implemented in software and executed on processors, these neural network architectures are commonly represented using one two-dimensional matrix”. Wherein the neural network can comprise a recurrent neural network (RNN) that can be trained recursively using input data ([0039]-[0044], [0047], and [0061]). With the RNN able to operate on “multiple processor cores” ([0051]-[0052]).):
the recurrent neural network comprises a deep neural network that comprises N+1 layers, numbered 0, ..., N, wherein N > 3, and wherein layer 0 is an input layer and layer N -9-is an output layer of the recurrent neural network, and wherein layers 1 to N-1 are between the input layer and the output layer (Figs. 2A-3A: showing the RNN architecture comprising a plurality of layers, wherein the plurality of layers denote depth, with a RNN being a type of deep neural network (DNN). Similarly, see [0034]-[0035]: describing the RNN architecture and operation in further detail, wherein the architecture comprises input, intermediate, and output layers and nodes.); 
the recurrent neural network is trained to produce an output pattern for each of the input examples ([0038]-[0039] and [0072]: describing that “a single input sequence x and corresponding output sequence y be sampled from a training set”, so that “an output sequence [can be] transformed from an input sequence”.); 
a target for the output pattern for each input example is the input example ([0046]: “Each individual value 134 in output vector 136 corresponds to a character that is associated with at least one input value from input data vectors 118-124”. 
See also [0038]-[0039] and [0072]: describing that “a single input sequence x and corresponding output sequence y be sampled from a training set”, so that “an output sequence [can be] transformed from an input sequence”.)…; and 
the recurrent neural network comprises a plurality of directed arcs, wherein at least some of the directed arcs are between a node in one layer of the recurrent neural network and a node in another layer of the recurrent neural network (Figs. 2A-3A: showing the RNN architecture comprising a plurality of layers with corresponding arcs and nodes. Similarly, see [0034]-[0035]: describing the RNN architecture and operation in further detail, wherein the architecture comprises input, intermediate, and output layers and nodes.).

While the cited reference teaches the limitations of claim 43, it does not explicitly teach: “such that the recurrent neural network auto-associatively learns, through the recursive training, to memorize the plurality of input examples” on lines 10-12. Kwasny discloses the claim neural network that operates “based on [a] sequential recursive auto-associative memory” that can be trained in an auto-associative manner so that the outputs and inputs match (Kwasny col. 5, lines 59-67 to col. 6, lines 1-35). That is, an input can be memorized via the operation and training of the neural network. 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system in the cited reference to include the training in Kwasny. Doing so would enable “[a] neural network approach derived from sequential recursive auto-associative memory [to be] used to parse the wavelet coefficients and hierarchy data. Since the wavelet coefficients are continuous, linear output instead of sigmoidal output is used. This variation is therefore referred to as linear output sequential recursive auto-associative memory, or LOSRAAM. The objective of training the LOSRAAM network is to have the output exactly match the input.” (Kwasny Abstract). 

Regarding claim 47, claim 47 is substantially similar to claim 16 and therefore is rejected on the same ground as claim 16. Claim 47 is a method claim that corresponds to system claim 16.

Claims 17, 27-29, 31, and 48 are rejected under 35 U.S.C. 103 as being unpatentable over Diamos et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2017/0169326, hereinafter Diamos) and Kwasny et. al. (U.S. Pat. No. 6,285,992, hereinafter Kwasny) in view of Yu et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2017/0330068, hereinafter Yu). 


Regarding claim 17, Diamos teaches:
The computer system of claim 16, wherein: 
the output layer of the recurrent neural network comprise a plurality of output layer nodes (Figs. 2D, 3A, 5, and 6: showing output layer nodes of the RNN); 
the input layer of the recurrent neural network comprise a plurality of input layer nodes (Figs. 2D, 3A, 5, and 6: showing input layer nodes of the RNN);
the quantity of input layer nodes equals the quantity of output layer nodes, such that each output layer node has one and only one corresponding input layer node (Figs. 5 and 6: showing the output layer nodes with the corresponding input layer nodes); and 
….

Regarding claim 17, the rejection of claim 16 is incorporated. While the cited references teach the claim limitations, they do not explicitly teach: “the only directed arcs in the recurrent neural network that are from a higher numbered layer to a lower numbered layer are directed arcs from the output layer to the input layer, wherein there is a directed arc from each output layer node to its associated input layer node”. Yu discloses the claim limitations, teaching: that the RNN has direct arcs from the output nodes back to a corresponding input nodes (Yu Figs. 2, 4, 7, and 8). Similarly, see Yu [0024], [0034]-[0035], [0045]-[0048]: describing the RNN comprising arcs in further detail. 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system in the cited references to include the RNN arcs in Yu. Doing so would enable “obtain[ing] data in a first modality; propagate[ing] the data in the first modality through a neural network, thereby generating network outputs, wherein the neural network includes a first-stage neural network and a second-stage neural network, wherein the first-stage neural network includes two or more layers, wherein each layer of the two or more layers of the first-stage neural network includes a plurality of respective nodes, wherein the second-stage neural network includes two or more layers, one of which is an input layer and one of which is an output layer, and wherein each node in each layer of the first-stage neural network is connected to the input layer of the second-stage neural network; calculate a gradient of a loss function based on the network outputs; backpropagate the gradient through the neural network; and update the neural network based on the backpropagation of the gradient.” (Yu Abstract). 
Regarding claim 27, the rejection of claim 12 is incorporated. While the cited references teach the claim limitations, they do not explicitly teach: “wherein at least some of the input examples are labeled examples that have, for each such input example, a classification category label such that the recurrent neural network is trained to act as a classifier”. Yu discloses the claim limitations, teaching: that “[t]he system includes a specially-configured computing device 170 that implements a neural network 100 that accepts multiple modalities of data 150A-B as inputs and that performs detection, segmentation, or classification.” (Yu [0022]-[0023]). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system in the cited references to include the classification in Yu. Doing so would enable “obtain[ing] data in a first modality; propagate[ing] the data in the first modality through a neural network, thereby generating network outputs, wherein the neural network includes a first-stage neural network and a second-stage neural network, wherein the first-stage neural network includes two or more layers, wherein each layer of the two or more layers of the first-stage neural network includes a plurality of respective nodes, wherein the second-stage neural network includes two or more layers, one of which is an input layer and one of which is an output layer, and wherein each node in each layer of the first-stage neural network is connected to the input layer of the second-stage neural network; calculate a gradient of a loss function based on the network outputs; backpropagate the gradient through the neural network; and update the neural network based on the backpropagation of the gradient.” (Yu Abstract). 
Regarding claim 28, the rejection of claim 27 is incorporated. While the cited references teach the claim limitations, Yu further teaches: “wherein the classification category labels comprise error-correcting encoding”. Yu discloses the claim limitations, teaching: that “to train the neural network 200, some embodiments use a loss function L (e.g., a reconstruction error, and a classification error)” (Yu [0025]). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system in the cited references to include the classification error in Yu. Doing so would enable the neural network to be trained, and “[o]nce the neural network 100 is trained, the computing device 170 can use the neural network 100, for example for image segmentation, object detection, and object classification.” (Yu [0023]). 

Regarding claim 29, the rejection of claim 12 is incorporated. While the cited references teach the claim limitations, they do not explicitly teach: “the input examples are digital images; and the recurrent neural network is trained to, in operation, retrieve one of the digital images in response to receiving as input a portion of the digital image.” Yu discloses the claim limitations, teaching: 
“the input examples are digital images (Yu [0022]-[0023]: describing input data as comprising various RBG [red blue green] images); and 
the recurrent neural network is trained to, in operation, retrieve one of the digital images in response to receiving as input a portion of the digital image (Yu [0022]-[0023]: “The system includes a specially-configured computing device 170 that implements a neural network 100 that accepts multiple modalities of data 150A-B as inputs and that performs detection, segmentation, or classification. Examples of data modalities include RGB images, … hyperspectral images….”).”
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system in the cited references to include the receiving of data in Yu. Doing so would enable “obtain[ing] data in a first modality; propagate[ing] the data in the first modality through a neural network, thereby generating network outputs, wherein the neural network includes a first-stage neural network and a second-stage neural network, wherein the first-stage neural network includes two or more layers, wherein each layer of the two or more layers of the first-stage neural network includes a plurality of respective nodes, wherein the second-stage neural network includes two or more layers, one of which is an input layer and one of which is an output layer, and wherein each node in each layer of the first-stage neural network is connected to the input layer of the second-stage neural network; calculate a gradient of a loss function based on the network outputs; backpropagate the gradient through the neural network; and update the neural network based on the backpropagation of the gradient.” (Yu Abstract). 
Regarding claim 31, the rejection of claim 12 is incorporated. While the cited references teach the claim limitations, they do not explicitly teach: “the input examples are document files; and the recurrent neural network is trained to, in operation, retrieve one of the document files in response to receiving as input a portion of the document file”. Yu discloses the claim limitations, teaching: 
“the input examples are document files (Yu [0022]-[0023]: describing input data as comprising “text from annotations and sentences”); and 
the recurrent neural network is trained to, in operation, retrieve one of the document files in response to receiving as input a portion of the document file (Yu [0022]-[0023]: “The system includes a specially-configured computing device 170 that implements a neural network 100 that accepts multiple modalities of data 150A-B as inputs and that performs detection, segmentation, or classification. Examples of data modalities include … text from annotations and sentences, depth maps ….”)”.
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system in the cited references to include the receiving of data in Yu. Doing so would enable “obtain[ing] data in a first modality; propagate[ing] the data in the first modality through a neural network, thereby generating network outputs, wherein the neural network includes a first-stage neural network and a second-stage neural network, wherein the first-stage neural network includes two or more layers, wherein each layer of the two or more layers of the first-stage neural network includes a plurality of respective nodes, wherein the second-stage neural network includes two or more layers, one of which is an input layer and one of which is an output layer, and wherein each node in each layer of the first-stage neural network is connected to the input layer of the second-stage neural network; calculate a gradient of a loss function based on the network outputs; backpropagate the gradient through the neural network; and update the neural network based on the backpropagation of the gradient.” (Yu Abstract). 
Regarding claim 48, claim 48 is substantially similar to claim 17 and therefore is rejected on the same ground as claim 17. Claim 48 is a method claim that corresponds to system claim 17.

Claims 18-20 and 49 are rejected under 35 U.S.C. 103 as being unpatentable over Diamos et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2017/0169326, hereinafter Diamos) and Kwasny et. al. (U.S. Pat. No. 6,285,992, hereinafter Kwasny) in view of Hori et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2017/0221474, hereinafter Hori). 

Regarding claim 18, the rejection of claim 12 is incorporated. While the cited references teach the claim limitations, they do not explicitly teach: “wherein the computer memory stores software that when executed by the set of processor cores causes the set of processor cores to train the recurrent neural network by back propagating partial derivatives of a loss function through the recurrent neural network.” Hori discloses the claim limitations, teaching: “We obtain partial derivatives of the loss function L(Λ) with respect to ΛL for back propagation over time with a (BPTT) procedure. For simplicity, here we only use the derivative with respect to each RNNLM's [recurrent neural network language model] output….” (Hori [0041]). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system in the cited references to include the computations in Hori. Doing so would enable “training a language model to reduce recognition errors, wherein the language model is a recurrent neural network language model (RNNLM) by first acquiring training samples… [Wherein the process comprises determining] gradients for the hypotheses using the word errors and gradients for words in the hypotheses. Lastly, parameters of the RNNLM are updated using a sum of the gradients.” (Hori Abstract). 

Regarding claim 19, claim 19 is substantially similar to claim 18 and therefore is rejected on the same ground as claim 18. Claim 19 is a method claim that corresponds to method claim 18.

Regarding claim 20, claim 20 is substantially similar to claim 18 and therefore is rejected on the same ground as claim 18. Claim 20 is a method claim that corresponds to method claim 18.

Regarding claim 49, claim 49 is substantially similar to claim 18 and therefore is rejected on the same ground as claim 18. Claim 49 is a method claim that corresponds to system claim 18
Claims 21-26, 30, 52, 53, and 56 are rejected under 35 U.S.C. 103 as being unpatentable over Diamos et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2017/0169326, hereinafter Diamos) and Kwasny et. al. (U.S. Pat. No. 6,285,992, hereinafter Kwasny) in view of Yu et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2017/0330068, hereinafter Yu) and Hammond et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2017/0213131, hereinafter Hammond). 

Regarding claim 21, the rejection of claim 17 is incorporated. While the cited references teach the claim limitations, they do not explicitly teach: “The computer system of claim 17, wherein the computer memory stores software that when executed by the set of processor cores causes the set of processor cores to train the recurrent neural network by, for each input example: randomly transforming the input example; and recursively providing the randomly transformed input example to the content-addressable auto-associative memory system for training, until an output of the content-addressable auto-associative memory system converges to the input example.” Hammond discloses the claim limitations, teaching:
“The computer system of claim 17, wherein the computer memory stores software that when executed by the set of processor cores causes the set of processor cores to train the recurrent neural network by, for each input example (Hammond [0040]: describing software or hardware that can be utilized to “design an AI model, build the AI model, train the AI model to provide a trained AI model, and deploy the trained AI model”. Whereby various inputs can be introduced into the system for training ([0103]). Similarly, see [0183]: describing the computing devices comprising processors.):  
(Hammond [0067]: describing that a “stream node can operate directly on input data, data from other stream nodes, data from concept nodes, from literals, and from built in functions (for example to return random data or sequence data). For example, the following Inkling™ code block declares a functional transformation of data that is explicitly specified….”); and 
recursively providing the randomly transformed input example to the content-addressable auto-associative memory system for training, until an output of the content-addressable auto-associative memory system converges to the input example (Hammond [0091]: describing “recursive simulations and training session on each node making up the neural network”. Similarly, see [0133]: describing a “recursive deep-learning neural network architecture like a long short-term memory”. See also [0098]: describing that “[e]ach individual neural unit [in a neural network] can have, for example, a summation function, which combines the values of all its inputs together.” Wherein the neural network can comprise a content addressable auto-associative memory system (see [0100]).).”
	Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system in the cited references to include the random data in Hammond. Doing so would enable “an artificial intelligence (“AI”) engine configured to work with a graphical user interface (“GUI”). The AI engine can include an architect module, instructor module, and learner module AI-engine modules. The GUI can be configured with a text editor and a mental-model editor to enable an author to define a mental model to be learned by an AI model, the mental model including an input, one or more concept nodes, and an output. The architect module can be configured to propose a neural-network layout from an assembly code compiled from a source code in a pedagogical programming language, the learner module can be configured to build the AI model from the neural-network layout, and the instructor module can be configured to train the AI model on the one or more concept nodes.” (Hammond Abstract).

Regarding claim 22, the rejection of claim 21 is incorporated. Diamos teaches:
The computer system of claim 21, wherein the computer memory stores software that when executed by the set of processor cores causes the set of processor cores to transform an input example by performing a distortion on the input example that comprises a distortion selected from the group consisting of: 
translating the input example; 
rotating the input example;
linearly transforming the input example ([0047]-[0048]: describing that an intermediate layer of a RNN can “comprise a sub-group of individual neurons 142 that are connected to inputs values 145 and produce output values 146 that may undergo a linear and/or non-linear transformation”); 
degrading the input example; and 
subsampling the input example.

Regarding claim 23, the rejection of claim 21 is incorporated. While the cited references teach the claim limitations, they do not explicitly teach: “wherein the transformations of the input examples are controlled by one or more hyperparameters”. Hammond discloses the claim limitations, teaching: that “[t]he instructor module 324 [of the AI system] can optionally include hyperlearner module 325, which can be referred to herein as the Hyperlearner, and which can be configured to select one or more hyperparameters for any one or more of a neural network configuration, a learning algorithm, a learning optimizer, and the like.” (Hammond [0046]). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system in the cited references to include the hyperparameters in Hammond. Doing so would enable “[t]he instructor module 324 [to] instruct the learner module 328 on training the neural network 104 (e.g., which lessons should be taught in which order) with the one or more curriculums for training the one or more concepts in the mental mode using the training data and one or more hyperparameters from the hyperlearner module 325.” (Hammond [0048]). 

Regarding claim 24, Diamos teaches:
The computer system of claim 23, further comprising: 
a second set of processor cores ([0051]-[0052]: “multiple processor cores”.); and 
second computer memory in communication with the second set of processor cores, wherein the second computer memory stores software that when executed by the second set of processor cores causes the second set of processor cores to ([0028] and [0035]-[0036]: describing neural network architecture that “when implemented in software and executed on processors, these neural network architectures are commonly represented using one two-dimensional matrix”.)
….

While the cited references teach the limitations of claim 24, they do not explicitly teach: “to implement a machine-learning learning coach that learns, through machine learning, values for the one or more hyperparameters.” Hammond discloses the claim limitations, teaching: the training for the AI model with hyperparameters (Hammond [0046]). See also [0175]: describing the computing devices comprising processors, software, hardware, and servers for the machine learning.
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system in the cited references to include the hyperparameters in Hammond. Doing so would enable “[t]he instructor module 324 [to] instruct the learner module 328 on training the neural network 104 (e.g., which lessons should be taught in which order) with the one or more curriculums for training the one or more concepts in the mental mode using the training data and one or more hyperparameters from the hyperlearner module 325.” (Hammond [0048]). 

Regarding claim 25, the rejection of claim 21 is incorporated. Diamos teaches:
The computer system of claim 21, wherein the computer memory stores software that when executed by the set of processor cores further causes the set of processor cores to train the recurrent neural network with negative input examples ([0087]: describing an “inverse of a transformation that is performed by the first layer, e.g., to generate an intermediate representation having different size or dimensions, such that processing the inverse leads to the original format of the first layer”.).

Regarding claim 26, the rejection of claim 25 is incorporated. While the cited references teach the claim limitations, they do not explicitly teach: “wherein the negative input examples comprise input examples where the output of the recurrent neural network, in operation, does not converge to an input example”. Hammond discloses the claim limitations, teaching: that the AI system’s compiler “can match-check the schemas and report one or more errors if the schemas expected to match do not match” (Hammond [0083]). Wherein the schema can denote different types of input data or output data (Hammond [0078]-[0079]) and the compiler is part of the AI system (Hammond [0096]).)
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system in the cited references to include the match checking and lack of match in Hammond. Doing so would enable a compiler and mental model block of an AI engine, whereby “[a] block is a collection of one or more schemas, one or more concept nodes, one or more stream nodes…. A block can include a single input and a single output using reserved names for the input and the output.” (Hammond [0084]).  

Regarding claim 30, the rejection of claim 12 is incorporated. While the cited references teach the claim limitations, they do not explicitly teach: “the input examples are audio files; and the recurrent neural network is trained to, in operation, retrieve one of the audio files in response to receiving as input a portion of the audio file”. Hammond discloses the claim limitations, teaching: 
“the input examples are audio files (Hammond [0079]: describing that the input data can comprise “audio” data); and 
(Hammond [0060], [0063], and [0096]: describing that the AI system can receive the input data and be trained.)”.  
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the system in the cited references to include the data types in Hammond. Doing so would enable “[t]he AI engine is thus configured to make determinations regarding i) when to train the neural network on each of the one or more concepts and ii) how extensively to train the neural network on each of the one or more concepts.” (Hammond [0139]). Wherein the AI engine can receive various data types (Hammond [0080]).

Regarding claim 52, claim 52 is substantially similar to claim 21 and therefore is rejected on the same ground as claim 21. Claim 52 is a method claim that corresponds to system claim 21.

Regarding claim 53, claim 53 is substantially similar to claim 22 and therefore is rejected on the same ground as claim 22. Claim 53 is a method claim that corresponds to system claim 22.

Regarding claim 56, claim 56 is substantially similar to claim 25 and therefore is rejected on the same ground as claim 25. Claim 56 is a method claim that corresponds to system claim 25.

Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicant's disclosure:
Commons (U.S. Pat. No. 9,875,440): describing a neural network comprising hidden layers that can operate in an auto-associative manner. The neural network can analyze and infer patterns to determine if a certain input belongs to a pattern that has been identified by the neural network. 
Hartstein et. al. (U.S. Pat. No. 5,010,512): describing a neural network with an associative memory. The associative memory can operate dynamically or statically to learn the data. New states can be learned continuously over time while old states that are not in use can be forgotten over time. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SELENE A HAEDI whose telephone number is (571)270-5762.  The examiner can normally be reached on M-F 11 AM - 7 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached on (571)272-3768.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


/S.H./Examiner, Art Unit 2121                                                                                                                                                                                                        




/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121