DETAILED ACTION
This action is in response to the claims filed 11/01/2018. Claims 1-21 are pending and have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 rejected under 35 U.S.C. 101 because:

Regarding Claim 1
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [method for analyzing data using a reduced parameter gating signal]
at least a first array of values corresponding to a first parameter in a first equation that is used to calculate values of the first gating signal was calculated based on training data provided to the recurrent neural network
calculating a first value for the first gating signal based on the first equation using the first array of values as the first parameter;
generating a first output based on the first data and the first value for the first gating signal;
generating a second output based on the second data, and the first output;
and providing a third output identifying one or more characteristics of the input data based on the first output and the second output.

as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass the following: “Calculating values…” (a mathematical calculation). Calculating values using an equation based on parameters and training data amounts to little more than a mathematical calculation, the limitations “based on training data” only serves to further describe the data used in the calculation, and does not link the abstract idea to the process of training a machine learning model. Furthermore, the limitations in the context of this claim also encompass “generating output…” and “providing output…”(evalutation performed in the mind). Simply generating/providing output data based on other data broadly speaking is an analysis step that can be performed in the mind.  As such the claim recites an abstract idea. Finally, the limitations “wherein the first data and the second data form at least a portion of a sequence of data and the second data comes after the first data in the sequence” and “and the first equation includes not more than two parameters corresponding to arrays of values;” only serve to further describe the recited abstract ideas.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In addition, the claim recites additional element(s) (“providing the first data as input to a recurrent neural network”, “wherein the recurrent neural network includes at least a first gate corresponding to a first gating signal” and “providing the second data as input to the recurrent neural network;”) that only generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Furthermore, the additional element of “receiving input data that includes at least first data and second data” in the context of the claim amounts to sending and receiving data, corresponding to simply appending well-understood, routine, conventional activities. (See MPEP 2106.05(d)). Therefore, the claim is not patent eligible.

Regarding Claim 2
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [method for analyzing data using a reduced parameter gating signal]. Each of the following limitations:
wherein the first parameter is an n X n matrix, and the first output is an n-element vector, wherein n > 1.  
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “calculating a first value for the first gating signal based on the first equation using the first array of values as the first parameter;” and further defines the abstract idea, the  above limitation including: “wherein the first parameter is anfl X n matrix, and the first output is an n-element vector, wherein n > 1.”  (mathematical calculation ). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 3
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [method for analyzing data using a reduced parameter gating signal]. Each of the following limitations:
calculating a second value for the first gating signal based on the first equation using the first parameter and the first output as input   data, wherein calculating the second value comprises multiplying the first parameter and the first output. 

as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “calculating a first value for the first gating signal based on the first equation using the first array of values as the first parameter;” (mathematical calculation ). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 4
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [method for analyzing data using a reduced parameter gating signal]. Each of the following limitations:
wherein the first parameter is an n-element vector, and the first output is an n-element vector, wherein n > 1.  
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “calculating a first value for the first gating signal based on the first equation using the first array of values as the first parameter;” and further defines the abstract idea, the above limitation including: “wherein the first parameter is an n-element vector, and the first output is an n-element vector, wherein n > 1.” (mathematical calculation ). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 5
Step 1 Analysis: The claim is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a system for carrying out the method of claim 1. The Step 2A Prong One Analysis for claim 1 is applicable here since claim 5 carries out the method of claim 1 but for the recitation of additional elements “wherein the recurrent neural network comprises a long short-term memory (LSTM) unit.  ” (generally linking the use of the judicial exception to a particular field of use).
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) the claim recites additional element(s) “wherein the recurrent neural network comprises a long short-term memory (LSTM) unit.” that only generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 6
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [method for analyzing data using a reduced parameter gating signal]. Each of the following limitations:
wherein the first gate is an input gate, and the first equation includes neither a weight matrix W nor an input vector xt.  
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “calculating a first value for the first gating signal based on the first equation using the first array of values as the first parameter;” and further defines the abstract idea, the above limitation including: “wherein the first gate is an input gate, and the first equation includes neither a weight matrix W nor an input vector xt.” (mathematical calculation ). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 7
Step 1 Analysis: The claim is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a system for carrying out the method of claim 1. The Step 2A Prong One Analysis for claim 1 is applicable here since claim 5 carries out the method of claim 1 but for the recitation of additional elements “wherein the recurrent neural network comprises a gated recurrent unit (GRU)” (generally linking the use of the judicial exception to a particular field of use).
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) the claim recites additional element(s) “wherein the recurrent neural network comprises a gated recurrent unit (GRU)” that only generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 8
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [method for analyzing data using a reduced parameter gating signal]. Each of the following limitations:
wherein the first gate is an update gate, and the first equation does not include a weight matrix Wz, an input vector Xt, nor a bias vector bzb
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “calculating a first value for the first gating signal based on the first equation using the first array of values as the first parameter;” and further defines the abstract idea, the above limitation including: “wherein the first gate is an update gate, and the first equation does not include a weight matrix Wz, an input vector Xt, nor a bias vector bz.” (mathematical calculation ). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 9
Step 1 Analysis: The claim is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a system for carrying out the method of claim 1. The Step 2A Prong One Analysis for claim 1 is applicable here since claim 5 carries out the method of claim 1 but for the recitation of additional elements “wherein the recurrent neural network comprises a minimal gated unit (MGU)” (generally linking the use of the judicial exception to a particular field of use).
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) the claim recites additional element(s) “wherein the recurrent neural network comprises a minimal gated unit (MGU)” that only generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 10
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [method for analyzing data using a reduced parameter gating signal]. Each of the following limitations:
wherein the first gate is a forget gate, and the first equation includes a bias vector bf, and does not include a weight matrix Wf, an input vector Xt, a weight matrix Uf, nor an activation unit ht-1 generated at a previous step
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “calculating a first value for the first gating signal based on the first equation using the first array of values as the first parameter;” and further defines the abstract idea, the above limitation including: “wherein the first gate is a forget gate, and the first equation includes a bias vector bf, and does not include a weight matrix Wf, an input vector Xt, a weight matrix Uf, nor an activation unit ht-1 generated at a previous step” (mathematical calculation ). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 11
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [method for analyzing data using a reduced parameter gating signal]. Each of the following limitations:
wherein the recurrent neural network uses no more than half as many parameter values as a second recurrent neural network that uses matrices U, W, and b to calculate a gating signal corresponding to the first gating signal.  
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “calculating a first value for the first gating signal based on the first equation using the first array of values as the first parameter;” and further defines the abstract idea, the above limitation including: “wherein the recurrent neural network uses no more than half as many parameter values as a second recurrent neural network that uses matrices U, W, and b to calculate a gating signal corresponding to the first gating signal.” (mathematical calculation ). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 12
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [method for analyzing data using a reduced parameter gating signal]. Each of the following limitations:
wherein the input data is audio data, and the third output is an ordered set of words representing speech in the audio data.
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “and providing a third output identifying one or more characteristics of the input data based on the first output and the second output.” and further defines the abstract idea, the above limitation including: “wherein the input data is audio data, and the third output is an ordered set of words representing speech in the audio data.” (evaluation). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 13
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [method for analyzing data using a reduced parameter gating signal]. Each of the following limitations:
wherein the input data is a first ordered set of words in a first language, and the third output is a second ordered set of words in a second language representing a translation from the first language to the second language.  
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “and providing a third output identifying one or more characteristics of the input data based on the first output and the second output.” and further defines the abstract idea, the above limitation including: “wherein the input data is a first ordered set of words in a first language, and the third output is a second ordered set of words in a second language representing a translation from the first language to the second language.” (evaluation). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 14
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [method for analyzing data using a reduced parameter gating signal]. Each of the following limitations:
wherein the second output is calculated as ht = Ot 0 g (Ct), where g is a non-linear activation function, Ct is an output of a memory cell of an LSTM unit, Ot is an output gate signal, and 0 is element-wise (Hadamard) multiplication. 
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “generating a second output based on the second data, and the first output;” and further defines the abstract idea, the above limitation including: “wherein the second output is calculated as ht = Ot 0 g (Ct), where g is a non-linear activation function, Ct is an output of a memory cell of an LSTM unit, Ot is an output gate signal, and 0 is element-wise (Hadamard) multiplication..” (mathematical calculation and evaluation ). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 15
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [method for analyzing data using a reduced parameter gating signal]. Each of the following limitations:
at least one gating signal has a different dimension than an output signal of a memory cell of one of the plurality of LSTM units
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “calculating a first value for the first gating signal based on the first equation using the first array of values as the first parameter;” and further defines the abstract idea, the above limitation including: “at least one gating signal has a different dimension than an output signal of a memory cell of one of the plurality of LSTM units” (mathematical calculation ). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In addition, the claim recites additional element(s) (“wherein the recurrent neural network comprises a plurality of LSTM units”) that only generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 16
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [method for analyzing data using a reduced parameter gating signal]. Each of the following limitations:
wherein an update gate signal is a scalar.
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “calculating a first value for the first gating signal based on the first equation using the first array of values as the first parameter;” and further defines the abstract idea, the above limitation including: “wherein an update gate signal is a scalar.” (mathematical calculation ). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 17
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [method for analyzing data using a reduced parameter gating signal]. Each of the following limitations:
wherein a forget gate signal is a scalar
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “calculating a first value for the first gating signal based on the first equation using the first array of values as the first parameter;” and further defines the abstract idea, the above limitation including: “wherein a forget gate signal is a scalar” (mathematical calculation ). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. The claim does not recite any additional elements. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 18
Step 1 Analysis: The claim is directed to a [method], which is directed to [a process], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [method for analyzing data using a reduced parameter gating signal]. Each of the following limitations:
at least a second array of values corresponding to a second parameter in a second equation that is used to calculate values of the memory cell signal was calculated based on training data provided to the recurrent neural network, the second equation includes not more than one parameter corresponding to a multidimensional array of values
calculating a first value for the memory-cell signal;
and generating the first output based on the first data, the first value for the first gating signal, and the first value for the memory-cell signal.
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “at least a second array of values corresponding to a second parameter in a second equation that is used to calculate values of the memory cell signal was calculated based on training data provided to the recurrent neural network, the second equation includes not more than one parameter corresponding to a multidimensional array of values” and “calculating a first value for the memory-cell signal;” and “generating the first output based on the first data, the first value for the first gating signal, and the first value for the memory-cell signal.” (mathematical calculation and evaluation ). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In addition, the claim recites additional element(s) (“wherein the recurrent neural network includes a memory cell corresponding to a memory cell signal,) that only generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 19
Step 1 Analysis: The claim is directed to a [system], which is directed to [a machine], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [A system for analyzing sequential data using a reduced parameter gating signal]. Each of the following limitations:
a least a first array of values corresponding to a first parameter in a first equation that is used to calculate values of the first gating signal was calculated based on training data provided to the recurrent neural network, 
a second array of values corresponding to a second parameter in a second equation that is used to calculate values of the memory-cell signal was calculated based on the training data provided to the recurrent neural network,
the first equation includes not more than two parameters corresponding to arrays of values, and the second equation includes not more than one parameter corresponding to a multidimensional array of values;
calculate a first value for the first gating signal based on the first equation using the first array of values as the first parameter;
calculate a first value for the memory-cell signal based on the second equation using the second array of values as the second parameter;
generate a first output based on the first data, the first value for the first gating signal, and the first value for the memory-cell signal; 
generate a second output based on the second data, and the first output;
and provide a third output identifying one or more characteristics of the input data based on the first output and the second output.
wherein the first data and the second data form at least a portion of a sequence of data and the second data comes after the first data in the sequence;
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, but for the generic computer components language (“at least one processor that is programmed to”) the above limitations in the context of this claim encompass the following: “Calculating values…” (a mathematical calculation). Calculating values using an equation based on parameters and training data amounts to little more than a mathematical calculation, the limitations “based on training data” only serves to further describe the data used in the calculation, and does not link the abstract idea to the process of training a machine learning model. Furthermore, the limitations in the context of this claim also encompass “generating output…” and “providing output…”(evaluation performed in the mind). Simply generating/providing output data based on other data broadly speaking is an analysis step that can be performed in the mind.  As such the claim recites an abstract idea. 
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “at least one processor that is programmed to”, as drafted, are reciting generic computer components. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component.  In addition, the claim recites additional element(s) (“provide the first data as input to a recurrent neural network,”, “wherein the recurrent neural network comprises a long short-term memory (LSTM) unit including at least a first gate corresponding to a first gating signal, and a memory cell corresponding to a memory-cell signal,” and “provide the second data as input to the recurrent neural network;”) that only generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Furthermore, the additional element of “receive input data that includes at least first data and second data” in the context of the claim amounts to sending and receiving data, corresponding to simply appending well-understood, routine, conventional activities. (See MPEP 2106.05(d)). Therefore, the claim is not patent eligible.

Regarding Claim 20
Step 1 Analysis: The claim is directed to a [system], which is directed to [a machine], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [A system for analyzing sequential data using a reduced parameter gating signal]. Each of the following limitations:

    PNG
    media_image1.png
    188
    628
    media_image1.png
    Greyscale
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “calculate a first value for the memory-cell signal based on the second equation” and further defines the abstract idea, the above limitation including: the limitations of claim 20 (mathematical calculation ). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “at least one processor that is programmed to”, as drafted, are reciting generic computer components. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 21
Step 1 Analysis: The claim is directed to a [system], which is directed to [a machine], one of the statutory categories.
Step 2A Prong One Analysis: The claim recites [A system for analyzing sequential data using a reduced parameter gating signal]. Each of the following limitations:

    PNG
    media_image2.png
    194
    666
    media_image2.png
    Greyscale
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. The above limitations in the context of this claim encompass “calculate a first value for the memory-cell signal based on the second equation” and further defines the abstract idea, the above limitation including: the limitations of claim 21 (mathematical calculation ). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “at least one processor that is programmed to”, as drafted, are reciting generic computer components. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.
	

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1, 5, 15, 16, 17 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Olah et al “Understanding LSTM Networks” hereinafter Olah. Further in view of Chen et al “A Gentle Tutorial of Recurrent Neural Network with Error Backpropagation” hereinafter Chen.

Regarding claim 1
Olah teaches, A method for analyzing data using a reduced parameter gating signal, the method comprising:  receiving input data that includes at least first data and second data, (
    PNG
    media_image3.png
    265
    681
    media_image3.png
    Greyscale
pg 5 as shown in the figure 6 (annotated by examiner) the LSTM module receives at least two input data xt-1 and xt) providing the first data as input to a recurrent neural network; providing the second data as input to the recurrent neural network; (pg 4 ¶01 “Long Short Term Memory networks – usually just called “LSTMs” – are a special kind of RNN, capable of learning long-term dependencies.” The LSTM, a type of RNN, is a module that receives the previously mapped input data, including the first and second data, as shown in Figure 6) wherein the recurrent neural network includes at least a first gate corresponding to a first gating signal, (pg 6 ¶01 The Core Idea Behind LSTMs “The LSTM does have the ability to remove or add information to the cell state, carefully regulated by structures called gates.” Pg 6 Step-by-Step LSTM Walk Through  ¶01 “The first step in our LSTM is to decide what information we’re going to throw away from the cell state. This decision is made by a sigmoid layer called the “forget gate layer.” It looks at and ht-1 and xt outputs a number between and for each number in the cell state” 
    PNG
    media_image4.png
    176
    612
    media_image4.png
    Greyscale
internal to a LSTM Cell (shown in Figure 9 pg 7)  there is at least a first gating signal, which is calculated by a first gate.) at least a first array of values corresponding to a first parameter in a first equation that is used to calculate values of the first gating signal (pg 7 Figure 9 
    PNG
    media_image5.png
    48
    230
    media_image5.png
    Greyscale
the equation includes an array of values corresponding to a first parameter in the equation. One such parameter is the parameter Wf, ht-1 and xt are not parameters but input values as shown in the figure. As made clear by the equation, the parameter is used to calculate ft, the first signal) calculating a first value for the first gating signal based on the first equation using the first array of values as the first parameter; (pg 7 Figure 9 
    PNG
    media_image5.png
    48
    230
    media_image5.png
    Greyscale
the equation includes an array of values corresponding to a first parameter in the equation, as stated previously. Ft corresponds to the first value) generating a first output based on the first data and the first value for the first gating signal; (pg 7 with reference to Figure 10 “We multiply the old state by ft, forgetting the things we decided to forget earlier. Then we add it*Ct. This is the new candidate values, scaled by how much we decided to update each state value.”
    PNG
    media_image6.png
    30
    192
    media_image6.png
    Greyscale
 Ct is generated based on the first value, which is itself based on the first data. Pg 8 ¶01 “Finally, we need to decide what we’re going to output. This output will be based on our cell state, but will be a filtered version” 
    PNG
    media_image7.png
    38
    183
    media_image7.png
    Greyscale
ht is the first output, which is calculated based on Ct thus based on the first data and the first value) generating a second output based on the second data, and the first output (Figure 6 pg 5
    PNG
    media_image8.png
    236
    603
    media_image8.png
    Greyscale
as discussed previously a RNN is made up of multiple cells. Each cell takes in input to generate output, each cell in an LSTM operates in the method previously described, in which a gate signal is used to generate an output.)
	Olah does not explicitly teach, …form at least a portion of a sequence of data and the second data comes after the first data in the sequence;… was calculated based on training data provided to the recurrent neural network, and providing a third output identifying one or more characteristics of the input data based on the first output and the second output.
	However Chen when addressing training recurrent neural network cells with trained gating signals and processing on first and second sequence data teaches, …form at least a portion of a sequence of data and the second data comes after the first data in the sequence; (pg 5 “The core of LSTM is a memory unit (or cell) ct in Fig. 2, which encodes the information of the inputs that have been observed up to that step….given a sequence data {x1, ..., xT }” the input that is encoded by the LSTM module is first and second data from the ordered sequence X.) was calculated based on training data provided to the recurrent neural network, (pg 9 “If the groundtruth at time t is yt , we can consider minimizing least square ½ (yt−zt)^2 or cross entroy to estimate model parameters….Thus, for the top layer classification with weight Whz, we can take derivative w.r.t. zt and Whz respectively” pg 8 Section 3.3 “we can backpropagate the error from T to 1 via Eqs. 25-34. After we get gradients using backpropagation, the model θ can be learnt with gradient based methods, such as stochastic gradient descent and L-BFGS)” The model parameters are estimated based on the data fed to the network. The data used to calculate the derivative of the weight parameters with respect to the weight matrix is the training data.) and providing a third output identifying one or more characteristics of the input data based on the first output and the second output. (pg 6 “Same as RNNs to make prediction, we can add a linear model over the hidden state ht , and output the likelihood with softmax function” the hidden state output ht corresponding to the second output, which is based on the first output, generates an output zt based on the second output and linear model parameters. Zt is a prediction which corresponds to identifying a characteristic of the input data.)
	It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a method that calculates outputs for recurrent networks cells to make a output prediction based on training parameters as taught by Chen to the disclosed invention of Olah.
One of ordinary skill in the arts would have been motivated to make this modification in order to apply LSTMs to the problem of sequence labeling (“The goal of this work is for sequence labeling, i.e. classify all items in a sequence” Chen pg 1)  

Regarding claim 5
	Further Olah/Chen teaches the method of claim 1
	Further Olah teaches, wherein the recurrent neural network comprises a long short-term memory (LSTM) unit ( pg 4 “Long Short Term Memory networks – usually just called “LSTMs” – are a special kind of RNN, capable of learning long-term Dependencies”  the neural network described includes at least one LSTM unit or cell in the RNN network.)

Regarding claim 18
	Further Olah/Chen teaches the method of claim 1
	Further Olah teaches, wherein the recurrent neural network includes a memory cell corresponding to a memory cell signal, ( pg 7 ¶01 “The next step is to decide what new information we’re going to store in the cell state. This has two parts. First, a sigmoid layer called the “input gate layer” decides which values we’ll update. Next, a tanh layer creates a vector of new candidate values, that could be added to the state” 
    PNG
    media_image9.png
    245
    773
    media_image9.png
    Greyscale
part of the LSTM cell includes a component that produces a memory cell signal Ct. Ct is the memory cell signal because it signal that indicates “the new information we’re going to store in the cell state”. Storing is indicative of memory.) the method further comprising: calculating a first value for the memory-cell signal; and generating the first output based on the first data, the first value for the first gating signal, and the first value for the memory-cell signal.( pg 8 
    PNG
    media_image10.png
    266
    833
    media_image10.png
    Greyscale
the output ht corresponding to the first ouput is calculated based on the first data, xt, the first value for the memory cell, Ct, and the first value for the first gating signal ot.)  a second array of values…that is used to calculate values of the memory cell signal …the second equation includes not more than one parameter corresponding to a multidimensional array of values, (pg 7
    PNG
    media_image9.png
    245
    773
    media_image9.png
    Greyscale
part of the LSTM cell calculates the memory signal Ct  based on a second array of values, Wc. The Second equation which calculates the memory signal only has one multidimensional parameter)  
Further Chen teaches, was calculated based on training data provided to the recurrent neural network, (pg 9 “If the groundtruth at time t is yt , we can consider minimizing least square ½ (yt−zt)^2 or cross entropy to estimate model parameters….Thus, for the top layer classification with weight Whz, we can take derivative w.r.t. zt and Whz respectively” pg 8 Section 3.3 “we can backpropagate the error from T to 1 via Eqs. 25-34. After we get gradients using backpropagation, the model θ can be learnt with gradient based methods, such as stochastic gradient descent and L-BFGS)” The model parameters are estimated based on the data fed to the network. The data used to calculate the derivative of the weight parameters with respect to the weight matrix is the training data.)
For the reasons to combine Olah and Chen, see the rejection of claim 1

Regarding claim 15
	Further Olah/Chen teaches the method of claim 1
Further Olah teaches, wherein the recurrent neural network comprises a plurality of LSTM units, ( pg 5
    PNG
    media_image3.png
    265
    681
    media_image3.png
    Greyscale
Figure 6, which was referenced in the rejection of claim 1, teaches a plurality of LSTM cells that form the neural network) and at least one gating signal has a different dimension than an output signal of a memory cell of one of the plurality of LSTM units. (pg 5 ¶02“In the above diagram, each line carries an entire vector, from the output of one node to the inputs of others.” The lstm unit takes as input a vector of scalar gating signal values. The output signal of a memory cell, as shown, is a vector of a dimension 1. Conversely the input includes at least one gating signal which is a scalar, a scalar has a dimension of 0. Thus the gating signal and the output signal have a different dimension.)

Regarding claim 16
	Further Olah/Chen teaches the method of claim 1
Further Olah teaches, wherein an update gate signal is a scalar (pg 5 Figure 6 “In the above diagram, each line carries an entire vector, from the output of one node to the inputs of others.” Examiner notes, as described in claim 15, the bold arrows in the figure depict the flow of the gating signals. The vector output by the update function includes a plurality of scalar values, or gating signals.)

Regarding claim 17
	Further Olah/Chen teaches the method of claim 1
Further Olah teaches, wherein a forget gate signal is a scalar. (pg 5 figure 6 “In the above diagram, each line carries an entire vector, from the output of one node to the inputs of others.” Examiner notes, as described in claim 15, the bold arrows in the figure depict the flow of the gating signals. The vector output by the forget function includes a plurality of scalar values, or gating signals.)

Claim 2, 3 and 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Olah/Chen, further in view of Liu et al “CNN-LSTM Neural Network Model for Quantitative Strategy Analysis in Stock Markets” hereinafter Liu.

Regarding claim 2
Further Olah/Chen teaches the method of claim 1
Olah/Chen does not explicitly teach, wherein the first parameter is a n X n matrix, and the first output is an n-element vector, wherein n >= 1
However Liu when addressing issues related to parameter size and input data size of recurrent neural network cells teaches, wherein the first parameter is a n X n matrix, and the first output is an n-element vector, wherein n >= 1 (Section 2.2 “The vector formulas for LSTM layer forward pass are given in [15]. In order to facilitate your understanding, just listed below:... where xt is the input vector at time t, the W are input weight matrices, the R are square recurrent weight matrices, the p are peephole weight vectors and b are bias vectors.” The first parameter corresponds to the square parameter matrix R, xt is input vector. By definition a vector includes n elements wherein n>1.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a recurrent neural network with both square parameters matrixes and 1d parameter vectors as taught by Liu to the disclosed invention of Olah/Chen.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement an LSTM RNN disclosed by Olah and Chen to solve the problem presented Liu.

Regarding claim 3
	Further Olah/Chen/Liu teaches the method of claim 2
	Further Olah teaches,  further comprising calculating a second value for the first gating signal based on the first equation using the first parameter and the first output as input data, wherein calculating the second value comprises multiplying the first parameter and the first output. (pg 5 as stated previously the second cell in the figure B cited in claim one produces a second value ft which is based on the previous cells output ht-1 and used the first parameter Wf as input data. As shown in the equation Wf is multiplied by ht-1 in order to generate the second value.)

Regarding claim 4
	Further Olah/Chen teaches the method of claim 1
	Olah/Chen does not explicitly teach, wherein the first parameter is an n-element vector, and the first output is an n-element vector, wherein n >= 1
	However Liu when addressing issues related to parameter size and input data size of recurrent neural network cells teaches, wherein the first parameter is an n-element vector, and the first output is an n-element vector, wherein n >= 1 (Section 2.2 “The vector formulas for LSTM layer forward pass are given in [15]. In order to facilitate your understanding, just listed below:... where xt is the input vector at time t, the W are input weight matrices, the R are square recurrent weight matrices, the p are peephole weight vectors and b are bias vectors.” The first parameter corresponds to the bias vector, xt is input vector. By definition a vector includes n elements wherein n>=1.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a recurrent neural network with both square parameters matrixes and 1d parameter vectors as taught by Liu to the disclosed invention of Olah/Chen.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement an LSTM RNN disclosed by Olah and Chen to solve the problem presented Liu.

Claim 6 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Olah/Chen. Further in view of Lu et al “Empirical Evaluation of A New Approach to Simplifying Long Short-term Memory (LSTM)*” hereinafter Lu.

Regarding claim 6
	Further Olah/Chen teaches the method of claim 1
	Olah/Chen does not explicitly teach, wherein the first gate is an input gate, and the first equation includes neither a weight matrix Wi nor an input vector xt.  
	However Lu when addressing issues related to parameter reduced gating functions in recurrent neural network cells teaches, wherein the first gate is an input gate, and the first equation includes neither a weight matrix Wi nor an input vector xt.  (pg 2 Section 2 ¶03 “Here, three simplifications were made to the vanilla LSTM by removing certain components from all the three gates as follows:… 3) No Input Signal and No Hidden Unit Signal… 
    PNG
    media_image11.png
    20
    70
    media_image11.png
    Greyscale
” the modified gate signal does not include a weight matrix Wi nor a input vector xt.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a recurrent neural network with modified gating functions as taught by Lu to the disclosed invention of Olah/Chen.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement LSTM variants. (“The main benefit of the three LSTM variants is to reduce the number of parameters involved, and thus reduce the model complexity and the computation cost” (pg 4 right column Lu))

Regarding claim 14
	Further Olah/Chen teaches the method of claim 1
	Olah/Chen does not explicitly teach, wherein the second output is calculated as ht = Ot 0 g (Ct), where g is a non-linear activation function, Ct is an output of a memory cell of an LSTM unit, Ot is an output gate signal, and 0 is element-wise (Hadamard) multiplication
	However Lu when addressing issues related to parameter reduced gating functions in recurrent neural network cells teaches, wherein the second output is calculated as ht = Ot 0 g (Ct), where g is a non-linear activation function, Ct is an output of a memory cell of an LSTM unit, Ot is an output gate signal, and 0 is element-wise (Hadamard) multiplication (pg 2 Section 2 ¶02 “The equations for the LSTM memory block are given as follows:… 
    PNG
    media_image12.png
    115
    272
    media_image12.png
    Greyscale
 the operator * denotes the element-wise vector product” element wise vector product is equivilent to the hadamard multiplication. Furthermore, tanh is a non-linear activation function.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a recurrent neural network with modified gating functions as taught by Lu to the disclosed invention of Olah/Chen.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement LSTM variants. (“The main benefit of the three LSTM variants is to reduce the number of parameters involved, and thus reduce the model complexity and the computation cost” (pg 4 right column Lu))


Claim 7, 9 and 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Olah/Chen. Further in view of Zhou et al “Minimal Gated Unit for Recurrent Neural Networks” hereinafter Zhou.

Regarding claim 7
	Further Olah/Chen teaches the method of claim 1
	Olah/Chen does not explicitly teach, wherein the recurrent neural network comprises a gated recurrent unit (GRU). 
	However Zhou when addressing variations of the standard LSTM recurrent neural network cell teaches, wherein the recurrent neural network comprises a gated recurrent unit (GRU). (pg 4 ¶02 “. The Gated Recurrent Unit (GRU) architecture further simplifies LSTM-like units… GRU contains two gates: an update gate z (whose role is similar to the forget gate) and a reset gate r (whose role loosely matches the input gate). GRU’s update rules are shown as Equation (5a) to (5d)” a GRU is a variation of a recurrent neural network similar to LSTM, with its own cell state equations.) 
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a GRU or MGU cell in a recurrent neural network as taught by Zhou to the disclosed invention of Olah/Chen.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement simplified LSTM models in which “the accuracy of GRU is usually higher than that of LSTM, albeit the fact that GRU has one less hidden state and one less gate than LSTM.” (pg 2 ¶02) and further “With only one gate, we expect MGU will have significantly fewer parameters to learn than GRU or LSTM, and also fewer components or variations to tune.” (pg 2 column 1 Zhou) 

Regarding claim 9
	Further Olah/Chen teaches the method of claim 1
	Olah/Chen does not explicitly teach, wherein the recurrent neural network comprises a minimal gated unit (MGU).
	However Zhou when addressing variations of the standard LSTM recurrent neural network cell teaches, wherein the recurrent neural network comprises a minimal gated unit (MGU). (pg 2 ¶04 “.In this paper, we propose a new variant of GRU (which is also a variant of LSTM), which has minimal number of gates–only one gate! Hence, the proposed method is named as the Minimal Gated Unit (MGU) a MGU is a variation of a recurrent neural network similar to LSTM, with its own cell state equations.) 
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a GRU or MGU cell in a recurrent neural network as taught by Zhou to the disclosed invention of Olah/Chen.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement simplified LSTM models in which “the accuracy of GRU is usually higher than that of LSTM, albeit the fact that GRU has one less hidden state and one less gate than LSTM.” (pg 2 ¶02) and further “With only one gate, we expect MGU will have significantly fewer parameters to learn than GRU or LSTM, and also fewer components or variations to tune.” (pg 2 column 1 Zhou) 

Regarding claim 11
	Further Olah/Chen teaches the method of claim 1
	Olah/Chen does not explicitly teach, wherein the recurrent neural network uses no more than half as many parameter values as a second recurrent neural network that uses matrices U, W, and b,  to calculate a gating signal corresponding to the first gating signal
However Zhou when addressing variations of the standard LSTM recurrent neural network cell teaches wherein the recurrent neural network uses no more than half as many parameter values as a second recurrent neural network that uses matrices U, W, and b, to calculate a gating signal corresponding to the first gating signal ( Conclusion “The proposed Minimal Gated Unit (MGU) has the minimal design in any gated hidden unit for RNN. It has only one gate (the forget gate) and does not involve the peephole connection. Hence, the number of parameters in MGU is only half of that in the Long Short-Term Memory (LSTM),” The MGU unit has half as many parameter values as a second neural network LSTM which calculates a gating signal based on at least 3 parameters as shown in equations 4a-4f, where there are at least 4 different parameter matrixes (Wf, Wi, Wo, and Wc) and further weight vectors (bf, bi, bo, and bc). 
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a GRU or MGU cell in a recurrent neural network as taught by Zhou to the disclosed invention of Olah/Chen.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement simplified LSTM models in which “the accuracy of GRU is usually higher than that of LSTM, albeit the fact that GRU has one less hidden state and one less gate than LSTM.” (pg 2 ¶02) and further “With only one gate, we expect MGU will have significantly fewer parameters to learn than GRU or LSTM, and also fewer components or variations to tune.” (pg 2 column 1 Zhou) 

Claim 8 and 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Olah/Chen/Zhou. Further in view of Lu et al “Empirical Evaluation of A New Approach to Simplifying Long Short-term Memory (LSTM)*” hereinafter Lu.

Regarding claim 8
	Further Olah/Chen/Zhou teaches the method of claim 7
	Olah/Chen/Zhou does not explicitly teach, wherein the first gate is an update gate, and the first equation does not include a weight matrix Wz, an input vector Xt, nor a bias vector bz.
	However Lu when addressing issues related to parameter reduced gating functions in recurrent neural network cells teaches, wherein the first gate is an update gate, and the first equation does not include a weight matrix Wz, an input vector Xt, nor a bias vector bz. (pg 2 Section 2 ¶03 “Here, three simplifications were made to the vanilla LSTM by removing certain components from all the three gates as follows:… 3) No Input Signal and No Bias… 
    PNG
    media_image13.png
    21
    108
    media_image13.png
    Greyscale
” the modified gate signal does not include a weight matrix Wz nor a input vector xt, nor a bias vector bz. The update gate corresponds to the input gate in Lu because the input gate produces a signal used to update the cell state. Further although Ui is a matrix, it is distinct from the weight matrix Wi which is the training parameter used in the standard gating function depicted in equation (1)
    PNG
    media_image14.png
    23
    166
    media_image14.png
    Greyscale
 )
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a recurrent neural network with modified gating functions as taught by Lu to the disclosed invention of Olah/Chen.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement LSTM variants. (“The main benefit of the three LSTM variants is to reduce the number of parameters involved, and thus reduce the model complexity and the computation cost” (pg 4 right column Lu))


Regarding claim 10
	Further Olah/Chen/Zhou teaches the method of claim 9
	Olah/Chen/Zhou does not explicitly teach, wherein the first gate is a forget gate, and the first equation includes a bias vector bf, and does not include a weight matrix Wf, an input vector Xt, a weight matrix Uf, nor an activation unit ht-1 generated at a previous step.
	However Lu when addressing issues related to parameter reduced gating functions in recurrent neural network cells teaches, wherein the first gate is a forget gate, and the first equation includes a bias vector bf, and does not include a weight matrix Wf, an input vector Xt, a weight matrix Uf, nor an activation unit ht-1 generated at a previous step.. (pg 2 Section 2 ¶03 “Here, three simplifications were made to the vanilla LSTM by removing certain components from all the three gates as follows:… 3) No Input Signal and No Bias… 
    PNG
    media_image15.png
    20
    85
    media_image15.png
    Greyscale
” the modified forget gate signal does not include a weight matrix Wf nor a input vector xt, nor a weight matrix Uf, nor an activation unit ht-1 generated at a previous timestep. The equation cited is the forget gate with only a bias vector bf, although it includes an activation function it does not include and activation unit, ht-1, generated from the previous step.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a recurrent neural network with modified gating functions as taught by Lu to the disclosed invention of Olah/Chen.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement LSTM variants. (“The main benefit of the three LSTM variants is to reduce the number of parameters involved, and thus reduce the model complexity and the computation cost” (pg 4 right column Lu))


Claim 12 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Olah/Chen. Further in view of Hannun et al “Deep Speech: Scaling up end-to-end speech recognition” hereinafter Hannun.

Regarding claim 12
Further Olah/Chen teaches the method of claim 1
Olah/Chen does not explicitly teach, wherein the input data is audio data, and the third output is an ordered set of words representing speech in the audio data.
However Hannun when addressing using recurrent neural networks processing sequential data for classification teaches, wherein the input data is audio data, and the third output is an ordered set of words representing speech in the audio data. (Section 2 “The core of our system is a recurrent neural network (RNN) trained to ingest speech spectrograms and generate English text transcriptions.” The final output of English text is an order set of words representing the audio data ingested by the RNN.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to use a recurrent neural network to the sequential audio data to a classification as taught by Hannun to the disclosed invention of Olah/Chen.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement an LSTM RNN disclosed by Olah and Chen to solve sequence prediction problem presented by Hannun.

Regarding claim 19
	Olah teaches, A system for analyzing sequential data using a reduced parameter gating signal, receive input data that includes at least first data and second data, (
    PNG
    media_image3.png
    265
    681
    media_image3.png
    Greyscale
pg 5 as shown in figure 6 the LSTM module receives at least two input data xt-1 and xt) provide the first data as input to a recurrent neural network, wherein the recurrent neural network comprises a long short-term memory (LSTM) unit; generate a second output based on the second data, and the first output;  (pg 4 ¶01 “Long Short Term Memory networks – usually just called “LSTMs” – are a special kind of RNN, capable of learning long-term dependencies.” The LSTM, a type of RNN, is a module that receives the previously mapped input data, including the first and second data, as shown in Figure 6) including at least a first gate corresponding to a first gating signal, (pg 5 The Core Idea Behind LSTMs “The LSTM does have the ability to remove or add information to the cell state, carefully regulated by structures called gates.” Pg 6 Step-by-Step LSTM Walk Through“The first step in our LSTM is to decide what information we’re going to throw away from the cell state. This decision is made by a sigmoid layer called the “forget gate layer.” It looks at and ht-1 and xt outputs a number between and for each number in the cell state” 
    PNG
    media_image4.png
    176
    612
    media_image4.png
    Greyscale
internal to a LSTM Cell (shown in figure 9)  there is at least a first gating signal, which is calculated by a first gate.) a least a first array of values corresponding to a first parameter in a first equation that is used to calculate values of the first gating signal (Figure 9 
    PNG
    media_image5.png
    48
    230
    media_image5.png
    Greyscale
the equation includes an array of values corresponding to a first parameter in the equation. One such parameter is the parameter Wf, ht-1 and xt are not parameters but input values as shown in the figure. As made clear by the equation, the parameter is used to calculate ft, the first signal) calculate a first value for the first gating signal based on the first equation using the first array of values as the first parameter;( Figure 9
    PNG
    media_image5.png
    48
    230
    media_image5.png
    Greyscale
the equation includes an array of values corresponding to a first parameter in the equation, as stated previously. Ft corresponds to the first value) wherein the recurrent neural network …including … a memory cell corresponding to a memory cell signal, ( pg 7 “The next step is to decide what new information we’re going to store in the cell state. This has two parts. First, a sigmoid layer called the “input gate layer” decides which values we’ll update. Next, a tanh layer creates a vector of new candidate values, that could be added to the state” 
    PNG
    media_image16.png
    230
    757
    media_image16.png
    Greyscale
part of the LSTM cell includes a component that produces a memory cell signal Ct.) a second array of values corresponding to a second parameter in a second equation that is used to calculate values of the memory-cell signal … the first equation includes not more than two parameters corresponding to arrays of values, and the second equation includes not more than one parameter corresponding to a multidimensional array of values (pg 7
    PNG
    media_image16.png
    230
    757
    media_image16.png
    Greyscale
part of the LSTM cell calculates the memory signals, Ct, based on multidimensional array of values, or second array of values, the Wc matrix, this corresponds to the Second equation. Further, the first equation depicted in Figure 10 includes two parameters Wi and bi.) calculate a first value for the memory-cell signal based on the second equation using the second array of values as the second parameter; generate a first output based on the first data, the first value for the first gating signal, and the first value for the memory-cell signal ( 
    PNG
    media_image17.png
    236
    742
    media_image17.png
    Greyscale
the output ht corresponding to the first output is calculated based on the first data, xt, the first value for the memory cell, Ct, and the first value for the first gating signal ot.)  
	Olah does not explicitly teach, …form at least a portion of a sequence of data and the second data comes after the first data in the sequence; provide the second data as input to the recurrent neural network;… was calculated based on training data provided to the recurrent neural network and provide a third output identifying one or more characteristics of the input data based on the first output and the second output; the system comprising: at least one processor that is programmed 
	However Chen when addressing recurrent neural network cells with trained gating signals teaches, … form at least a portion of a sequence of data and the second data comes after the first data in the sequence; provide the second data as input to the recurrent neural network;  (pg 5 “The core of LSTM is a memory unit (or cell) ct in Fig. 2, which encodes the information of the inputs that have been observed up to that step….given a sequence data {x1, ..., xT }” the input that is encoded by the LSTM module is first and second data from the ordered sequence X.) was calculated based on training data provided to the recurrent neural network (pg 9 “If the groundtruth at time t is yt , we can consider minimizing least square ½ (yt−zt)^2 or cross entroy to estimate model parameters….Thus, for the top layer classification with weight Whz, we can take derivative w.r.t. zt and Whz respectively” pg 8 Section 3.3 “we can backpropagate the error from T to 1 via Eqs. 25-34. After we get gradients using backpropagation, the model θ can be learnt with gradient based methods, such as stochastic gradient descent and L-BFGS)” The model parameters are estimated based on the data fed to the network. The data used to calculate the derivative of the weight parameters with respect to the weight matrix is the training data.) and provide a third output identifying one or more characteristics of the input data based on the first output and the second output. (pg 6 “Same as RNNs to make prediction, we can add a linear model over the hidden state ht , and output the likelihood with softmax function” the hidden state output ht corresponding to the second output, which is based on the first output, generates an output zt based on the second output and linear model parameters. Zt is a prediction which corresponds to identifying a characteristic of the input data.)
	It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a method that integrates recurrent networks cells to make an output prediction based on calculated training parameters as taught by Chen to the disclosed invention of Olah.
One of ordinary skill in the arts would have been motivated to make this modification in order to apply LSTMs to the problem of sequence labeling (“The goal of this work is for sequence labeling, i.e. classify all items in a sequence” Chen pg 1)  
Olah/Chen does not explicitly teach, the system comprising: at least one processor that is programmed
However Hannun when addressing using recurrent neural networks for sequence classification teaches, the system comprising: at least one processor that is programmed (Abstract “Key to our approach is a well-optimized RNN training system that uses multiple GPUs, as well as a set of novel data synthesis techniques that allow us to efficiently obtain a large amount of varied data for training.” The RNN system is implemented with multiple GPUs or at least one processor.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to use a recurrent neural network to the sequential audio data to a classification as taught by Hannun to the disclosed invention of Olah/Chen.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement an LSTM RNN disclosed by Olah and Chen to solve sequence prediction problem presented by Hannun.

Claim 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Olah/Chen. Further in view of Tu et al “Context Gates for Neural Machine Translation” hereinafter Tu.

Regarding claim 13
	Further Olah/Chen teaches the method of claim 1
Olah/Chen does not explicitly teach, wherein the input data is a first ordered set of words in a first language, and the third output is a second ordered set of words in a second language representing a translation from the first language to the second language
However Tu when addressing using recurrent neural networks for machine translation teaches wherein the input data is a first ordered set of words in a first language, and the third output is a second ordered set of words in a second language representing a translation from the first language to the second language. (Section 2 ¶01 “Suppose that x=x1, . . . xj , . . . xJ represents a source sentence and y=y1, . . . yi, . . . yI a target sentence. NMT directly models the probability of translation from the source sentence to the target sentence word by word:”)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a recurrent neural network to map text data from one language to another language as taught by Tu to the disclosed invention of Olah/Chen.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement a recurrent neural network in which “Experimental results show that NMT with context gates achieves consistent and significant improvements in translation quality over different NMT models.” (Conclusion Tu)

Claim 20 and 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Olah/Chen/Hannun, further in view of Subakan et al. “Diagonal RNNS in symbolic music modeling” hereinafter Subaken.

Regarding claim 20
	Further Olah/Chen/Hannun teaches the method of claim 19
Olah/Chen/Hannun does not explicitly teach, 
    PNG
    media_image18.png
    184
    647
    media_image18.png
    Greyscale

However Subakan when addressing LSTM networks that are modified for faster training while using less parameters teaches 
    PNG
    media_image18.png
    184
    647
    media_image18.png
    Greyscale
 (Section 2.2 ¶01 “We define the Diagonal RNN as an RNN with diagonal recurrent matrices…Note that element wise multiplying the previous state ht−1 with the W vector is equivalent to having a matrix-vector multiplication Wdiaght−1 where Wdiag is a diagonal matrix, with diagonal entries set to the W vector, and hence the name for Diagonal RNNs…” see equation 6 
    PNG
    media_image19.png
    27
    202
    media_image19.png
    Greyscale
, examiner notes that W corresponds to the weighting vector, Uc, as claimed. W in the art is a diagonal matrix which is simply a matrix representation of a W vector.
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate an LSTM with reduced parameters that still maintains comparable or better performance of standard LSTMS as taught by Subakan to the disclosed invention of Olah/Chen/Hannun.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement “the diagonal models achieve comparable (if not better) performance by using fewer parameters.” (Conclusion Subakan)

Regarding claim 21
	Further Olah/Chen/Hannun teaches the method of claim 19
	Further Chen teaches, 
    PNG
    media_image20.png
    205
    659
    media_image20.png
    Greyscale
 (pg 5 ¶03 “As to the memory cell itself, it is also controlled with a forget gate, which can reset the memory unit with a sigmoid function… 
    PNG
    media_image21.png
    149
    501
    media_image21.png
    Greyscale
 the LSTM cell described by Chen teaches equation 18 which is used to compute the memory cell signal. Further the first value of the memory cell signal is given by equation 19. Equation 19 describes an non linear activation function that operates on second data xt and a weight matrix Wc.)
Olah/Chen/Hannun does not explicitly teach, [the non linear equation includes 
    PNG
    media_image22.png
    31
    97
    media_image22.png
    Greyscale
]… uc is a weighting vector, ht-1 is the first output.
However Subakan when addressing LSTM networks that are modified for faster training while using less parameters teaches, [the non linear equation includes 
    PNG
    media_image22.png
    31
    97
    media_image22.png
    Greyscale
]… uc is a weighting vector, ht-1 is the first output. (Section 2.2 ¶01 “We define the Diagonal RNN as an RNN with diagonal recurrent matrices…Note that element wise multiplying the previous state ht−1 with the W vector is equivalent to having a matrix-vector multiplication Wdiaght−1 where Wdiag is a diagonal matrix, with diagonal entries set to the W vector, and hence the name for Diagonal RNNs…” see equation 6 
    PNG
    media_image19.png
    27
    202
    media_image19.png
    Greyscale
, examiner notes that W corresponds to the weighting vector, Uc, as claimed. W in the art is a diagonal matrix which is simply a matrix representation of a W vector.
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate an LSTM with reduced parameters that still maintains comparable or better performance of standard LSTMS as taught by Subakan to the disclosed invention of Olah/Chen/Hannun.
One of ordinary skill in the arts would have been motivated to make this modification in order to implement “the diagonal models achieve comparable (if not better) performance by using fewer parameters.” (Conclusion Subakan)


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHNATHAN R GERMICK whose telephone number is (571)272-8363. The examiner can normally be reached on Monday-Friday 7:30 am – 4:00 pm (EST).
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki, can be reached at telephone number 5712723719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://portal.uspto.gov/external/portal. Should you have questions about access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
	
/J.R.G./Examiner, Art Unit 2122                                                                                                                                                                                                        
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122