DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
This is a final office action in response to the amendment filed 21 June 2022.  Claims 1 through 3 have been amended.  Claims 36 through 49 are newly added.  Claims 4, 5, and 8 through 34 have been cancelled.  Claims 1 through 3, 6, 7, and 35 through 49 are pending and have been examined. 
Response to Amendment
Applicant’s amendment to claims 1 through 3, cancelation of claims 4, 5, and 8 through 34, and addition of claims 35 through 49 has been entered. 
Applicant’s amendment to the Figure 4 of the drawings to correct a typographical error has been accepted and entered. 
Claims 28 through 33 have been canceled, therefore the 35 U.S.C. 112(f) claim interpretation is moot. 
Applicant’s amendment to claims 1 through 3, 6, and 7 are sufficient to overcome the 35 U.S.C. 112(a) rejection.  The 35 U.S.C. 112(a) rejection is respectfully withdrawn. 
Claims 1 through 3, 6, 7, and 35 through 49 recite patent eligible subject matter under 35 U.S.C. 101.  The 35 U.S.C. 101 rejection is respectfully withdrawn. 
Claims 1 through 3, 6, 7, and 35 through 38 overcome the 35 U.S.C. 103 rejection.  The 35 U.S.C. 103 rejection is respectfully withdrawn. 
Claims 39 through 49 are rejected under 35 U.S.C. 103, as detailed below. 
Response to Arguments
Applicant’s arguments that the claims as presented herein recite patent eligible subject matter are persuasive.  Particularly, Applicant’s argument that the recited features do not fall within an enumerated group of abstract ideas, and further that the claims recite improvements to the accuracy of the machine learning model using an embedding matrix and a first and second layer of the recurrent neural network to generate a forecast value predicting future resource consumption. As a result the claims provide an inventive concept that overcomes technical challenges with a technical solution.  Therefore the 35 U.S.C. 101 rejection is respectfully withdrawn for claims 1 through 3, 6, 7, and 35 through 49.
Regarding the pending 35 U.S.C. 103 prior art rejection, the specific ordered combined sequence of claim elements recited in 1 through 3, 6, 7, and 35 through 38 cannot be found in the cited prior art and can only be found as recited in Applicant’s Specification, any combination of the cited references and/or additional references(s) to teach all the claim elements, including the features discussed above, would be the result of impermissible hindsight reconstruction  Therefore, the 35 U.S.C. 103 prior art rejection is respectfully withdrawn for claims 1 through 3, 6, 7, and 35 through 38. 
Applicant’s argument regarding the application of a 35 U.S.C. 103 rejection to newly added claims 39 through 49 have been fully considered, but are moot because the arguments do not apply to the prior art cited in the 35 U.S.C. 103 rejection of claims 39 through 49 detailed below, and further the claims are recited more broadly than those of claims 1 through 3, 6, 7, and 35 through 38.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f):
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f). The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f). The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f), except as otherwise indicated in an Office action.  Such claim limitations in claims 45 through 49 are: “means for receiving training data,” “ means for training a recurrent network,” “means for generating dimensional-transformation data,” “means for learning the model parameters of the recurrent neural network,” “means for collecting usage data resulting from user interaction,”  “means for generating the forecast value of the metric for a particular element,” and “means for displaying the forecast value of the metric for the particular element.” A review of the specification, specifically paragraphs [0038-0039], identifies a computing device as the corresponding structure. 
 Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f), except as otherwise indicated in an Office action. Such claim limitations in claims 39 through 44 are: “receiving training data,” “training a recurrent neural network,” “generating dimensional-transformation data,” “learning the model parameters of the recurrent neural network.” A review of the specification, specifically paragraphs [0038-0039], identifies a computing device as the corresponding structure.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 39 through 49 are rejected under 35 U.S.C. 103 as being unpatentable over Raviv et al. (US 2018/0218256) in view of Portegys et al. (US 2015/0026108).
Regarding New Claim 39, Principe et al. discloses a method comprising: receiving training data describing a time series of values of a metric for a plurality of elements, (… a method of training an artificial neural network to generate synthetic behavior samples is disclosed. The method includes training, a convolutional auto encoder of the artificial neural network, to generate a representation of an original behavior sample received from behavior data of a plurality of users. Raviv et al. [para. 0013-0016]. … The behavior data 402 may comprise a sequence of different samples from multiple sensors of different users. Each sensor may be sampled independently in multiple different time intervals, resulting in a multichannel time series. The sequence of samples may be referred to as a sequence of multi-dimensional time based samples (one dimension for each sensor). Time based samples refer to samples received from the sensor at a given time. Raviv et al. [para. 0061; Fig. 4]);
While Raviv et al. discloses user interaction data as a type of collected behavior data (For example, the behavior data collection may include the force of a touch on a touch screen, the length of the touch, the orientation of the phone, the time of use, and/or currently running applications. Raviv et al. [para. 0032]), Raviv et al. fails to explicitly disclose the metric describing computational resource consumption. Portegys et al. discloses this limitation. (… methods and systems for managing the capacity of one or more servers under a varying load. … methods disclosed here use observations of statistical fluctuations in actual usage as a means of prediction. … A neural network may be used to accept corresponding load and health measurements as inputs, and calculate a relationship (e.g., correlation, cause-effect, etc.) between those inputs using a learn module. Moreover, a predictor module may generate a predicted health of the plurality of server machines using the neural network. … a neural network comprises: (1) units (e.g., also sometimes referred to as “cells”), and (2) directed weighted links (connections) between them. Portegys et al. [para. 0008, 0020-0023, 0054-0058; Fig. 1, 4-5].  … corresponding to FIGS. 7C, 7D, and 7E, the server machine 704 may setup a hypothetical load 720 of seventy users with varying processor, memory, I/O usage to predict 722 the health of the plurality of servers executing the users' applications. Portegys et al. [para. 0071]). It would have been obvious to one of ordinary skill in the art of predictive data analytics before the effective filing date of the claimed invention to modify the data analysis features of Raviv et al. to include the metric describing computational resource consumption as disclosed by Portegys et al. to provide dynamic provisioning recommendations with corresponding confidence scores.  Portegys et al. [para. 0020].
 and training a recurrent neural network configured with model parameters to generate a forecast value of the metric based on the training data and including: (Neural networks may also have recurrent or feedback (also called top-down) connections. … A recurrent architecture may be helpful in recognizing patterns that span more than one of the input data chunks that are delivered to the neural network in a sequence.  Raviv et al. [para. 0043, 0051-0052]. …After learning, the DCN may be presented with new images 326 and a forward pass through the network may yield an output 322 that may be considered an inference or a prediction of the DCN. Raviv et al. [para. 0049]);
generating, by an embedding layer of the recurrent neural network, dimensional-transformation data including an embedding matrix configured to transform the training data into a simplified representation to determine similarity of the plurality of elements, one to another, with respect to the metric over the time series, (The values in the feature map may be further processed with a non-linearity, such as a rectification, max(0,x). Values from adjacent neurons may be further pooled, which corresponds to down sampling, and may provide additional local invariance and dimensionality reduction. Normalization, which corresponds to whitening, may also be applied through lateral inhibition between neurons in the feature map. … The deep convolutional network 350 may include multiple different types of layers based on connectivity and weight sharing. As shown in FIG. 3B, the exemplary deep convolutional network 350 includes multiple convolution blocks (e.g., C1 and C2). Each of the convolution blocks may be configured with a convolution layer, a normalization layer (LNorm), and a pooling layer. Raviv et al. [para. 0052-0056; Fig. 3A]. … an embedding of original behavior sample vectors (e.g., user behavior samples) is learned by the neural network. The learned embedding may capture the vector distribution and the encoded distribution. The probability distribution is learned (e.g., estimated) from the embedding space. … A combined per-user distribution and a combined distribution across all users are determined after all of the samples have been processed. In one configuration, the per-user distribution and the distribution across all users are estimated with an increased weight on sample length, sequence start, and sequence endings.  Raviv et al. [para. 0064-0072]. … an artificial neural network of the behavior generator generates a synthetic behavior sample based on the vector. The artificial neural network may comprise at least a bottleneck layer, a trained decoder layer, and a trained de-convolutional layer. The vector may be input to the bottleneck layer and further processed by the trained decoder layer and also a trained de-convolutional layer to generate a synthetic behavior sample. Raviv et al. [para. 0073-0077; Fig. 6]);
the generating including learning one or more encoding weights of the embedding matrix as part of the training; (To adjust the weights, a learning algorithm may compute a gradient vector for the weights. The gradient may indicate an amount that an error would increase or decrease if the weight were adjusted slightly. At the top layer, the gradient may correspond directly to the value of a weight connecting an activated neuron in the penultimate layer and a neuron in the output layer. In lower layers, the gradient may depend on the value of the weights and on the computed error gradients of the higher layers. The weights may then be adjusted so as to reduce the error. Raviv et al. [para. 0047-0048, 0062-0065]);
and learning, by a … generation layer of the recurrent neural network, the model parameters of the recurrent neural network based on the simplified representation to generate the… metric and based in part on hidden states that are not directly observed from the training data. (For varying length samples, the length distribution may be estimated. Alternatively, a random hidden markov process may be used for varying length samples. …A sample that is missing data points may be compensated by interpolation and extrapolation at an interpolation layer 404. The interpolation and extrapolation may create missing data points in a sample such that the all samples have the same size (e.g., same number of data points).  Raviv et al. [para. 0060-0063]).
While Raviv et al. discloses that yielding an output that may be a prediction (Raviv et al. [para. 0049]), Raviv et al. fails to explicitly disclose and learning, by a forecast value generation of the recurrent neural network to generate the forecast value of the metric and based in part on hidden states that are not directly observed from the training data. Portegys et al. discloses this limitation.  (…a neural network comprises: (1) units (e.g., also sometimes referred to as “cells”), and (2) directed weighted links (connections) between them. FIG. 5 shows a diagram of an illustrative neural network. …  a “hidden” unit, as illustrated in FIG. 5. Portegys et al. [para. 0008, 0020-0023, 0054-0058; Fig. 1, 4-5]. … The capacity prediction and learning server 106A may monitor, in step 404, the health of and load on the system as a result of the simulated load, and as before will record (in step 406) the measurements in the data store 128. As a result of the additional training in step 408, the learn module may adjust its neural network to provide a more confident (i.e., higher confidence score) prediction in the future.  Portegys et al. [para. 0064-0067, 0070-0072; Fig. 7A-7G]). It would have been obvious to one of ordinary skill in the art of predictive data analytics before the effective filing date of the claimed invention to modify the data analysis features of Raviv et al. to include learning, by a forecast value generation layer of the recurrent neural network, the model parameters of the recurrent neural network based on the simplified representation to generate the forecast value of the metric and based in part on hidden states that are not directly observed from the training data as disclosed by Portegys et al. to provide dynamic provisioning recommendations with corresponding confidence scores.  Portegys et al. [para. 0020]. 
Regarding New Claim 40, Raviv et al. and Portegys et al. combined disclose a method, wherein the model parameters define how the hidden states evolve over a particular time interval based on the training data. (The behavior data 402 may comprise a sequence of different samples from multiple sensors of different users. Each sensor may be sampled independently in multiple different time intervals, resulting in a multichannel time series. The sequence of samples may be referred to as a sequence of multi-dimensional time based samples (one dimension for each sensor).  … After aligning the samples at the interpolation layer 404, one of the aligned samples (e.g., original behavior sample vectors) is selected and input to a convolutional layer 406. Raviv et al. [para. 0061-0063; Fig. 4]. …  After the training is performed using all of the behavior data 402, a probability distribution of the behavior data 402 is captured. Raviv et al. [para. 0069, 0073-0077; Fig. 6]).
Regarding New Claim 41, Raviv et al. and Portegys et al. combined disclose a method, wherein the recurrent neural network is configured to implement a hidden Markov model to generate the model parameters. (For varying length samples, the length distribution may be estimated. Alternatively, a random hidden markov process may be used for varying length samples. Raviv et al. [para. 0060]).
Regarding New Claim 42, Raviv et al. and Portegys et al. combined disclose a method, wherein the recurrent neural network is trained to learn separate models for each element of the plurality of elements, and wherein the training further includes determining correlations between the separate models. (A convolutional neural network refers to a type of feed-forward artificial neural network. A neural network, such as an artificial neural network, with an interconnected group of artificial neurons (e.g., neuron models) may be a computational device or may be a method to be performed by a computational device. Raviv et al. [para. 0004]. …  a locally connected layer of a network may be configured so that each neuron in a layer will have the same or a similar connectivity pattern, but with connections strengths that may have different values (e.g., 310, 312, 314, and 316). Raviv et al. [para. 0044-0045]).
Regarding New Claim 43, Raviv et al. and Portegys et al. combined disclose a method, wherein the one or more encoding weights and the model parameters are jointly learned as part of the training. (To adjust the weights, a learning algorithm may compute a gradient vector for the weights. The gradient may indicate an amount that an error would increase or decrease if the weight were adjusted slightly. … This manner of adjusting the weights may be referred to as “back propagation” as it involves a “backward pass” through the neural network. Raviv et al. [para. 0047, 0050]).
Regarding New Claim 44, Raviv et al. and Portegys et al. combined disclose a method, wherein the embedding matrix includes a plurality of columns, the plurality of columns of the embedding matrix describing a vector representation of a corresponding dimension from a data space of the training data, and wherein one or more items included in the vector representation of the embedding matrix indicate a contribution of a respective dimension of the data space of the training data to a resulting low dimension of a corresponding simplified representation. (The program code for generating synthetic behavior samples with a behavior generator is executed by a processor and includes program code to draw, at a behavior generator, a vector from a probability distribution obtained from behavior data of a plurality of users.  Raviv et al. [para. 0011-0012, 0040-0041]. … The outputs of the convolutional connections may be considered to form a feature map in the subsequent layer 318 and 320, with each element of the feature map (e.g., 320) receiving input from a range of neurons in the previous layer (e.g., 318) and from each of the multiple channels. The values in the feature map may be further processed with a non-linearity, such as a rectification, max(0,x). Values from adjacent neurons may be further pooled, which corresponds to down sampling, and may provide additional local invariance and dimensionality reduction. …    Aspects of the present disclosure are not limited to the 2D convolutional neural network of FIG. 3A. Raviv et al. [para. 0052-0055; Fig. 3A]).

Regarding New Claim 45, Raviv et al. discloses a system comprising: means for receiving training data describing a time series of values of a metric for a plurality of elements; means for training a recurrent neural network configured with model parameters to generate a forecast value of the metric based on the training data and including: (… an apparatus including means for training, a convolutional auto encoder of the artificial neural network, to generate a representation of an original behavior sample received from behavior data of a plurality of users.  Raviv et al. [para. 0013-0016]. … The behavior data 402 may comprise a sequence of different samples from multiple sensors of different users. Each sensor may be sampled independently in multiple different time intervals, resulting in a multichannel time series. The sequence of samples may be referred to as a sequence of multi-dimensional time based samples (one dimension for each sensor). Time based samples refer to samples received from the sensor at a given time. Raviv et al. [para. 0061; Fig. 4] …After learning, the DCN may be presented with new images 326 and a forward pass through the network may yield an output 322 that may be considered an inference or a prediction of the DCN. Raviv et al. [para. 0049]… Neural networks may also have recurrent or feedback (also called top-down) connections. … A recurrent architecture may be helpful in recognizing patterns that span more than one of the input data chunks that are delivered to the neural network in a sequence.  Raviv et al. [para. 0043, 0051]);
means for generating, by a first layer of the recurrent neural network, dimensional-transformation data including an embedding matrix configured to transform the training data into a simplified representation to determine similarity of the plurality of elements, one to another, with respect to the metric over the time series, the generating including learning one or more encoding weights of the embedding matrix as part of the training; (The values in the feature map may be further processed with a non-linearity, such as a rectification, max(0,x). Values from adjacent neurons may be further pooled, which corresponds to down sampling, and may provide additional local invariance and dimensionality reduction. Normalization, which corresponds to whitening, may also be applied through lateral inhibition between neurons in the feature map. … The deep convolutional network 350 may include multiple different types of layers based on connectivity and weight sharing. As shown in FIG. 3B, the exemplary deep convolutional network 350 includes multiple convolution blocks (e.g., C1 and C2). Each of the convolution blocks may be configured with a convolution layer, a normalization layer (LNorm), and a pooling layer. Raviv et al. [para. 0052-0056; Fig. 3A]. … an embedding of original behavior sample vectors (e.g., user behavior samples) is learned by the neural network. The learned embedding may capture the vector distribution and the encoded distribution. The probability distribution is learned (e.g., estimated) from the embedding space. … A combined per-user distribution and a combined distribution across all users are determined after all of the samples have been processed. In one configuration, the per-user distribution and the distribution across all users are estimated with an increased weight on sample length, sequence start, and sequence endings.  Raviv et al. [para. 0064-0072]. … an artificial neural network of the behavior generator generates a synthetic behavior sample based on the vector. The artificial neural network may comprise at least a bottleneck layer, a trained decoder layer, and a trained de-convolutional layer. The vector may be input to the bottleneck layer and further processed by the trained decoder layer and also a trained de-convolutional layer to generate a synthetic behavior sample. Raviv et al. [para. 0073-0077; Fig. 6]).
means for learning, by a second layer of the recurrent neural network, the model parameters of the recurrent neural network based on the simplified representation to generate the forecast value of the metric, the model parameters based in part on hidden states that are not directly observed from the training data; (… After aligning the samples at the interpolation layer 404, one of the aligned samples (e.g., original behavior sample vectors) is selected and input to a convolutional layer 406. Raviv et al. [para. 0060-0063; Fig. 4]. …  After the training is performed using all of the behavior data 402, a probability distribution of the behavior data 402 is captured. Raviv et al. [para. 0069, 0073-0077; Fig. 6]. … The outputs of the convolutional connections may be considered to form a feature map in the subsequent layer 318 and 320, with each element of the feature map (e.g., 320) receiving input from a range of neurons in the previous layer (e.g., 318) and from each of the multiple channels. The values in the feature map may be further processed with a non-linearity, such as a rectification, max(0,x). Values from adjacent neurons may be further pooled, which corresponds to down sampling, and may provide additional local invariance and dimensionality reduction. …  Aspects of the present disclosure are not limited to the 2D convolutional neural networ  of FIG. 3A. Raviv et al. [para. 0052-0055; Fig. 3A]).
means for collecting usage data resulting from user interaction with a user interface via a network at respective client devices of a plurality of client devices; (To perform an accurate classification (e.g., authentication of a user) the convolutional neural network should be initially trained and/or tuned, after the initial training, with training data. The training data may include positive samples and negative samples. For behavior classification, the positive behavior samples may be obtained from behavior data generated from a device user's behavior (e.g., interaction with the device). Raviv et al. [para. 0005, 0014].  … the behavior data collection may include the force of a touch on a touch screen, the length of the touch, the orientation of the phone, the time of use, and/or currently running applications. Raviv et al. [para. 0032, 0061]);
While Raviv et al. discloses that the output may be a prediction (Raviv et al. [para. 0049]), Raviv et al. fails to explicitly disclose a means for generating the forecast value of the metric for a particular element that is a subject of the user interaction using the trained recurrent neural network.  Portegys et al. discloses this limitation. ( … a neural network comprises: (1) units (e.g., also sometimes referred to as “cells”), and (2) directed weighted links (connections) between them. FIG. 5 shows a diagram of an illustrative neural network. … a “hidden” unit, as illustrated in FIG. 5. Portegys et al. [para. 0008, 0020-0023, 0054-0058; Fig. 1, 4-5]. … The capacity prediction and learning server 106A may monitor, in step 404, the health of and load on the system as a result of the simulated load, and as before will record (in step 406) the measurements in the data store 128. As a result of the additional training in step 408, the learn module may adjust its neural network to provide a more confident (i.e., higher confidence score) prediction in the future.  Portegys et al. [para. 0064-0067, 0070-0072; Fig. 7A-7G]). It would have been obvious to one of ordinary skill in the art of predictive data analytics before the effective filing date of the claimed invention to modify the data analysis features of Raviv et al. to include means for generating the forecast value of the metric for a particular element that is a subject of the user interaction using the trained recurrent neural network as disclosed by Portegys et al. to provide dynamic provisioning recommendations with corresponding confidence scores.  Portegys et al. [para. 0020].
Regarding New Claim 46, Raviv et al. and Portegys et al. combined disclose a system, further comprising means for displaying the forecast value of the metric for the particular element in the user interface in real time responsive to user interaction. Portegys et al. discloses this limitation. (… a client device 102 that displays application output generated by an application remotely executing on a server 106 or other remotely located machine. Portegys et al. [para. 0027]. …  Referring to FIGS. 7A-7G, which include illustrative screenshots of a graphical user interface (GUI) tool for simulating, monitoring, and predicting the load and health of one or more server computers in accordance with various aspects of the disclosure.  Portegys et al. [para. 0070-0072]; Fig. 7A-7G]). It would have been obvious to one of ordinary skill in the art of predictive data analytics before the effective filing date of the claimed invention to modify the data analysis features of Raviv et al. to include a means for displaying the forecast value of the metric for the particular element in the user interface in real time responsive to user interaction as disclosed by Portegys et al. to provide dynamic provisioning recommendations with corresponding confidence scores.  Portegys et al. [para. 0020].
Regarding New Claim 47, Raviv et al. and Portegys et al. combined disclose a system, wherein the model parameters define how the hidden states evolve over a particular time interval based on the training data. (The behavior data 402 may comprise a sequence of different samples from multiple sensors of different users. Each sensor may be sampled independently in multiple different time intervals, resulting in a multichannel time series. The sequence of samples may be referred to as a sequence of multi-dimensional time based samples (one dimension for each sensor).  … After aligning the samples at the interpolation layer 404, one of the aligned samples (e.g., original behavior sample vectors) is selected and input to a convolutional layer 406. Raviv et al. [para. 0061-0063; Fig. 4]. …  After the training is performed using all of the behavior data 402, a probability distribution of the behavior data 402 is captured. Raviv et al. [para. 0069, 0073-0077; Fig. 6]).
Regarding New Claim 48, Raviv et al. and Portegys et al. combined disclose a system, wherein the recurrent neural network is configured to implement a hidden Markov model to generate the model parameters. (For varying length samples, the length distribution may be estimated. Alternatively, a random hidden markov process may be used for varying length samples. Raviv et al. [para. 0060]).
Regarding New Claim 49, Raviv et al. and Portegys et al. combined disclose a system, wherein the recurrent neural network is trained to learn separate models for each element of the plurality of elements, and wherein the training further includes determining correlations between the separate models. ( A convolutional neural network refers to a type of feed-forward artificial neural network. A neural network, such as an artificial neural network, with an interconnected group of artificial neurons (e.g., neuron models) may be a computational device or may be a method to be performed by a computational device. Raviv et al. [para. 0004]. …  a locally connected layer of a network may be configured so that each neuron in a layer will have the same or a similar connectivity pattern, but with connections strengths that may have different values (e.g., 310, 312, 314, and 316). Raviv et al. [para. 0044-0045]).
Allowable Subject Matter
Claims 1 through 3, 6, 7, and 35 through 38 are allowable.  
Regarding the subject matter eligibility rejection under 35 U.S.C. 101, Examiner submits that the claims are directed to patent-eligible subject matter because the claims include the above mentioned technical improvements, and when viewed as a whole, amount to significantly more than any recited abstract idea.
Regarding the prior art rejection under 35 U.S.C. 103, the specific ordered combined sequence of claim elements recited in claims 1 through 3, 6, 7, and 35 through 38 cannot be found in the cited prior art and can only be found as recited in Applicant’s Specification, any combination of the cited references and/or additional references(s) to teach all the claim elements, including the features discussed above, would be the result of impermissible hindsight reconstruction.
Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicant’s disclosure:
Trask et al. (US 2016/0247061) – a computer-implemented neural network includes a plurality of neural nodes, where each of the neural nodes has a plurality of input weights corresponding to a vector of real numbers. The neural network also includes an input neural node corresponding to a linguistic unit selected from an ordered list of a plurality of linguistic units, and an embedding layer with a plurality of embedding node partitions. Each embedding node partition includes one or more neural nodes. Each of the embedding node partitions corresponds to a position in the ordered list relative to a focus term, is configured to receive an input from an input node, and is configured to generate an output. The neural network also includes a classifier layer with a plurality of neural nodes, each configured to receive the embedding outputs from the embedding layer, and configured to generate an output corresponding to a probability that a particular linguistic unit is the focus term.
Principe et al. (US 2016/0242690) -  The methods for neural data analysis and decoding for BMIs can include the following: qualitatively capture the intrinsic relationship between the neural modulation and the variable to be decoded, highlight important features or dimensions of the neural response, and improve the performance of subsequent decoding.  A variety of methods consider the formation of features from the dimensions. Starting on the far left of FIG. 4, feature selection is the simplest approach as it comprises an inclusion or exclusion decision for each dimension. Although the relative importance of a feature is assessed, the information is used to select the features and then is lost; whereas feature weighting explicitly optimizes the relative contribution of the individual dimensions. Feature weighting can be used whenever there is an underlying distance metric. When applicable, linear projections can be used to find linear combinations, weighted sums, of the dimensions or, more generally, the correlation between dimensions that serve as better features then a weighted combination of features. 
Kumar et al. (US 2016/0335053) - Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the methods includes receiving a plurality of high-dimensional data items; generating a circulant embedding matrix for the high-dimensional data items, wherein the circulant embedding matrix is a matrix that is fully specified by a single vector; for each high-dimensional data item, generating a compact representation of the high-dimensional data item, comprising computing a product of the circulant embedding matrix and the high dimensional data item by performing a circular convolution of the single vector that fully specifies the circulant embedding matrix and the high dimensional data item using a Fast Fourier Transform (FFT); and generating a compact representation of the high dimensional data item by computing a binary map of the computed product.
Applicant’s amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LETORIA G KNIGHT whose telephone number is (571)270-0485. The examiner can normally be reached M-F 9am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Rutao WU can be reached on 571-272-6045. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/L.G.K/             Examiner, Art Unit 3623                                                                                                                                                                                           
/CHARLES GUILIANO/             Primary Examiner, Art Unit 3623