DETAILED ACTION
This action is in response to the claims filed 11/01/2022 for application 16/694,921. Claims 1-15, 17, and 18 have been amended. Thus, claims 1-20 are currently pending.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claim 18 is objected to because of the following informalities:  "wherein first neural network" should read "wherein the first neural network".  Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitations uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitations are:
a preprocessor configured to generate…in claim 1
a learner configured to generate…in claim 1
a feature learner configured to calculate…in claim 2
a time series learner configured to calculate… in claim 2
a weight value controller configured to adjust… in claim 2
a missing value processor configured to generate… in claim 3
a time processor configured to generate… in claim 3
a feature weight value calculator configured to calculate... in claim 3
a feature weight value applicator configured to apply… in claim 3
a time series weight value calculator configured to calculate… in claim 4
a time series weight value applicator configured to apply… in claim 4
an integrated weight value applicator configured to generate… in claim 8
a predictor configured to generate… in claim 9
a feature predictor configured to generate…in claim 10
a time series predictor configured to generate…in claim 10
a result generator configured to calculate…in claim 10
Because these claim limitations are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, they are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have these limitations interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitations to avoid them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitations recites sufficient structure to perform the claimed function so as to avoid them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Regarding claim 1, 
Step 1 Analysis: Claim 1 is directed to a process, which falls within one of the four statutory categories. 
Step 2A Prong 1 Analysis: Claim 1 recites, in part, to generate interval data, add an interpolation value to a missing value…, generate masking data for distinguishing the missing value, generate a weight value group and a time series weight value group, and generating the feature weight value and a second parameter for generating the time series weight value. The limitations of to generate interval data, add an interpolation value to a missing value…, generate masking data for distinguishing the missing value, generate a weight value group and a time series weight value group, and generating the feature weight value and a second parameter for generating the time series weight value, as drafted, are processes that, under broadest reasonable interpretation, covers the performance of the limitation in the mind which falls within the “Mental Processes” grouping of abstract ideas. The limitations of: 
to generate interval data can be considered to be an evaluation in the human mind,
add an interpolation value to a missing value… can be considered to be an evaluation in the human mind, 
generate masking data for distinguishing the missing value can be considered to be an evaluation in the human mind,
generate a weight value group and a time series weight value group can be considered to be an evaluation in the human mind,
and generating the feature weight value and a second parameter for generating the time series weight value can be considered to be an evaluation in the human mind
and to generate a prediction result corresponding to a prediction time for a second time series data can be considered to be an evaluation in the human mind.
Accordingly, the claim recites an abstract idea.
Step 2A Prong 2 Analysis: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements – “prediction model”, “using first neural network comprising a feed-forward neural network”, “using a second neural network comprising a recurrent neural network”, and “a predictor including one or more neural networks”. These elements that are recited are only generally linked to the judicial exception. Additionally, the claim recites the – “preprocessor” and “learner”. These elements invoke 112(f) and can be interpreted to be hardware as disclosed on para [0033] of the specification. Thus, the elements in the claim are recited at a high level of generality (i.e. as a generic processor performing a generic computer function of generating an index) such that it amounts to no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
Step 2B Analysis: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of utilizing a prediction model, using first neural network comprising a feed-forward neural network, using a second neural network comprising a recurrent neural network, and a predictor including one or more neural networks to perform the steps of the claimed process amount to no more than generally linking the elements to the judicial exception. Additionally, the preprocessor and learner amount to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.  

Regarding claim 2, the rejection of claim 1 is further incorporated, and further, the claim recites: wherein the learner includes: a feature learner configured to calculate the feature weight value, based on the masking data, the interval data, the interpolation data, and the first parameter, and generate a first learning result, based on the feature weight value; a time series learner configured to calculate the time series weight value, based on the first learning result and the second parameter, and generate a second learning result, based on the time series weight value; and a weight value controller configured to adjust the first parameter or the second parameter, based on the first learning result or the second learning result. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 1 above.
The claim does recite the additional elements of “feature learner”, “time series learner”, “weight value controller”, “using the first neural network” and “using the second neural network”, however they do not amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception, for the reasons set forth in connection with the rejection of claim 1 above. The claim is not patent eligible.

Regarding claim 3, the rejection of claim 2 is further incorporated, and further, the claim recites: wherein the feature learner includes: a missing value processor configured to generate first correction data of the interpolation data, based on the masking data; a time processor configured to generate second correction data of the interpolation data, based on the interval data; a feature weight value calculator configured to calculate the feature weight value, based on the first parameter, the first correction data, and the second correction data; and a feature weight value applicator configured to apply the feature weight value to the interpolation data. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 1 above.
The claim does recite the additional elements of “a missing value processor”, “time processor”, “feature weight value controller”, “feature weight value applicator” and “using the first neural network”, however they do not amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception, for the reasons set forth in connection with the rejection of claim 1 above. The claim is not patent eligible.
Regarding claim 4, the rejection of claim 2 is further incorporated, and further, the claim recites: wherein the time series learner includes: a time series weight value calculator configured to calculate the time series weight value, based on the first learning result and the second parameter; and a time series weight value applicator configured to apply the time series weight value to the first learning result. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 1 above.
The claim does recite the additional elements of “a time series weight value calculator”, “time series weight value applicator”, and “using the second neural network”, however they do not amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception, for the reasons set forth in connection with the rejection of claim 1 above. The claim is not patent eligible.

Regarding claim 5, the rejection of claim 2 is further incorporated, and further, the claim recites: wherein the learner includes: a feature learner configured to calculate the feature weight value, based on the masking data, the interpolation data, and the first parameter, and generate a first learning result, based on the feature weight value; a time series learner configured to calculate the time series weight value, based on the interval data, the first learning result, and the second parameter, and generate a second learning result, based on the time series weight value; and a weight value controller configured to adjust the first parameter or the second parameter, based on the first learning result or the second learning result. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 1 above.
The claim does recite the additional elements of “feature learner”, “time series learner”, “weight value controller”, “using the first neural network”, and “using the second neural network”, however they do not amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception, for the reasons set forth in connection with the rejection of claim 1 above. The claim is not patent eligible.

Regarding claim 6, the rejection of claim 5 is further incorporated, and further, the claim recites: wherein the feature learner includes: a missing value processor configured to generate correction data of the interpolation data, based on the masking data; a feature weight value calculator configured to calculate the feature weight value, based on the first parameter and the correction data; and a feature weight value applicator configured to apply the feature weight value to the interpolation data. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 1 above.
The claim does recite the additional elements of “missing value processor”, “feature weight value calculator”, “feature weight value applicator”, and “using the first neural network”, however they do not amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception, for the reasons set forth in connection with the rejection of claim 1 above. The claim is not patent eligible.

Regarding claim 7, the rejection of claim 5 is further incorporated, and further, the claim recites: wherein the time 5series learner includes: a time processor configured to generate correction data of the first learning result, based on the interval data; a time series weight value calculator configured to calculate the time series weight value, based on the second parameter and the correction data; and  10a time series weight value applicator configured to apply the time series weight value to the first learning result. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 1 above.
The claim does recite the additional elements of “time processor”, “time series weight value calculator”, “time series weight value applicator”, and “using the second neural network”, however they do not amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception, for the reasons set forth in connection with the rejection of claim 1 above. The claim is not patent eligible.

Regarding claim 8, the rejection of claim 1 is further incorporated, and further, the claim recites: wherein the learner includes: a feature learner configured to calculate the feature weight value, based on the masking data, the interpolation data, and the first parameter; a time series learner configured to calculate the time series weight value, based on the interval data, the interpolation data, and the second parameter; and an integrated weight value applicator configured to generate a learning result, based on the feature weight value and the time series weight value; and a weight value controller configured to adjust the first parameter or the second parameter, based on the learning result. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 1 above.
The claim does recite the additional elements of “feature learner”, “time series learner”, “integrated weight value applicator”, “weight value controller”, “using the first neural network”, and “using the second neural network”, however they do not amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception, for the reasons set forth in connection with the rejection of claim 1 above. The claim is not patent eligible.

Regarding claim 9, 
Step 1 Analysis: Claim 9 is directed to a process, which falls within one of the four statutory categories. 
Step 2A Prong 1 Analysis: Claim 9 recites, in part, to generate interval data, add an interpolation value to a missing value…, generate masking data for distinguishing the missing value, generate a weight value group and a time series weight value group, and generating the feature weight value and a second parameter for generating the time series weight value. The limitations of to generate interval data, add an interpolation value to a missing value…, generate masking data for distinguishing the missing value, generate a weight value group and a time series weight value group, and generating the feature weight value and a second parameter for generating the time series weight value, as drafted, are processes that, under broadest reasonable interpretation, covers the performance of the limitation in the mind which falls within the “Mental Processes” grouping of abstract ideas. The limitations of: 
to generate interval data can be considered to be an evaluation in the human mind,
add an interpolation value to a missing value… can be considered to be an evaluation in the human mind, 
generate masking data for distinguishing the missing value can be considered to be an evaluation in the human mind,
generate a weight value group and a time series weight value group can be considered to be an evaluation in the human mind,
and generating the feature weight value and a second parameter for generating the time series weight value can be considered to be an evaluation in the human mind.
Accordingly, the claim recites an abstract idea.
Step 2A Prong 2 Analysis: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements – “prediction model”, “using first neural network comprising a feed-forward neural network”, “using a second neural network comprising a recurrent neural network”, “using a third neural network comprising a fully-connected neural network”. These elements that are recited are only generally linked to the judicial exception. Additionally, the claim recites the – “preprocessor” and “predictor”. These elements invoke 112(f) and can be interpreted to be hardware as disclosed on para [0033] of the specification. Thus, the elements in the claim are recited at a high level of generality (i.e. as a generic processor performing a generic computer function of generating an index) such that it amounts to no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
Step 2B Analysis: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of utilizing a prediction model, using first neural network comprising a feed-forward neural network, using a second neural network comprising a recurrent neural network, using a third neural network comprising a fully-connected neural network to perform the steps of the claimed process amount to no more than generally linking the elements to the judicial exception. Additionally, the preprocessor and predictor amount to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.  

Regarding claim 10, the rejection of claim 9 is further incorporated, and further, the claim recites: wherein the predictor includes: a feature predictor configured to generate a first result, based on the feature weight value; a time series predictor configured to generate a second result, based on the time series weight value; and a result generator configured to calculate the prediction result corresponding to a prediction time, based on the second result. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 9 above.
The claim does recite the additional elements of “feature predictor”, “time series predictor”, “result generator”, “using the first neural network, “using the second neural network”, and “using the third neural network”, however they do not amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception, for the reasons set forth in connection with the rejection of claim 9 above. The claim is not patent eligible.

Regarding claim 11, the rejection of claim 10 is further incorporated, and further, the claim recites: wherein the feature predictor includes: a missing value processor configured to encode the interpolation data, based on the masking data; a time processor configured to model the interval data; a feature weight value calculator configured to generate feature analysis data, based on the encoded interpolation data and to generate the feature weight value, based on the feature analysis data and the modeled interval data; and a feature weight value applicator configured to apply the feature weight value to the feature analysis data to generate the first result. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 9 above.
The claim does recite the additional elements of “a missing value processor”, “time processor”, “feature weight value calculator”, “feature weight value applicator”, “using the first neural network”, however they do not amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception, for the reasons set forth in connection with the rejection of claim 9 above. The claim is not patent eligible.
Regarding claim 12, the rejection of claim 10 is further incorporated, and further, the claim recites: wherein the feature predictor includes: a missing value processor configured to merge the masking data and the interpolation data; a time processor configured to model the interval data; a feature weight value calculator configured to generate feature analysis data, based on the merged data, and generate the feature weight value, based on the feature analysis data and the modeled interval data; and a feature weight value applicator configured to apply the feature weight value to the feature analysis data to generate the first result. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 9 above.
The claim does recite the additional elements of “a missing value processor”, “time processor”, “feature weight value calculator”, “feature weight value applicator” and “using the first neural network”, however they do not amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception, for the reasons set forth in connection with the rejection of claim 9 above. The claim is not patent eligible.

Regarding claim 13, the rejection of claim 10 is further incorporated, and further, the claim recites: wherein the feature predictor includes: a missing value processor configured to model the masking data; a time processor configured to model the interval data; a feature weight value calculator configured to generate feature analysis data, based on the interpolation data, and generate the feature weight value, based on the modeled masking data, the modeled interval data, and the feature analysis data; and a feature weight value applicator configured to apply the feature weight value to the feature analysis data to generate the first result. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 9 above.
The claim does recite the additional elements of “a missing value processor”, “time processor”, “feature weight value calculator”, “feature weight value applicator”, and “using the first neural network”, however they do not amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception, for the reasons set forth in connection with the rejection of claim 9 above. The claim is not patent eligible.

Regarding claim 14, the rejection of claim 10 is further incorporated, and further, the claim recites: wherein the feature predictor includes: a missing value processor configured to model the masking data; a time processor configured to merge the interval data and the interpolation data; a feature weight value calculator configured to generate feature analysis data, based on the merged data, and generate the feature weight value, based on the feature analysis data and the modeled masking data; and a feature weight value applicator configured to apply the feature weight value to the feature analysis data to generate the first result. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 9 above.
The claim does recite the additional elements of “a missing value processor”, “time processor”, “feature weight value calculator”, “feature weight value applicator” and “using the first neural network”, however they do not amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception, for the reasons set forth in connection with the rejection of claim 9 above. The claim is not patent eligible.

Regarding claim 15, the rejection of claim 10 is further incorporated, and further, the claim recites: wherein the time series predictor includes: a time series weight value calculator configured to generate time series analysis data, based on the first result, and generate the time series weight value, based on the time series analysis data; and a time series weight value applicator configured to apply the time series weight value to the first result or the time series analysis data. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 9 above.
The claim does recite the additional elements of “a time series weight value calculator”, “time series weight value applicator” and “using the second neural network”, however they do not amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception, for the reasons set forth in connection with the rejection of claim 9 above. The claim is not patent eligible.

Regarding claim 16, the rejection of claim 10 is further incorporated, and further, the claim recites: wherein the feature predictor calculates the feature weight value, based on the masking data and the interpolation data, and wherein the time series predictor calculates the time series weight value, based on the first result and the interval data. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 9 above.
The claim does recite the additional elements of “a feature predictor” and “time series predictor”, however they do not amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception, for the reasons set forth in connection with the rejection of claim 9 above. The claim is not patent eligible.

Regarding claim 17, the rejection of claim 9 is further incorporated, and further, the claim recites: wherein the predictor includes: a feature predictor configured to calculate the feature weight value, based on the masking data and the interpolation data; a time series predictor configured to calculate the time series weight value, based on the interval data the interpolation data; an integrated weight value applicator configured to generate an integrated result corresponding to the interpolation data, based on the feature weight value and the time series weight value; and a result generator configured to calculate the prediction result corresponding to a prediction time, based on the integrated result. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 9 above.
The claim does recite the additional elements of “a feature predictor”, “time series predictor”, “an integrated weight value applicator”, “a result generator”, “using the first neural network”, “using the second neural network”, and “using the third neural network”, however they do not amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception, for the reasons set forth in connection with the rejection of claim 9 above. The claim is not patent eligible.

Regarding claim 18, 
Step 1 Analysis: Claim 18 is directed to a process, which falls within one of the four statutory categories. 
Step 2A Prong 1 Analysis: Claim 18 recites, in part, generating interpolation data by adding an interpolation value to a missing value of time series data; generating interval data, based on a time interval of the time series data; generating masking data, based on the missing value; generating a feature weight value depending on a time and a feature of the time series data, based on the interpolation data, the interval data, and the masking data; generating a first result, based on the feature weight value; generating a time series weight value depending on a time flow of the time series data, based on the first result; and generating a second result, based on the time series weight value. The limitations of generating interpolation data by adding an interpolation value to a missing value of time series data; generating interval data, based on a time interval of the time series data; generating masking data, based on the missing value; generating a feature weight value depending on a time and a feature of the time series data, based on the interpolation data, the interval data, and the masking data; generating a first result, based on the feature weight value; generating a time series weight value depending on a time flow of the time series data, based on the first result; and generating a second result, based on the time series weight value, as drafted, are processes that, under broadest reasonable interpretation, covers the performance of the limitation in the mind which falls within the “Mental Processes” grouping of abstract ideas. The limitations of: 
generating interpolation data by adding an interpolation value to a missing value of time series data can be considered to be an evaluation in the human mind,
generating interval data, based on a time interval of the time series data can be considered to be an evaluation in the human mind, 
generating masking data, based on the missing value can be considered to be an evaluation in the human mind,
generating a feature weight value depending on a time and a feature of the time series data, based on the interpolation data, the interval data, and the masking data can be considered to be an evaluation in the human mind,
generating a first result, based on the feature weight value can be considered to be an evaluation in the human mind.
generating a time series weight value depending on a time flow of the time series data based on the first result can be considered to be an evaluation in the human mind.
generating a second result, based on the time series weight value can be considered to be an evaluation in the human mind.
Accordingly, the claim recites an abstract idea.
Step 2A Prong 2 Analysis: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements – “time series data processing device”, “using a first neural network”, and “using a second neural network”. Thus, the elements in the claim are recited at a high level of generality (i.e. as a generic processor performing a generic computer function of generating an index) such that it amounts to no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
Step 2B Analysis: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of utilizing a time series data processing device, using a first neural network, and using a second neural network amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.  

Regarding claim 19, the rejection of claim 18 is further incorporated, and further, the claim recites: adjusting a parameter for generating the feature weight value or the time series weight value, based on the second result. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 18 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 20, the rejection of claim 18 is further incorporated, and further, the claim recites: calculating a prediction result corresponding to a prediction time, based on the second result. This limitation amounts to additional mental steps in addition to the judicial exception identified in the rejection of claim 18 above.
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-2, 4-5, and 8-20 are rejected under 35 U.S.C. 103 as being unpatentable over Kurasawa et al. ("US 20190228291 A1", hereinafter "Kurasawa") in view of Brezak et al. ("A Comparison of Feed-forward and Recurrent Neural Networks in Time Series Forecasting", hereinafter "Brezak") and further in view of Yao et al. ("Deep Multi-View Spatial-Temporal Network for Taxi Demand Prediction", hereinafter "Yao").

Regarding claim 1, Kurasawa teaches A time series data processing device comprising: 
a preprocessor (See [¶0058-60], note: This claim invokes 112f, therefore as recited in para [¶0038] of the specification, the elements can be interpreted as hardware or software loaded into memory.) configured to generate interval data, based on a time interval of first time series data (“a data processing unit that processes the received unevenly spaced time-series-data group into an evenly spaced time-series-data group and an omission information group based on the received input time-series data length and the received minimum observation interval” [¶0014]), add an interpolation value to a missing value of the first time series data to generate interpolation data (“In a second method, an omission estimation processing (interpolation or extrapolation) is performed so that the time interval is constant, and then features expressing changes over time are extracted.” [¶0006; See also: “In the above example of unevenly spaced time-series data, when discrete Fourier transform is performed after the interpolation processing by a linear function, three records, namely (13 o'clock, 26 degrees), (15 o'clock, 25 degrees), and (16 o'clock, 22 degrees) are added by means of interpolation of the omission estimation processing” [¶0008]]), and 
generate masking data for distinguishing the missing value (“Next, the model learning unit 14 initializes the model parameters (Step S104). A random value is assigned to the weight parameter A.sub.i of the model parameter and the bias parameter B.sub.i (i=1, 2, 3, 4, 5). Furthermore, “0” is assigned to the omission value of the evenly spaced time-series-data. In the present embodiment, “0” is assigned to the omission value, but it is not limited to this example. An average value, a median value, or an omission processing result may be assigned to the omission value.” [¶0044; assigning “0” to omission value corresponds to generating “masking data”]); and 
a learner configured to generate a weight value group of a prediction model that generates a feature weight value depending on a time and a feature of the first time series data (“For the model of a neural network in which the feature extraction size received by the model learning unit 12 is an intermediate layer, the model learning unit 14 learns a weight vector of each layer where a difference between an element not missing in the matrix of the evenly spaced time-series-data group including omissions, and an element of the output result of the output layer is taken as an error, and generates model parameters.” [¶0031; See [¶0030] “The model design receiving unit 12 receives (receives an input of) a time-series data length, an observation minimum interval, and a feature extraction size.” corresponds to a time and feature of a time series.]) and a time series weight value depending on a time flow of the first time series data (“The feature extraction unit 16 receives the time-series data of the feature extraction target and takes the received time-series data as an input of the model. The feature extraction unit 16 calculates the value of the intermediate layer of the model using the stored model parameters, and outputs the feature representing the temporal changes of the data.” [¶0032; temporal changes correspond to a dependence on a time flow.]), based on the interval data (“a model design receiving unit that receives an input time-series data length, an observation minimum interval” [¶0014]), the interpolation data (¶0006, method is based on interpolation), and the masking data (¶0044; discloses masking data]), 
wherein the weight value group includes a first parameter for generating the feature weight value (“the model learning unit 14 learns a weight vector of each layer where a difference between an element not missing in the matrix of the evenly spaced time-series-data group including omissions, and an element of the output result of the output layer is taken as an error, and generates model parameters.” [¶0031; corresponds to a first parameter]) and a second parameter for generating the time series weight value (“The feature extraction unit 16 calculates the value of the intermediate layer of the model using the stored model parameters” [¶0032; model parameters implies a second parameter to generate a time series weight value]).
	Although Kurasawa teaches using a neural network, the reference fails to explicitly teach wherein the weight value group is generated using a first neural network comprising a feed- forward neural network, wherein the time series weight value is generated using a second neural network comprising a recurrent neural network
	Brezak teaches wherein the weight value group is generated using a first neural network comprising a feed-forward neural network (“Feed–forward neural network analyzed in this paper is the most commonly used MLP NN with three layers… By summing the outputs of all hidden layer neurons (including Bias) and belonging weight factors w” [pg. 2, A. Feed-forward Neural Network, ¶1-3; See further: “Forecasting performances of feed-forward and recurrent neural networks (NN) trained with different learning algorithms are analyzed and compared using the Mackey– Glass nonlinear chaotic time series. This system is a known benchmark test whose elements are hard to predict. Multi– layer Perceptron NN was chosen as a feed-forward neural network because it is still the most commonly used network in financial forecasting models.” [Abstract]]), 
wherein the time series weight value is generated using a second neural network comprising a recurrent neural network (“The Dynamic Multi–layer Perceptron Network (DMLP), proposed in [9], was modified and used as the second, recurrent type of network in this study. Unlike the feedforward MLP NN, this type of network is characterized by a dynamic neuron model, the so-called Dynamic Elementary Processor (DEP) which is structured as an Auto Regressive Moving Average (ARMA) filter, and is built into the network hidden layer. Thus, every hidden layer neuron has the ability to process previous values of its own activity together with new input signals.” [pg. 2, B. Recurrent Neural Network, ¶1])
Kurasawa and Brezak are both in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak] 
However Kurasawa and Brezak fails to explicitly teach and wherein the weight value group and the time series weight value are adapted to generate, when provided to a predictor including one or more neural networks, a prediction result corresponding to a prediction time for second time series data 
Yao teaches and wherein the weight value group and the time series weight value are adapted to generate, when provided to a predictor including one or more neural networks, a prediction result corresponding to a prediction time for second time series data (“We propose to use a graph of regions to capture this latent semantic, where the edge represents similarity of demand patterns for a pair of regions. Later, regions are encoded into vectors via a graph embedding method and such vectors are used as context features in the model. In the end, a fully connected neural network component is used for prediction.” [pg. 2589, left col, ¶2; See further: “We use the average weekly demand time series as the demand patterns. The average is computed on the training data in the experiment. The graph is fully connected because every two regions can be reached.” [pg. 2591, right col, top para; discloses using time-series data. See Figure 1: “The semantic view first constructs a weighted graph of regions (with weights representing functional similarity). Nodes are encoded into vectors. A fully connected layer is used at the end for jointly training. Finally, a fully connected neural network is used for prediction]]).
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Kurasawa’s/Brezak’s teachings to implement a fully connected neural network to perform a prediction result as taught by Yao. One would have been motivated to make this modification in order to use spatial-temporal correlations to improve on traditional time series prediction methods. [pg. 2588, § Introduction, ¶3, Yao]

Regarding claim 2, Kurasawa/Brezak/Yao teaches The time series data processing device of claim 1, where Kurasawa teaches wherein the learner includes: 
a feature learner configured to calculate the feature weight value, based on the masking data, the interval data, the interpolation data, and the first parameter, and generate a first learning result, based on the feature weight value (“a model learning unit that learns a weight vector of each layer of a model with a difference being taken as an error, and stores the weight vector as a model parameter in a storage unit, the model being a model of a neural network including an input layer, an output layer, and an intermediate layer, a matrix obtained by combining the evenly spaced time-series-data group including omissions and the omission information group indicating presence or absence of omissions being input to the input layer, a matrix of an evenly spaced time-series-data group of an input time-series data length being output from the output layer, the received feature extraction size being the intermediate layer, and the difference being a difference between an element not missing in a matrix of the evenly spaced time-series-data group including omissions and an element of an output result of the output layer;” [¶0014]); 
a time series learner configured to calculate the time series weight value, based on the first learning result and the second parameter (“a feature extraction unit that receives time-series data of a feature extraction target, calculates a value of the intermediate layer of the model with use of the model parameter stored in the storage unit by inputting the received time-series data of the feature extraction target into the model” [¶0014]), and 
generate a second learning result, based on the time series weight value (“and outputs the calculated value of the intermediate layer as a feature that represents temporal changes in data.” [¶0014]); and 
a weight value controller configured to adjust the first parameter or the second parameter, based on the first learning result or the second learning result (“Moreover, for the non-omitted value of the evenly spaced time-series-data P, it is aimed to learn so as to have the same value in the output layer value X.sub.6, and an error function is designed. The model parameters are optimized by means of a gradient method so as to minimize the error. Adam is used as the gradient method. The gradient method in the present embodiment is not limited to this. As the gradient method, any method of probabilistic gradient descent such as SGD and AdaDelta may be used.” [¶0046; discloses an optimization algorithm, therefore optimizing model parameters would be equivalent to adjusting the first or second parameters.]).
Brezak teaches using the first neural network (“Feed–forward neural network analyzed in this paper is the most commonly used MLP NN with three layers… By summing the outputs of all hidden layer neurons (including Bias) and belonging weight factors w” [pg. 2, A. Feed-forward Neural Network, ¶1-3; See further: “Forecasting performances of feed-forward and recurrent neural networks (NN) trained with different learning algorithms are analyzed and compared using the Mackey– Glass nonlinear chaotic time series. This system is a known benchmark test whose elements are hard to predict. Multi– layer Perceptron NN was chosen as a feed-forward neural network because it is still the most commonly used network in financial forecasting models.” [Abstract]]) and using the second neural network (“The Dynamic Multi–layer Perceptron Network (DMLP), proposed in [9], was modified and used as the second, recurrent type of network in this study. Unlike the feedforward MLP NN, this type of network is characterized by a dynamic neuron model, the so-called Dynamic Elementary Processor (DEP) which is structured as an Auto Regressive Moving Average (ARMA) filter, and is built into the network hidden layer. Thus, every hidden layer neuron has the ability to process previous values of its own activity together with new input signals.” [pg. 2, B. Recurrent Neural Network, ¶1]).
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s/Yao’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak]

Regarding claim 4, Kurasawa/Brezak/Yao teaches The time series data processing device of claim 2, where Kurasawa teaches wherein the time series learner includes: 
a time series weight value calculator configured to calculate the time series weight value, based on the first learning result and the second parameter (“a feature extraction unit that receives time-series data of a feature extraction target, calculates a value of the intermediate layer of the model with use of the model parameter stored in the storage unit by inputting the received time-series data of the feature extraction target into the model” [¶0014]); and 
a time series weight value applicator configured to apply the time series weight value to the first learning result (“Next, the model learning unit 14 learns the weight vector of each layer that constitutes the model so as to minimize the error (Step S105). Specifically, the evenly spaced time-series-data is referred to as P, the omission information is referred to as Q, and the data that combines the evenly spaced time-series-data group and the omission information group representing the presence or absence of omission is referred to as R.” [¶0045]).
Brezak teaches using the second neural network (“The Dynamic Multi–layer Perceptron Network (DMLP), proposed in [9], was modified and used as the second, recurrent type of network in this study. Unlike the feedforward MLP NN, this type of network is characterized by a dynamic neuron model, the so-called Dynamic Elementary Processor (DEP) which is structured as an Auto Regressive Moving Average (ARMA) filter, and is built into the network hidden layer. Thus, every hidden layer neuron has the ability to process previous values of its own activity together with new input signals.” [pg. 2, B. Recurrent Neural Network, ¶1])
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s/Yao’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak]

Regarding claim 5, Kurasawa/Brezak/Yao teaches The time series data processing device of claim 1, where Kurasawa wherein the learner includes: 
a feature learner configured to calculate the feature weight value, based on the masking data, the interpolation data, and the first parameter, and generate a first learning result, based on the feature weight value (“a model learning unit that learns a weight vector of each layer of a model with a difference being taken as an error, and stores the weight vector as a model parameter in a storage unit, the model being a model of a neural network including an input layer, an output layer, and an intermediate layer, a matrix obtained by combining the evenly spaced time-series-data group including omissions and the omission information group indicating presence or absence of omissions being input to the input layer, a matrix of an evenly spaced time-series-data group of an input time-series data length being output from the output layer, the received feature extraction size being the intermediate layer, and the difference being a difference between an element not missing in a matrix of the evenly spaced time-series-data group including omissions and an element of an output result of the output layer;” [¶0014]); 
a time series learner configured to calculate the time series weight value, based on the interval data, the first learning result, and the second parameter (“a feature extraction unit that receives time-series data of a feature extraction target, calculates a value of the intermediate layer of the model with use of the model parameter stored in the storage unit by inputting the received time-series data of the feature extraction target into the model” [¶0014]), and 
generate a second learning result, based on the time series weight value (“and outputs the calculated value of the intermediate layer as a feature that represents temporal changes in data.” [¶0014]); and 
a weight value controller configured to adjust the first parameter or the second parameter, based on the first learning result or the second learning result (“Moreover, for the non-omitted value of the evenly spaced time-series-data P, it is aimed to learn so as to have the same value in the output layer value X.sub.6, and an error function is designed. The model parameters are optimized by means of a gradient method so as to minimize the error. Adam is used as the gradient method. The gradient method in the present embodiment is not limited to this. As the gradient method, any method of probabilistic gradient descent such as SGD and AdaDelta may be used.” [¶0046; discloses an optimization algorithm, therefore optimizing model parameters would be equivalent to adjusting the first or second parameters.]).

Brezak teaches using the first neural network (“Feed–forward neural network analyzed in this paper is the most commonly used MLP NN with three layers… By summing the outputs of all hidden layer neurons (including Bias) and belonging weight factors w” [pg. 2, A. Feed-forward Neural Network, ¶1-3; See further: “Forecasting performances of feed-forward and recurrent neural networks (NN) trained with different learning algorithms are analyzed and compared using the Mackey– Glass nonlinear chaotic time series. This system is a known benchmark test whose elements are hard to predict. Multi– layer Perceptron NN was chosen as a feed-forward neural network because it is still the most commonly used network in financial forecasting models.” [Abstract]]) and using the second neural network (“The Dynamic Multi–layer Perceptron Network (DMLP), proposed in [9], was modified and used as the second, recurrent type of network in this study. Unlike the feedforward MLP NN, this type of network is characterized by a dynamic neuron model, the so-called Dynamic Elementary Processor (DEP) which is structured as an Auto Regressive Moving Average (ARMA) filter, and is built into the network hidden layer. Thus, every hidden layer neuron has the ability to process previous values of its own activity together with new input signals.” [pg. 2, B. Recurrent Neural Network, ¶1]).
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s/Yao’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak]

Regarding claim 8, Kurasawa/Brezak/Yao teaches The time series data processing device of claim 1, where Kurasawa teaches wherein the learner includes: 
a feature learner configured to calculate the feature weight value, based on the masking data, the interpolation data, and the first parameter (“a model learning unit that learns a weight vector of each layer of a model with a difference being taken as an error, and stores the weight vector as a model parameter in a storage unit, the model being a model of a neural network including an input layer, an output layer, and an intermediate layer, a matrix obtained by combining the evenly spaced time-series-data group including omissions and the omission information group indicating presence or absence of omissions being input to the input layer, a matrix of an evenly spaced time-series-data group of an input time-series data length being output from the output layer, the received feature extraction size being the intermediate layer, and the difference being a difference between an element not missing in a matrix of the evenly spaced time-series-data group including omissions and an element of an output result of the output layer;” [¶0014]); 
a time series learner configured to calculate the time series weight value, based on the interval data, the interpolation data, and the second parameter (“a feature extraction unit that receives time-series data of a feature extraction target, calculates a value of the intermediate layer of the model with use of the model parameter stored in the storage unit by inputting the received time-series data of the feature extraction target into the model” [¶0014]); and 
an integrated weight value applicator configured to generate a learning result, based on the feature weight value (“a model learning unit that learns a weight vector of each layer of a model with a difference being taken as an error, and stores the weight vector as a model parameter in a storage unit, the model being a model of a neural network including an input layer, an output layer, and an intermediate layer” [¶0014]) and the time series weight value (“and outputs the calculated value of the intermediate layer as a feature that represents temporal changes in data.” [¶0014]); and 
a weight value controller configured to adjust the first parameter or the second parameter, based on the learning result (“Moreover, for the non-omitted value of the evenly spaced time-series-data P, it is aimed to learn so as to have the same value in the output layer value X.sub.6, and an error function is designed. The model parameters are optimized by means of a gradient method so as to minimize the error. Adam is used as the gradient method. The gradient method in the present embodiment is not limited to this. As the gradient method, any method of probabilistic gradient descent such as SGD and AdaDelta may be used.” [¶0046; discloses an optimization algorithm, therefore optimizing model parameters would be equivalent to adjusting the first or second parameters.]).
Brezak teaches using the first neural network (“Feed–forward neural network analyzed in this paper is the most commonly used MLP NN with three layers… By summing the outputs of all hidden layer neurons (including Bias) and belonging weight factors w” [pg. 2, A. Feed-forward Neural Network, ¶1-3; See further: “Forecasting performances of feed-forward and recurrent neural networks (NN) trained with different learning algorithms are analyzed and compared using the Mackey– Glass nonlinear chaotic time series. This system is a known benchmark test whose elements are hard to predict. Multi– layer Perceptron NN was chosen as a feed-forward neural network because it is still the most commonly used network in financial forecasting models.” [Abstract]]) and using the second neural network (“The Dynamic Multi–layer Perceptron Network (DMLP), proposed in [9], was modified and used as the second, recurrent type of network in this study. Unlike the feedforward MLP NN, this type of network is characterized by a dynamic neuron model, the so-called Dynamic Elementary Processor (DEP) which is structured as an Auto Regressive Moving Average (ARMA) filter, and is built into the network hidden layer. Thus, every hidden layer neuron has the ability to process previous values of its own activity together with new input signals.” [pg. 2, B. Recurrent Neural Network, ¶1]).
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s/Yao’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak]

Regarding claim 9, Kurasawa teaches A time series data processing device comprising: 
a preprocessor (See [¶0058-60], note: This claim invokes 112f, therefore as recited in para [¶0038] of the specification, the elements can be interpreted as hardware or software loaded into memory.) configured to generate interval data, based on a time interval of time series data (“a data processing unit that processes the received unevenly spaced time-series-data group into an evenly spaced time-series-data group and an omission information group based on the received input time-series data length and the received minimum observation interval” [¶0014]), add an interpolation value to a missing value of the time series data to generate interpolation data (“In a second method, an omission estimation processing (interpolation or extrapolation) is performed so that the time interval is constant, and then features expressing changes over time are extracted.” [¶0006; See also: “In the above example of unevenly spaced time-series data, when discrete Fourier transform is performed after the interpolation processing by a linear function, three records, namely (13 o'clock, 26 degrees), (15 o'clock, 25 degrees), and (16 o'clock, 22 degrees) are added by means of interpolation of the omission estimation processing” [¶0008]]), and 
generate masking data for distinguishing the missing value (“Next, the model learning unit 14 initializes the model parameters (Step S104). A random value is assigned to the weight parameter A.sub.i of the model parameter and the bias parameter B.sub.i (i=1, 2, 3, 4, 5). Furthermore, “0” is assigned to the omission value of the evenly spaced time-series-data. In the present embodiment, “0” is assigned to the omission value, but it is not limited to this example. An average value, a median value, or an omission processing result may be assigned to the omission value.” [¶0044; assigning “0” to omission value corresponds to generating “masking data”]); and 
a predictor configured to generate a feature weight value depending on a time and a feature of the time series data (“For the model of a neural network in which the feature extraction size received by the model learning unit 12 is an intermediate layer, the model learning unit 14 learns a weight vector of each layer where a difference between an element not missing in the matrix of the evenly spaced time-series-data group including omissions, and an element of the output result of the output layer is taken as an error, and generates model parameters.” [¶0031; See [¶0030] “The model design receiving unit 12 receives (receives an input of) a time-series data length, an observation minimum interval, and a feature extraction size.” corresponds to a time and feature of a time series.]) and a time series weight value depending on a time flow of the time series data (“The feature extraction unit 16 receives the time-series data of the feature extraction target and takes the received time-series data as an input of the model. The feature extraction unit 16 calculates the value of the intermediate layer of the model using the stored model parameters, and outputs the feature representing the temporal changes of the data.” [¶0032; temporal changes correspond to a dependence on a time flow.]), based on the interval data (“a model design receiving unit that receives an input time-series data length, an observation minimum interval” [¶0014]), the interpolation data (¶0006, method is based on interpolation), and the masking data (¶0044; discloses masking data]), 
and generate a prediction result corresponding to a prediction time for the time series data (“The feature extraction unit 16 may output the value of the intermediate layer together with the time-series data from which the feature has been extracted, and may output information of the difference between the element not missing in the matrix of the evenly spaced time-series-data group including omissions, and the element of the output result of the output layer of the model.” [¶0056]), based on the feature weight value (“the model learning unit 14 learns a weight vector of each layer where a difference between an element not missing in the matrix of the evenly spaced time-series-data group including omissions, and an element of the output result of the output layer is taken as an error, and generates model parameters.” [¶0031; corresponds to a first parameter]) and a second parameter for generating the time series weight value (“The feature extraction unit 16 calculates the value of the intermediate layer of the model using the stored model parameters” [¶0032; model parameters implies a second parameter to generate a time series weight value]).
Although Kurasawa teaches using a neural network, the reference fails to explicitly teach wherein the weight value group is generated using a first neural network comprising a feed- forward neural network, wherein the time series weight value is generated using a second neural network comprising a recurrent neural network
	Brezak teaches wherein the weight value group is generated using a first neural network comprising a feed-forward neural network (“Feed–forward neural network analyzed in this paper is the most commonly used MLP NN with three layers… By summing the outputs of all hidden layer neurons (including Bias) and belonging weight factors w” [pg. 2, A. Feed-forward Neural Network, ¶1-3; See further: “Forecasting performances of feed-forward and recurrent neural networks (NN) trained with different learning algorithms are analyzed and compared using the Mackey– Glass nonlinear chaotic time series. This system is a known benchmark test whose elements are hard to predict. Multi– layer Perceptron NN was chosen as a feed-forward neural network because it is still the most commonly used network in financial forecasting models.” [Abstract]]), 
wherein the time series weight value is generated using a second neural network comprising a recurrent neural network (“The Dynamic Multi–layer Perceptron Network (DMLP), proposed in [9], was modified and used as the second, recurrent type of network in this study. Unlike the feedforward MLP NN, this type of network is characterized by a dynamic neuron model, the so-called Dynamic Elementary Processor (DEP) which is structured as an Auto Regressive Moving Average (ARMA) filter, and is built into the network hidden layer. Thus, every hidden layer neuron has the ability to process previous values of its own activity together with new input signals.” [pg. 2, B. Recurrent Neural Network, ¶1])
Kurasawa and Brezak are both in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak] 
However Kurasawa and Brezak fails to explicitly teach and wherein the prediction result is generated using a third neural network comprising a fully-connected neural network.
Yao teaches and wherein the prediction result is generated using a third neural network comprising a fully-connected neural network. (“We propose to use a graph of regions to capture this latent semantic, where the edge represents similarity of demand patterns for a pair of regions. Later, regions are encoded into vectors via a graph embedding method and such vectors are used as context features in the model. In the end, a fully connected neural network component is used for prediction.” [pg. 2589, left col, ¶2; See further: “We use the average weekly demand time series as the demand patterns. The average is computed on the training data in the experiment. The graph is fully connected because every two regions can be reached.” [pg. 2591, right col, top para; discloses using time-series data. See Figure 1: “The semantic view first constructs a weighted graph of regions (with weights representing functional similarity). Nodes are encoded into vectors. A fully connected layer is used at the end for jointly training. Finally, a fully connected neural network is used for prediction]]).
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Kurasawa’s/Brezak’s teachings to implement a fully connected neural network to perform a prediction result as taught by Yao. One would have been motivated to make this modification in order to use spatial-temporal correlations to improve on traditional time series prediction methods. [pg. 2588, § Introduction, ¶3, Yao]

Regarding claim 10, Kurasawa/Brezak/Yao teaches The time series data processing device of claim 9, where Kurasawa teaches wherein the predictor includes: 
a feature predictor configured to generate a first result, based on the feature weight value (“a model learning unit that learns a weight vector of each layer of a model with a difference being taken as an error, and stores the weight vector as a model parameter in a storage unit, the model being a model of a neural network including an input layer, an output layer, and an intermediate layer” [¶0014; output layer from model learning unit outputs a “first result”]); 
a time series predictor configured to generate a second result, based on the time series weight value (“a feature extraction unit that receives time-series data of a feature extraction target, calculates a value of the intermediate layer of the model with use of the model parameter stored in the storage unit by inputting the received time-series data of the feature extraction target into the model, and outputs the calculated value of the intermediate layer as a feature that represents temporal changes in data.” [¶0014; feature extraction unit outputs a “second result”]); and 
a result generator configured to calculate the prediction result corresponding to a prediction time, based on the second result (“In the time-series-data feature extraction device, the feature extraction unit may output the value of the intermediate layer together with the time-series data from which the feature has been extracted, and may output information of the difference between the element not missing in the matrix of the evenly spaced time-series-data group including omissions and the element of the output result of the output layer of the model.” [¶0015]).
Brezak teaches using the first neural network (“Feed–forward neural network analyzed in this paper is the most commonly used MLP NN with three layers… By summing the outputs of all hidden layer neurons (including Bias) and belonging weight factors w” [pg. 2, A. Feed-forward Neural Network, ¶1-3; See further: “Forecasting performances of feed-forward and recurrent neural networks (NN) trained with different learning algorithms are analyzed and compared using the Mackey– Glass nonlinear chaotic time series. This system is a known benchmark test whose elements are hard to predict. Multi– layer Perceptron NN was chosen as a feed-forward neural network because it is still the most commonly used network in financial forecasting models.” [Abstract]]) and using the second neural network (“The Dynamic Multi–layer Perceptron Network (DMLP), proposed in [9], was modified and used as the second, recurrent type of network in this study. Unlike the feedforward MLP NN, this type of network is characterized by a dynamic neuron model, the so-called Dynamic Elementary Processor (DEP) which is structured as an Auto Regressive Moving Average (ARMA) filter, and is built into the network hidden layer. Thus, every hidden layer neuron has the ability to process previous values of its own activity together with new input signals.” [pg. 2, B. Recurrent Neural Network, ¶1]).
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s/Yao’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak]
Yao teaches using the third neural network (“We propose to use a graph of regions to capture this latent semantic, where the edge represents similarity of demand patterns for a pair of regions. Later, regions are encoded into vectors via a graph embedding method and such vectors are used as context features in the model. In the end, a fully connected neural network component is used for prediction.” [pg. 2589, left col, ¶2; See further: “We use the average weekly demand time series as the demand patterns. The average is computed on the training data in the experiment. The graph is fully connected because every two regions can be reached.” [pg. 2591, right col, top para; discloses using time-series data. See Figure 1: “The semantic view first constructs a weighted graph of regions (with weights representing functional similarity). Nodes are encoded into vectors. A fully connected layer is used at the end for jointly training. Finally, a fully connected neural network is used for prediction]]).
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Kurasawa’s/Brezak’s teachings to implement a fully connected neural network to perform a prediction result as taught by Yao. One would have been motivated to make this modification in order to use spatial-temporal correlations to improve on traditional time series prediction methods. [pg. 2588, § Introduction, ¶3, Yao]

Regarding claim 11, Kurasawa/Brezak/Yao teaches The time series data processing device of claim 10, where Kurasawa teaches wherein the feature predictor includes: 
a missing value processor configured to encode the interpolation data, based on the masking data (“As described above, in the present embodiment, it is possible to: convert the evenly spaced time-series-data group for training to an evenly spaced time-series-data group including omissions, and an omission information group indicating the presence or absence of omissions; learn, while taking these two as inputs, as a self-encoder in which an evenly spaced time-series-data group becomes an output” [¶0055]); 
a time processor configured to model the interval data (“An example of an object of the present invention is to provide a time-series-data feature extraction device that extracts features representing temporal changes in data from time-series-data observed at uneven intervals, a time-series-data feature extraction method, and a time-series-data feature extraction program.” [¶0013]); 
a feature weight value calculator configured to generate feature analysis data, based on the encoded interpolation data and to generate the feature weight value (“By restoring the time-series data observed at uneven intervals from the feature representing the temporal changes of the data and outputting the magnitude of the difference from the original time-series data as a new feature, it is possible to analyze unevenly spaced time-series data that takes account of the accuracy of the feature extraction of the temporal changes in the data. Furthermore, it can be used for an analysis as an index that indicates whether the feature representing temporal changes sufficiently expresses the original time-series data.” [¶0056]), based on the feature analysis data and the modeled interval data (“According to this configuration, in extracting a feature that represents temporal changes in data from time-series data observed at uneven intervals, then by collectively processing the omission estimation process and the feature extraction, the accuracy of the omission estimation processing is prevented from significantly influencing the feature, and the accuracy of analysis such as classification by means of machine learning is improved.” [¶0057]); and 
a feature weight value applicator configured to apply the feature weight value to the feature analysis data to generate the first result (“a model learning unit that learns a weight vector of each layer of a model with a difference being taken as an error, and stores the weight vector as a model parameter in a storage unit, the model being a model of a neural network including an input layer, an output layer, and an intermediate layer” [¶0014; output layer from model learning unit outputs a “first result”]).
Brezak teaches using the first neural network (“Feed–forward neural network analyzed in this paper is the most commonly used MLP NN with three layers… By summing the outputs of all hidden layer neurons (including Bias) and belonging weight factors w” [pg. 2, A. Feed-forward Neural Network, ¶1-3; See further: “Forecasting performances of feed-forward and recurrent neural networks (NN) trained with different learning algorithms are analyzed and compared using the Mackey– Glass nonlinear chaotic time series. This system is a known benchmark test whose elements are hard to predict. Multi– layer Perceptron NN was chosen as a feed-forward neural network because it is still the most commonly used network in financial forecasting models.” [Abstract]])
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s/Yao’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak]

Regarding claim 12, Kurasawa/Brezak/Yao teaches The time series data processing device of claim 10, where Kurasawa teaches wherein the feature predictor includes:
a missing value processor configured to merge the masking data and the interpolation data (“Specifically, the evenly spaced time-series-data is referred to as P, the omission information is referred to as Q, and the data that combines the evenly spaced time-series-data group and the omission information group representing the presence or absence of omission is referred to as R. To the value X.sub.1 for the input layer, there is input the data R obtained by combining the evenly spaced time-series-data group and the omission information group indicating the presence or absence of omission. Learning is performed so that the output value X.sub.6 of the output layer (shown in Equation (4)) and the evenly spaced time-series-data P approach, without limit, to the value that is not omitted.” [¶0045]); 
a time processor configured to model the interval data (“An example of an object of the present invention is to provide a time-series-data feature extraction device that extracts features representing temporal changes in data from time-series-data observed at uneven intervals, a time-series-data feature extraction method, and a time-series-data feature extraction program.” [¶0013]); 
a feature weight value calculator configured to generate feature analysis data, based on the merged data, and generate the feature weight value, based on the feature analysis data and the modeled interval data (“By restoring the time-series data observed at uneven intervals from the feature representing the temporal changes of the data and outputting the magnitude of the difference from the original time-series data as a new feature, it is possible to analyze unevenly spaced time-series data that takes account of the accuracy of the feature extraction of the temporal changes in the data. Furthermore, it can be used for an analysis as an index that indicates whether the feature representing temporal changes sufficiently expresses the original time-series data.” [¶0056]); and 
a feature weight value applicator configured to apply the feature weight value to the feature analysis data to generate the first result (“a model learning unit that learns a weight vector of each layer of a model with a difference being taken as an error, and stores the weight vector as a model parameter in a storage unit, the model being a model of a neural network including an input layer, an output layer, and an intermediate layer” [¶0014; output layer from model learning unit outputs a “first result”]).
Brezak teaches using the first neural network (“Feed–forward neural network analyzed in this paper is the most commonly used MLP NN with three layers… By summing the outputs of all hidden layer neurons (including Bias) and belonging weight factors w” [pg. 2, A. Feed-forward Neural Network, ¶1-3; See further: “Forecasting performances of feed-forward and recurrent neural networks (NN) trained with different learning algorithms are analyzed and compared using the Mackey– Glass nonlinear chaotic time series. This system is a known benchmark test whose elements are hard to predict. Multi– layer Perceptron NN was chosen as a feed-forward neural network because it is still the most commonly used network in financial forecasting models.” [Abstract]])
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s/Yao’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak]

Regarding claim 13, Kurasawa/Brezak/Yao teaches The time series data processing device of claim 10, where Kurasawa teaches wherein the feature predictor includes: 
a missing value processor configured to model the masking data (“Furthermore, “0” is assigned to the omission value of the evenly spaced time-series-data. In the present embodiment, “0” is assigned to the omission value, but it is not limited to this example. An average value, a median value, or an omission processing result may be assigned to the omission value.” [¶0044]); 
a time processor configured to model the interval data (“An example of an object of the present invention is to provide a time-series-data feature extraction device that extracts features representing temporal changes in data from time-series-data observed at uneven intervals, a time-series-data feature extraction method, and a time-series-data feature extraction program.” [¶0013]); 
a feature weight value calculator configured to generate feature analysis data, based on the interpolation data (“By restoring the time-series data observed at uneven intervals from the feature representing the temporal changes of the data and outputting the magnitude of the difference from the original time-series data as a new feature, it is possible to analyze unevenly spaced time-series data that takes account of the accuracy of the feature extraction of the temporal changes in the data. Furthermore, it can be used for an analysis as an index that indicates whether the feature representing temporal changes sufficiently expresses the original time-series data.” [¶0056]), and generate the feature weight value, based on the modeled masking data, the modeled interval data, and the feature analysis data (“a model learning unit that learns a weight vector of each layer of a model with a difference being taken as an error, and stores the weight vector as a model parameter in a storage unit, the model being a model of a neural network including an input layer, an output layer, and an intermediate layer, a matrix obtained by combining the evenly spaced time-series-data group including omissions and the omission information group indicating presence or absence of omissions being input to the input layer, a matrix of an evenly spaced time-series-data group of an input time-series data length being output from the output layer, the received feature extraction size being the intermediate layer, and the difference being a difference between an element not missing in a matrix of the evenly spaced time-series-data group including omissions and an element of an output result of the output layer;” [¶0014]); and 
a feature weight value applicator configured to apply the feature weight value to the feature analysis data to generate the first result (“The model learning unit 14 takes a matrix obtained by combining the evenly spaced time-series-data group including omissions, and the omission information group indicating the presence or absence of omissions as an input to the input layer, and takes the matrix of the evenly spaced time-series-data group of the input time-series data length as an output from the output layer” [¶0031])
Brezak teaches using the first neural network (“Feed–forward neural network analyzed in this paper is the most commonly used MLP NN with three layers… By summing the outputs of all hidden layer neurons (including Bias) and belonging weight factors w” [pg. 2, A. Feed-forward Neural Network, ¶1-3; See further: “Forecasting performances of feed-forward and recurrent neural networks (NN) trained with different learning algorithms are analyzed and compared using the Mackey– Glass nonlinear chaotic time series. This system is a known benchmark test whose elements are hard to predict. Multi– layer Perceptron NN was chosen as a feed-forward neural network because it is still the most commonly used network in financial forecasting models.” [Abstract]])
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s/Yao’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak]

Regarding claim 14, Kurasawa/Brezak/Yao teaches The time series data processing device of claim 10, where Kurasawa teaches wherein the feature predictor includes: 
a missing value processor configured to model the masking data (“Furthermore, “0” is assigned to the omission value of the evenly spaced time-series-data. In the present embodiment, “0” is assigned to the omission value, but it is not limited to this example. An average value, a median value, or an omission processing result may be assigned to the omission value.” [¶0044]); 
a time processor configured to merge the interval data and the interpolation data (“The model handled by the model learning unit 14 is a neural network. This model is a model composed of three or more layers that always has three layers of input layer, output layer, and intermediate layer. The input to the model learning unit 14 is information combining the evenly spaced time-series-data group including omissions (refer to the portion (B) in FIG. 5), and the omission information group indicating the presence or absence of omissions (refer to the portion (C) in FIG. 5).” [¶0037]); 
a feature weight value calculator configured to generate feature analysis data, based on the merged data (“By restoring the time-series data observed at uneven intervals from the feature representing the temporal changes of the data and outputting the magnitude of the difference from the original time-series data as a new feature, it is possible to analyze unevenly spaced time-series data that takes account of the accuracy of the feature extraction of the temporal changes in the data. Furthermore, it can be used for an analysis as an index that indicates whether the feature representing temporal changes sufficiently expresses the original time-series data.” [¶0056]), and generate the feature weight value, based on the feature analysis data and the modeled masking data (“a model learning unit that learns a weight vector of each layer of a model with a difference being taken as an error, and stores the weight vector as a model parameter in a storage unit, the model being a model of a neural network including an input layer, an output layer, and an intermediate layer, a matrix obtained by combining the evenly spaced time-series-data group including omissions and the omission information group indicating presence or absence of omissions being input to the input layer, a matrix of an evenly spaced time-series-data group of an input time-series data length being output from the output layer, the received feature extraction size being the intermediate layer, and the difference being a difference between an element not missing in a matrix of the evenly spaced time-series-data group including omissions and an element of an output result of the output layer;” [¶0014]); and 
a feature weight value applicator configured to apply the feature weight value to the feature analysis data to generate the first result (“The model learning unit 14 takes a matrix obtained by combining the evenly spaced time-series-data group including omissions, and the omission information group indicating the presence or absence of omissions as an input to the input layer, and takes the matrix of the evenly spaced time-series-data group of the input time-series data length as an output from the output layer” [¶0031]).
Brezak teaches using the first neural network (“Feed–forward neural network analyzed in this paper is the most commonly used MLP NN with three layers… By summing the outputs of all hidden layer neurons (including Bias) and belonging weight factors w” [pg. 2, A. Feed-forward Neural Network, ¶1-3; See further: “Forecasting performances of feed-forward and recurrent neural networks (NN) trained with different learning algorithms are analyzed and compared using the Mackey– Glass nonlinear chaotic time series. This system is a known benchmark test whose elements are hard to predict. Multi– layer Perceptron NN was chosen as a feed-forward neural network because it is still the most commonly used network in financial forecasting models.” [Abstract]])
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s/Yao’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak]

Regarding claim 15, Kurasawa/Brezak/Yao teaches The time series data processing device of claim 10, where Kurasawa teaches wherein the time series predictor includes: 
a time series weight value calculator configured to generate time series analysis data, based on the first result (“a feature extraction unit that receives time-series data of a feature extraction target, calculates a value of the intermediate layer of the model with use of the model parameter stored in the storage unit by inputting the received time-series data of the feature extraction target into the model” [¶0014]), and 
generate the time series weight value, based on the time series analysis data (“The feature extraction unit 16 calculates the value of the intermediate layer of the model using the stored model parameters” [¶0032; model parameters implies a second parameter to generate a time series weight value]); and 
a time series weight value applicator configured to apply the time series weight value to the first result or the time series analysis data (“Next, the model learning unit 14 learns the weight vector of each layer that constitutes the model so as to minimize the error (Step S105). Specifically, the evenly spaced time-series-data is referred to as P, the omission information is referred to as Q, and the data that combines the evenly spaced time-series-data group and the omission information group representing the presence or absence of omission is referred to as R.” [¶0045]).
Brezak teaches using the second neural network (“The Dynamic Multi–layer Perceptron Network (DMLP), proposed in [9], was modified and used as the second, recurrent type of network in this study. Unlike the feedforward MLP NN, this type of network is characterized by a dynamic neuron model, the so-called Dynamic Elementary Processor (DEP) which is structured as an Auto Regressive Moving Average (ARMA) filter, and is built into the network hidden layer. Thus, every hidden layer neuron has the ability to process previous values of its own activity together with new input signals.” [pg. 2, B. Recurrent Neural Network, ¶1])
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s/Yao’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak]

Regarding claim 16, Kurasawa/Brezak/Yao teaches The time series data processing device of claim 10, where Kurasawa teaches wherein the feature predictor calculates the feature weight value, based on the masking data and the interpolation data (“a model learning unit that learns a weight vector of each layer of a model with a difference being taken as an error, and stores the weight vector as a model parameter in a storage unit, the model being a model of a neural network including an input layer, an output layer, and an intermediate layer, a matrix obtained by combining the evenly spaced time-series-data group including omissions and the omission information group indicating presence or absence of omissions being input to the input layer, a matrix of an evenly spaced time-series-data group of an input time-series data length being output from the output layer, the received feature extraction size being the intermediate layer, and the difference being a difference between an element not missing in a matrix of the evenly spaced time-series-data group including omissions and an element of an output result of the output layer;” [¶0014]), and 
wherein the time series predictor calculates the time series weight value, based on the first result and the interval data (“a feature extraction unit that receives time-series data of a feature extraction target, calculates a value of the intermediate layer of the model with use of the model parameter stored in the storage unit by inputting the received time-series data of the feature extraction target into the model” [¶0014]).

Regarding claim 17, Kurasawa teaches The time series data processing device of claim 9, wherein the predictor includes: 
a feature predictor configured to calculate the feature weight value, based on the masking data and the interpolation data (“a model learning unit that learns a weight vector of each layer of a model with a difference being taken as an error, and stores the weight vector as a model parameter in a storage unit, the model being a model of a neural network including an input layer, an output layer, and an intermediate layer, a matrix obtained by combining the evenly spaced time-series-data group including omissions and the omission information group indicating presence or absence of omissions being input to the input layer, a matrix of an evenly spaced time-series-data group of an input time-series data length being output from the output layer, the received feature extraction size being the intermediate layer, and the difference being a difference between an element not missing in a matrix of the evenly spaced time-series-data group including omissions and an element of an output result of the output layer;” [¶0014]); 
a time series predictor configured to calculate the time series weight value, based on the interval data the interpolation data (“a feature extraction unit that receives time-series data of a feature extraction target, calculates a value of the intermediate layer of the model with use of the model parameter stored in the storage unit by inputting the received time-series data of the feature extraction target into the model” [¶0014]); 
an integrated weight value applicator configured to generate an integrated result corresponding to the interpolation data, based on the feature weight value and the time series weight value (“The feature extraction unit 16 performs processing into the evenly spaced time-series-data including omissions, and the omission information indicating the presence or absence of omissions (Step S203). FIG. 9 shows an example of the processed time-series data of the feature extraction target. The evenly spaced time-series-data is referred to as P′, the omission information indicating the presence or absence of omission is referred to as Q′, and the information that combines the evenly spaced time-series-data group and the omission information group representing the presence or absence of omission is referred to as R′.” [¶0052]); and 
a result generator configured to calculate the prediction result corresponding to a prediction time, based on the integrated result (“In the time-series-data feature extraction device, the feature extraction unit may output the value of the intermediate layer together with the time-series data from which the feature has been extracted, and may output information of the difference between the element not missing in the matrix of the evenly spaced time-series-data group including omissions and the element of the output result of the output layer of the model.” [¶0015]).
Brezak teaches using the first neural network (“Feed–forward neural network analyzed in this paper is the most commonly used MLP NN with three layers… By summing the outputs of all hidden layer neurons (including Bias) and belonging weight factors w” [pg. 2, A. Feed-forward Neural Network, ¶1-3; See further: “Forecasting performances of feed-forward and recurrent neural networks (NN) trained with different learning algorithms are analyzed and compared using the Mackey– Glass nonlinear chaotic time series. This system is a known benchmark test whose elements are hard to predict. Multi– layer Perceptron NN was chosen as a feed-forward neural network because it is still the most commonly used network in financial forecasting models.” [Abstract]]) and using the second neural network (“The Dynamic Multi–layer Perceptron Network (DMLP), proposed in [9], was modified and used as the second, recurrent type of network in this study. Unlike the feedforward MLP NN, this type of network is characterized by a dynamic neuron model, the so-called Dynamic Elementary Processor (DEP) which is structured as an Auto Regressive Moving Average (ARMA) filter, and is built into the network hidden layer. Thus, every hidden layer neuron has the ability to process previous values of its own activity together with new input signals.” [pg. 2, B. Recurrent Neural Network, ¶1]).
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s/Yao’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak]
Yao teaches using the third neural network (“We propose to use a graph of regions to capture this latent semantic, where the edge represents similarity of demand patterns for a pair of regions. Later, regions are encoded into vectors via a graph embedding method and such vectors are used as context features in the model. In the end, a fully connected neural network component is used for prediction.” [pg. 2589, left col, ¶2; See further: “We use the average weekly demand time series as the demand patterns. The average is computed on the training data in the experiment. The graph is fully connected because every two regions can be reached.” [pg. 2591, right col, top para; discloses using time-series data. See Figure 1: “The semantic view first constructs a weighted graph of regions (with weights representing functional similarity). Nodes are encoded into vectors. A fully connected layer is used at the end for jointly training. Finally, a fully connected neural network is used for prediction]]).
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Kurasawa’s/Brezak’s teachings to implement a fully connected neural network to perform a prediction result as taught by Yao. One would have been motivated to make this modification in order to use spatial-temporal correlations to improve on traditional time series prediction methods. [pg. 2588, § Introduction, ¶3, Yao]

Claims 3, 6, and 7 are rejected under 35 U.S.C. 103 as being unpatentable over Kurasawa in view of Brezak and Yao and further in view of Che et al. ("Recurrent Neural Networks for Multivariate Time Series with Missing Values", hereinafter "Che").


Regarding claim 3, Kurasawa/Brezak/Yao teaches The time series data processing device of claim 2, where Kurasawa teaches wherein the feature learner includes: 
a feature weight value calculator configured to calculate the feature weight value, based on the first parameter, the first correction data, and the second correction data; (“the model learning unit 14 learns a weight vector of each layer where a difference between an element not missing in the matrix of the evenly spaced time-series-data group including omissions, and an element of the output result of the output layer is taken as an error, and generates model parameters.” [¶0031; corresponds to applying a feature weight value to interpolation data]), and 
a feature weight value applicator configured to apply the feature weight value to the interpolation data (“The data processing unit 13 processes the time-series data group for training into an evenly spaced time-series-data group including omissions, and an omission information group indicating the presence or absence of omissions (Step S103).” [¶0036, training implies applying the feature weight value to the interpolation data]).
Brezak teaches using the first neural network (“Feed–forward neural network analyzed in this paper is the most commonly used MLP NN with three layers… By summing the outputs of all hidden layer neurons (including Bias) and belonging weight factors w” [pg. 2, A. Feed-forward Neural Network, ¶1-3; See further: “Forecasting performances of feed-forward and recurrent neural networks (NN) trained with different learning algorithms are analyzed and compared using the Mackey– Glass nonlinear chaotic time series. This system is a known benchmark test whose elements are hard to predict. Multi– layer Perceptron NN was chosen as a feed-forward neural network because it is still the most commonly used network in financial forecasting models.” [Abstract]])
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s/Yao’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak]
However Kurasawa/Brezak/Yao fails to explicitly teach a missing value processor configured to generate first correction data of the interpolation data, based on the masking data and
a time processor configured to generate second correction data of the interpolation data, based on the interval data
Che teaches a missing value processor configured to generate first correction data of the interpolation data, based on the masking data (“In this paper, we develop a novel deep learning model based on GRU, namely GRU-D, to effectively exploit two representations of informative missingness patterns, i.e., masking and time interval. Masking informs the model which inputs are observed (or missing)” [pg. 2, ¶2]); 
a time processor configured to generate second correction data of the interpolation data, based on the interval data (“Masking informs the model which inputs are observed (or missing), while time interval encapsulates the input observation patterns. Our model captures the observations and their dependencies by applying masking and time interval (using a decay term) to the inputs and network states of GRU, and jointly train all model components using back-propagation. Thus, our model not only captures the long-term temporal dependencies of time series observations but also utilizes the missing patterns to improve the prediction results.” [pg. 2, ¶2]);
Kurasawa, Brezak, Yao, and Che are all in the same field of endeavor of training prediction models with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. Che discloses using RNN for time-series data with missing values. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the feature extraction method of Kurasawa and the neural networks of Brezak and Yao by informing the prediction model of the inputs are observed or missing values as taught by Che. One would have been motivated to make this modification in order to improve the prediction results of the model. [pg. 2, ¶2, Che]

Regarding claim 6, Kurasawa/Brezak/Yao teaches The time series data processing device of claim 5, where Kurasawa teaches wherein the feature learner includes: 
a feature weight value calculator configured to calculate the feature weight value, based on the first parameter and the correction data (“the model learning unit 14 learns a weight vector of each layer where a difference between an element not missing in the matrix of the evenly spaced time-series-data group including omissions, and an element of the output result of the output layer is taken as an error, and generates model parameters.” [¶0031; corresponds to applying a feature weight value to interpolation data]); and 
a feature weight value applicator configured to apply the feature weight value to the interpolation data (“The data processing unit 13 processes the time-series data group for training into an evenly spaced time-series-data group including omissions, and an omission information group indicating the presence or absence of omissions (Step S103).” [¶0036, training implies applying the feature weight value to the interpolation data]).
Brezak teaches using the first neural network (“Feed–forward neural network analyzed in this paper is the most commonly used MLP NN with three layers… By summing the outputs of all hidden layer neurons (including Bias) and belonging weight factors w” [pg. 2, A. Feed-forward Neural Network, ¶1-3; See further: “Forecasting performances of feed-forward and recurrent neural networks (NN) trained with different learning algorithms are analyzed and compared using the Mackey– Glass nonlinear chaotic time series. This system is a known benchmark test whose elements are hard to predict. Multi– layer Perceptron NN was chosen as a feed-forward neural network because it is still the most commonly used network in financial forecasting models.” [Abstract]])
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s/Yao’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak]
However Kurasawa/Brezak/Yao fails to explicitly teach a missing value processor configured to generate correction data of the interpolation data, based on the masking data
Che teaches a missing value processor configured to generate correction data of the interpolation data, based on the masking data (“In this paper, we develop a novel deep learning model based on GRU, namely GRU-D, to effectively exploit two representations of informative missingness patterns, i.e., masking and time interval. Masking informs the model which inputs are observed (or missing)” [pg. 2, ¶2]);
Kurasawa, Brezak, Yao, and Che are all in the same field of endeavor of training prediction models with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. Che discloses using RNN for time-series data with missing values. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the feature extraction method of Kurasawa and the neural networks of Brezak and Yao by informing the prediction model of the inputs are observed or missing values as taught by Che. One would have been motivated to make this modification in order to improve the prediction results of the model. [pg. 2, ¶2, Che]

Regarding claim 7, Kurasawa/Brezak/Yao teaches The time series data processing device of claim 5, where Kurasawa teaches wherein the time series learner includes: 
a time series weight value calculator configured to calculate the time series weight value, based on the second parameter and the correction data (“a feature extraction unit that receives time-series data of a feature extraction target, calculates a value of the intermediate layer of the model with use of the model parameter stored in the storage unit by inputting the received time-series data of the feature extraction target into the model” [¶0014]); and 
a time series weight value applicator configured to apply the time series weight value to the first learning result (“Next, the model learning unit 14 learns the weight vector of each layer that constitutes the model so as to minimize the error (Step S105). Specifically, the evenly spaced time-series-data is referred to as P, the omission information is referred to as Q, and the data that combines the evenly spaced time-series-data group and the omission information group representing the presence or absence of omission is referred to as R.” [¶0045]).
Brezak teaches using the second neural network (“The Dynamic Multi–layer Perceptron Network (DMLP), proposed in [9], was modified and used as the second, recurrent type of network in this study. Unlike the feedforward MLP NN, this type of network is characterized by a dynamic neuron model, the so-called Dynamic Elementary Processor (DEP) which is structured as an Auto Regressive Moving Average (ARMA) filter, and is built into the network hidden layer. Thus, every hidden layer neuron has the ability to process previous values of its own activity together with new input signals.” [pg. 2, B. Recurrent Neural Network, ¶1])
Kurasawa, Brezak, and Yao are all in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s/Yao’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak]
However Kurasawa/Brezak/Yao fails to explicitly teach a time processor configured to generate correction data of the first learning result, based on the interval data
Che teaches a time processor configured to generate correction data of the first learning result, based on the interval data (“Masking informs the model which inputs are observed (or missing), while time interval encapsulates the input observation patterns. Our model captures the observations and their dependencies by applying masking and time interval (using a decay term) to the inputs and network states of GRU, and jointly train all model components using back-propagation. Thus, our model not only captures the long-term temporal dependencies of time series observations but also utilizes the missing patterns to improve the prediction results.” [pg. 2, ¶2]); 
Kurasawa, Brezak, Yao, and Che are all in the same field of endeavor of training prediction models with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. Yao discloses a time-series prediction method using a deep multi-view spatial temporal network. Che discloses using RNN for time-series data with missing values. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the feature extraction method of Kurasawa and the neural networks of Brezak and Yao by informing the prediction model of the inputs are observed or missing values as taught by Che. One would have been motivated to make this modification in order to improve the prediction results of the model. [pg. 2, ¶2, Che]
Claims 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Kurasawa in view of Brezak.

Regarding claim 18, Kurasawa teaches A method of operating a time series data processing device, the method comprising: 
generating interpolation data by adding an interpolation value to a missing value of time series data (“In a second method, an omission estimation processing (interpolation or extrapolation) is performed so that the time interval is constant, and then features expressing changes over time are extracted.” [¶0006; See also: “In the above example of unevenly spaced time-series data, when discrete Fourier transform is performed after the interpolation processing by a linear function, three records, namely (13 o'clock, 26 degrees), (15 o'clock, 25 degrees), and (16 o'clock, 22 degrees) are added by means of interpolation of the omission estimation processing” [¶0008]]); 
generating interval data, based on a time interval of the time series data (“a data processing unit that processes the received unevenly spaced time-series-data group into an evenly spaced time-series-data group and an omission information group based on the received input time-series data length and the received minimum observation interval” [¶0014]); 
generating masking data, based on the missing value (“Next, the model learning unit 14 initializes the model parameters (Step S104). A random value is assigned to the weight parameter A.sub.i of the model parameter and the bias parameter B.sub.i (i=1, 2, 3, 4, 5). Furthermore, “0” is assigned to the omission value of the evenly spaced time-series-data. In the present embodiment, “0” is assigned to the omission value, but it is not limited to this example. An average value, a median value, or an omission processing result may be assigned to the omission value.” [¶0044; assigning “0” to omission value corresponds to generating “masking data”]); 
generating a feature weight value depending on a time and a feature of the time series data (“For the model of a neural network in which the feature extraction size received by the model learning unit 12 is an intermediate layer, the model learning unit 14 learns a weight vector of each layer where a difference between an element not missing in the matrix of the evenly spaced time-series-data group including omissions, and an element of the output result of the output layer is taken as an error, and generates model parameters.” [¶0031; See [¶0030] “The model design receiving unit 12 receives (receives an input of) a time-series data length, an observation minimum interval, and a feature extraction size.” corresponds to a time and feature of a time series.]), based on the interpolation data (¶0006, method is based on interpolation), the interval data (“a model design receiving unit that receives an input time-series data length, an observation minimum interval” [¶0014]), and the masking data (¶0044; discloses masking data]); 
generating a first result, based on the feature weight value (“a model learning unit that learns a weight vector of each layer of a model with a difference being taken as an error, and stores the weight vector as a model parameter in a storage unit, the model being a model of a neural network including an input layer, an output layer, and an intermediate layer” [¶0014; output layer from model learning unit outputs a “first result”]); 
generating a time series weight value depending on a time flow of the time series data, based on the first result (“The feature extraction unit 16 receives the time-series data of the feature extraction target and takes the received time-series data as an input of the model. The feature extraction unit 16 calculates the value of the intermediate layer of the model using the stored model parameters, and outputs the feature representing the temporal changes of the data.” [¶0032; temporal changes correspond to a dependence on a time flow.]); and 
generating a second result, based on the time series weight value (“a feature extraction unit that receives time-series data of a feature extraction target, calculates a value of the intermediate layer of the model with use of the model parameter stored in the storage unit by inputting the received time-series data of the feature extraction target into the model, and outputs the calculated value of the intermediate layer as a feature that represents temporal changes in data.” [¶0014; feature extraction unit outputs a “second result”]).
Although Kurasawa teaches using a neural network, the reference fails to explicitly teach using a first neural network, wherein first neural network comprises a feed-forward neural network, and using a second neural network wherein the second neural network comprises a recurrent neural network.
	Brezak teaches using a first neural network, wherein first neural network comprises a feed-forward neural network (“Feed–forward neural network analyzed in this paper is the most commonly used MLP NN with three layers… By summing the outputs of all hidden layer neurons (including Bias) and belonging weight factors w” [pg. 2, A. Feed-forward Neural Network, ¶1-3; See further: “Forecasting performances of feed-forward and recurrent neural networks (NN) trained with different learning algorithms are analyzed and compared using the Mackey– Glass nonlinear chaotic time series. This system is a known benchmark test whose elements are hard to predict. Multi– layer Perceptron NN was chosen as a feed-forward neural network because it is still the most commonly used network in financial forecasting models.” [Abstract]]), 
using a second neural network wherein the second neural network comprises a recurrent neural network. (“The Dynamic Multi–layer Perceptron Network (DMLP), proposed in [9], was modified and used as the second, recurrent type of network in this study. Unlike the feedforward MLP NN, this type of network is characterized by a dynamic neuron model, the so-called Dynamic Elementary Processor (DEP) which is structured as an Auto Regressive Moving Average (ARMA) filter, and is built into the network hidden layer. Thus, every hidden layer neuron has the ability to process previous values of its own activity together with new input signals.” [pg. 2, B. Recurrent Neural Network, ¶1])
Kurasawa and Brezak are both in the same field of endeavor of training neural networks with time-series data. Kurasawa discloses a feature extraction method using time-series data. Brezak discloses a method of using a feed-forward neural network and recurrent neural network in time-series forecasting. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the Kurasawa’s teachings to implement using a feed-forward neural network and a recurrent neural network in a time-series forecasting learning algorithm as taught by Brezak. Feed-forward neural networks and recurrent neural networks are well-known types of neural networks in machine learning, thus one would have been motivated to make this modification in order improve the prediction of future data by analyzing past time-series data. [pg. 1, § 1 Introduction, ¶1-4, Brezak] 

Regarding claim 19, Kurasawa/Brezak teaches The method of claim 18, where Kurasawa teaches further comprising: adjusting a parameter for generating the feature weight value or the time series weight value, based on the second result (“Moreover, for the non-omitted value of the evenly spaced time-series-data P, it is aimed to learn so as to have the same value in the output layer value X.sub.6, and an error function is designed. The model parameters are optimized by means of a gradient method so as to minimize the error. Adam is used as the gradient method. The gradient method in the present embodiment is not limited to this. As the gradient method, any method of probabilistic gradient descent such as SGD and AdaDelta may be used.” [¶0046; discloses an optimization algorithm, therefore optimizing model parameters would be equivalent to adjusting parameters.]).
Regarding claim 20, Kurasawa/Brezak teaches The method of claim 18, where Kurasawa teaches further comprising: calculating a prediction result corresponding to a prediction time, based on the second result (“In the time-series-data feature extraction device, the feature extraction unit may output the value of the intermediate layer together with the time-series data from which the feature has been extracted, and may output information of the difference between the element not missing in the matrix of the evenly spaced time-series-data group including omissions and the element of the output result of the output layer of the model.” [¶0015]).

Response to Arguments
Applicant's arguments filed 11/01/2022 have been fully considered but they are not persuasive. 

Regarding the 35 U.S.C. §101 Rejection:
Applicant appear to assert that the newly amended claims cannot be practically performed in the mind because they recite using a feed-forward neural network and a recurrent neural network to perform the claimed methods. Furthermore, applicant appears to assert that the amended feature of generating a prediction result corresponding to a prediction time for time series data integrate the abstract idea into a practical application. The examiner respectfully disagrees. The claims as currently recited appear to be merely using a neural network to generate the prediction result which appears to be merely using the neural network as a tool to perform the abstract idea.  The examiner respectfully disagrees. The claims still recite mental steps that can be practically performed in the human mind (i.e. adding an interpolation value to a missing value, generating masking data, generating feature weight values, etc.). The claims as currently recited appear to be merely using the neural networks as tools to perform these abstract ideas. There are no further details of how these neural networks are used to improve the functioning of a computer nor show an improvement to the training/learning of the algorithm. Therefore, the claims are merely generally linking these elements to the judicial exception and does not integrate the abstract idea into a practical application nor significantly more. 

Regarding the 35 U.S.C. § 102 Rejection:
Applicant’s arguments regarding the prior art of Kurasawa failing to explicitly discloses “using two neural networks, including a feed-forward neural network and recurrent neural network” has been considered but are moot because these newly amended features are now taught by the newly presented arts of Brezak and Yao. Please see the updated prior art rejection above.

Applicant’s arguments with respect to the rejections of the dependent claims have been fully considered but they are not persuasive as they rely upon the allowability of the independent claims.

Conclusion
Applicant's amendment necessitated the new grounds of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL H HOANG whose telephone number is (571)272-8491. The examiner can normally be reached Mon-Fri 8:30AM-4:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/M.H.H./Examiner, Art Unit 2122                                                                                                                                                                                                        
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122