DETAILED ACTION
This action is in response the communications filed on 06/30/2022 in which claims 1, 2, 10, 11, and 19 are amended, claims 7 and 16 are canceled, and claims 1-6, 8-15 and 17-20 are pending.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1, 10 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Kajino ("A Functional Dynamic Boltzmann Machine") in view of Wang ("Temporal-related convolutional-Restricted-Boltzmann-Machine capable of learning relational order via reinforcement learning") in further view of Gao ("Collision Avoidance Control for Advanced Driver Assistance System Based on Deep Discriminant Model").

In regard to claim 1, Kajino teaches: A computer-implemented method for machine prediction, comprising: Dynamic Boltzmann Machine (DyBM) (Kajino, p.1987 "Dynamic Boltzmann machines (DyBMs) are recently developed generative models of a time series.")
Kajino does not teach, but Wang teaches: forming... a Convolutional Dynamic Boltzmann Machine (C-DyBM) by extending a non-convolutional DyBM with a convolutional operation; … using the convolution operation of the C-DyBM, (Wang, p. 2 "… In case of image processing, we extract features in the image by convoluting it with a 2d kernels [a convolution operation] and then construct a 2d layer of RBM [Restricted Boltzmann Machine]"; extending RBM (a non-convolutional BM) with a convolutional operation)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to apply the convolutional extention to RBM of Wang to the DyBM of Kajino (i.e. replacing RBM with DyBM). Doing so would allow the system process high dimensional and highly abstract data. (Wang, p, 2 "convolutional Restricted Boltzmann Machine has been used to extract features from high dimensional and highly abstract dataset")

Kajino and Wang do not teach, but Gao teaches: by a hardware processor… (Gao, p. 83 "The experiments were all conducted on a computer with a platform of Intel i7-5930K 12-core processor @3.5GHz, 64GB memory, Nvidia Titan X GPU, Ubuntu 16.04, and TensorFlow framework.")
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to implement Dynamic Boltzmann machines of Kajino and Temporal-related convolutional-Restricted-Boltzmann-Machine of Wang on the system of Gao. Doing so is for automation and perform those machines on a computer for performance evaluation. (Gao, p. 82 “to evaluate the proposed DDM for different settings of network architecture and model parameters.”)

generating... a prediction of a future event at time t from a past patch of time-series of observations; and (Gao, p. 80 "a deep discriminant model is proposed whose primary goal is to improve vehicle occupant safety by predicting future vehicle collisions [predicting future event] cased by dangerous lane changes in time to activate ADAS."; p. 81 "each branch of the deep network takes as input a plurality of images or values of each input state/physiological data, corresponding to data collected at each time step over the length of the sequence [a past patch of time-series of observations]. Proceeding through the deep network, data collected periodically over the sequence length yields sufficient input data to generate a prediction output."; p. 81 "The time tc [e.g. future time t] taken by the vehicle M from the starting position to the collision position of the two vehicles can be calculated as tc = (vN - vM(0))/ aM.")
performing... a physical action responsive to the prediction of the future event at time t, (Gao, p. 79 "Autonomous cars need to constantly assess the risk of accidents and generate control commands accordingly with the help of Advanced Driver Assistance System (ADAS)… The algorithms estimate the risk level or collision possibility, and decide whether, when [e.g. future time] and how to conduct the beforehand interventions (e.g. warning or braking) [performing a physical action].")
wherein the physical action comprises avoiding an obstacle by automatically controlling one or more driving related functions of a vehicle, responsive to the prediction of the future event at time t being a collision with the obstacle based on a current trajectory of the vehicle. (Gao, p. 79 "In this paper, a novel Deep Discriminant Model, DDM is proposed for predicting imminent collisions caused by dangerous lane change, which can be utilized as a collision avoidance control strategy for advanced driver assistance system... Autonomous cars need to constantly assess the risk of accidents and generate control commands accordingly with the help of Advanced Driver Assistance System (ADAS)… The algorithms estimate the risk level or collision possibility [prediction], and decide whether, when [e.g. future time] and how to conduct the beforehand interventions (e.g. warning or braking) [controlling 
    PNG
    media_image1.png
    324
    399
    media_image1.png
    Greyscale
driving related functions of a vehicle]."; p. 81 "The time tc [future time] taken by the vehicle M from the starting position to the collision position of the two vehicles [future event at time t] can be calculated as tc = (vN - vM(0))/ aM."; See Fig. 1, M: Host vehicle, N: Other vehicle [an obstacle]; the starting position of the host vehicle to the collision position is a current trajectory of the vehicle.)

It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to implement c-DyBM of the combination of Kajino and Wang on the system of Gao processing sensor/image data from a vehicle. Doing so would make c-DyBM to be used in the practical application for predicting vehicle collisions. (Gao, p. 79 "to process the input image sensor data in both time and spatial domain. The... Experiments in a simulation environment showed that the DDM can learn to predict impending collisions with an accuracy of 80.8%...")

In regard to claim 10, the claim recites substantially the same limitation as claim 1, therefore the rejection applied to claim 1 also apply to claim 10, and further, Kajino and Wang do not teach, but Gao teaches: A computer program product for machine prediction, the computer program product comprising a non-transitory computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to perform a method comprising: (Gao, p. 83 "The experiments were all conducted on a computer with a platform of Intel i7-5930K 12-core processor @3.5GHz, 64GB memory, Nvidia Titan X GPU, Ubuntu 16.04, and TensorFlow framework.")
The rationale for combining the teachings of Kajino, Wang and Gao is the same as set forth in the rejection of claim 1.

In regard to claim 19, the claim recites substantially the same limitation as claim 1, therefore the rejection applied to claim 1 also apply to claim 19, and further, Kajino and Wang do not teach, but Gao teaches: A computer processing system for machine prediction, comprising: a memory for storing program code; and a processor for running the program code to (Gao, p. 83 "The experiments were all conducted on a computer with a platform of Intel i7-5930K 12-core processor @3.5GHz, 64GB memory, Nvidia Titan X GPU, Ubuntu 16.04, and TensorFlow framework.")
The rationale for combining the teachings of Kajino, Wang and Gao is the same as set forth in the rejection of claim 1.

Claims 2, 8, 9, 11, 17, 18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Kajino in view of Wang in view of Gao in view of Zhao ("Convolutional neural networks for time series classification") in further view of Wiatowski ("Energy Propagation in Deep Convolutional Neural Networks").

In regard to claims 2 and 11, reference is made to the rejection of claims 1 and 10 respectively, and further, Kajino and Wang do not teach, but Gao teaches :  the future event at time t is predicted from past observations X[<t]≡{x[t−d]}d=t−T t−1 occurring before the time t by using a prediction model f... wherein X[<t]≡{x[t−d]}d=t−T t−1 is past observations occurring before the time t, (Gao, p. 80 "a deep discriminant model [a prediction model f] is proposed whose primary goal is to improve vehicle occupant safety by predicting future vehicle collisions [predicting future event at time t]" p. 81 "each branch of the deep network takes as input a plurality of images or values of each input state/physiological data, corresponding to data collected at each time step over the length of the sequence. [a past patch of time-series of observations/X[<t]] Proceeding through the deep network, data collected periodically over the sequence length yields sufficient input data to generate a prediction output... ConvLSTMs cells process the image inputs spatiotemporally, or compute the output of the cell according to the inputs and past states of its local neighbors"; p. 82 "Weighting tensors applied as convolutions... all other multiplications involving tensors in the above equations are convolution operations.")
The rationale for combining the teachings of Kajino, Wang and Gao is the same as set forth in the rejection of claim 1.


    PNG
    media_image2.png
    43
    421
    media_image2.png
    Greyscale
Kajino, Wang and Gao do not teach, but Zhao teaches: 

with model parameters θ ϵ [
    PNG
    media_image3.png
    106
    546
    media_image3.png
    Greyscale
(Zhao, p. 162 "Time series is an important class of temporal data objects, and can be easily obtained by recording a series of observations [past observations, x] chronologically"; p. 164 "wr ϵ  R lxk ... refer to the weights of the rth convolution filter [W/Wk,i,j/a convolutional parameter of a k-th convolutional map]"; p. 164 "the size of filter kxl where k denotes the variate number of the time series in the preceding layer and l denotes the length of filter [across time with τk/Tk/a width of a patch of a k-th convolutional map]"; ωr(i, j) in Eq (3), i=[1, l (length)], j = [1,k (variate number)], [i or j-th attribute / unit / neuron value xi [t−d] and 0 respectively]; because Wk,i,j in the claim forms a tensor and corresponds to a patch in CNNs, therefore ωr(i, j) teaches Wk,i,j, where wr(i, j) is the weight between the i-th unit/neuron at time t-d (or t- τk) and j-th unit/neuron at time 0; p. 164 "b(r) refer to the... bias of the rth convolution filter [b/bk]"; p. 163 "Input layer has N x k neurons, where k denotes the variate number of input time series and N denotes the length of each univariate series [d = [t-T, t-1], T is a length of the past observations].; p. 164 "where x ϵ R N x k denotes the input time series... [T]"; "Cr(t) refers to the tth component of the rth feature map"; Cr(t) is the output of function; "an activation function f, and the commonly used example is the sigmoid function [h]")

It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the convolutional operations of the combination of Kajino, Wang and Gao to include the convolution and pooling operations of Zhao. Doing so would extract the suitable internal structure to generate deep features of the input time series automatically. (Zhao, p. 162 "CNN can discover and extract the suitable internal structure to generate deep features of the input time series automatically by using convolution and pooling operations.")


    PNG
    media_image4.png
    86
    174
    media_image4.png
    Greyscale
Kajino, Wang, Gao and Zhao do not teach, but Wiatowski teaches: where Wk,i,j [d,τk] in the k-th convolutional map is selected from one of … U, V... are the model 

parameters for K number of convolutional maps, λ and μ are decay rates, (Wiatowski, p. 3 "Un[λn]f=∣∣f∗gλn∣∣. (3) We extend (3) to paths on index sets q=(λ1,λ2,…,λn) ∈ Λ1×Λ2×…×Λn =:Λn,n∈N, according to U[q]f = U[(λ1,λ2,…,λn)]f := Un[λn]⋯U2[λ2]U1[λ1]f, (4)… The signals U[q]f , qϵ Λ^n , associated with the n -th network layer, are often referred to as feature maps [k-th convolutional maps] in the deep learning literature."; p. 7 "The next result shows that, under additional structural assumptions on the filters {gλn} λn ϵ Λ, the guaranteed energy decay rate can be improved from polynomial to exponential."; p. 1 "For broad families of wavelets and Weyl-Heisenberg filters, the guaranteed energy decay rate is shown to be exponential in the network depth, i.e., the decay is at least of order O(a^−N) with the decay factor given as a=5/3 in the wavelet case and a=3/2 in the Weyl-Heisenberg case.")
(see Eq(3), U_n is the model parameter of kth convolutional map, and qϵ Λ^n is the decay rates. If t-1, t-2 ... t-d corresponds to 1st, 2nd, ...dth filter map, then qϵ Λ^n corresponds to λ^d and μ^d.)

It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the Convolutional operation of the combination of Kajino, Wang, Gao and Zhao to include the energy conservation of CNN of Wiatowski. Doing so would reduce computational complexity posed from large number of convolutions. (Wiatowski, p. 1 "Many practical machine learning tasks employ very deep convolutional neural networks. Such large depths pose formidable computational challenges in training and operating the network. It is therefore important to understand how fast the energy contained in the propagated signals (a.k.a. feature maps) decays across layers...")

In regard to claims 8, 17 and 20, reference is made to the rejection of claims 1, 10 and 19 respectively, and further, Kajino, Wang, Gao and Zhao do not teach, but Wiatowski teaches: the convolutional operation with which the DyBM is extended as a one-dimensional convolutional operation. (Wiatowski, p. 3 "Un[λn]f=∣∣f∗gλn∣∣. (3)"; the operator * denotes a one-dimensional convolution, as taught in many other literature, e.g. Katsuki on p. 452)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the Convolutional operation of the combination of Kajino, Wang, Gao and Zhao to include the energy conservation of CNN of Wiatowski. Doing so would reduce computational complexity posed from large number of convolutions. 
The rationale for combining the teachings of Kajino, Wang, Gao, Zhao and Wiatowski is the same as set forth in the rejection of claim 2.

In regard to claims 9 and 18, reference is made to the rejection of claims 8 and 17 respectively, and further, Kajino teaches:  the one-dimensional convolutional operation extends a summation of an eligibility trace in the C-DyBM. (Kajino, p. 1987 "DyBMs are recently emerging generative models of a binary/real-valued multi-dimensional time series. One of their essential characteristics is a recursively updatable memory unit summarizing all the past data, which is dubbed as an eligibility trace.")

Claims 3, 5, 6, 12, 14 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Kajino in view of Wang in view of Gao in view of Zhao in further view of Datar ("The sliding-window computation model and results").

In regard to claims 3 and 12, reference is made to the rejection of claims 1 and 10 respectively, and further, Kajino, Wang, Gao do not teach, but Zhao teach:  dynamically down-sampling any of observations and latent representations using a down-sampling function that takes a maximum value over sub-temporal regions (Zhao, "the function g represents the pooling strategy, the most popular used is averaging or max pooling [a maximum value over sub-temporal regions]. It is obvious that pooling operation achieves reducing the point data, while not changing the number of feature maps"; "The advantage of pooling operation is down-sampling the convolutional output bands, thus reducing variability in the hidden activations...") 
The rationale for combining the teachings of Kajino, Wang, Gao and Zhao is the same as set forth in the rejection of claim 2.


    PNG
    media_image5.png
    343
    823
    media_image5.png
    Greyscale

    PNG
    media_image6.png
    49
    152
    media_image6.png
    Greyscale
Kajino, Wang, Gao and Zhao do not teach, but Datar teaches: along with increasing a window size exponentially, the down-sampling function being , wherein l0 is an initial size of the window, and l is a growth rate of the window. (Datar, p. 160 "Typically, this goal is achieved by maintaining that the bucket sizes grow exponentially from right to left (new to old) [increasing a window size exponentially] and hence the name Exponential Histograms (EH)."; p.  154 "Let Cj = 2^r (r > 0) be the size of the j-th bucket"; B1 is an initial window; 2 is the example of a growth rate.)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the sequence data of the combination of Kajino, Wang, Gao and Zhao to several time window with size increasing exponentially, as taught by Datar. Doing so would allow the system to process data with the weights decaying as the observations get older. (Datar, p. 149 "in certain data-stream processing applications, recent data is more useful and pertinent than older data. In such cases, we would like to answer questions about the data only over the last N most recent data elements…")

In regard to claims 5 and 14, reference is made to the rejection of claims 3 and 12 respectively, and further, Kajino, Wang, Gao and Zhao do not, but Datar teach:  the sub-temporal regions are of varying lengths. (Datar, see Fig. 8.2, B1, B2… Bm are of varying lengths. Size/length of those windows are increase exponentially.)
The rationale for combining the teachings of Kajino, Wang, Gao, Zhao and Datar is the same as set forth in the rejection of claim 3.

In regard to claims 6 and 15, reference is made to the rejection of claims 3 and 12 respectively, and further, Kajino, Wang, Gao and Zhao do not, but Datar teach:  the window size for the earliest sub-temporal region is infinity. (Datar, because the size of buckets are increase exponentially, and in mathematics, for f(x) = a^x, when a > 1 the value of the exponential function increases rapidly towards infinity for positive x values, therefore the earliest Bm is infinity.)
The rationale for combining the teachings of Kajino, Wang, Gao, Zhao and Datar is the same as set forth in the rejection of claim 3.

Claims 4 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Kajino in view of Wang in view of Gao in view of Zhao in view of Datar in further view of Dhargalkar ("Determining Missing Values in Dimension Incomplete Databases using Spatial-Temporal Correlation Techniques").

In regard to claims 4 and 13, reference is made to the rejection of claims 3 and 12 respectively, and further, Kajino, Wang, Gao, Zhao and Datar do not teach, but Dhargalkar teaches:  missing sequence values are ignored by the dynamically down-sampling. (Dhargalkar, p. 603 "The data acquired by the sensor node Ni can be looked as a time series Si=(<yil,Tl>,…,<yin,Tn>) [sequence values], where yik is the sensor data of Ni at time Tk... B. Max-Window Association Rule Mining (Max-Warm) Following are the steps to be followed:- Step1: Determine the missing value 'misVal'using existing WARM method... a. Determine unique values reported by a missing sensor node and their support within a given window size. b. Set 'misVal' equal to value having maximum (higher) support. [missing values are ignored by replacing them with the max value within the window]"; sepc. [0041] the missing values by ignoring them in this max operation)

It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the sequence data of the combination of Kajino, Wang, Gao, Zhao and Datar to replace missing data with calculated values, as taught by Dhargalkar. Doing so would avoid inaccurate results due to the missing data. (Dhargalkar, p. 601 "The advantage of the proposed approach is that the result of the user query will always have complete and accurate data... Imputation is the process of calculating most probable values and replacing missing data with the calculated values. Since analyzing missing data could lead to inaccurate results, imputation is seen as a way to avoid pitfalls...")
Response to Arguments
Applicant's amendments with respect to rejection of claims under 35 U.S.C. 112(b) have been fully considered and are sufficient to overcome the rejection. The rejection to the claims under 35 U.S.C. 112(b) has been withdrawn.

Applicant's amendments with respect to rejection of claims under 35 U.S.C. 101 have been fully considered and are sufficient to overcome the rejection. The rejection to the claims under 35 U.S.C. 101 has been withdrawn.

Applicant's arguments filed with respect to the rejection of the claims under 35 U.S.C. 103 have been fully considered but they are not persuasive:

Applicant argues: (see p. 10 bottom): (1) “The preceding cited paragraph of Wang initially mentions a convolutional Restricted Boltzmann machine, and then goes on to state constructing a 2nd layer of the RBM, where it is clear that the reference simply did not mention the word convolutional before RBM but implied it with the initial reference to convolutional Restricted Boltzmann machine. As further disclosed in the Abstract of Wang: "we extend the conventional convolutional-Restricted-Boltzmann-Machine to learn highly abstract features among arbitrary number of time related input maps by constructing a layer of multiplicative units, which capture the relations among inputs.” and (2)” Moreover, a review of reference [1] cited above discloses the following: ‘In order to learn high-level representations, we stack CRBMs into a multilayer architecture analogous to DBNs.’ FIG. 1 of reference [1] also shows an extended CRBM and not an extended RBM.” 
Examiner answers: (1) The examiner does not cite “we extend the conventional convolutional-Restricted-Boltzmann-Machine…” or “Recently, convolutional Restricted Boltzmann Machine has been used to extract features…,” instead, the examiner only cites “… we extract features in the image by convoluting it with a 2d kernels [a convolution operation] and then construct a 2d layer of RBM [a non-convolutional BM]…” which teaches the claim “extending a non-convolutional BM with a convolutional operation.” The citation explains how a convolutional BM is established, it does not mean extending a convolutional Boltzmann Machine with another convolutional operation.
(2) Regarding to reference [1], the examiner’s answer is similar to Wang, see section in Lee / reference [1], section 3.1 and 3.2. “We use ∗ to denote convolution [a convolution operation]” and W_k∗v in the equation on p. 611 left column, which teaches using a convolution operation in the RBM to construct CRBM.) This explains how a CRBM is established by adding a convolutional operation on a RBM, it does not mean extending a CRBM with another convolutional operation.
Also see support from related documents: (Wang, abstract “… the steps of adding a convolutional operation on the basis of a restricted Boltzmann machine to obtain a model structure of a convolutional restricted Boltzmann machine…”)

Applicant argues: (see p. 13 middle): “Regarding claim 2… It seems like the case will go to appeal on at least these grounds. The Examiner is respectfully requested to reconsider the rejection of claims 2 and 11 and see how seemingly ridiculous it would be for one of ordinary skill in the art to be able to piece together that specific equation and the supporting equations from the alleged fragments allegedly shown in the cited references.” 
Examiner answers: Please provide specific details or analyses on why the prior arts cited could not be combined or persons in the art would not be motivated to perform the combination.

Conclusion
The art made of record and not relied upon is considered pertinent to applicant's disclosure.
Wang (CN 107330908 A) teaches convolutional restricted Boltzmann machine (CRBM).

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SU-TING CHUANG whose telephone number is (408)918-7519.  The examiner can normally be reached on Monday - Thursday 8-5 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571)272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/S.C./Examiner, Art Unit 2122                  
            
/BRIAN M SMITH/Primary Examiner, Art Unit 2122