Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-16 are rejected under 35 U.S.C. 103 as being unpatentable over Dasgupta (US20190042943A1) in view of Wang (Gaze latent support vector machine for image classification improved by weakly supervised region selection).

Regarding claim 1, Dasgupta teaches A method for a learning apparatus to learn an artificial neural network, the method comprising: obtaining first output data through a first artificial neural network for future data prediction based on an input time series data set

    PNG
    media_image1.png
    796
    1057
    media_image1.png
    Greyscale

([0063] FIG. 5 shows a diagram of cooperative neural networks performing deep reinforcement learning with partial input assistance, according to some embodiments of the present disclosure. Cooperative neural networks for partial input assistance can include a first neural network 520 and a second neural network 530. First neural network 520 can receive at least some of observation values 521 and output processed values. The examiner notes that Dasgupta teaches [0076] the use of observation values corresponding to time frames as input data to the first and second neural networks. The examiner interprets those time frame observation values to be the timeseries inputs claimed).
obtaining second output data through a second artificial neural network for past data reconstruction using the first output data of the first artificial neural network ([0063] Second neural network 530 can receive action values 531A, the processed values output from first neural network 520 that correspond to the input observation values 521, and any remaining observation values 521 that were not input to first neural network 520, such as observation values representing a reward, and output action values 531A, each sequentially output action values 531A correspond to the next time frame following the time frame of the input values. The examiner notes that the claim does not define past data reconstruction. The examiner considers the second neural network output action values to be the reconstructed values of the input observations).
However, Dasgupta fails to explicitly teach calculating a cost function using the first output data of the first artificial neural network and the second output data of the second artificial neural network. Dasgupta also fails to explicitly teach learning the first artificial neural network using the cost function.
 	On the other hand, Wang teaches calculating a cost function using the first output data of the first artificial neural network and the second output data of the second artificial neural network ([Page 61, Para. 7] This model generalizes latent SVM [15] by biasing the selection of latent regions based on the gaze information during the training scheme. The training objective of G + LSVM is as follows:

    PNG
    media_image2.png
    200
    400
    media_image2.png
    Greyscale

The examiner notes that Wang teaches a combined loss function that comprises a weighted sum of a classification hinge loss function and a gaze loss function. The examiner also notes that Dasgupta and Wang are both considered analogous because they are in the same field of computational neural networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Dasgupta’s training models to incorporate calculating a cost function using the first output data of the first artificial neural network and the second output data of the second artificial neural network as taught by Wang [Page 61, Para. 7] in order to solve the tasks of both models simultaneously, which leads to better results. [Page 62, Para 1]).
Furthermore, Wang teaches learning the first artificial neural network using the cost function ([Page 60, Para. 4] Our model is then optimized by reducing a loss function incorporating gaze penalization using the Concave-Convex Procedure (CCCP) [20]. The examiner notes that Dasgupta and Wang are both considered analogous because they are in the same field of computational neural networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Dasgupta’s training models to incorporate teaches learning the first artificial neural network using the cost function as taught by Wang [Page 60, Para. 4] to minimize the cost of testing the model after training [Page 60, Para. 4]).

Regarding claim 2, Dasgupta teaches The method of claim 1, wherein the obtaining of the second output data comprises obtaining the second output data by using, as an input of the second artificial neural network, the first output data of the first artificial neural network and a part of observation data which is included in the time series data set and corresponds to data observed before a time point to be predicted ([0063] Second neural network 530 can receive action values 531A, the processed values output from first neural network 520 that correspond to the input observation values 521, and any remaining observation values 521 that were not input to first neural network 520, such as observation values representing a reward, and output action values 531A, each sequentially output action values 531A correspond to the next time frame following the time frame of the input values. The examiner notes that the claim does not define past data reconstruction. The examiner considers the second neural network output action values to be the reconstructed values of the input observations. The examiner also notes that Dasgupta teaches [0076] the use of observation values corresponding to time frames as input data to the first and second neural networks. The examiner interprets those time frame observation values to be the timeseries inputs claimed).

Regarding claim 3, Dasgupta teaches The method of claim 1. However, Dasgupta fails to explicitly teach wherein the calculating of a cost function comprises calculating the cost function based on a direct error between a future data prediction value corresponding to the first output data and an actual future data observation value, and an indirect error between the second output data corresponding to past observation data reconstructed through the future data prediction value and actual past observation data.
On the other hand, Wang teaches wherein the calculating of a cost function comprises calculating the cost function based on a direct error between a future data prediction value corresponding to the first output data and an actual future data observation value, and an indirect error between the second output data corresponding to past observation data reconstructed through the future data prediction value and actual past observation data ([Page 61, Para. 7] This model generalizes latent SVM [15] by biasing the selection of latent regions based on the gaze information during the training scheme. The training objective of G + LSVM is as follows:

    PNG
    media_image2.png
    200
    400
    media_image2.png
    Greyscale

The examiner notes that Wang teaches a combined loss function that comprises a weighted sum of a classification hinge loss function and a gaze loss function. The examiner also notes that Dasgupta and Wang are both considered analogous because they are in the same field of computational neural networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Dasgupta’s training models to incorporate wherein the calculating of a cost function comprises calculating the cost function based on a direct error between a future data prediction value corresponding to the first output data and an actual future data observation value, and an indirect error between the second output data corresponding to past observation data reconstructed through the future data prediction value and actual past observation data as taught by Wang [Page 61, Para. 7] in order to solve the tasks of both models simultaneously, which leads to better results. [Page 62, Para 1]).

Regarding claim 4, Dasgupta teaches updating parameters of the first artificial neural network in a direction to minimize the cost function, and the parameters of the first artificial neural network are changed such that the direct error and the indirect error are lower than a set value ([0035] Updating section 115 can update the parameters of cooperative neural networks, such as first neural network 120 and second neural network 130. For example, updating section 115 can update the plurality of parameters of first neural network 120 using an error based on the approximated action-value function 117 and a reward. Updating section 115 can update the parameters of first neural network 120 based on backpropagation of the gradient of the parameters of the first neural network 120 with respect to the temporal difference error).

Regarding claim 5, Dasgupta teaches The method of claim 4, wherein the learning of the first artificial neural network fixes parameters of the second artificial neural network and updates the parameters of the first artificial neural network ([0035] Updating section 115 can update the parameters of cooperative neural networks, such as first neural network 120 and second neural network 130. The examiner notes that the claim does not define the term fixing parameters. The examiner considers updating parameters to be the fixing parameters claimed).

Regarding claim 6, Dasgupta teaches The method of claim 1, wherein the time series data set includes input data that is past observation data observed during a certain time interval and target data that is actual future observation data, and the input data includes first input data that is target data to be reconstructed and second input data that is to be used in reconstruction ([0026] Obtaining section 101 can receive data from data storage in communication with apparatus 100. For example, obtaining section 101 can be operable to obtain an action and observation sequence, such as action and observation sequence 119. Action and observation sequence 119 can be obtained sequentially as the actions are performed and the observations are observed. For example, obtaining section 101 can be operable to obtain an observation of a subsequent time frame of action and observation sequence 119. Alternatively, obtaining section 101 can be operable to obtain an entire action and observation sequence for a set of time frames, such as a training sequence, complete with actions and observations at each time frame. The examiner notes that the claim does not define past data reconstruction. The examiner considers the second neural network output action values to be the reconstructed values of the input observations. The examiner also notes that Dasgupta teaches [0076] the use of observation values corresponding to time frames as input data to the first and second neural networks. The examiner interprets those time frame observation values to be the timeseries inputs claimed).

Regarding claim 7, Dasgupta teaches The method of claim 6, wherein the obtaining of  the second output data comprises receiving the first output data of the first artificial neural network and the second input data as input to obtain the second output data of the second artificial neural network ([0063] Second neural network 530 can receive action values 531A, the processed values output from first neural network 520 that correspond to the input observation values 521, and any remaining observation values 521 that were not input to first neural network 520, such as observation values representing a reward, and output action values 531A, each sequentially output action values 531A correspond to the next time frame following the time frame of the input values).

Regarding claim 8, Dasgupta teaches The method of claim 6. However, Dasgupta fails to explicitly teach wherein the calculating of the cost function comprises: calculating a first error between the first output data of the first artificial neural network and the target data of the time series data set; calculating a second error between the second output data of the second artificial neural network and the first input data of the time series data set; and calculating the cost function based on the first error and the second error.
On the other hand, Wang teaches wherein the calculating of the cost function comprises: calculating a first error between the first output data of the first artificial neural network and the target data of the time series data set; calculating a second error between the second output data of the second artificial neural network and the first input data of the time series data set; and calculating the cost function based on the first error and the second error ([Page 61, Para. 7] This model generalizes latent SVM [15] by biasing the selection of latent regions based on the gaze information during the training scheme. The training objective of G + LSVM is as follows:

    PNG
    media_image2.png
    200
    400
    media_image2.png
    Greyscale

The examiner notes that Wang teaches a combined loss function that comprises a weighted sum of a classification hinge loss function and a gaze loss function. The examiner also notes that Dasgupta and Wang are both considered analogous because they are in the same field of computational neural networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Dasgupta’s training models to incorporate wherein the calculating of the cost function comprises: calculating a first error between the first output data of the first artificial neural network and the target data of the time series data set; calculating a second error between the second output data of the second artificial neural network and the first input data of the time series data set; and calculating the cost function based on the first error and the second error as taught by Wang [Page 61, Para. 7] in order to solve the tasks of both models simultaneously, which leads to better results. [Page 62, Para 1]).

Regarding claim 9, Dasgupta teaches The method of claim 7, wherein the learning of the first artificial neural network comprises changing parameters of the first artificial neural network such that the first error and the second error are respectively lower than a corresponding set value ([0035] Updating section 115 can update the parameters of cooperative neural networks, such as first neural network 120 and second neural network 130. For example, updating section 115 can update the plurality of parameters of first neural network 120 using an error based on the approximated action-value function 117 and a reward. Updating section 115 can update the parameters of first neural network 120 based on backpropagation of the gradient of the parameters of the first neural network 120 with respect to the temporal difference error).

Regarding claim 10, Dasgupta teaches An apparatus for learning an artificial neural network, comprising: an input interface device configured to receive a time-series data set; and a processor coupled to the input interface device and configured to learn a first artificial neural network for future data prediction, wherein the processor is configured to obtain first output data through the first artificial neural network based on the time series data set

    PNG
    media_image1.png
    796
    1057
    media_image1.png
    Greyscale

([0063] FIG. 5 shows a diagram of cooperative neural networks performing deep reinforcement learning with partial input assistance, according to some embodiments of the present disclosure. Cooperative neural networks for partial input assistance can include a first neural network 520 and a second neural network 530. First neural network 520 can receive at least some of observation values 521 and output processed values. The examiner notes that Dasgupta teaches [0076] the use of observation values corresponding to time frames as input data to the first and second neural networks. The examiner interprets those time frame observation values to be the timeseries inputs claimed).
obtain second output data through a second artificial neural network for past data reconstruction using the first output data ([0063] Second neural network 530 can receive action values 531A, the processed values output from first neural network 520 that correspond to the input observation values 521, and any remaining observation values 521 that were not input to first neural network 520, such as observation values representing a reward, and output action values 531A, each sequentially output action values 531A correspond to the next time frame following the time frame of the input values. The examiner notes that the claim does not define past data reconstruction. The examiner considers the second neural network output action values to be the reconstructed values of the input observations).
However, Dasgupta fails to explicitly teach calculate a cost function using the first output data and the second output data. Dasgupta also fails to explicitly teach learn the first artificial neural network using the cost function.
 	On the other hand, Wang teaches calculate a cost function using the first output data and the second output data ([Page 61, Para. 7] This model generalizes latent SVM [15] by biasing the selection of latent regions based on the gaze information during the training scheme. The training objective of G + LSVM is as follows:

    PNG
    media_image2.png
    200
    400
    media_image2.png
    Greyscale

The examiner notes that Wang teaches a combined loss function that comprises a weighted sum of a classification hinge loss function and a gaze loss function. The examiner also notes that Dasgupta and Wang are both considered analogous because they are in the same field of computational neural networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Dasgupta’s training models to incorporate calculate a cost function using the first output data and the second output data as taught by Wang [Page 61, Para. 7] in order to solve the tasks of both models simultaneously, which leads to better results. [Page 62, Para 1]).
Furthermore, Wang teaches learn the first artificial neural network using the cost function ([Page 60, Para. 4] Our model is then optimized by reducing a loss function incorporating gaze penalization using the Concave-Convex Procedure (CCCP) [20]. The examiner notes that Dasgupta and Wang are both considered analogous because they are in the same field of computational neural networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Dasgupta’s training models to incorporate teaches learn the first artificial neural network using the cost function as taught by Wang [Page 60, Para. 4] to minimize the cost of testing the model after training [Page 60, Para. 4]).

Regarding claim 11, Dasgupta teaches Dasgupta teaches The apparatus of claim 10, wherein the processor is configured to obtain the second output data by using, as an input of the second artificial neural network, the first output data of the first artificial neural network, and a part of observation data which is included in the time series data set and corresponds to data observed before a time point to be predicted ([0063] Second neural network 530 can receive action values 531A, the processed values output from first neural network 520 that correspond to the input observation values 521, and any remaining observation values 521 that were not input to first neural network 520, such as observation values representing a reward, and output action values 531A, each sequentially output action values 531A correspond to the next time frame following the time frame of the input values. The examiner notes that the claim does not define past data reconstruction. The examiner considers the second neural network output action values to be the reconstructed values of the input observations. The examiner also notes that Dasgupta teaches [0076] the use of observation values corresponding to time frames as input data to the first and second neural networks. The examiner interprets those time frame observation values to be the timeseries inputs claimed).

Regarding claim 12, Dasgupta teaches The apparatus of claim 10. However, Dasgupta fails to explicitly teach wherein the processor is specifically configured to calculate the cost function based on a direct error between a future data prediction value corresponding to the first output data and an actual future data observation value, and an indirect error between the second output data corresponding to past observation data reconstructed through the future data prediction value and actual past observation data.
On the other hand, Wang teaches wherein the processor is specifically configured to calculate the cost function based on a direct error between a future data prediction value corresponding to the first output data and an actual future data observation value, and an indirect error between the second output data corresponding to past observation data reconstructed through the future data prediction value and actual past observation data ([Page 61, Para. 7] This model generalizes latent SVM [15] by biasing the selection of latent regions based on the gaze information during the training scheme. The training objective of G + LSVM is as follows:

    PNG
    media_image2.png
    200
    400
    media_image2.png
    Greyscale

The examiner notes that Wang teaches a combined loss function that comprises a weighted sum of a classification hinge loss function and a gaze loss function. The examiner also notes that Dasgupta and Wang are both considered analogous because they are in the same field of computational neural networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Dasgupta’s training models to incorporate wherein the processor is specifically configured to calculate the cost function based on a direct error between a future data prediction value corresponding to the first output data and an actual future data observation value, and an indirect error between the second output data corresponding to past observation data reconstructed through the future data prediction value and actual past observation data as taught by Wang [Page 61, Para. 7] in order to solve the tasks of both models simultaneously, which leads to better results. [Page 62, Para 1]).
Dasgupta teaches update parameters of the first artificial neural network in a direction to minimize the cost function ([0035] Updating section 115 can update the parameters of cooperative neural networks, such as first neural network 120 and second neural network 130. For example, updating section 115 can update the plurality of parameters of first neural network 120 using an error based on the approximated action-value function 117 and a reward. Updating section 115 can update the parameters of first neural network 120 based on backpropagation of the gradient of the parameters of the first neural network 120 with respect to the temporal difference error).

Regarding claim 13, Dasgupta teaches The apparatus of claim 10, wherein the time series data set includes input data that is past observation data observed during a certain time interval and target data that is actual future observation data, and the input data includes first input data that is target data to be reconstructed and second input data that is to be used in reconstruction ([0026] Obtaining section 101 can receive data from data storage in communication with apparatus 100. For example, obtaining section 101 can be operable to obtain an action and observation sequence, such as action and observation sequence 119. Action and observation sequence 119 can be obtained sequentially as the actions are performed and the observations are observed. For example, obtaining section 101 can be operable to obtain an observation of a subsequent time frame of action and observation sequence 119. Alternatively, obtaining section 101 can be operable to obtain an entire action and observation sequence for a set of time frames, such as a training sequence, complete with actions and observations at each time frame. The examiner notes that the claim does not define past data reconstruction. The examiner considers the second neural network output action values to be the reconstructed values of the input observations. The examiner also notes that Dasgupta teaches [0076] the use of observation values corresponding to time frames as input data to the first and second neural networks. The examiner interprets those time frame observation values to be the timeseries inputs claimed).

Regarding claim 14, Dasgupta teaches The apparatus of claim 13, wherein the processor is configured to receive the first output data of the first artificial neural network and the second input data as input to obtain the second output data of the second artificial neural network ([0063] Second neural network 530 can receive action values 531A, the processed values output from first neural network 520 that correspond to the input observation values 521, and any remaining observation values 521 that were not input to first neural network 520, such as observation values representing a reward, and output action values 531A, each sequentially output action values 531A correspond to the next time frame following the time frame of the input values).

Regarding claim 15, Dasgupta teaches The apparatus of claim 14. However, Dasgupta fails to explicitly teach wherein the processor is specifically configured to calculate a first error between the first output data of the first artificial neural network and the target data of the time series data set, to  calculate a second error between the second output data of the second artificial neural network and the first input data of the time series data set, and to calculate the cost function based on the first error and the second error.
On the other hand, Wang teaches wherein the processor is specifically configured to calculate a first error between the first output data of the first artificial neural network and the target data of the time series data set, to  calculate a second error between the second output data of the second artificial neural network and the first input data of the time series data set, and to calculate the cost function based on the first error and the second error ([Page 61, Para. 7] This model generalizes latent SVM [15] by biasing the selection of latent regions based on the gaze information during the training scheme. The training objective of G + LSVM is as follows:

    PNG
    media_image2.png
    200
    400
    media_image2.png
    Greyscale

The examiner notes that Wang teaches a combined loss function that comprises a weighted sum of a classification hinge loss function and a gaze loss function. The examiner also notes that Dasgupta and Wang are both considered analogous because they are in the same field of computational neural networks. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Dasgupta’s training models to incorporate wherein the processor is specifically configured to calculate a first error between the first output data of the first artificial neural network and the target data of the time series data set, to  calculate a second error between the second output data of the second artificial neural network and the first input data of the time series data set, and to calculate the cost function based on the first error and the second error as taught by Wang [Page 61, Para. 7] in order to solve the tasks of both models simultaneously, which leads to better results. [Page 62, Para 1]).

Regarding claim 16, Dasgupta teaches The apparatus of claim 15, wherein the processor is configured to change parameters of the first artificial neural network such that the first error and the second error are respectively lower than a corresponding set value ([0035] Updating section 115 can update the parameters of cooperative neural networks, such as first neural network 120 and second neural network 130. For example, updating section 115 can update the plurality of parameters of first neural network 120 using an error based on the approximated action-value function 117 and a reward. Updating section 115 can update the parameters of first neural network 120 based on backpropagation of the gradient of the parameters of the first neural network 120 with respect to the temporal difference error).


Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Fannes - US10530388 
(Fannes teaches the use of an encoder to decompose a signal, a predictor to approximate the decomposed signal, and a decoder to reconstruct the signal.)
Kato - US20150254554A1
(Kato teaches the a time-series data prediction model using neural networks)
Keeler - US6243696
(Keeler discloses steps to create a neural network model)

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHAMCY ALGHAZZY whose telephone number is (571)272-8824. The examiner can normally be reached Monday-Friday 7:30am-4:30pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas can be reached on (571) 272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/SHAMCY ALGHAZZY/Examiner, Art Unit 2128  

/OMAR F FERNANDEZ RIVAS/Supervisory Patent Examiner, Art Unit 2128