DETAILED ACTION
Claims 1-20 are pending and have been examined.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
Information Disclosure Statement
The information disclosure statements (IDS) submitted on 07/24/2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the examiner.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-4, 6-10 and 13-18 are rejected under 35 U.S.C. 102 (a)(1) as being anticipated by Riemer ("Scalable Recollections for Continual Lifelong Learning").

In regard to claim 1, Riemer teaches: A method for training a parametric machine learning system, comprising; (Riemer, p. 1355 left col., Algorithm 1 Experience Replay Training for Continual Learning with a Scalable Recollection Module; p. 1354 left col. Improving Experience Replay Training "... In this setting the model must learn T tasks sequentially from dataset. At every step it receives a triplet (x; t; y) representing the input, task label, and correct output. There are two models [machine learning system] to be trained: the recollection module which consists of a memory index buffer M, an encoder ENC_Φ and decoder DEC_Ψ; and a predictive task model F_θ. The training proceeds in two phases..."; Two models are the ML system. The recollection module and predictive model both have neural network architectures, i.e. machine learning models, see claim 9 for details.)

    PNG
    media_image1.png
    482
    498
    media_image1.png
    Greyscale


compressing a first data; (Riemer, p. 1353 right col. The Scalable Recollection Module (SRM) "… When a new experience is received (in the figure, an image of the numeral '6' [e.g. first data]), the encoder compresses it to a sequence of discrete latent codes (one hot vectors). These codes are concatenated and further compressed to a k bit binary code or 'index' shown in decimal in the figure."; compressing an image)
storing the compressed first data; (Riemer, p. 1353 right col. "… This compressed code [compressed data] is then stored in the index buffer. This path is shown in blue..."; storing the compressed code generated from the image.)
reconstructing a first selected amount of the stored compressed first data; (Riemer, p. 1353 right col. "... Experiences are retrieved from the index buffer by sampling a code [e.g. a selected amount] from the index buffer and passing it through the decoder to create an approximate reconstruction of the original input. This path is shown in red in the figure..."; p. 1354 "When an incoming example is received, we first sample multiple batches of recollections [e.g. a selected amount of compressed data] from the index buffer and decode them into experiences [reconstruct] using the current decoder...")
providing a machine learning system; and (Riemer, p. 1354 left col. Improving Experience Replay Training "... There are two models [machine learning system] to be trained: the recollection module which consists of a memory index buffer M, an encoder ENC_Φ and decoder DEC_Ψ; and a predictive task model F_θ. The training proceeds in two phases..."; Two models are the ML system.)
training the machine learning system with the reconstructed first data. (Riemer, p. 1354 left col. "For the recollection module, we achieve stabilization through a novel extension of experience replay (Lin 1992). We then perform N steps of optimization [e.g. training] on the encoder/decoder parameters Φ and Ψ by interleaving the current input example with a different batch of past recollections at each of the N optimization steps… For each optimization step, the error for each experience in a batch is computed by encoding that experience into a latent code using the encoder and then decoding back to an experience [reconstruct data] to compute the reconstruction error."; optimization steps of experience replay teaches training using reconstructions of compressed data.; p. 1354 right col. "... the predictive model F_θ is trained on just one of the recollection sample sets [e.g. training with reconstructed data] (we arbitrarily chose the first) using loss function and learning rate α...")

In regard to claim 2, reference is made to the rejection of claim 1, and further, Riemer teaches: further comprising training the machine learning system with raw data. (Riemer, p. 1354 left col. "We then perform N steps of optimization on the encoder/decoder parameters Φ and Ψ by interleaving the current input example [e.g. raw data] with a different batch of past recollections at each of the N optimization steps… learning parameters to successfully reconstruct both the old experiences in the buffer as well as the new experience [e.g. raw data]. In this way the recollection module continues to remember the relevant past experiences in the buffer while integrating new experiences.")

In regard to claim 3, reference is made to the rejection of claim 1, and further, Riemer teaches: further comprising training the machine learning system with data comprising reconstructions of compressed second data, raw data, or both. (Riemer, p. 1354 left col. "For the recollection module, we achieve stabilization through a novel extension of experience replay (Lin 1992). We then perform N steps of optimization on the encoder/decoder parameters Φ and Ψ by interleaving the current input example with a different batch of past recollections at each of the N optimization steps…"; optimization steps of experience replay teach training using reconstructions of a different set compressed data, i.e. reconstructions of compressed second data.)

In regard to claim 4, reference is made to the rejection of claim 3, and further, Riemer teaches: further comprising reconstructing a second selected amount of the stored compressed first data and training the machine learning system with the second selected amount of reconstructed first data. (Riemer, p. 1354 left col. "For the recollection module, we achieve stabilization through a novel extension of experience replay (Lin 1992). We then perform N steps of optimization [e.g. training] on the encoder/decoder parameters Φ and Ψ by interleaving the current input example with a different batch of past recollections [e.g. a second selected amount] at each of the N optimization steps… For each optimization step, the error for each experience in a batch is computed by encoding that experience into a latent code using the encoder and then decoding back to an experience [reconstruct data] to compute the reconstruction error."; optimization steps of experience replay teaches training using reconstructions of compressed data.; p. 1354 right col. "... the predictive model F_θ is trained on just one of the recollection sample sets [e.g. training with reconstructed data (generated from a different batch / second data)] (we arbitrarily chose the first) using loss function and learning rate α...")

In regard to claims 6 and 18, reference is made to the rejection of claims 1 and 14 respectively, and further, Riemer teaches: wherein data comprises images, strings, audio waves, charts, coordinates, vectors, or text. (Riemer, p. 1353 right col. The Scalable Recollection Module (SRM) "… When a new experience is received (in the figure, an image of the numeral '6' [e.g. first data])…")

In regard to claim 7, reference is made to the rejection of claim 1, and further, Riemer teaches: wherein compressing a first data is performed by a compression model. (Riemer, p. 1353 right col. "The recollection buffer is implemented using a discrete variational auto-encoder [a compression model] (Jang, Gu, and Poole 2017)(Maddison, Mnih, and Teh 2017). A discrete variational autoencoder is a generative neural model with two components: an encoder and a decoder."; p. 1353 right col. "… the encoder compresses it to a sequence of discrete latent codes (one hot vectors). These codes are concatenated and further compressed to a k bit binary code or 'index' shown in decimal in the figure.")

In regard to claims 8 and 15, reference is made to the rejection of claims 7 and 14 respectively, and further, Riemer teaches: wherein the compression model is a product quantization, K-means clustering, Gaussian mixture model, vector quantized variational auto-encoder (VQ-VAE), or Adaptive Resonance Theory network, transform coding, wavelet compression, Huffman coding, run-length encoding, or incremental encoding. (Riemer, p. 1353 right col. "The recollection buffer is implemented using a discrete variational auto-encoder [vector quantized variational auto-encoder (VQ-VAE)] (Jang, Gu, and Poole 2017)(Maddison, Mnih, and Teh 2017)"; p. 1353 right col. "… the encoder compresses it to a sequence of discrete latent codes (one hot vectors). These codes are concatenated... ")

In regard to claims 9 and 17, reference is made to the rejection of claims 1 and 14 respectively, and further, Riemer teaches: wherein the machine learning system is an artificial neural network, decision tree, support vector machine, Bayesian network, or genetic algorithm. (Riemer, p. 1355 right col. "… We model our experiments after (Lopez-Paz and Ranzato 2017) and use a Resnet-18 model as F_θ… our autoencoder models include three convolutional layers in the encoder and three deconvolutional layers in the decoder."; ResNet-18 is a convolutional neural network that is 18 layers deep. F_θ is an artificial neural network. Autoencoder models also is a neural network architecture.)

In regard to claim 10, reference is made to the rejection of claim 1, and further, Riemer teaches: wherein the machine learning system is trained to learn to perform a task comprising image classification, audio classification, object detection, regression, visual question answering, and combinations thereof. (Riemer, p. 1355 right col. "Incremental CIFAR-100: (Lopez-Paz and Ranzato 2017) A continual learning split of the CIFAR-100 image classification dataset considering each of the 20 course grained labels to be a task with 2,500 examples each. Omniglot: A character recognition dataset (Lake et al. 2011) in which we consider each of the 50 alphabets to be a task.")

In regard to claim 13, reference is made to the rejection of claim 1, and further, Riemer teaches: wherein the training is performed in a continuous or online manner. (Riemer, p. 1355 left col., Algorithm 1 Experience Replay Training for Continual Learning with a Scalable Recollection Module; p. 1354 left col. Improving Experience Replay Training "The recollection module can be used in many ways in a continual learning setting. In Algorithm 1 we show one approach...")

In regard to claim 14, Riemer teaches:  A parametric machine learning training system comprising: (Riemer, p. 1355 left col., Algorithm 1 Experience Replay Training for Continual Learning with a Scalable Recollection Module; p. 1354 left col. Improving Experience Replay Training "... In this setting the model must learn T tasks sequentially from dataset. At every step it receives a triplet (x; t; y) representing the input, task label, and correct output. There are two models [machine learning system] to be trained: the recollection module which consists of a memory index buffer M, an encoder ENC_Φ and decoder DEC_Ψ; and a predictive task model F_θ. The training proceeds in two phases..."; Two models are the ML system. The recollection module and predictive model both have neural network architectures, i.e. machine learning models, see claim 9 for details.)
a compression system which compresses and reconstructs a first data; (Riemer, p. 1353 right col. "The recollection buffer is implemented using a discrete variational auto-encoder [a compression system] (Jang, Gu, and Poole 2017)(Maddison, Mnih, and Teh 2017). A discrete variational autoencoder is a generative neural model with two components: an encoder and a decoder."; p. 1353 right col. The Scalable Recollection Module (SRM) "… When a new experience is received (in the figure, an image of the numeral '6' [e.g. first data]), the encoder compresses it to a sequence of discrete latent codes (one hot vectors). These codes are concatenated and further compressed to a k bit binary code or 'index' shown in decimal in the figure."; p. 1353 right col. "... Experiences are retrieved from the index buffer by sampling a code from the index buffer and passing it through the decoder to create an approximate reconstruction of the original input. This path is shown in red in the figure..."; p. 1354 "When an incoming example is received, we first sample multiple batches of recollections from the index buffer and decode them into experiences [reconstruct] using the current decoder...")
a memory buffer which stores the compressed first data; (Riemer, p. 1353 right col. "… This compressed code [compressed data] is then stored in the index buffer [a memory buffer]. This path is shown in blue..."; storing the compressed code generated from the image.)
a machine learning system; and (Riemer, p. 1354 left col. Improving Experience Replay Training "... There are two models [machine learning system] to be trained: the recollection module which consists of a memory index buffer M, an encoder ENC_Φ and decoder DEC_Ψ; and a predictive task model F_θ. The training proceeds in two phases..."; Two models are the ML system.)
a computer which trains the machine learning system with the selected stored reconstructed first data. (Riemer, p. 1354 left col. "For the recollection module, we achieve stabilization through a novel extension of experience replay (Lin 1992). We then perform N steps of optimization [e.g. training] on the encoder/decoder parameters Φ and Ψ by interleaving the current input example with a different batch of past recollections at each of the N optimization steps… For each optimization step, the error for each experience in a batch is computed by encoding that experience into a latent code using the encoder and then decoding back to an experience [reconstruct data] to compute the reconstruction error."; optimization steps of experience replay teaches training using reconstructions of compressed data.; p. 1354 right col. "... the predictive model F_θ is trained on just one of the recollection sample sets [e.g. training with reconstructed data] (we arbitrarily chose the first) using loss function and learning rate α..."; p. 1352 right col. "Continual learning… (2) retain memories which are useful, (3) manage compute and memory resources over a long period of time..."; compute and memory resources inherently teach a computer, memory, etc.)

In regard to claim 16, reference is made to the rejection of claim 14, and further, Riemer teaches: wherein the memory buffer is an array or a list. (Riemer, p. 1353 right col. "… This compressed code is then stored in the index buffer. This path is shown in blue..."; p. 1353 right col. "… When a new experience is received (in the figure, an image of the numeral '6'), the encoder compresses it to a sequence of discrete latent codes (one hot vectors [e.g. an array])...")

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 5 is rejected under 35 U.S.C. 103 as being unpatentable over Riemer in view of Benjamin (US 20070075999 A1).

In regard to claim 5, reference is made to the rejection of claim 1, and Riemer does not teach, but Benjamin teaches: wherein the first selected amount of the stored compressed first data comprises all the stored compressed first data. (Benjamin, [0022] "... This scheme can treat both volumes (group of images) and single images applying loss-less or lossy coding... A progressive approach is applied within the compression-decompression scheme, which enables the user to get previews or overviews of the transmitted image long prior to the time required to transmit the entire image [all the stored compressed data]... interactive approach enables the user to make many of these decisions before the entire image-set is received"; [0173] "The above process is iterated for all resolution levels. The process is stopped either when the user indicates that the visual level is adequate or the entire image has been sent resulting in a perfect, loss-less, replication of the original image on the user's screen... If segmentation and/or windowing and/or lossy compression was performed on the images, the user can request the server to complete the images to their loss-less representation. In such a case, the server will compress and transmit the needed information for the user to complete the images to their loss-less version.")

It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified Riemer to include all compressed data when reconstructing an image . Doing so would result in a perfect, loss-less, replication of the original image. (Benjamin, [0173] "... the entire image has been sent resulting in a perfect, loss-less, replication of the original image on the user's screen.")

Claims 11-12 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Riemer in view of Liu ("Lifelong Learning for Heterogeneous Multi-Modal Tasks").

In regard to claims 11 and 19, reference is made to the rejection of claims 1 and 14 respectively, and Riemer does not teach, but Liu teaches: wherein the data comprises at least two different modalities. (Liu, p. 6158 left col. abstract "In this work, we investigate the lifelong learning problem from the viewpoint of heterogeneous multi-modal fusion.")

It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified Riemer to include a multi-modal lifelong learning framework. Doing so would allow the system to perform on a complicated material recognition task. (Liu, p. 6158 left col. abstract "... we construct a multi-modal lifelong learning framework which deals with the consecutive multi-modal learning tasks and develop an efficient online dictionary learning algorithm to solve the multi-modal lifelong learning problem. Finally, we perform experimental validation on a complicated material recognition task and show the promising results.")

In regard to claims 12 and 20, reference is made to the rejection of claims 1 and 14 respectively, and Riemer does not teach, but Liu teaches: wherein the data comprises at least two of images, strings, audio waves, charts, coordinates, vectors, and text. (Liu, p. 6161 right col. "Fig. 2. The representative modalities for Mesh (Top) and Foam (Bottom). LEFT: Visual Modality; MIDDLE: Tactile Modality; RIGHT: Auditory Modality... The three-axes accelerometer is used to collect the tactile modality samples and the microphone is used for collecting the auditory samples... Fig.2 shows some representative signals. Fig.3 shows the representative images of the surface material instances... Even for such 10 sets of sample including visual, tactile and auditory modalities... It contains 1080 samples for each of the three modalities.")

The rationale for combining the teachings of Riemer and Liu is the same as set forth in the rejection of claim 11.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SU-TING CHUANG whose telephone number is (408)918-7519.  The examiner can normally be reached on Monday - Thursday 8-5 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571)272-3719.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/S.C./Examiner, Art Unit 2122                 

/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122