DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4 and 7-14 are is/are rejected under 35 U.S.C. 103 as being unpatentable over Minnen et al (US 20200027247) in view of Bhorkar (US 20200244969) further in view of Schroers et al (US 20210067808).

As to claim 1, Minnen discloses a computer-implemented method for encoding a video that comprises at least two image frames having a sequential order (FIG. 7; see [0059], The compression system is configured to process input data e.g., video data), the method comprising:
encoding, by a computing system comprising one or more computing devices and using an encoder model (FIG. 1, compression system 100 and encoder 106), a prior image frame of the at least two image frames to generate a first latent representation (see [0066], The encoder neural network 106 is configured to process the input data 102 (x) to generate a latent representation 116 (y) of the input data 102; see [0059], input data (e.g., image data, video data); FIG. 7, steps 702-704); 
determining, by the computing system and using a hyperprior encoder model (FIG. 1, hyper-encoder neural network 108), a hyperprior code based at least in part on the first latent representation (see [0069], The hyper-encoder neural network 108 is configured to process the latent representation 116 of the input data to generate a "hyper-prior" 122 (z) (sometimes called a "hyper-parameter"), that is, a latent representation of the conditional entropy model; FIG. 7, step 706); 
determining, by the computing system and using a hyperprior decoder model (FIG. 1, hyper-decoder neural network 110), one or more conditional probability parameters based at least in part on the first latent representation and the hyperprior code (see [0071]-[0072], The hyper-decoder neural network 110 is configured to process the quantized hyper-prior 124 to generate a hyper-decoder output 128 (.PSI.), and the entropy model neural network 112 is configured to process the hyper-decoder output 128 to generate the conditional entropy model … The conditional entropy model specifies a respective code symbol probability distribution corresponding to each code symbol 116 representing the input data. Generally, the output of the entropy model neural network 112 includes distribution parameters that define each code symbol probability distribution of the conditional entropy model; FIG. 7, step 710); 
generating, by the computing system and using an entropy coder (FIG. 1, entropy encoding engine 132), an entropy coding of the current image frame based at least in part on the one or more conditional probability parameters (see [0068], The compression system 100 uses the hyper-encoder neural network 108, the hyper-decoder neural network 110, and the entropy model neural network 112 to generate a conditional entropy model for entropy encoding the code symbols 120 representing the input data; see [0076], The entropy encoding engine 132 is configured to compress the code symbols 120 representing the input data by entropy encoding them in accordance with the conditional entropy model; FIG. 7, step 712); and 
storing, by the computing system, the entropy coding and the hyperprior code (see [0060], storing the compressed representation of the data; see [0047] and [0075]).
Minnen fails to explicitly disclose encoding, by the computing system and using the encoder model, a current image frame that occurs after the prior image frame based on the sequential order to generate a second latent representation; 
determining, by the computing system and using the hyperprior encoder model, the hyperprior code based at least in part on the first latent representation and the second latent representation, wherein the hyperprior code is indicative of differences between the current image frame and the prior image frame, the prior image frame occurring before the current image frame in the sequential order; and
generating, by the computing system and using the entropy coder, the entropy coding of the current image frame based at least in part on the one or more conditional probability parameters and the second latent representation.
However, Bhorkar teaches encoding, by the computing system and using the encoder model, a current image frame that occurs after the prior image frame based on the sequential order to generate a second latent representation (FIG. 4; see [0019], determine latent space representation of the second frame; see FIG. 2, encoder 240 converts an image (e.g., image 210) into a vector in latent space z; see [0047], the latent space representation of the first frame and the latent space representation of the second frame are generated via an autoencoder (e.g., a variational autoencoder (VAE))); 
determining, by the computing system and using the hyperprior encoder model, the hyperprior code based at least in part on the first latent representation and the second latent representation, wherein the hyperprior code is indicative of differences between the current image frame and the prior image frame, the prior image frame occurring before the current image frame in the sequential order (see [0022], server 116 may first detect a correlation between visual properties of a first frame of the sequence of frames and a second frame of the sequence of frames, where the second frame comprises a next frame following the first frame in the sequence of frames, generate a first difference vector comprising a difference between a latent space representation of the second frame and a latent space representation of the first frame in response to detecting the correlation between the visual properties, where the latent space representation of the first frame and the latent space representation of the second frame are generated via an autoencoder; see [0047]).
At the time before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skills in the art to modify Minnen using Bhorkar’s teachings to include encoding, by the computing system and using the encoder model, a current image frame that occurs after the prior image frame based on the sequential order to generate a second latent representation; determining, by the computing system and using the hyperprior encoder model, the hyperprior code based at least in part on the first latent representation and the second latent representation, wherein the hyperprior code is indicative of differences between the current image frame and the prior image frame, the prior image frame occurring before the current image frame in the sequential order in order to improve the quality of reconstructed versions of the (Bhorkar; [0013]).
The combination of Minnen and Bhorkar fails to explicitly disclose generating, by the computing system and using the entropy coder, the entropy coding of the current image frame based at least in part on the one or more conditional probability parameters and the second latent representation.
However, Schroers teaches generating, by the computing system and using the entropy coder, the entropy coding of the current image frame based at least in part on the one or more conditional probability parameters and the second latent representation (FIG. 2, frames 202, 232, 260 and 270 are encoded to generate latent space frames 208, 238, 264 and 274, respectively; see [0052]-[0053], Latent space residual 278 may be entropy coded by 282 based on one or more probability models).
At the time before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skills in the art to modify the combination of Minnen and Bhorkar using Schroers’ teachings to include generating, by the computing system and using the entropy coder, the entropy coding of the current image frame based at least in part on the one or more conditional probability parameters and the second latent representation in order to find an optimal encoding and/or decoding function to improve the rate distortion performance and improve compression efficiency (Schroers; [0039]).

As to claim 2, the combination of Minnen, Bhorkar and Schroers further discloses further comprising: encoding, by the computing system and using the encoder model, a third image frame of the at least two image frames that occurs after the current image frame to generate a (Bhorkar; see [0050], latent space representation of the third frame; see also [0061]).

As to claim 3, the combination of Minnen, Bhorkar and Schroers further discloses wherein the current image frame occurs immediately after the prior image frame (Bhorkar; [0022], where the second frame comprises a next frame following the first frame in the sequence of frames; see [0047]). 

As to claim 4, the combination of Minnen, Bhorkar and Schroers further discloses further comprising: performing, by the computing system, internal learning to optimize the second latent representation, the hyperprior code, or both the second latent representation and the hyperprior code (Minnen; see [0046], The system described in this specification is trained using machine learning techniques to adaptively determine the complexity of the hyper-prior for each set of input data, in order to optimize the overall compression rate; [0087], the encoder neural network, the hyper-encoder neural network, the hyper-decoder neural network, the context neural network, the entropy model neural network, and the decoder neural network can be jointly trained to optimize the rate distortion objective function). 

As to claim 7, the combination of Minnen, Bhorkar and Schroers further discloses wherein the hyperprior encoder model comprises a trained neural network (Minnen; see [0069], the hyper-encoder neural network 108 may be a convolutional neural network).

claim 8, the combination of Minnen, Bhorkar and Schroers further discloses wherein: determining, by the computing system and using the hyperprior encoder model, the hyperprior code is based only on image information included in the first latent representation and the second latent representation (Bhorkar; see [0022], generate a first difference vector comprising a difference between a latent space representation of the second frame and a latent space representation of the first frame).

As to claim 9, Minnen discloses a computer-implemented method for decoding a video that comprises two or more image frames having a sequential order (FIG. 8; see [0059], input data e.g., video data), the method comprising: 
for the two or more image frames, respectively (see [0059], input data e.g., video data): 
obtaining, by a computing system comprising one or more computing devices (FIG. 2, decompression system 200), a hyperprior code for a current image frame (FIG. 8, steps 802-804; see [0108]-[0109], The system obtains the compressed data (802). As described above, the compressed data includes (i) a compressed (i.e., entropy encoded) quantized latent representation of the data, and (ii) a compressed (i.e., entropy encoded) quantized hyper-prior); 
determining, by the computing system and using a hyperprior decoder model (FIG. 2, hyper-decoder neural network 110), one or more conditional probability parameters for the current frame based at least in part on the hyperprior code for the current image frame (FIG. 8, steps 804-806; see [0071]; see [0109]-[0110], The system determines the conditional entropy model used to entropy encode the quantized latent representation of the data (806). To determine the conditional entropy model, the system processes the quantized hyper-prior using the hyper-decoder neural network, and then processes the hyper-decoder neural network output using the entropy model neural network to generate the distribution parameters defining the conditional entropy model); 
decoding, by the computing system and using the one or more conditional probability parameters for the current frame, an entropy code for the current image frame to obtain a decoded version of a latent representation of the current image frame (FIG. 8, step 808; see [0082]; see [0111], The system entropy decodes the quantized latent representation of the data using the conditional entropy model (808). In particular, the system entropy decodes each code symbol of the quantized latent representation of the data using a corresponding code symbol probability distribution defined by the conditional entropy model); and 
providing, by the computing system, the decoded version of a latent representation of the current image frame for use in decoding a next entropy code for a next sequential image frame (FIG. 2, code symbol 120; see [0085], [0114]). 
Minnen fails to explicitly disclose obtaining, by the computing system comprising one or more computing devices, the hyperprior code for a current image frame and a decoded version of a latent representation of a previous sequential image frame, wherein the hyperprior code is indicative of differences between the current image frame and the previous sequential image frame, the previous sequential image frame occurring before the current image frame in the sequential order; and 
determining, by the computing system and using the hyperprior decoder model, the one or more conditional probability parameters for the current frame based at least in part on the hyperprior code for the current image frame and the decoded version of the latent representation of the previous sequential image frame.
(see FIG. 5; see [0054], a latent space representation of the first frame, and (2) a first difference vector comprising a difference between a latent space representation of a second frame of the sequence of frames and the latent space representation of the first frame; see [0056], the processing system decodes the latent space representation of the second frame into a decoded version of the second frame; [0061],  the processing system may determine a latent space representation of a third frame from a second difference vector and the latent space representation of the second frame), wherein the hyperprior code is indicative of differences between the current image frame and the previous sequential image frame, the previous sequential image frame occurring before the current image frame in the sequential order (see [0054], a first difference vector comprising a difference between a latent space representation of a second frame of the sequence of frames and the latent space representation of the first frame (where the second frame is the next frame following the first frame in the sequence of frames); see [0061], a second difference vector comprising a difference between a latent space representation of a third frame of the sequence of frames and the latent space representation of the second frame, where the third frame is a next frame following the second frame in the sequence of frames);
At the time before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skills in the art to modify to include obtaining, by the computing system comprising one or more computing devices the hyperprior code for the current image frame a decoded version of a latent representation of a previous sequential image frame, wherein the hyperprior code is indicative of differences between the current image frame and the previous (Bhorkar; [0013]).
The combination of Minnen and Bhorkar fails to explicitly disclose determining, by the computing system and using the hyperprior decoder model, the one or more conditional probability parameters for the current frame based at least in part on the hyperprior code for the current image frame and the decoded version of the latent representation of the previous sequential image frame.
However, Schroers teaches determining, by the computing system and using the hyperprior decoder model, the one or more conditional probability parameters for the current frame based at least in part on the hyperprior code for the current image frame and the decoded version of the latent representation of the previous sequential image frame (FIG. 2, frames 202, 232, 260 and 270; see [0040], [0045]; see [0049], Latent space reference frame 208 may be decoded by decoder 210 to generate decoded reference frame 212; see [0053], the hyper prior latent variables may be taken into account by the hyperparameter decoder network to describe the probabilities of the actual latents).
At the time before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skills in the art to modify the combination of Minnen and Bhorkar using Schroers’ teachings to include determining, by the computing system and using the hyperprior decoder model, the one or more conditional probability parameters for the current frame based at least in part on the hyperprior code for the current image frame and the decoded version of the latent representation of the previous sequential image frame in order to find an (Schroers; [0039]).

As to claim 10, the combination of Minnen, Bhorkar and Schroers further discloses further comprising: 
decoding, by the computing system and using a decoder model (Minnen; FIG. 2, decoder neural network 204), the decoded version of a latent representation of the current image frame to obtain a reconstructed version of the current image frame (Minnen; see [0080], [0085]).

As to claim 11, Minnen discloses one or more non-transitory computer-readable media that store (see [0116]-[0122]): 
a video compression model (FIG. 1, compression system 100), the video compression model comprising: 
a hyperprior encoder model (FIG. 1, hyper-encoder neural network 108); and 
a hyperprior decoder model (FIG. 1, hyper-decoder neural network 110); and 
instructions for performing encoding comprising (see [0116]): 
obtaining a video comprising an ordered sequence of image frames (FIG. 7, step 702; see [0059], The compression system is configured to process input data (e.g., video data)); 
determining a latent representation for at least two sequential image frames in the ordered sequence (FIG. 7, step 704; see [0099], The system processes the data using an encoder neural network to generate a latent representation of the data (704); see [0059], The compression system is configured to process input data (e.g., video data)); 
generating a hyperprior code for the at least two sequential image frames by providing the latent representation associated with input video data (FIG. 7, step 706; see [0100], The system processes the latent representation of the data using a hyper-encoder neural network to generate a latent representation of a conditional entropy model, i.e., a "hyper-prior"; see [0059], input data (e.g., video data)); 
generating one or more conditional probability parameters for the at least two sequential image frames by providing the hyperprior code associated with input video data input video data (see [0071]-[0072], The hyper-decoder neural network 110 is configured to process the quantized hyper-prior 124 to generate a hyper-decoder output 128 (.PSI.), and the entropy model neural network 112 is configured to process the hyper-decoder output 128 to generate the conditional entropy model … The conditional entropy model specifies a respective code symbol probability distribution corresponding to each code symbol 116 representing the input data. Generally, the output of the entropy model neural network 112 includes distribution parameters that define each code symbol probability distribution of the conditional entropy model; FIG. 7, step 710; see [0101]-[0102]); and 
input video data input video data (FIG. 1, entropy encoding engine 132; see [0068], The compression system 100 uses the hyper-encoder neural network 108, the hyper-decoder neural network 110, and the entropy model neural network 112 to generate a conditional entropy model for entropy encoding the code symbols 120 representing the input data; see [0076], The entropy encoding engine 132 is configured to compress the code symbols 120 representing the input data by entropy encoding them in accordance with the conditional entropy model; FIG. 7, step 712). 
Minnen fails to explicitly disclose generating the hyperprior code for the at least two sequential image frames by providing the latent representation associated with a prior image frame and the latent representation associated with a current image frame to the hyperprior encoder model, wherein the hyperprior code is indicative of differences between the current image frame and the prior image frame; 
generating the one or more conditional probability parameters for the at least two sequential image frames by providing the hyperprior code associated with the current image frame and the latent representation associated with the prior image frame to the hyperprior decoder model; and
determining the entropy coding for the at least two sequential image frames by providing the conditional probability parameters for the current image frame and the latent representation associated with the prior image frame to the entropy coder.
However, Bhorkar teaches generating the hyperprior code for the at least two sequential image frames by providing the latent representation associated with a prior image frame and the latent representation associated with a current image frame to the hyperprior encoder model, wherein the hyperprior code is indicative of differences between the current image frame and the prior image frame (FIG. 4; see [0019], determine latent space representation of the second frame; see FIG. 2, encoder 240 converts an image (e.g., image 210) into a vector in latent space z; see [0022], server 116 may first detect a correlation between visual properties of a first frame of the sequence of frames and a second frame of the sequence of frames, where the second frame comprises a next frame following the first frame in the sequence of frames, generate a first difference vector comprising a difference between a latent space representation of the second frame and a latent space representation of the first frame in response to detecting the correlation between the visual properties, where the latent space representation of the first frame and the latent space representation of the second frame are generated via an autoencoder; see [0047], the latent space representation of the first frame and the latent space representation of the second frame are generated via an autoencoder (e.g., a variational autoencoder (VAE))).
At the time before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skills in the art to modify Minnen using Bhorkar’s teachings to include generating the hyperprior code for the at least two sequential image frames by providing the latent representation associated with a prior image frame and the latent representation associated with a current image frame to the hyperprior encoder model, wherein the hyperprior code is indicative of differences between the current image frame and the prior image frame in order to improve the quality of reconstructed versions of the video frames and provide efficient, error resilient methods for video compression and transmission (Bhorkar; [0013]).
The combination of Minnen and Bhorkar fails to explicitly disclose generating the one or more conditional probability parameters for the at least two sequential image frames by providing the hyperprior code associated with the current image frame and the latent representation associated with the prior image frame to the hyperprior decoder model; and
determining the entropy coding for the at least two sequential image frames by providing the conditional probability parameters for the current image frame and the latent representation associated with the prior image frame to the entropy coder.
(FIG. 2, frames 202, 232, 260 and 270 are encoded to generate latent space frames 208, 238, 264 and 274, respectively; see [0052]-[0053], Latent space residual 278 may be entropy coded by 282 based on one or more probability models); and
generating the hyperprior code for each image frame by providing the latent representation associated with the prior image frame and the latent representation associated with a current image frame to the hyperprior encoder model (FIG. 2, frames 202, 232, 260 and 270 are encoded to generate latent space frames 208, 238, 264 and 274, respectively; see [0053], the hyper prior latent variables).
At the time before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skills in the art to modify the combination of Minnen and Bhorkar using Schroers’ teachings to include generating the one or more conditional probability parameters for the at least two sequential image frames by providing the hyperprior code associated with the current image frame and the latent representation associated with the prior image frame to the hyperprior decoder model; and generating the hyperprior code for each image frame by providing the latent representation associated with the prior image frame and the latent representation associated with a current image frame to the hyperprior encoder model in order to find an optimal encoding and/or decoding function to improve the rate distortion performance and improve compression efficiency (Schroers; [0039]).

As to claim 12, the combination of Minnen, Bhorkar and Schroers further discloses wherein the one or more non-transitory computer-readable media further store: 
(Minnen; FIG. 1, an encoder neural network 106; FIG. 2, a decoder neural network 204), and wherein determining the latent representation for the at least two sequential image frames in the ordered sequence comprises: 
encoding, using the encoder model, the at least two sequential image frames in the ordered sequence (Minnen; see [0066], The encoder neural network 106 is configured to process the input data 102 (x) to generate a latent representation 116 (y) of the input data 102; see [0059], input data (e.g., video data)).

As to claim 13, Minnen as modified by Bhorkar and Schroers further discloses wherein the one or more non-transitory computer-readable media further store: 
instructions for performing decoding comprising (see [0116]):  
obtaining the hyperprior code for the current image frame (FIG. 8, steps 802-804; see [0108]-[0109], The system obtains the compressed data (802). As described above, the compressed data includes (i) a compressed (i.e., entropy encoded) quantized latent representation of the data, and (ii) a compressed (i.e., entropy encoded) quantized hyper-prior);  
determining, using the hyperprior decoder model, one or more conditional probability parameters for the current frame based at least in part on the hyperprior code for the current image frame (FIG. 8, steps 804-806; see [0071]; see [0109]-[0110], The system determines the conditional entropy model used to entropy encode the quantized latent representation of the data (806). To determine the conditional entropy model, the system processes the quantized hyper-prior using the hyper-decoder neural network, and then processes the hyper-decoder neural network output using the entropy model neural network to generate the distribution parameters defining the conditional entropy model);
decoding, using the one or more conditional probability parameters for the current frame, an entropy code for the current image frame to obtain a decoded version of a latent representation of the current image frame (FIG. 8, step 808; see [0082]; see [0111], The system entropy decodes the quantized latent representation of the data using the conditional entropy model (808). In particular, the system entropy decodes each code symbol of the quantized latent representation of the data using a corresponding code symbol probability distribution defined by the conditional entropy model); 
providing, the decoded version of a latent representation of the current image frame for use in decoding a next sequential image frame (FIG. 2, code symbol 120; see [0085], [0114]).  
Minnen as modified by Bhorkar and Schroers fails to explicitly disclose obtaining a decoded version of the latent representation of a previous sequential image frame; 
determining, using the hyperprior decoder model, the one or more conditional probability parameters for the current frame based at least in part on the hyperprior code for the current image frame and the decoded version of the latent representation of the previous sequential image frame; 
generating a second latent representation based at least in part on the one or more conditional probability parameters and said entropy coding of the sequence of one or more entropy codings; and 
decoding, using the decoder model, the second latent representation to produce a decoded image frame.
(FIG. 2, frames 202, 232, 260 and 270; see [0040], [0045]; see [0049], Latent space reference frame 208 may be decoded by decoder 210 to generate decoded reference frame 212);
determining, using the hyperprior decoder model, the one or more conditional probability parameters for the current frame based at least in part on the hyperprior code for the current image frame and the decoded version of the latent representation of the previous sequential image frame (see [0053], the hyper prior latent variables may be taken into account by the hyperparameter decoder network to describe the probabilities of the actual latents);
generating a second latent representation based at least in part on the one or more conditional probability parameters and said entropy coding of the sequence of one or more entropy codings (FIG. 2 and [0051]-[0052], In latent space, latent space reconstructed frame 274 may be subtracted from latent space target frame 264 to generate latent space residual 278 … Latent space residual 278 may be entropy coded by 282 based on one or more probability models); and 
decoding, using the decoder model, the second latent representation to produce a decoded image frame (FIG. 2 and [0054], After entropy coding, latent space residual 278 and latent space reconstructed frame 274 may be combined as input for decoder 284 to generate decoded target frame 286). 
At the time before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skills in the art to modify the combination of Minnen and Bhorkar using Schroers’ teachings to include obtaining a decoded version of the latent representation of a previous sequential image frame; determining, using the hyperprior decoder model, the one or (Schroers; [0039]).

As to claim 14, the combination of Minnen, Bhorka and Schroers further discloses wherein the media further store: 
instructions for performing internal learning to optimize one or more outputs of the encoder model, the hyperprior encoder model, or both (Minnen; see [0046], The system described in this specification is trained using machine learning techniques to adaptively determine the complexity of the hyper-prior for each set of input data, in order to optimize the overall compression rate; [0087], the encoder neural network, the hyper-encoder neural network, the hyper-decoder neural network, the context neural network, the entropy model neural network, and the decoder neural network can be jointly trained to optimize the rate distortion objective function).

Claims 5-6 and 15-16 are is/are rejected under 35 U.S.C. 103 as being unpatentable over Minnen et al (US 20200027247) in view of Bhorkar (US 20200244969) further in view of Schroers et al (US 20210067808) and further in view of Rippel et al (US 20180174052).

As to claim 5, the combination of Minnen, Bhorkar and Schroers fails to explicitly disclose wherein performing, by the computing system, internal learning comprises: 
setting, by the computing system, as learnable parameters one or more of the second latent representation, the hyperprior code, or both the second latent representation and the hyperprior code; 
modifying, by the computing system, the learnable parameters to reduce a loss function, the loss function evaluating one or both of: 
a difference between the current image frame and a decoded image frame generated from the entropy coding of the current image frame; and 
a probability of determining the second latent representation, given the first latent representation and the hyperprior code. 
However, Rippel teaches setting, by the computing system, as learnable parameters one or more of the second latent representation, the hyperprior code, or both the second latent representation and the hyperprior code (see [0037]-[0040], the parameters of the autoencoder 202); 
modifying, by the computing system, the learnable parameters to reduce a loss function (see [0039]-[0040], The compression system 130 may also determine the autoencoder loss including the reconstruction loss 230 and the negative of the discriminator loss 234, and repeatedly update the parameters of the autoencoder 202 by backpropagating error terms obtained from the autoencoder loss), the loss function evaluating one or both of: 
a difference between the current image frame and a decoded image frame generated from the entropy coding of the current image frame (see autoencoder loss including reconstruction loss in [0037], [0039], [0041], [0054]-[0056], [0058], [0065]-[0066] and [0071]; see [0037], the reconstruction loss is determined based on a dissimilarity between the final reconstructed content and the training content); and 
a probability of determining the second latent representation, given the first latent representation and the hyperprior code. 
At the time before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skills in the art to modify the combination of Minnen, Bhorkar and Schroers using Rippel’s teachings to include setting, by the computing system, as learnable parameters one or more of the second latent representation, the hyperprior code, or both the second latent representation and the hyperprior code; modifying, by the computing system, the learnable parameters to reduce a loss function, the loss function evaluating one or both of: a difference between the current image frame and a decoded image frame generated from the entropy coding of the current image frame; and a probability of determining the second latent representation, given the first latent representation and the hyperprior code in order to reduce compression artifacts in the reconstructed content and to generate reconstructed content that closely resembles the structure of the original content (Rippel; [0006]-[0007], [0012]).

As to claim 6, the combination of Minnen, Bhorkar, Schroers and Rippel further discloses wherein modifying, by the computing system, the learnable parameters to reduce the loss function comprises: 
backpropagating, by the computing system, gradients for the learnable parameters over a number of iterations (Rippel; see backpropagation in [0038], [0040], [0050]-[0053]); and 
(Rippel; see [0050], For one or more subsequent iterations, a forward pass step and a backpropagation step to update the parameters of the autoencoder 302 based on the autoencoder loss function);
wherein during said modifying, all hyperprior decoder model and decoder model parameters are fixed (Rippel; [0050], parameters are fixed).

As to claim 15, the combination of Minnen, Bhorkar and Schroers fails to explicitly disclose wherein performing internal learning comprises: 
setting as learnable parameters one or more of the latent representation for at least one image frame, the hyperprior code determined from said latent representation, or combinations thereof; and 
optimizing a loss function, the loss function evaluating one or both of: 
a difference between said one image frame and a decoded image frame generated from an entropy coding, wherein the entropy coding was generated from said one image frame; and 
a probability of determining the latent representation of said one image frame, given the latent representation of a prior image frame and the hyperprior code determined from said latent representation. 
However, Rippel teaches setting as learnable parameters one or more of the latent representation for at least one image frame, the hyperprior code determined from said latent representation, or combinations thereof (see [0037]-[0040], the parameters of the autoencoder 202); and 
(see [0039]-[0040], The compression system 130 may also determine the autoencoder loss including the reconstruction loss 230 and the negative of the discriminator loss 234, and repeatedly update the parameters of the autoencoder 202 by backpropagating error terms obtained from the autoencoder loss), the loss function evaluating one or both of: 
a difference between said one image frame and a decoded image frame generated from an entropy coding, wherein the entropy coding was generated from said one image frame (see autoencoder loss including reconstruction loss in [0037], [0039], [0041], [0054]-[0056], [0058], [0065]-[0066] and [0071]; see [0037], the reconstruction loss is determined based on a dissimilarity between the final reconstructed content and the training content); and 
a probability of determining the latent representation of said one image frame, given the latent representation of a prior image frame and the hyperprior code determined from said latent representation. 
At the time before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skills in the art to modify the combination of Minnen, Bhorkar and Schroers using Rippel’s teachings to include setting as learnable parameters one or more of the latent representation for at least one image frame, the hyperprior code determined from said latent representation, or combinations thereof; and optimizing a loss function, the loss function evaluating one or both of: a difference between said one image frame and a decoded image frame generated from an entropy coding, wherein the entropy coding was generated from said one image frame; and a probability of determining the latent representation of said one image frame, given the latent representation of a prior image frame and the hyperprior code determined from (Rippel; [0006]-[0007], [0012]).

As to claim 16, the combination of Minnen, Bhorkar, Schroers and Rippel further discloses wherein optimizing the loss function comprises: 
backpropagating gradients for the learnable parameters over a number of iterations (Rippel; see backpropagation in [0038], [0040], [0050]-[0053]); and
updating values for each of the learnable parameters at each iteration of the number of iterations (Rippel; see [0050], For one or more subsequent iterations, a forward pass step and a backpropagation step to update the parameters of the autoencoder 302 based on the autoencoder loss function), wherein 
during optimization, all hyperprior decoder model, and decoder model parameters are fixed (Rippel; [0050], parameters are fixed). 


Claims 17-20 are is/are rejected under 35 U.S.C. 103 as being unpatentable over Minnen et al (US 20200027247) in view of Bhorkar (US 20200244969) further in view of Rippel et al (US 20180174052).

As to claim 17, Minnen discloses a computing system (FIG. 1) comprising: 
one or more processors (see [0117]); and 
(see [0116]-[0122]; see FIG. 1, compression system 100), the operations comprising: 
obtaining, by the computing system, a training dataset comprising a plurality of sequential image frames (FIG. 1, input data 102; FIG. step 702; see [0098]-[0099], The system receives the data to be compressed (702) … The system processes the data using an encoder neural network to generate a latent representation of the data (704); see [0059], input data (e.g., video data)); 
generating, by the computing system and using a machine-learned conditional entropy model (FIG. 1, hyper-decoder neural network 110 and entropy model neural network 112), a hyperprior code and an entropy code for at least two sequential image frames of the plurality of sequential the image frames (FIG. 7, steps 706-708; see [0100]-[0101], The system processes the latent representation of the data using a hyper-encoder neural network to generate a latent representation of a conditional entropy model, i.e., a "hyper-prior" (706) … The system can entropy encode the quantized hyper-prior using, e.g., a pre-determined entropy model defined by one or more predetermined code symbol probability distributions; see [0059], input data (e.g., video data)); 
generating, by the computing system and using the machine-learned conditional entropy model, a reconstruction of the at least two sequential image frames based on the hyperprior code and the entropy code for the image frame (see [0059], [0062], [0080], [0085] and [0114], process the ordered collection of code symbols 120 to generate the reconstruction 202 approximating the input data). 
Minnen fails to explicitly disclose wherein the hyperprior code is indicative of differences between a current image frame and a previous sequential image frame, the previous sequential image frame occurring before the current image frame in the plurality of sequential image frames; evaluating, by the computing system, a loss function that evaluates a difference between the at least two sequential image frames and the reconstruction of the at least two sequential image frames; and modifying, by the computing system, one or more parameters of the machine-learned conditional entropy based at least in part on the loss function. 
However, Bhorkar teaches wherein the hyperprior code is indicative of differences between a current image frame and a previous sequential image frame, the previous sequential image frame occurring before the current image frame in the plurality of sequential image frames (see [0022], server 116 may first detect a correlation between visual properties of a first frame of the sequence of frames and a second frame of the sequence of frames, where the second frame comprises a next frame following the first frame in the sequence of frames, generate a first difference vector comprising a difference between a latent space representation of the second frame and a latent space representation of the first frame in response to detecting the correlation between the visual properties, where the latent space representation of the first frame and the latent space representation of the second frame are generated via an autoencoder; see [0047]).
At the time before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skills in the art to modify to include wherein the hyperprior code is indicative of differences between a current image frame and a previous sequential image frame, (Bhorkar; [0013]).
The combination of Minnen and Bhorkar fails to explicitly disclose evaluating, by the computing system, a loss function that evaluates a difference between the at least two sequential image frames and the reconstruction of the at least two sequential image frames; and modifying, by the computing system, one or more parameters of the machine-learned conditional entropy based at least in part on the loss function.
However, Rippel teaches evaluating, by the computing system, a loss function that evaluates a difference between the at least two sequential image frames and the reconstruction of the at least two sequential image frames (see autoencoder loss including reconstruction loss in [0037], [0039], [0041], [0054]-[0056], [0058], [0065]-[0066] and [0071]; see [0037], the reconstruction loss is determined based on a dissimilarity between the final reconstructed content and the training content); and 
modifying, by the computing system, one or more parameters of the machine-learned conditional entropy based at least in part on the loss function (see [0039]-[0040], The compression system 130 may also determine the autoencoder loss including the reconstruction loss 230 and the negative of the discriminator loss 234, and repeatedly update the parameters of the autoencoder 202 by backpropagating error terms obtained from the autoencoder loss). 
At the time before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skills in the art to modify combination of Minnen and Bhorkar using (Rippel; [0006]-[0007], [0012]).

As to claim 18, the combination of Minnen, Bhorkar and Rippel further discloses wherein the machine-learned conditional entropy model comprises a hyperprior encoder model configured to, for the at least two sequential image frames, process a latent representation of the image frame and a latent representation of a previous image frame to generate the hyperprior code for the at least two sequential image frames (Minnen; FIG. 1, hyper-encoder neural network 108; [0069], The hyper-encoder neural network 108 is configured to process the latent representation 116 of the input data to generate a "hyper-prior" 122 (z) (sometimes called a "hyper-parameter"), that is, a latent representation of the conditional entropy model; see [0059], input data (e.g., video data)). 

As to claim 19, the combination of Minnen, Bhorkar and Rippel further discloses wherein the machine-learned conditional entropy model comprises a hyperprior decoder model configured to, for the at least two sequential image frames, process the hyperprior code for the image frame and a latent representation of the previous image frame to generate one or more conditional probability parameters for performing entropy coding of the image frame (Minnen; FIG. 2, hyper-decoder neural network 110; [0071], The hyper-decoder neural network 110 is configured to process the quantized hyper-prior 124 to generate a hyper-decoder output 128 (.PSI.), and the entropy model neural network 112 is configured to process the hyper-decoder output 128 to generate the conditional entropy model; see [0059], input data (e.g., video data)).

As to claim 20, the combination of Minnen and Rippel further discloses wherein the one or more conditional probability parameters comprise Gaussian mixture model values (Minnen; [0072]).
Response to Arguments
Applicant’s amendments and arguments, filed on 10/07/2021, with respect to the rejection(s) of claim(s) 1, 9, 11 and 17 under 103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Bhorkar (US 20200244969).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BOUBACAR ABDOU TCHOUSSOU whose telephone number is (571)272-7625. The examiner can normally be reached M-F 8am-4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chris Kelley can be reached on 5712727331. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BOUBACAR ABDOU TCHOUSSOU/Examiner, Art Unit 2482