DETAIL

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 


Applicant(s) Response to Official Action
The response filed on 02/09/2021 has been entered and made of record.

Claim Rejections - 35 USC § 102 and 35 USC § 103
Presented arguments have been fully considered, but are rendered moot in view of the new ground(s) of rejection necessitated by amendment(s) initiated by the applicant(s).


Notice re prior art available under both pre-AIA  and AIA 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:

1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.


Claim 1-7, 9, 11-22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Covell et al. (US20200111238A1) (hereinafter Covell).
Note: NPL documents: “G. Toderici, D. Vincent, N. Johnston, S. J. Hwang, D. Minnen, J. Shor, and M. Covell, "Full resolution image compression with recurrent neural networks," CoRR, vol. abs/1608.05148, 2016” (hereinafter referred to as Toderici) are incorporated by references in Covell (see Covell: Para 0048, 0051, 0053) and “Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, and Alexei Efros, in "Context encoders: Feature learning by Inpainting," in CVPR, 2016” (hereinafter referred to as Pathak) and “Raiko, M. Berglund, G. Alain, and L. Dinh, in "Techniques for learning binary stochastic feedforward neural networks," ICLR, 2015” (hereinafter referred to as Berglund) is incorporated by references in Covell (see Covell: Para 0066). This references are treated as part of the specification of Covell (see M.P.E.P 2163.07 (b)).

Regarding Claim 1, Covell meets the limitations as follows: 
A method comprising: 
identifying residual video data resulting from a compression of original video data; [i.e. residual between the true image tile and the initial prediction; para 0049, determines a residual image between the particular input tile and the output tile generated by the spatial context prediction neural network 507; Fig. 5b, Para 0070]
encoding the residual video data, [i.e. the encoder 110 encodes the residual between the true image tile and the initial prediction.; Fig. 1, Para 0049; Fig. 5b, Para 0070] by downsizing a feature map of the residual video data, [i.e. extract features from the input residual image followed by three convolutional LSTM layers that reduce the spatial resolution and generate feature maps.; Para 0065, 0056] to create encoded residual video data; [i.e. encoder 110 encodes the residual between the true image tile and the initial prediction, and the encoder executes the encoding iteratively. At each iteration, the encoder 110 receives an encoder input for the iteration and encodes the encoder input to generate a set of binary codes for each tile of the input image that represent the encoded version of the tile.; Para 0049, also refer Para 0009]
binarizing the encoded residual video data, by converting the downsized feature map into a binary feature map, to create binarized residual video data; [i.e.  At each time step, the residual encoder network receives an encoder input for the time step and processes the input using an encoder 110 to generate a set of binary codes for the time step.; Fig. 4, Para 0060, and  The encoder input for time steps after the first time step are temporary residual images between (i) the residual and (ii) a reconstruction generated by the decoder neural network from the set of binary codes at the previous time step; Para 0064, and The encoder portion of the network uses one convolutional layer to extract features from the input residual image followed by three convolutional LSTM layers that reduce the spatial resolution and generate feature maps. Weights are shared across all iterations and the recurrent connections allow information to propagate from one iteration to the next.; Para 0065, and The binary bottleneck layer 402 maps incoming features to (-1, 1) using a 1.times.1 convolution followed by a tanh activation function. And When the system applies the trained network to real images, the system binarizes deterministically (b=sign (tanh(x)) with b=1 when x=0).; Para 0066, and The system generates the set of binary codes for a particular tile at each time step by determining whether a quality threshold for the particular tile has been satisfied when the particular tile is reconstructed from the binary codes already generated at the current time step or any previous time steps.; Para 0068 ]
compressing the binarized residual video data to create compressed residual video data; [i.e. A residual network processes the residual image tile to compress the residual within each tile.; Para 0058, and In the case where the spatial context predictor is not able to recover many image details, reconstruction quality can be improved by compression and reconstruction of the residual images with a recurrent auto-encoder architecture. In each iteration, the residual encoder, extracts features from the input and quantizes them to generate 128 bits. FIG. 4 shows four iterations.; Para 0059] and 
transmitting the compressed residual video data.  [i.e. the residual encoder network receives an encoder input for the time step and processes the input using an encoder 110 to generate a set of binary codes for the time step.; Para 0060, and The system compresses the input image by compressing the binary codes in the encoded representation using a data compression algorithm, e.g., a trained entropy coder. The system may transmit the compressed input image to an image decoder; para 0061]
Covell describes their invention in terms of several embodiments, further discloses that features described in separate embodiments may be combined in one combination or separated in sub combination or variation of subcombination. [Para 0006, 0083] Therefore, it should be apparent to those of ordinary skill in the art that the steps disclosed in the method, other steps may be added or existing steps may be removed, modified or rearranged. Therefore, different steps disclosed in different embodiments may be combined, and modified that would result in the claim invention.
Therefore, it would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the system disclosed by Covell, in order to encode an image with constant quality. [Covell: Para 0026]

Regarding Claim 2, Note the Rejection for claim 1, wherein Covell further discloses
The method of claim 1, wherein the compression of the original video data results in compressed original video data, and wherein the residual video data is determined by: decompressing the compressed original video data to result in decompressed original video data, and computing a difference between the decompressed original video data and the original video data. [i.e. At the first iteration, the encoder input is the residual between the true image tile and the initial prediction. The encoder 110 encodes the residuals to create binary codes and uses a decoder 114 to reconstruct the input from the binary to capture residual remaining from the previous iteration. The decoded pixel values are stored 122 and used as context for predicting subsequent tiles. At each iteration after the first iteration, the encoder input is a residual tile from the preceding iteration.; Para 0049, The encoder input for time steps after the first time step are temporary residual images between (i) the residual and (ii) a reconstruction generated by the decoder neural network from the set of binary codes at the previous time step.; Para 0064]  

Regarding Claim 3, Note the Rejection for claim 1, wherein Covell further discloses
The method of claim 1, wherein encoding the residual video data includes compressing the residual video data.  [i.e. At each time step, the residual encoder network receives an encoder input for the time step and processes the input using an encoder 110 to generate a set of binary codes for the time step, where the encoder input is the residual image; para 0060 and he system compresses the input image by compressing the binary codes in the encoded representation using a data compression algorithm; Para 0061]

Regarding Claim 4, Note the Rejection for claim 1, wherein Covell further discloses
The method of claim 1, wherein the residual video data is encoded utilizing a binary residual autoencoder. [i.e. residual network 400 that compresses and reconstructs residual images. The residual network 400 uses a recurrent auto-encoder architecture; Para 0059] 

Regarding Claim 5, Note the Rejection for claim 1, wherein Covell further discloses
The method of claim 1, wherein the encoded residual video data is binarized utilizing a binary residual autoencoder.  [i.e. residual network 400 uses a recurrent auto-encoder architecture; Para 0059, and At each time step, the residual encoder network receives an encoder input for the time step and processes the input using an encoder 110 to generate a set of binary codes for the time step.; Para 0060] 

Regarding Claim 6, Note the Rejection for claim 1, wherein Covell further discloses
The method of claim 5, wherein the binary residual autoencoder includes a binarizer that transforms frame data of the encoded residual video data into a compressible binary bitstream.  [i.e. A residual network processes the residual image tile to compress the residual within each tile.; Para 0058, and reconstruction quality can be improved by compression and reconstruction of the residual images with a recurrent auto-encoder architecture. In each iteration, the residual encoder, extracts features from the input and quantizes them to generate 128 bits.; Para 0059, and The system compresses the input image by compressing the binary codes in the encoded representation using a data compression algorithm,; para 0061]

Regarding Claim 7, Note the Rejection for claim 1, wherein Covell further discloses
The method of claim 6, wherein the binarizer is trained and implemented utilizing a tanh activation.  [i.e. The binary bottleneck layer 402 maps incoming features to (-1, 1) using a 1.times.1 convolution followed by a tanh activation function…. the system binarizes deterministically (b=sign (tanh(x)) with b=1 when x=0); Para 0066, and uses a tanh activation to map the features to three values in the range [-1, 1]; Para 0067]

Regarding Claim 9, Note the Rejection for claim 1, wherein Covell further discloses
The method of claim 6, wherein the binarized residual video data is compressed utilizing a binary residual autoencoder. [i.e. a given tile may be represented using fewer bits or a less complex string of bits that can be compressed to a smaller size due to the use of the spatial context predictor neural network and because the compressed bits only need to represent the residual rather than the entire tile.; Para 0022, The residual network 130 is a deep network based on recurrent auto-encoders. The residual network 130 includes an encoder 110; Para 0048, and residual network 400 that compresses and reconstructs residual images. The residual network 400 uses a recurrent auto-encoder architecture; Fig. 4, Para 0059, system generates a set of binary codes for the particular tile by encoding the residual image using an encoder neural network 508, Fig. 4, para 0070]
 
Regarding Claim 11, Note the Rejection for claim 1, wherein Covell further discloses
The method of claim 1, wherein the encoding, binarizing, and compressing are domain-specific, such that: a binary residual autoencoder performs the encoding, binarizing, and compressing of the residual video data; the binary residual autoencoder is trained in a domain-specific manner; and a constrained amount of possible outcomes are used during a training of the binary residual autoencoder.  [Please refer to mapping, explanation and pertinence of prior art reference given in claim 1, further discloses Training data may be context image patches, and image patches may be cropped from a collection of images; Para  0053, Further Toderici discloses using a “High Entropy (HE)” dataset for training, Pg. 3310, c. 1, section 4, Para 001,, Pathak, further discloses Given an image with a missing region (e.g., Fig. 1a), we train a convolutional neural network to regress to the missing pixel values (Fig. 1d). We call our model context encoder,; Fig. 1, Pg. 2536, c. 2, Para 003; Note that the interpretation is consistent with the specification of instant application in specification, Para 0024]
 
Regarding Claim 12, Note the Rejection for claim 1, wherein Covell further discloses
The method of claim 1, wherein the compression of the original video data results in compressed original video data, and wherein the compressed residual video data is transmitted with the compressed original video data to a receiver.  [i.e. the encoder executes the encoding iteratively. … At the first iteration, the encoder input is the residual between the true image tile and the initial prediction (i.e. compressed original video). The encoder 110 encodes the residuals to create binary codes and uses a decoder 114 to reconstruct the input from the binary to capture residual remaining from the previous iteration. The decoded pixel values are stored 122 and used as context for predicting subsequent tiles. At each iteration after the first iteration, the encoder input is a residual tile from the preceding iteration. By reconstructing the tile and capturing the residual remaining from the previous iteration (i.e. compressed residual video data) ; Para 0009, 0049, and once a tile is encoded, the residual network may send the encoded residuals 142, i.e., binary codes or compressed binary codes, to a decoder 128 for decoding; Para 0050, and The system compresses the input image by compressing the binary codes in the encoded representation using a data compression algorithm, e.g., a trained entropy coder. The system may transmit the compressed input image to an image decoder for decompression of the input image to an image decoder for decompression of the input image; Para 0061, Therefore, compressed residual video data is transmitted with the compressed original video data to a receiver.]

Regarding Claim 13, Note the Rejection for claim 1, wherein Covell further discloses
The method of claim 1, further comprising: decompressing the original video data that has been compressed at a client device to create decompressed original video data; [i.e. the decoding components, i.e., those components necessary to re-construct the input image, can be located on a client device.]
reconstructing the compressed residual video data at the client device to create reconstructed residual video data; [i.e. The decoder 128 iteratively decodes the binary codes to obtain the residuals between the actual decoded tile and the predicted tile from the spatial context predictor 124.] and 
adding the reconstructed residual video data back to the decompressed original video data to create output video data.  [i.e. A combiner 132 then combines the decoded residual and the predicted tile to obtain the full reconstruction of the tile 150.; Para 0051]

Regarding Claim 21, Note the Rejection for claim 1 and 12, wherein Covell further discloses
The method of claim 12, wherein the compressed residual video data is transmitted with the compressed original video data over a single channel.  [Refer to mapping, explanation and pertinence of prior art reference given in claim 1 and 12.  It is well known to the person of ordinary skill in the art before the effective filing date of the claimed invention that typically the compressed image is transmitted over a single channel unless it has mentioned explicitly. Covell does not explicitly or implicitly indicate that the compressed image is transmitted over a multiple channels. Therefore, It would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to understand that the compressed image is transmitted over a single channel]

Regarding claim 22, Covell meet the claim limitations as set forth in claim 1, 12 and 21.
The method of claim 12, wherein the compressed residual video data is transmitted with the compressed original video data for use in creating output video data. [Please, refer to mapping, explanation and pertinence of prior art reference given in claim 1, 12 and 21; Note: current claim may contain different terminology or additional claim terms compared to the claim(s) that are referred claim(s) to meet the claim limitations of current claim. However, explanation and pertinence of prior art of reference provided in the referred claim(s) would address any differing claim limitations. Therefore, explanation and pertinence of prior art of reference are not duplicated, and applicant is requested to refer to explanation and pertinence of prior art reference provided in the referred claim(s).]

Regarding claim 14, 15-19 and 20, the claim(s) recites analogous limitations to claim 1, 3-7 and 1 above, respectively, and is/are therefore rejected on the same premise. Therefore, regarding claim 14, 15-19 and 20, Covell meets the claim limitations as set forth in claim 1, 3-7 and 1, respectively. [Please, refer to mapping, explanation and pertinence of prior art reference given in claim 1-7 and 1, respectively; Note: current claim may contain different terminology or additional claim terms compared to the claim(s) that are referred claim(s) to meet the claim limitations of current claim. However, explanation and pertinence of prior art of reference provided in the referred claim(s) would address any differing claim limitations. Therefore, explanation and pertinence of prior art of reference are not duplicated, and applicant is requested to refer to explanation and pertinence of prior art reference provided in the referred claim(s).]
 

Claim 8, 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Covell et al. (US20200111238A1) (hereinafter Covell) and further in view of EL-YANIV et al. (US20170286830A1)  (hereinafter EL-YANIV).
Note: NPL documents: “G. Toderici, D. Vincent, N. Johnston, S. J. Hwang, D. Minnen, J. Shor, and M. Covell, "Full resolution image compression with recurrent neural networks," CoRR, vol. abs/1608.05148, 2016” (hereinafter referred to as Toderici) are incorporated by references in Covell (see Covell: Para 0048, 0051, 0053) and “Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, and Alexei Efros, in "Context encoders: Feature learning by Inpainting," in CVPR, 2016” (hereinafter referred to as Pathak) and “Raiko, M. Berglund, G. Alain, and L. Dinh, in "Techniques for learning binary stochastic feedforward neural networks," ICLR, 2015” (hereinafter referred to as Berglund) is incorporated by references in Covell (see Covell: Para 0066). This references are treated as part of the specification of Covell (see M.P.E.P 2163.07 (b)).

Regarding Claim 8, Note the Rejection for claim 1 and 6, wherein Covell further discloses
The method of claim 6, wherein the binarizer is trained [i.e. includes a training engine 116 for the spatial context predictor 108 and a training engine 118 for the residual encoding network 130; Para 0052, and the system binarizes deterministically (b=sign (tanh(x)) with b=1 when x=0).; Para 0066] and …  
Covell do not explicitly disclose the following claim limitations:
… implemented utilizing a hardtanh activation.
However, in the same field of endeavor EL-YANIV discloses the deficient claim limitations, as follows:
… implemented utilizing a hardtanh activation. [i.e. neural network may be any DNN, including any feed-forward artificial neural network such as a convolutional neural network (CNN), fully connected neural network (FNN) and/or recurrent neural network (RNN); Para 0050, and linear activation function such as H tan h(x)=Clip(x;-1; 1); Para 0081]
Covell  discloses a neural network that is trained using a tanh activation. EL-YANIV discloses RNN network that used activation function hard tan h for training, which is pertinent to the problem with which the applicant was concerned. Therefore, combining the teachings of Covell with EL-YANIV would provide an expected result thereby resulting in the claimed invention.
Therefore,  it would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the system disclosed by Covell add the teachings of EL-YANIV as above, in order to provide a method for training neural networks. [EL-YANIV: Para 0005]

Regarding Claim 10, Note the Rejection for claim 1 and 9, wherein Covell further discloses
The method of claim 9, wherein the binary residual autoencoder … that compresses a binary bitstream produced by the binarizer in order to reduce an amount of data to be transmitted.  [i.e. compressing the binary codes in the encoded representation using a data compression algorithm, e.g., a trained entropy coder.; Para 0061]
Covell do not explicitly disclose the following claim limitations:
… includes a Huffman encoder …
However, in the same field of endeavor EL-YANIV discloses the deficient claim limitations, as follows:
… includes a Huffman encoder [i.e. provided a method for training neural networks; para 0050, and neural network may be any DNN, including any feed-forward artificial neural network such as a convolutional neural network (CNN), fully connected neural network (FNN) and/or recurrent neural network (RNN); Para 0050, and To test the strength of the above described training method, … applying Huffman codes; Para 0105]…
Therefore,  it would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the system disclosed by Covell add the teachings of EL-YANIV as above, in order to provide a method for training neural networks. [EL-YANIV: Para 0005]



Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 


Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EXAMINER, DAKSHESH PARIKH, whose telephone number is (571) 272-2777.  The examiner can normally be reached on EXAMINER SCHEDULE.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, Applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, SPE, SATH V. PERUNGAVOOR, can be reached on (571) 272-7455.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
 
/DAKSHESH D PARIKH/Primary Examiner, Art Unit 2488