The present application is being examined under the first inventor to file provisions of the AIA . 
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
This office action is in response to submission of application on 12/15/2017. 
Claims 1-40 are presented for examination.

Priority
Applicant’s claims for the benefit of prior-filed U.S. Provisional Patent Application 62/434,600 filed on 12/15/2016, U.S. Provisional Patent Application 62/432,602 filed on 12/15/2016, U.S. Provisional Patent Application 62/432,603 filed on 12/15/2016 and U.S. Provisional Patent Application 62/458,749 filed on 2/14/2017 are acknowledged and admitted.  Receipt is acknowledged of papers submitted under 35 U.S.C. 119(a)-(d), which papers have been placed of record in the file.
Information Disclosure Statement
The information disclosure statements submitted on 6/06/2018 and 1/16/2020 are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statements are considered by the examiner.
Drawings
The Drawings filed on 12/15/2017 are acceptable for examination purposes.
Specification
The Specification filed on 12/15/2017 is acceptable for examination purposes.

Claim Objections
Claims 1,  are objected to because of the following informalities:  
Claim 1 line 9 recites by adding ‘learning’.
Claim 1 line 14 recites “where the encoder portion terminates and before the decoder portion”. Examiner suggests changing to “where the encoder portion terminates before the task portion” by removing ‘and’ and replacing decoder with ‘task’ to reference the task portion recited on line 6.  
Claim 11 lines 8-9 recite, in part, “in comparison to the machine task output label”. There is lack of antecedent basis for “the machine task output label” on lines 8-9. Examiner suggests changing to “in comparison to a machine learning task output label of a training example” similar to claim 1 lines 9-10.
Claim 21 line 4 recites, in part, “obtain training examples”. Examiner suggest changing to “obtaining training examples”.
Claim 21 lines 4-5 and 9 recite, in part, “and a machine task output label” and “in comparison to the machine task output label”. Examiner suggests changing “machine task output label” to “machine learning task output label” as in claims 1 and 11.
Claim 21 line 6 recites, in part, “training a neural network”. Examiner suggest changing to “training the neural network” to reference lines 1-2, “a neural network”.
Claim 31 line 4 recites, in part, “obtain training examples”. Examiner suggest changing to “obtaining training examples”.
Claim 31 lines 4-5 and 9 recite, in part, “and a machine task output label” and “in comparison to the machine task output label”. Examiner suggests changing “machine task output label” to “machine learning task output label” as in claims 1 and 11.
Appropriate correction is required.


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claims 1, 7-8, 11, 17-18, 21, 27-28, 31 and 37-38 are rejected under 35 U.S.C. 102 (a)(2) as being anticipated by Toderici et al. (US 10192327 B1, hereinafter Toderici).

Regarding claim 1, 
Toderici discloses a method comprising (Toderici Pg. 8, Col. 1 recites “This specification describes methods and systems, including computer programs encoded on computer storage media, for performing image compression across different compression rates on images of arbitrary size using recurrent neural networks.”): 
obtaining training examples, each training example comprising acquired data and a machine learning task output label (Toderici Pg. 12, Col. 10 recites, in part, “For example, the neural network system 100 may be trained on a set of training data by processing training inputs included in the set of training data to generate corresponding outputs. The generated outputs may then be compared to known training outputs included in the set of training data...” Trained on set of training data with training inputs and known training outputs (i.e. obtain training examples with acquired data and output labels)); 
training a neural network using one or more error terms obtained from a loss function to update a set of parameters of the neural network (Toderici Pg. 12, Col. 10 recites, in part, “The generated outputs may then be compared to known training outputs included in the set of training data by computing loss functions and backpropagating loss function gradients with respect to current neural network parameters to determine an updated set of neural network parameters that minimizes the loss functions.” Computing loss functions and backpropagating for updating neural network parameters (i.e. using 1 or more error terms to update neural network parameters)), 
where the neural network comprises an encoder portion and a task portion, and where the loss function comprises (Toderici fig. 1 & Pg. 10, Col. 5 recites “The neural network system 100 includes an encoder network 102, a binarizer 104, a decoder network 106, and a residual error calculator 108.” The neural network system includes an encoder and a decoder network and a residual error calculator (i.e. neural network comprises an encoder portion, a task portion and loss function)): 
a first loss describing an accuracy of a machine learning task output predicted by the neural network in comparison to the machine task output label of the training example (Toderici Pg. 11, Cols. 7-8 recites, in part, “The neural network system 100 may be configured to iteratively repeat the above described process using the residual error 122 as a subsequent neural network system input, i.e., processing the residual error 122 using the encoder network 102, binarizer 104, decoder network 106 and residual error calculator 108 to generate a subsequent residual error.” Additionally, Pg. 11 Col. 8 “In some implementations, the accuracy of an image compression may increase as more bits are processed by the neural network system 100, e.g., as more iterations of the process are performed by the neural network system 100. However, more iterations of the process reduce the compression rate, incurring a tradeoff between image quality and compression rate as described above.” Residual error processing (i.e. a first loss describing accuracy)), and 
a second loss describing an encoding efficiency of compressed codes generated from a compressed representation of the acquired data of the training example output by an intermediate layer of the neural network where the encoder portion terminates and before the decoder portion (Toderici Pg. 11, Cols. 7-8 recite “The image compression rate is determined by the number of bits generated by the binarizer 104 at each iteration, and the number of iterations. For example, for a fixed number of iterations, increasing the number of bits generated by the binarizer 104 at each iteration may improve the image compression rate (although in some implementations this may involve re training the neural network system for each iteration). Alternatively, for a fixed number of bits generated by the binarizer 104 at each iteration, increasing the number of iterations may reduce the image compression rate.” The compression rate (i.e. second loss describing an encoding efficiency of compressed codes generated from a compressed representation.) Additionally, Pg. 11, Col. 8 “In cases where the encoder network 102 and decoder network 106 respectively include one or more recurrent network components, e.g., one or more recurrent neural network layers, the neural network system 100 may be configured to iteratively repeat the above described process at each of multiple time steps.” One or more recurrent neural network layers (i.e. intermediate layer)); and 
storing the set of parameters of the neural network on a computer readable medium (Toderici Pg. 8, Col. 1 recites “In particular, a recurrent neural network can use some or all of the internal state of the network from a previous time step in computing an output at a current time step. An example of a recurrent neural network is a Long Short-Term Memory (LSTM) neural network that includes one or more LSTM memory blocks. Each LSTM memory block can include one or more cells that each include an input gate, a forget gate, and an output gate that allow the cell to store previous states for the cell, e.g., for use in generating a current activation or to be provided to other components of the LSTM neural network.” LSTM memory blocks and store internal state of the network (i.e. store set of parameters of the neural network on computer readable medium)).


    PNG
    media_image1.png
    727
    532
    media_image1.png
    Greyscale

Regarding claim 7,
Toderici discloses the method of claim 1, wherein the neural network further comprises one or more additional encoder portions (Toderici fig. 1 & Pg. 13, Col. 11 recites “As described above with reference to FIG. 1, the encoder network 102 may include one or more recurrent network components, e.g., one or more recurrent neural network layers.” One or more recurrent network components (i.e. one or more additional encoder portions) Additionally, Pg. 10, Col. 5 recites “For convenience, the binarizer 104 of FIG. 1 is shown as being separate to the encoder network 102, however in some implementations the binarizer 104 may be included in the encoder network 102.”), 
wherein nodes in an output layer of each of the one or more encoder portions serve as nodes in an input layer of the task portion of the neural network (Toderici fig. 2 & Pg. 13, Col. 11 recites “For example, as shown in FIG. 2, in some implementations the first stack of neural network layers may include one convolutional neural network layer 206 followed by two stacked LSTM layers. As another example, in some implementations the first stack of neural network layers may include one or more LSTM neural network layers and one or more convolutional LSTM neural network layers.” Additionally, fig. 3 & Pg. 13, Col. 12 recites “The second stack of neural network layers may further include one or more non-LSTM neural network layers, e.g., convolutional neural network layer 310. In some implementations the second stack of neural network layers may include gated recurrent neural network layers. In addition, in some implementations the second stack of neural network layers may include one or more fully connected neural network layers.” The output layer of the 1st stack in the encoder network connected to the input layer of the 2nd stack in the decoder network (i.e. nodes in output layer of the encoder portion serves as nodes in an input layer of the task portion)).


    PNG
    media_image2.png
    762
    526
    media_image2.png
    Greyscale

    PNG
    media_image3.png
    751
    529
    media_image3.png
    Greyscale





Regarding claim 8,
Toderici discloses the method of claim 7, wherein for each of the additional encoder portions of the neural network, the additional encoder portion is trained using (Toderici Pg. 10, Col. 5 recites “In some implementations the encoder network 102 includes one or more recurrent network components, e.g., one or more recurrent neural network layers, that are configured to process data representing a sequence of input images for respective time steps. In these implementations, the encoder network 102 is configured to generate data representing a sequence of encoded representations of the input images.” One or more recurrent network components (i.e. additional encoder portions)): 
the first loss describing the accuracy of the machine learning task output, and an additional loss describing an encoding efficiency of compressed codes generated from a representation outputted by the additional encoder portion (Toderici Pg.10, Col. 5 recites “The decoder recurrent neural network constructs an estimate of the original input image based on the received binary code. The procedure is iteratively repeated with a residual error, i.e., the difference between the original image and the estimation from the decoder recurrent neural network. The neural network system weights are shared between iterations, and the internal states in the recurrent neural networks are propagated to the next iteration. Therefore, residual errors are encoded and decoded in different contexts in different iterations. The image compression rate is determined by the number of bits in the binary code generated at each iteration and by the total number of iterations performed by the system.” Residual error and compression rate determined by number of bits in binary code generated at each iteration (i.e. first loss describing the accuracy and additional loss describing encoding efficiency)).

Regarding claims 11 and 17-18,
Claims 11 and 17-18 are directed to a system comprising a data compression system and a machine learning task system configured to perform methods substantially identical to those recited in claims 1 and 7-8, respectively. Therefore, the rejections to claims 1 and 7-8 apply equally here.
In addition, Toderici discloses the additional limitation of a system, a data compression system, a data acquisition module, a coding module, a machine learning task system and a decoding module (Toderici Pg. 10, Col. 5 recites “The neural network system 100 includes an encoder network 102, a binarizer 104, a decoder network 106, and a residual error calculator 108. For convenience, the binarizer 104 of FIG. 1 is shown as being separate to the encoder network 102, however in some implementations the binarizer 104 may be included in the encoder network 102. Optionally, the neural network system may include an additive image reconstruction module 110 and a gain estimator module 112.” Additionally, Toderici Pgs. 12-13, Cols. 10-11 recite “For convenience, the encoder network 102 and decoder network 106 are illustrated in FIG. 1 as being located in a same system. However, in some implementations the decoder network 106 and encoder network 102 may be distributed across multiple systems. That is, the decoder network 106 may be remote from the encoder network 102 and binarizer 104. For example, a received system input image, e.g., input image 114, may be compressed using the encoder network 102 and binarizer 104 at one end, and transmitted to the decoder network 106 at another end point, where it may be reconstructed and provided as a system output.” Encoder network that can receive input images and has functionality as a data acquisition module, Binarizer which has functionality as a coding module and may be part of the encoder network, and Decoder Network that has functionality as a decoding module).

Regarding claims 21, 27-28, 31 and 37-38,
Claims 21 and 27-28 & 31 and 37-38 are directed to articles of manufacture configured to perform methods substantially identical to those recited in claims 1 and 7-8, respectively. Therefore, the rejections to claims 1 and 7-8 apply equally here.
In addition, Toderici discloses the additional limitation of a computer readable storage medium (Toderici Pg. 15, Col. 16 recites “Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.”).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 2, 4, 12, 14, 22, 24, 32 and 34 are rejected under 35 U.S.C. 103 as being unpatentable over Toderici in view of Choi et al. (US 20180107926 A1, hereinafter Choi).

Regarding claim 2,
Toderici discloses the method of claim 1, wherein training the neural network comprises (Toderici Fig. 1 Element 100 – Neural network system): 
applying the acquired data of the training example as input to the encoder portion of the neural network to obtain a compressed representation of the acquired data (Toderici Pg. 12, Col. 10 recites, in part, “For example, the neural network system 100 may be trained on a set of training data by processing training inputs included in the set of training data to generate corresponding outputs. The generated outputs may then be compared to known training outputs included in the set of training data” Additionally, Toderici Pg. 13, Col. 11 recites “FIG. 2 shows an example encoder network 102, as described above with reference to FIG. 1. The encoder network 102 is configured to receive an input image, e.g., input image 202, and to generate an encoded representation of the input image, e.g., first stack output 204.” Trained on a set of training data (i.e. applying the data of the training example as input)).
However, Toderici does not explicitly disclose determining the second loss by comparing a codelength of compressed codes generated from the compressed representation to a target codelength.
Choi teaches determining the second loss by comparing a codelength of compressed codes generated from the compressed representation to a target codelength (Choi Fig. 9A-B, [0132] and [0157] recite, in part, “Thus, in such cases, it is better to minimize the quantization loss under the constraint of the actual compression ratio, which is a function of the average codeword length resulting from the specific encoding scheme employed at the end. [0157] From this curve, the point that satisfies the target performance and/or target compression ratio can be selected.” Quantization loss under the constraint of the actual compression ratio which is a function of the average codeword length (i.e. codelength generated from compressed representation to a target codelength)).
Choi and Toderici are both directed to neural networks and compression. In view of the teachings of Choi, it would have been obvious to one of ordinary skill in the art to apply the teachings of Choi to Toderici before the effective filing date of the claimed invention in order to efficiently use deep neural networks in the presence of limited storage by accounting for constraints of the actual compression ratio (cf. Choi [0006]-[0007] recites, in part, “Accordingly, although deep neural networks are extremely powerful, they also require a significant amount of resources to implement, particularly in terms of memory storage. [0007] This makes it difficult to deploy deep neural networks on devices with limited storage, such as mobile/portable devices.”).
Regarding claim 4,
The Toderici/Choi Combination discloses the method of claim 2, wherein training the neural network further comprises: applying the compressed representation as input to the task portion of the neural network to obtain a predicted machine learning task output (Toderici Fig. 3 and Pg. 13, Cols. 11-12 recites “FIG. 3 shows an example decoder network 106, as described above with reference to FIG. 1. The decoder network 106 is configured to receive an encoded representation of the system input image, e.g., binarized input 302, and to generate an output image that is a reconstruction of the system input image, e.g., second stack output 304.”); and 
determining the first loss by comparing the machine learning task output to the machine learning task output label of the training example (Toderici Pg. 12, Col. 10 recites “The generated outputs may then be compared to known training outputs included in the set of training data by computing loss functions and backpropagating loss function gradients with respect to current neural network parameters to determine an updated set of neural network parameters that minimizes the loss functions.”).

Regarding claims 12 and 14,
Claims 12 and 14 are directed to a system comprising a data compression system and a machine learning task system configured to perform methods substantially identical to those recited in claims 2 and 4, respectively. Therefore, the rejections to claims 2 and 4 apply equally here.
In addition, Toderici discloses the additional limitation of a system, a data compression system, a data acquisition module, a coding module, a machine learning task system and a decoding module (Toderici Pg. 10, Col. 5 recites “The neural network system 100 includes an encoder network 102, a binarizer 104, a decoder network 106, and a residual error calculator 108. For convenience, the binarizer 104 of FIG. 1 is shown as being separate to the encoder network 102, however in some implementations the binarizer 104 may be included in the encoder network 102. Optionally, the neural network system may include an additive image reconstruction module 110 and a gain estimator module 112.” Additionally, Toderici Pgs. 12-13, Cols. 10-11 recite “For convenience, the encoder network 102 and decoder network 106 are illustrated in FIG. 1 as being located in a same system. However, in some implementations the decoder network 106 and encoder network 102 may be distributed across multiple systems. That is, the decoder network 106 may be remote from the encoder network 102 and binarizer 104. For example, a received system input image, e.g., input image 114, may be compressed using the encoder network 102 and binarizer 104 at one end, and transmitted to the decoder network 106 at another end point, where it may be reconstructed and provided as a system output.” Encoder network that can receive input images and has functionality as a data acquisition module, Binarizer which has functionality as a coding module and may be part of the encoder network, and Decoder Network that has functionality as a decoding module).

Regarding claims 22, 24, 32 and 34,
Claims 22 and 24 & 32 and 34 are directed to articles of manufacture configured to perform methods substantially identical to those recited in claims 2 and 4, respectively. Therefore, the rejections to claims 2 and 4 apply equally here.
In addition, Toderici discloses the additional limitation of a computer readable storage medium (Toderici Pg. 15, Col. 16 recites “Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.”).
Claims 3, 13, 23 and 33 are rejected under 35 U.S.C. 103 as being unpatentable over Toderici in view of Choi and in further view of Theis et al. (US 10623775 B1, hereinafter Theis).

Regarding claim 3,
The Toderici/Choi Combination teaches the method of claim 2 and the target codelength (Choi Fig. 9A-B, [0132] and [0157] recite, in part, “Thus, in such cases, it is better to minimize the quantization loss under the constraint of the actual compression ratio, which is a function of the average codeword length resulting from the specific encoding scheme employed at the end. [0157] From this curve, the point that satisfies the target performance and/or target compression ratio can be selected.” The codeword length of the actual compression ratio (i.e. target codelength)). 
However, the Toderici/Choi Combination does not teach wherein the target codelength is set based on an amount of available bandwidth.
Theis teaches wherein the target codelength is set based on an amount of available bandwidth (Theis Pg. 14, Col. 4 recites “All three components may have parameters used to optimize a tradeoff between using a small number of bits (e.g., high compression and low bandwidth) for an encoded frame of video or an encoded image and having small distortion when the encoded frame of video or encoded image is decoded, such that: 
    PNG
    media_image4.png
    52
    198
    media_image4.png
    Greyscale
  where, α controls the tradeoff between using a small number of bits for the encoded frame of video or the encoded image and having small distortion when the encoded frame of video or encoded image is decoded.” Optimize for small number of bits would indicate high compression and low bandwidth (i.e. setting based on bandwidth)).
Theis and the Toderici/Choi Combination are both directed to neural networks and compression. In view of the teachings of Theis, it would have been obvious to one of ordinary skill in the art to apply the teachings of Theis to the Toderici/Choi Combination before the effective filing date of the claimed invention in order to deliver large volumes of high quality data where bandwidth is limited by optimizing the bitrate (cf. Theis Pg. 13, Cols. 1-2 recite “In order to reduce the resolution of video, several techniques exist to downscale the resolution of video data to reduce the bitrate. As a result of the disadvantages of current compression approaches, existing network infrastructure and video streaming mechanisms are becoming increasingly inadequate to deliver large volumes of high quality video content to meet ever-growing consumer demands for this type of content. This can be of particular relevance in certain circumstances, for example in relation to live broadcasts, where bandwidth is often limited, and extensive processing and video compression cannot take place at the location of the live broadcast without a significant delay due to inadequate computing resources being available at the location. Advances in training of neural networks have helped to improve performance in a number of domains. However, neural networks have yet to surpass existing codecs in lossy image compression.”).

Regarding claim 13,
Claim 13 is directed to a system comprising a data compression system and a machine learning task system configured to perform methods substantially identical to those recited in claim 3. Therefore, the rejection to claim 3 applies equally here.
In addition, Toderici discloses the additional limitation of a system, a data compression system, a data acquisition module, a coding module, a machine learning task system and a decoding module (Toderici Pg. 10, Col. 5 recites “The neural network system 100 includes an encoder network 102, a binarizer 104, a decoder network 106, and a residual error calculator 108. For convenience, the binarizer 104 of FIG. 1 is shown as being separate to the encoder network 102, however in some implementations the binarizer 104 may be included in the encoder network 102. Optionally, the neural network system may include an additive image reconstruction module 110 and a gain estimator module 112.” Additionally, Toderici Pgs. 12-13, Cols. 10-11 recite “For convenience, the encoder network 102 and decoder network 106 are illustrated in FIG. 1 as being located in a same system. However, in some implementations the decoder network 106 and encoder network 102 may be distributed across multiple systems. That is, the decoder network 106 may be remote from the encoder network 102 and binarizer 104. For example, a received system input image, e.g., input image 114, may be compressed using the encoder network 102 and binarizer 104 at one end, and transmitted to the decoder network 106 at another end point, where it may be reconstructed and provided as a system output.” Encoder network that can receive input images and has functionality as a data acquisition module, Binarizer which has functionality as a coding module and may be part of the encoder network, and Decoder Network that has functionality as a decoding module).

Regarding claims 23 and 33,
Claims 23 and 33 are directed to articles of manufacture configured to perform methods substantially identical to those recited in claim 3. Therefore, the rejection to claim 3 applies equally here.
In addition, Toderici discloses the additional limitation of a computer readable storage medium (Toderici Pg. 15, Col. 16 recites “Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.”).



Claims 5-6, 9-10, 15-16, 19-20, 25-26, 29-30, 35-36 and 39-40 are rejected under 35 U.S.C. 103 as being unpatentable over Toderici in view of Theis.

Regarding claim 5,
Toderici discloses the method of claim 1, wherein the neural network is trained to maximize the accuracy of the machine learning task output predicted by the neural network (Toderici Pg. 11, Col. 8 recites “In some implementations, the accuracy of an image compression may increase as more bits are processed by the neural network system 100, e.g., as more iterations of the process are performed by the neural network system 100. However, more iterations of the process reduce the compression rate, incurring a tradeoff between image quality and compression rate as described above.” Accuracy increase with more bits processed at each iteration with a trade-off between quality and compression (i.e. train to maximize accuracy)). 
However, Toderici does not explicitly teach while ensuring that the compressed codes generated from the compressed representation does not exceed a threshold codelength that is dependent on available bandwidth.
Theis teaches while ensuring that the compressed codes generated from the compressed representation does not exceed a threshold codelength that is dependent on available bandwidth (Theis Pg. 14, Col. 4 recites “All three components may have parameters used to optimize a tradeoff between using a small number of bits (e.g., high compression and low bandwidth) for an encoded frame of video or an encoded image and having small distortion when the encoded frame of video or encoded image is decoded, such that: 
    PNG
    media_image4.png
    52
    198
    media_image4.png
    Greyscale
  where, α controls the tradeoff between using a small number of bits for the encoded frame of video or the encoded image and having small distortion when the encoded frame of video or encoded image is decoded.” Additionally, Theis Pg. 15, Col. 5 recites “The non-differentiable number of bits can define an upper bound by first expressing the probability distribution Q in terms of a probability density q, such that: 
    PNG
    media_image5.png
    23
    143
    media_image5.png
    Greyscale
” Upper bound defined with number of bits where a small number of bits would indicate high compression and low bandwidth (i.e. threshold codelength dependent on bandwidth)).
Theis and Toderici are both directed to neural networks and compression. In view of the teachings of Theis, it would have been obvious to one of ordinary skill in the art to apply the teachings of Theis to Toderici before the effective filing date of the claimed invention in order to deliver large volumes of high quality data where bandwidth is limited by optimizing the bitrate (cf. Theis Pg. 13, Cols. 1-2 recite “In order to reduce the resolution of video, several techniques exist to downscale the resolution of video data to reduce the bitrate. As a result of the disadvantages of current compression approaches, existing network infrastructure and video streaming mechanisms are becoming increasingly inadequate to deliver large volumes of high quality video content to meet ever-growing consumer demands for this type of content. This can be of particular relevance in certain circumstances, for example in relation to live broadcasts, where bandwidth is often limited, and extensive processing and video compression cannot take place at the location of the live broadcast without a significant delay due to inadequate computing resources being available at the location. Advances in training of neural networks have helped to improve performance in a number of domains. However, neural networks have yet to surpass existing codecs in lossy image compression.”).

Regarding claim 6,
Toderici discloses the method of claim 1, wherein the neural network is trained while maintaining the accuracy of the machine learning task output predicted by the neural network at a minimum accuracy threshold (Toderici Pg. 12, Col. 10 recites “The generated outputs may then be compared to known training outputs included in the set of training data by computing loss functions and backpropagating loss function gradients with respect to current neural network parameters to determine an updated set of neural network parameters that minimizes the loss functions.” Backpropagating and minimizing the loss functions for improvement (i.e. maintaining the accuracy)).
However, Toderici does not explicitly disclose wherein the neural network is trained to minimize a codelength of the compressed codes generated from the compressed representation.
Theis teaches wherein the neural network is trained to minimize a codelength of the compressed codes generated from the compressed representation (Theis Pg. 16, Col. 7 recites “In some embodiments, the goal of training the lossy compression algorithm and the inverse of the lossy compression algorithm is to obtain a small reconstruction error using as few bits as possible.” Obtain a small reconstruction error using as few bits as possible (i.e. trained to minimize a codelength)).
Theis and Toderici are both directed to neural networks and compression. In view of the teachings of Theis, it would have been obvious to one of ordinary skill in the art to apply the teachings of Theis to Toderici before the effective filing date of the claimed invention in order to deliver large volumes of high quality data where bandwidth is limited by optimizing the bitrate (cf. Theis Pg. 13, Cols. 1-2 recite “In order to reduce the resolution of video, several techniques exist to downscale the resolution of video data to reduce the bitrate. As a result of the disadvantages of current compression approaches, existing network infrastructure and video streaming mechanisms are becoming increasingly inadequate to deliver large volumes of high quality video content to meet ever-growing consumer demands for this type of content. This can be of particular relevance in certain circumstances, for example in relation to live broadcasts, where bandwidth is often limited, and extensive processing and video compression cannot take place at the location of the live broadcast without a significant delay due to inadequate computing resources being available at the location. Advances in training of neural networks have helped to improve performance in a number of domains. However, neural networks have yet to surpass existing codecs in lossy image compression.”).

Regarding claim 9,
Toderici discloses the method of claim 1 (Toderici Pg. 8, Col. 1 recites “This specification describes methods and systems, including computer programs encoded on computer storage media, for performing image compression across different compression rates on images of arbitrary size using recurrent neural networks.”). 
However, Toderici does not explicitly disclose wherein a machine learning task of the machine learning task output predicted by the neural network is one of a classification task, regression task, clustering task, density estimation task, dimensionality reduction task, or multivariate querying task.
Theis teaches wherein a machine learning task of the machine learning task output predicted by the neural network is one of a classification task, regression task, clustering task, density estimation task, dimensionality reduction task, or multivariate querying task (Theis Pg. 19, Cols. 13-14 recite “When initially configuring a machine learning system, particularly when using a supervised machine learning approach, the machine learning algorithm can be provided with some training data or a set of training examples, in which each example is typically a pair of an input signal/vector and a desired output value, label (or classification) or signal. The machine learning algorithm analyses the training data and produces a generalized function that can be used with unseen data sets to produce desired output values or signals for the unseen input vectors/signals.” Configuring a machine learning system with an input vector and a desired classification (i.e. a machine learning task that is a classification task)).
Theis and Toderici are both directed to neural networks and compression. In view of the teachings of Theis, it would have been obvious to one of ordinary skill in the art to apply the teachings of Theis to Toderici before the effective filing date of the claimed invention in order to use neural networks for optimizing image compression (cf. Theis Pg. 13, Cols. 1-2 recite “In order to reduce the resolution of video, several techniques exist to downscale the resolution of video data to reduce the bitrate. As a result of the disadvantages of current compression approaches, existing network infrastructure and video streaming mechanisms are becoming increasingly inadequate to deliver large volumes of high quality video content to meet ever-growing consumer demands for this type of content. This can be of particular relevance in certain circumstances, for example in relation to live broadcasts, where bandwidth is often limited, and extensive processing and video compression cannot take place at the location of the live broadcast without a significant delay due to inadequate computing resources being available at the location. Advances in training of neural networks have helped to improve performance in a number of domains. However, neural networks have yet to surpass existing codecs in lossy image compression.”).


Regarding claim 10,
The Toderici/Theis Combination teaches the method of claim 9, wherein the classification task is one of an object detection or object recognition task (Theis Pg. 19 Col. 13 recites “Unsupervised learning is concerned with determining a structure for input data, for example when performing pattern recognition, and typically uses unlabeled data sets.” Pattern recognition (i.e. object recognition)).

Regarding claims 15-16 and 19-20,
Claims 15-16 and 19-20 are directed to a system comprising a data compression system and a machine learning task system configured to perform methods substantially identical to those recited in claims 5-6 and 9-10, respectively. Therefore, the rejections to claims 5-6 and 9-10 apply equally here.
In addition, Toderici discloses the additional limitation of a system, a data compression system, a data acquisition module, a coding module, a machine learning task system and a decoding module (Toderici Pg. 10, Col. 5 recites “The neural network system 100 includes an encoder network 102, a binarizer 104, a decoder network 106, and a residual error calculator 108. For convenience, the binarizer 104 of FIG. 1 is shown as being separate to the encoder network 102, however in some implementations the binarizer 104 may be included in the encoder network 102. Optionally, the neural network system may include an additive image reconstruction module 110 and a gain estimator module 112.” Additionally, Toderici Pgs. 12-13, Cols. 10-11 recite “For convenience, the encoder network 102 and decoder network 106 are illustrated in FIG. 1 as being located in a same system. However, in some implementations the decoder network 106 and encoder network 102 may be distributed across multiple systems. That is, the decoder network 106 may be remote from the encoder network 102 and binarizer 104. For example, a received system input image, e.g., input image 114, may be compressed using the encoder network 102 and binarizer 104 at one end, and transmitted to the decoder network 106 at another end point, where it may be reconstructed and provided as a system output.” Encoder network that can receive input images and has functionality as a data acquisition module, Binarizer which has functionality as a coding module and may be part of the encoder network, and Decoder Network that has functionality as a decoding module).

Regarding claims 25-26, 29-30, 35-36 and 39-40,
Claims 25-26 and 29-30 & 35-36 and 39-40 are directed to articles of manufacture configured to perform methods substantially identical to those recited in claims 5-6 and 9-10, respectively. Therefore, the rejections to claims 5-6 and 9-10 apply equally here.
In addition, Toderici discloses the additional limitation of a computer readable storage medium (Toderici Pg. 15, Col. 16 recites “Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.”).


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Bernal et al. (US-20180063538-A1) teaches data acquisition, processing and methods of compressing data. 
Wierstra et al. (US-20170230675-A1) teaches image processing through layers of neural networks.
Sydorenko (U.S. Patent No. 6091773-A) teaches audio and video coding/compression techniques using a neural network.
Dony et al. ("Neural Network Approaches to Image Compression", February 1995) teaches image compression using neural networks and codewords.
Gong et al. ("Compressing Deep Convolutional Networks Using Vector Quantization", December 18, 2014) teaches vector quantization methods for compressing convolutional neural network parameters.
Omaima N.A. AL-Allaf ("Improving the Performance of Backpropagation Neural Network Algorithm for Image Compression/Decompression System", 2010) teaches an algorithm for image compression/decompression for a Backpropagation Neural Network.

	

Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEON W CHEUNG whose telephone number is (571) 272-9930.  The examiner can normally be reached on 8:30AM-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571) 270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/LWC/Examiner, Art Unit 2124                                                                                                                                                                                                        
/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124