DETAILED ACTION
This action is in response to the amendments filed 31 August 2022 for application 16/046993 filed 26 July 2018.  Currently claims 1, 3-9, 11-17, and 19-20 are pending.  Claims 2, 10, and 18 have been previously canceled. Rejections under 35 USC 112(b) have been withdrawn in light of the amendments. The provisional non-statutory double patenting rejection relative to pending US Application 16/2231092 has been withdrawn in response to the terminal disclaimer filed 31 August 2022. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed 31 August 2022 have been fully considered but they are not persuasive. 

Specifically, the Applicants Argue:
Regarding the analysis for claim 1 under Step 2A Prong 2, the analysis merely concludes "No" and proceeds to analyze additional elements that are recited in claim 1. There are, however, absolutely no details for analysis of Step 2A Prong 2 as to whether claim 1 improves the functioning of a computer or other technology or technological field.  …In this regard, Applicant respectfully submits that one of skill in the art would understand that the claimed encoder circuit of independent claim 1 is involved with a reduction in the size of the memory used by a neural network by being configured to encode at least one block of values independently from other blocks of a tensor using Sparse-Exponential-Golomb lossless compression encoding. … Applicant respectfully submits that claim 1 integrates a judicial exception into a practical application that imposes a meaningful limit on the judicial exception and claim 1 is more than a drafting effort designed to monopolize the judicial exception. That is, the limits imposed by claim 1 require that a system comprise an encoder circuit configured to encode the at least one block of values independently from other blocks of the tensor using Sparse-Exponential-Golomb- RemoveMin lossless compression encoding to reduce the size of the memory used by the neural network. … Further still, claim 1 does not merely recite the words "apply it" (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer; merely use a computer as a tool to perform an abstract idea; adds insignificant extra-solution activity to the judicial exception; and generally links the use of a judicial exception to a particular technological environment or field of use. … In response to Applicant's previously submitted arguments that the claims 1, 9 and 17 provide practical applications of providing lossless encoding of activation maps of a neural network to reduce memory requirements, particularly during training of a deep neural network, …This response also appears to bolster the apparent lack of proper determination under Prong Two of Revised Step 2A whether any of claims 1, 9 and 17 recite additional elements that integrate the judicial exception into a practical application. While it is indicated that an evaluation at Step 2A, Prong 2 and at Step 2B has been performed, there is no record of that evaluation other than a conclusory statement of "No" for the analysis under Step 2A Prong 2. As previously pointed out, there is no determination or statement in the present Office Action whether the disclosure of the specification of the present patent application sets forth sufficient details such that one of ordinary skill in the art would recognize the claimed invention as providing an improvement in the functioning of a computer. …Moreover, the statement that "[i]t is further noted that the reduction of size of the memory is an intended and must be given little patentable weight" is irrelevant in the analysis under Step 2A Prong 2, and strongly indicates that a proper analysis has not been done. 
Response to Office Action PAGE 16 OF 19 Attorney Docket No. 1535-406 
Examiner’s Response:
	The Examiner respectfully disagrees. Both the current office action as well as the previous office actions (viz., the 7 July 2022 NOFA and the 2 November 2022 FOA) specifically address, at step 2A, Prong 2,  the question of whether or not the claim elements are sufficient to integrate the judicial exception into a practical technological application such as an improvement in computer technology. Specifically, with respect to the claim element “to reduce a size of a memory used by the neural network,”  the Examiner indicated that the reduction in the size of the memory through compression is recited at a high level of generality that merely links the judicial exception to the technological environment of compressing neural network data. The corresponding evaluation of this element is based on a carry-over conclusion from step 2A, prong 2 according to MPEP 2106.05(f)(1) (as cited in the previous NOFA). As noted in the Applicant’s argument, this includes the statement “It is also noted that the reduction of the size of a memory is an outcome, the details of which how that solution is accomplished is not recited by the claims.” In other words, one of the reasons that the claim elements merely link to a technological environment is that the details of how that compression is performed are not recited in the claims. For clarity, this statement has been reiterated at Step 2A, Prong 2. It is further recommended that functional details associated “Sparse-Exponential-Golomb- RemoveMin lossless compression” be added to the independent claims. 

The Applicants Further Argue:
Regarding independent claim 1, it has been admitted at page 46, lines 3-5, of the present Office Action that: "Choi does not explicitly teach selected from a group including Sparse- Exponential-Golomb encoding, Sparse-Exponential-Golomb-RemoveMin encoding, Exponent-Mantissa encoding, and Sparse fixed length encoding." Applicant respectfully agrees with this admission. Moreover, in view of this admission it follows that Choi does not disclose or suggest "using Sparse-Exponential-Golomb-RemoveMin lossless encoding." 

Examiner’s Response:
This argument is moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Specifically, as set forth in the current office action, although Choi teaches a “RemoveMin” functionality, he does not disclose this in combination with “sparse-Exponential-Golomb” encoding. However, the new grounds of rejection in view of Choi, Loganathan, and Aziz do teach this claim element.

Claim Objections
Claim 17 is objected to because of the following informalities:  Claim 17 recites “comprising a reduced a size …” which should instead read “comprising a reduced size…”.  Appropriate correction is required.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.



Claims 1, 3-9, 11-17, and 19-20 are rejected under 35 U.S.C. 101. because the claims are directed to an abstract idea; and because the claims as a whole, considering all claim elements both individually and in combination, do not amount to significantly more than the abstract idea, see Alice Corporation Pty. Ltd. v. CLS Bank International, et al, 573 U.S. (2014).
As an initial matter, according to the first part of the Alice analysis (Step 1), the claims were determined to be directed to one of the four statutory categories: an article of manufacture, a method/process (claims 9, 11-17, 19-20), a machine/system/product (claims 1, 2-8), and/or a composition of matter.
Secondly, based on the claims being determined to be within one of the four categories (i.e., process, machine, manufacture, or composition of matter) it must be determined if the claims are directed to a judicial exception (i.e., law of nature, natural phenomenon, and abstract idea) (Step 2A). This step consists of a two-prong inquiry: (1) Does the claim recites an abstract idea, law of nature, or natural phenomenon? and (2) Does the claim recite additional elements that integrate the judicial exception into a practical application?
Claims 1, 3-9, 11-17, and 19-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims recite mathematical concepts. This judicial exception is not integrated into a practical application because it fails to integrate the judicial exception into a practical application and generic recited computer elements do not add meaningful limitations The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception as discussed in the following analysis.
Regarding independent claims 1, 9, and 17, the following analysis shows that the limitations recite the judicial exception of an abstract idea in the mathematical concepts and mental processes groups and do not recite additional elements that integrate the judicial exception into a practical application.

Claim 1 does not satisfy the two-Prong Test as explained in the analysis of each limitation below:
Step 2A
Prong 1: 
… losslessly compress …: a formatter … configured to format a tensor corresponding to an … into at least one block of values, the tensor having a size of H x W x C in which H represents a height of the tensor, W represents a width of the tensor, and C represents a number of channels of the tensor; (Yes)  The claim, under its broadest reasonable interpretation, recites mathematical steps of formatting a tensor into blocks of values as part of a system for performing the mathematical steps of lossless compression/encoding.  The mere recitation of a generic computer device/system to perform these mathematical steps does not take the claim limitation out of the mathematical concepts group. 
and an encoder … configured to encode the at least one block independently from other blocks of the tensor using Sparse-Exponential-Golomb-RemoveMin lossless compression encoding (Yes)  The claim, under its broadest reasonable interpretation, recites mathematical steps of encoding a block of values of the tensor using a lossless compression mode. The mere recitation of a generic computer device/system to perform these mathematical steps does not take the claim limitation out of the mathematical concepts group.
Prong 2 (No): The claim recites additional elements:
A system to … the system comprising… circuit … circuit :  The system/processors/circuits in the computer system that perform the mathematical steps of formatting and compression are recited at a high level of generality and are no more than mere instructions to apply the exception using a generic computer and, thereby, do not impose a meaningful limit on the judicial exception.  
… an activation map of a neural network … activation map …- The activation map of a neural network is recited at a high level of generality that merely generally links the judicial exception to a particular technological environment. 
to reduce a size of a memory used by the neural network: The reduction of a size of a memory associated with a neural network through compression is recited at a high level of generality that merely generally links the judicial exception to a particular technological environment (compression of neural network data). It is also noted that the reduction of the size of a memory is an outcome, the details of which how that solution is accomplished is not recited by the claims.
None of these additional elements integrate the judicial exception into a practical application because the computing devices and the compression of neural network data/activation maps are recited at a high level of generality and correspond to generic computer functions.  
In addition, according to the second part of the Alice/Mayo test (step 2B), it must be determined if the claim as a whole recite something significantly more than the judicial exception, when considered both individually and as an ordered combination. The recitation in the preamble is insufficient to transform a judicial exception to a patentable invention because the preamble elements are recited at a high level of generality that simply linked to a field of use, see MPEP 2106.05(h). The examiner further notes that the claim limitation(s) below are deemed insufficient to transform a judicial exception to a patentable invention, as described in the analysis that follows below:
The elements in the limitations below are insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity, see MPEP 2106.05(g):
Generic computer implemented method, processing resources as noted above.
… an activation map of a neural network … activation map …– as noted above (see MPEP 2106.05(h)).
and reduce a size of a memory used by the neural network: …– as noted above (see MPEP 2106.05(h)). It is also noted that the reduction of the size of a memory is an outcome, the details of which how that solution is accomplished is not recited by the claims (see MPEP 2106.05(f)(1)).
As discussed in the step 1, 2A Prongs 1 and 2, and 2B analyses, claim 1 limitations examined individually or as an ordered combination recites no meaningful limitations that amount to significantly more than the exception itself. In particular, there are no indication that the combination of elements improves the functioning of a computer or improves another technology. Therefore, when looking at the claim elements individually or an ordered combination, claim 1 does not recite identified elements deemed by the courts as "significantly more”.

Independent claim 9 recites similar elements analyzed in claims 1 above and are rejected for the same reasons as claim 1. Specifically, according to the second part of the Alice/Mayo test (step 2B), it must be determine if the claim as a whole recite something significantly more than the judicial exception, when considered both individually and as an ordered combination. The recitation in the preamble is are insufficient to transform a judicial exception to a patentable invention because the preamble elements are recited at a high level of generality that simply linked to a field of use, see MPEP 2106.05(h). The examiner further notes that claim 9 is a method implementation of the same subject matter recited in claim 1. As discussed in the step 1, 2A Prongs 1 and 2, and 2B analyses, limitations of claim 9, examined individually or as an ordered combination recite no meaningful limitations that amount to significantly more than the exception itself. In particular, there are no indication that the combination of elements improves the functioning of a computer or improves another technology. Therefore, when looking at the claim elements individually or an ordered combination, claim 9 does not recite identified elements deemed by the courts as "significantly more”.
Regarding independent claim 17, the following analysis shows that the limitations recite the judicial exception of an abstract idea in the mathematical concepts and mental processes groups and do not recite additional elements that integrate the judicial exception into a practical application.
Claim 17 does not satisfy the two-Prong Test as explained in the analysis of each limitation below:
Step 2A
Prong 1: 
A method to losslessly decompress an …, the method comprising: … at a decoder … a bitstream representing at least one compressed block of values of the …,; (Yes)  The claim, under its broadest reasonable interpretation, recites mathematical steps of generating from a decoder a stream of bits corresponding a compressed block of values as part of a system for performing lossless decompression/decoding.  The mere recitation of a generic computer device/system to perform these mathematical steps does not take the claim limitation out of the mathematical concepts group. 
decompressing by the decoder … the at least one compressed block of values to form at least one decompressed block of values, the decompressed block of values being independently decompressed from other blocks of values of the … using a  decompression mode corresponding to a lossless compression mode used to compress the at least one block, the lossless compression mode being Sparse-Exponential-Golomb-RemoveMin lossless compression encoding mode; (Yes)  The claim, under its broadest reasonable interpretation, recites mathematical steps of decoding a block of values independently using a lossless decompression mode corresponding to the (particular) compression mode to encode it. The mere recitation of a generic computer device/system to perform these mathematical steps does not take the claim limitation out of the mathematical concepts group.
and deformatting by a deformatter … the at least one decompressed block of values into a tensor having a size of H x W x C in which H represents a height of the tensor, W represents a width of the tensor, and C represents a number of channels of the tensor, the tensor being the … that has been decompressed.  (Yes)  The claim, under its broadest reasonable interpretation, recites mathematical steps of deformatting a block of values to form a tensor. The mere recitation of a generic computer device/system to perform these mathematical steps does not take the claim limitation out of the mathematical concepts group.
Prong 2 (No): The claim recites additional elements:
Receiving - The function of receiving data is a mere data gathering step and the computers that perform that function are recited at a high level of generality that does not impose a meaningful limitation on the judicial exception. 
Circuit … circuit … circuit… The system/processors/circuits in the computer system that perform the mathematical steps of deformatting and decoding/decompression are recited at a high level of generality and are no more than mere instructions to apply the exception using a generic computer and, thereby, do not impose a meaningful limit on the judicial exception.  
… an activation map of a neural network … activation map… activation map… neural network … activation map… activation map… activation map… activation map …- The activation map of a neural network is recited at a high level of generality that merely generally links the judicial exception to a particular technological environment. 
the bitstream reducing of the at least one compressed block of values of the activation map comprising a reduced a size of a memory used by the neural network as compared to the bitstream being uncompressed  blocks of values from the activation map- The reduction of a size of a memory associated with a neural network (including activation maps) through compression is recited at a high level of generality that merely generally links the judicial exception to a particular technological environment (compression of neural network data). It is also noted that the reduction of the size of a memory is an outcome, the details of which how that solution is accomplished is not recited by the claims.
None of these additional elements integrate the judicial exception into a practical application because the computing devices and the compression of neural network data/activation maps are recited at a high level of generality and correspond to generic computer functions.  
In addition, according to the second part of the Alice/Mayo test (step 2B), it must be determined if the claim as a whole recite something significantly more than the judicial exception, when considered both individually and as an ordered combination. The recitation in the preamble is insufficient to transform a judicial exception to a patentable invention because the preamble elements are recited at a high level of generality that simply linked to a field of use, see MPEP 2106.05(h). The examiner further notes that the claim limitation(s) below are deemed insufficient to transform a judicial exception to a patentable invention, as described in the analysis that follows below:
The elements in the limitations below are insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity, see MPEP 2106.05(g):
Generic computer implemented method, processing resources as noted above (with respect to the processors and the computer implemented method).
receiving… It is noted that the claimed extra-solution data gathering is acknowledged to be well-understood, routine, conventional activity (see, e.g., court recognized WURC examples in MPEP 2106.05(d)(II)(i)). Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. 
… activation map of a neural network … activation map … activation map … neural network … activation map …… activation map … activation map … activation map – as noted above (see MPEP 2106.05(h)). 
the at least one compressed bitstream reducing a size of a memory used by the neural network as compared to the bitstream being uncompressed - as noted above (see MPEP 2106.05(h)). It is also noted that the reduction of the size of a memory is an outcome, the details of which how that solution is accomplished is not recited by the claims (see MPEP 2106.05(f)(1))
As discussed in the step 1, 2A Prongs 1 and 2, and 2B analyses, claim 17 limitations examined individually or as an ordered combination recites no meaningful limitations that amount to significantly more than the exception itself. In particular, there are no indication that the combination of elements improves the functioning of a computer or improves another technology. Therefore, when looking at the claim elements individually or an ordered combination, claim 17 does not recite identified elements deemed by the courts as "significantly more”.

Furthermore, regarding the dependent claims 3-8 which are dependent on claim 1, the disclosed limitations does not recite identified elements deemed by the courts as "significantly more”. The examiner notes that the dependent claims elements that are deemed insufficient to transform a judicial exception to a patentable invention and are considered part of the abstract idea as noted below:
Claim 3:
Step 2A
Prong 1 (Yes):
wherein the at least one lossless compression mode selected to encode the at least one block of values is different from a lossless compression mode selected to encode another block of the tensor. (Yes)  The claim, under its broadest reasonable interpretation, recites mathematical steps of encoding another block of values of the tensor using a different selected lossless compression mode. The mere recitation of a generic computer device/system to perform these mathematical steps does not take the claim limitation out of the mathematical concepts group.
Prong 2 (No): The claim does not recite any additional elements
Step 2B: 
The claim does not recite additional elements that the courts have identified as “significantly more” for the same reasons as pointed out in claim 1. 
Claim 4:
Step 2A
Prong 1 (Yes):
wherein the encoder … is further configured to encode the at least one block of values by encoding the at least one block independently from other blocks of the tensor using a plurality of the lossless compression modes.   (Yes) The claim, under its broadest reasonable interpretation, recites mathematical steps of encoding a block of values of the tensor independently. The mere recitation of a generic computer device/system to perform these mathematical steps does not take the claim limitation out of the mathematical concepts group.
Prong 2 (No): The claim does t recite additional elements
Circuit … The system/processors/circuits in the computer system that perform the mathematical steps of encoding/compression are recited at a high level of generality and are no more than mere instructions to apply the exception using a generic computer and, thereby, do not impose a meaningful limit on the judicial exception.
None of these additional elements integrate the judicial exception into a practical application because the computing devices and the compression/decompression of neural network data/activation maps are recited at a high level of generality and correspond to generic computer functions.
Step 2B
The claim does not recite additional elements that the courts have identified as “significantly more” for the same reasons as pointed out in claim 1 – i.e., generic computer system, processing resources/circuits as noted above.
Claim 5:
Step 2A
Prong 1 (Yes): no additional element is recited
Prong 2 (No): The claim recites one additional element:
wherein the at least one block of values comprises 48 bits - The function of representing a block by 48 bits is a mere data selection step and the computers that perform that function are recited at a high level of generality that does not impose a meaningful limitation on the judicial exception. 
Step 2B: 
The element in the limitations below is insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity, see MPEP 2106.05(g):
wherein the at least one block of values comprises 48 bits … It is noted that the claimed extra-solution of data selection/data type selection is acknowledged to be well-understood, routine, conventional activity (see, e.g., court recognized WURC examples in MPEP 2106.05(g)(3)). Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. 
Claim 6:
Step 2A
Prong 1 (Yes):
wherein the encoder … is further configured to  … the at least one block of values encoded as a bit stream.; (Yes)  The claim, under its broadest reasonable interpretation, recites mathematical steps of encoding a block of values to form a stream of bits. The mere recitation of a generic computer device/system to perform these mathematical steps does not take the claim limitation out of the mathematical concepts group. 
Prong 2 (No): The claim recites one additional elements:
To output - The function of outputting data is a mere data outputting step and the computers that perform that function are recited at a high level of generality that does not impose a meaningful limitation on the judicial exception. 
Circuit … The system/processors/circuits in the computer system that perform the mathematical steps of encoding/compression are recited at a high level of generality and are no more than mere instructions to apply the exception using a generic computer and, thereby, do not impose a meaningful limit on the judicial exception.
None of these additional elements integrate the judicial exception into a practical application because the computing devices and the compression/decompression of neural network data/activation maps are recited at a high level of generality and correspond to generic computer functions.
Step 2B: 
The element in the limitations below is insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity, see MPEP 2106.05(g):
To Output… It is noted that the claimed extra-solution data outputting is acknowledged to be well-understood, routine, conventional activity (see, e.g., court recognized WURC examples in MPEP 2106.05(d)(II)(i)). Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. 
Generic computer system, processing resources/circuits as noted above.
Claim 7:
Step 2A
Prong 1 (Yes):
further comprising: a decoder … configured to decode the at least one block independently from other blocks of the tensor using at least one decompression mode corresponding to the at least one compression mode used to compress the at least one block; (Yes) The claim, under its broadest reasonable interpretation, recites mathematical steps of decoding a block of values (independently of other blocks) using a decompression mode. The mere recitation of a generic computer device/system to perform these mathematical steps does not take the claim limitation out of the mathematical concepts group. 
and a deformatter … configured to deformat the at least one block of values into a tensor having the size of HxWxC. (Yes) The claim, under its broadest reasonable interpretation, recites mathematical steps of deformatting a block of values into a tensor.  The mere recitation of a generic computer device/system to perform these mathematical steps does not take the claim limitation out of the mathematical concepts group.  
Prong 2 (No): The claim does recite additional element:
Circuit …circuit… The system/processors/circuits in the computer system that perform the mathematical steps of decoding/decompression and deformatting are recited at a high level of generality and are no more than mere instructions to apply the exception using a generic computer and, thereby, do not impose a meaningful limit on the judicial exception.
None of these additional elements integrate the judicial exception into a practical application because the computing devices and the compression/decompression of neural network data/activation maps are recited at a high level of generality and correspond to generic computer functions.
Step 2B: 
The claim does not recite additional elements that the courts have identified as “significantly more” for the same reasons as pointed out in claim 1 – i.e., generic computer system, processing resources/circuits as noted above.
Claim 8:
Step 2A
Prong 1 (Yes):
wherein the … includes floating-point values, the … further comprising a quantizer … configured to quantize the floating-point values of the … to be integer values.; (Yes)  The claim, under its broadest reasonable interpretation, recites mathematical steps of quantizing floating-point values into integer values. The mere recitation of a generic computer device/system to perform these mathematical steps does not take the claim limitation out of the mathematical concepts group. 
Prong 2 (No): The claim recites additional element:
system …:  The system/processors in the computer system that perform the mathematical steps of quantization are recited at a high level of generality and are no more than mere instructions to apply the exception using a generic computer and, thereby, do not impose a meaningful limit on the judicial exception.  
… activation map …- The activation map of a neural network is recited at a high level of generality that merely generally links the judicial exception to a particular technological environment. 
Circuit … The system/processors/circuits in the computer system that perform the mathematical steps of quantization are recited at a high level of generality and are no more than mere instructions to apply the exception using a generic computer and, thereby, do not impose a meaningful limit on the judicial exception.
None of these additional elements integrate the judicial exception into a practical application because the computing devices and the compression/decompression/quantization of neural network data/activation maps are recited at a high level of generality and correspond to generic computer functions. 
Step 2B: 
The element in the limitations below is insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity, see MPEP 2106.05(g):
Generic computer system, processing resources as noted above (with respect to the processors/circuits and the computer implemented method).
… activation map … as noted above (see MPEP 2106.05(h))


Therefore, as a whole claims 3-8 do not recite what have the courts have identified as "significantly more”.

Furthermore, regarding the dependent claims 11-16 which are dependent on claim 9, the disclosed limitations do not recite identified elements deemed by the courts as "significantly more”. In particular, claims 11-16 recite similar elements analyzed in dependent claim 3-8, respectively, above and are rejected for the same reasons as dependent claims 3-8. 

Furthermore, regarding the dependent claims 19-20 which are dependent on claim 17, the disclosed limitations does not recite identified elements deemed by the courts as "significantly more”. The examiner notes that the dependent claims elements that are deemed insufficient to transform a judicial exception to a patentable invention and are considered part of the abstract idea as noted below:
Claim 19:
Step 2A
Prong 1 (Yes):
further comprising: … at a formatter … at least one … configured as a tensor having a tensor size of HxWxC; formatting by the formatter … the tensor of the received at least one … into at least one block of values; (Yes)  The claim, under its broadest reasonable interpretation, recites mathematical steps of representing and formatting a tensor as a block of values.  The mere recitation of a generic computer device/system to perform these mathematical steps does not take the claim limitation out of the mathematical concepts group. 
 and compressing by an encoder … the at least one block independently from other blocks of the tensor of the at least one … using the at least one lossless compression mode…. (Yes)  The claim, under its broadest reasonable interpretation, recites mathematical steps of compressing/encoding a block of values of the tensor using a lossless compression mode. The mere recitation of a generic computer device/system to perform these mathematical steps does not take the claim limitation out of the mathematical concepts group.
Prong 2 (No): The claim recites additional elements:
Receiving … received - The function of receiving data  is a mere data gathering step and the computers that perform that function are recited at a high level of generality that does not impose a meaningful limitation on the judicial exception.- The activation map of a neural network is recited at a high level of generality that merely generally links the judicial exception to a particular technological environment. 
activation map … activation map … activation map …- The activation map of a neural network is recited at a high level of generality that merely generally links the judicial exception to a particular technological environment. 
Circuit …circuit … The system/processors/circuits in the computer system that perform the mathematical steps of encoding and formatting are recited at a high level of generality and are no more than mere instructions to apply the exception using a generic computer and, thereby, do not impose a meaningful limit on the judicial exception.
to reduce the size of a memory used by the neural network - The reduction of a size of a memory associated with a neural network through compression is recited at a high level of generality that merely generally links the judicial exception to a particular technological environment (compression of neural network data).
None of these additional elements integrate the judicial exception into a practical application because the computing devices and the compression/decompression/quantization of neural network data/activation maps are recited at a high level of generality and correspond to generic computer functions.
Step 2B: 
The element in the limitations below is insufficient to transform a judicial exception to a patentable invention because the recited elements are considered insignificant extra-solution activity, see MPEP 2106.05(g):
Receiving … received … It is noted that the claimed extra-solution data gathering is acknowledged to be well-understood, routine, conventional activity (see, e.g., court recognized WURC examples in MPEP 2106.05(d)(II)(i)). Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. 
Generic computer system, processing resources as noted above (with respect to the processors/circuits and the computer implemented method).
activation map … activation map … activation map …- as noted above (see MPEP 2106.05(h)).
to reduce the size of a memory used by the neural network uncompressed - as noted above (see MPEP 2106.05(h)). It is also noted that the reduction of the size of a memory is an outcome, the details of which how that solution is accomplished is not recited by the claims (see MPEP 2106.05(f)(1)). 
Claim 20:
Step 2A
Prong 1 (Yes):
wherein the at least one lossless compression mode selected to compress the at least one block of values is different from a lossless compression mode selected to compress another block of the tensor of the at least one …, (Yes)  The claim, under its broadest reasonable interpretation, recites mathematical steps of encoding/compressing another block of values of the tensor using a different selected lossless compression mode. The mere recitation of a generic computer device/system to perform these mathematical steps does not take the claim limitation out of the mathematical concepts group.
and wherein compressing the at least one block of values further comprises compressing by the encoder … the at least one block independently from other blocks of the tensor of the at least one … using a plurality of the lossless compression modes. (Yes) The claim, under its broadest reasonable interpretation, recites mathematical steps of encoding a block of values of the tensor independently. The mere recitation of a generic computer device/system to perform these mathematical steps does not take the claim limitation out of the mathematical concepts group.
Prong 2 (No): The claim recites additional elements:
activation map … activation map …- The activation map of a neural network is recited at a high level of generality that merely generally links the judicial exception to a particular technological environment. 
Circuit … The system/processors/circuits in the computer system that perform the mathematical steps of encoding are recited at a high level of generality and are no more than mere instructions to apply the exception using a generic computer and, thereby, do not impose a meaningful limit on the judicial exception.
None of these additional elements integrate the judicial exception into a practical application because the computing devices and the compression/decompression/quantization of neural network data/activation maps are recited at a high level of generality and correspond to generic computer functions.
Step 2B
activation map … activation map … activation map …- as noted above.
Generic computer system, processing resources as noted above (with respect to the processors/circuits and the computer implemented method). 

In summary, as shown in the analysis above, claims 1, 3-9, 11-17, and 19-20 do not provide any additional elements that when considered individually or as an ordered combination, amount to significantly more than the abstract idea identified. Therefore, as a whole claims 1, 3-9, 11-17, and 19-20 do not recite what have the courts have identified as "significantly more”. In particular, there is no indication that the combination of elements improves the functioning of a computer or improves another technology when claims are considered individually or as an ordered combination.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim 1, 3-6, 9, 11-14 are rejected under 35 U.S.C. 103 as being unpatentable over Choi et al. (“Near-Lossless Deep Feature Compression for Collaborative Intelligence”, arxiv.org/pdf/1804.09963.v1.pdf, arXiv:1804.099963v1 [eess.IV] 26 April 2018, pp. 1-6), hereinafter referred to as Choi, in view of Loganathan et al. (“Comparison of encoding techniques for transmission of image data obtained using compressed sensing in wireless sensor networks”, 2013 International Conference on Recent Trends in Information Technology (ICRTIT), 2013, pp. 696-701), hereinafter referred to as Loganathan, and in further view of Aziz et al. (“Implementation of H.264/MPEG-4 AVC for Compound Image Compression Using Histogram based Block Classification Scheme”, (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 6 (4), 2015, pp. 3479-3488), hereinafter referred to as Aziz.

In regards to claim 1, Choi teaches  A system to losslessly compress an activation map of a neural network and reduce a size of a memory used by the neural network, the system comprising: ([Abstract, p. 1, Section 1, Figure 1, Figure 6], Collaborative intelligence is a new paradigm for efficient deployment of deep neural networks across the mobile cloud infrastructure. By dividing the network between the mobile and the cloud, it is possible to distribute the computational workload such that the overall energy and/or latency of the system is minimized. However, this necessitates sending deep feature data from the mobile to the cloud in order to perform inference. In this work, we examine the differences between the deep feature data and natural image data, and propose a simple and effective near-lossless deep feature compressor., With a view towards collaborative intelligence, in this work we propose a simple and effective near-lossless compression method tailored to deep feature data., wherein a method for performing near lossless (interpreted as lossless) compression of deep neural network feature data (activation maps) applied in a system such as one using a mobile-cloud infrastructure such that this compression reduces the size (number of bits) of memory associated the neural network feature data (memory used by a neural network); it is further noted that the reduction of size of the memory is an intended and must be given little patentable weight.)a formatter circuit configured to format a tensor corresponding to the activation map into at least one block of values, the tensor having a size of H x W x C in which H represents a height of the tensor, W represents a width of the tensor, and C represents a number of channels of the tensor; ([p. 2, Section II, Figure 2], Feature values are typically quantized using an n-bit uniform quantizer (Q-layer in Fig. 1) prior to lossless [3] or lossy [4] compression. Ve = round V − min(V) max(V) − min(V) · (2n − 1) (1) where V ∈ R N×M×C is the feature tensor with N rows, M columns, and C channels at the point of split, Ve is the quantized feature tensor, and min(V) and max(V) are the minimum and maximum value in V, respectively…. The quantized features Ve are rearranged in a tiled image, as shown in Fig. 2., wherein a tensor with dimension N rows (corresponding to height H) x M columns (corresponding to W width) x C channels is quantized and rearranged (formatted) in a tiled image (Figure 2) such that the resulting tiles are blocks formed from the output of the “formatter”.) and an encoder circuit configured to encode the at least one block independently from other blocks of the tensor using … removemin lossless compression ([p. 2, Section II, pp. 3-4, Section III], Feature values are typically quantized using an n-bit uniform quantizer (Q-layer in Fig. 1) prior to lossless [3] or lossy [4] compression. <equation 1>, Before coding the quantized feature data, the following parameters are encoded directly using fixed-length coding: dimensions of the feature tensor, min(V) and max(V) (32- bit each) and the eight most frequent feature values, mi for i = 0, 1, ...7. A vector of these values, p = (p0, p1, ..., p7), is referred to as the palette vector.  Initially, the palette vector is sorted according to the frequency of these values in the first tile, so that p0 is the most frequent of the mi’s in the first tile, p1 is the next most frequent, etc. As we move to other tiles, the palette vector p = (p0, p1, ..., p7) is re-sorted according to the frequency of occurrence of mi’s up to the previously coded tile, so that p0 is the most frequent mi up to that point, and so on. At the tile boundary, once p is updated, one element of p is chosen to minimize the mean absolute difference (MAD) from the feature values in the to-be-coded tile. … The most frequently used mode among them is considered the mpm. If the current block’s mode is the same as mpm, bit 1 is coded by CABAC [15] to indicate it. Otherwise, bit 0 is coded, followed by two bits to indicate the mode. Prediction residuals for each 4 × 4 block are coded by CABAC. The first bit is the SKIP indicator. If the residual is all-zero, the SKIP indicator is set to 1 and the encoder moves to the next block. Otherwise, the SKIP indicator is set to 0 and residuals are coded using one of three scan orders: horizontal, vertical, and zig-zag. For the Ver (Hor) prediction mode, vertical (horizontal) scan order is used. Other modes use the zig-zag scan order. Locations of non-zero residuals are first indicated by binarizing the scanned block, with 1’s. …. . Finally, the non-zero residual values are coded in a manner similar to HEVC [15]: values larger than 1 or 2 are flagged, the flags are CABAC-coded, and the non-flagged values are binarized using exponential Golomb-Rice coding, then coded by CABAC., wherein a current tile/block is separately/independently encoded using CABAC and zero-encoding (corresponding to the condition SKIP =1) if there are no non-zero residuals or through HEVC (High Efficiency Video Coding) with CABAC encoding if the residual values are larger than 1 or 2 or using HEVC with Golomb-Rice coding and CABAC (both lossless/nearly lossless compression modes) if the non-zero residual values are not larger than 1 or 2, wherein it is noted that other encoding modes are also used in this process (e.g., three scan orders or to code residuals) with the independence of the encoding of different blocks also indicated by the association of a most probable mode with that particular block, and wherein it is further noted that various parameters associated with the feature/activation map tensor are encoded using fixed-length coding (also lossless/nearly lossless) to determine a palette vector that is used to perform block-specific encoding by associating an element of that vector with each tile/block, including the parameter min(V) which is the minimum of the features determined prior to the removal of that minimum value (removemin operation) for quantization (equation 1).). to reduce the size of the memory used by the neural network  ([pp. 3-4, Section III, Figure 6], Prediction residuals for each 4 × 4 block are coded by CABAC. The first bit is the SKIP indicator. If the residual is all-zero, the SKIP indicator is set to 1 and the encoder moves to the next block. Otherwise, the SKIP indicator is set to 0 and residuals are coded using one of three scan orders: horizontal, vertical, and zig-zag. For the Ver (Hor) prediction mode, vertical (horizontal) scan order is used. Other modes use the zig-zag scan order. Locations of non-zero residuals are first indicated by binarizing the scanned block, with 1’s…. Finally, the non-zero residual values are coded in a manner similar to HEVC [15]: values larger than 1 or 2 are flagged, the flags are CABAC-coded, and the non-flagged values are binarized using exponential Golomb-Rice coding, then coded by CABAC., wherein exponential-Golomb encoding (interpreted to be a species of Golomb-Rice encoding) is used to encode each block as previously indicated, wherein CABAC with zero-encoding is employed if there are no residual values (i.e., SKIP =1 is encoded and the encoder moves to the next block if all residuals are zero) such that this compression reduces the size (number of bits) of memory associated with the neural network feature data (i.e., memory used by a neural network), and wherein it is noted that that the claim language only requires any particular one of the modes, and wherein it is also noted that the H,W, C values of the dimension of the tensor matrix (not a block of values of the tensor itself) are encoded using fixed length encoding; it is further noted that the reduction of size of the memory is an intended and must be given little patentable weight.)
However, Choi does not explicitly teach Sparse-Exponential-Golomb-RemoveMin .  In other words, although Choi teaches exponential Golomb encoding, he does not explicitly disclose a “sparse” encoding mode – viz., Sparse-Exponential-Golomb encoding. In addition, although Choi teaches a removemin operation (as noted above with respect to equation 1) and associated fixed length encoding of min(V), the coding associated with min(V) (removeMin operation) is fixed length not exponential Goulomb.  Finally, Choi teaches CABAC encoding of tensor blocks but does not disclose sufficient algorithmic details of CABAC encoding to associate it with the recited encoding mode.
However, Loganathan, in the analogous environment of lossless compression of images teaches and an encoder circuit configured to encode the at least one block independently from other blocks … using Sparse-Exponential-Golomb … lossless encoding, ….  ([p. 697, Section II, p. 697, Section IIIB, p. 698, Section IIIE, Figure 2], In order to apply the random matrix more efficiently, the sparse components are needed to be regrouped and a block based sampling strategy is followed [4]. The sparse components are first divided into several groups by scaling and then reordering it into a number of vectors of the same dimension., After applying the 2D wavelet transform to the image, the resultant four bands LL, LH, HL and HH are to be encoded and transmitted. The transmitter block diagram of the proposed scheme is as shown in Fig. 2., In this case, m is determined as m = 2k . Therefore, only the parameter k must be specified to obtain m. This parameter k will also indicate the length of the suffix for the code. Exponential Golomb codes have three different parts which, once concatenated, produce the code [16]. Two intermediate values are used to build the code, f and w, which are calculated using equations (4) and (5) respectively. w(n)=1+n/2^k (4) f(n)=log2(1+n/2^k) (5) When the k value increases the number of bits needed to encode the non-negative decimal no increases. The practical implementation of the Exponential Golomb coding is dealt in detail in [16]. In the algorithm implementation, the coding of the zero value can be optimized by just writing 0 with k + 1 bits. If the value is not zero, then we must continue with the coding process., wherein an entropy-based encoder compresses each block (e.g., LL, LH, HL, and HH in Figure 2) independently (i.e., each block/group within the image is compressed separately/independently in a block based sampling strategy) using  jpeg encoding and sparse measurement encoding in which (for the latter encoding) sparse components of each (image) block to be encoded undergo Exponential Golomb Coding (interpreted as being sparse Exponential Golomb Encoding) followed by a run-length encoding in which a zero value is represented/encoded with k+1 bits (in a BRI sense, this is a “sparse” fixed length encoding scheme since k is pre-specified with this particular fixed length encoding applied to any “0” term).)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Choi to incorporate the teachings of Loganathan for the encoder to encode each block of an activation map using Sparse-Exponential-Golomb encoding. The modification would have been obvious because one of ordinary skill would have been motivated to achieve improved computational efficiency  with good performance/accuracy for (sparse) data with exponential probability distributions and large dispersions ([Loganathan, p. 698, Section IIIE, p. 701, Section VI]).
However, Choi and Loganathan do not explicitly teach the combination Sparse-Exponential-Golomb and RemoveMin .  Loganathan does not clearly disclose a remove minimum operation (quantization/binarization), parameters of which (e.g., min(V)) are encoded by Exponential-Golomb encoding.
However, Aziz, in the analogous environment of lossless compression of images teaches and an encoder circuit configured to encode the at least one block independently from other blocks … using …Exponential-Golomb-RemoveMin lossless encoding, ….  ([p. 3483, Section 9.1, p. 3484, Section 9.3, p. 3485, Section 11], Binarization maps the non-binary valued SE into bin string, which is a sequence of binary decision (bin)…. Kth ordered Exp-Golomb binarization (EGk) – a derivative of Golomb coding is proved to be an optimal prefix-free coding for geometrically distributed sources. EGk codeword consists of prefix and suffix bin strings, with total length of 2l+k+1 bits. EGk prefix is a Unary codeword, with l bits of 1 and one terminating bit 0., Because Range and Low of coding interval are represented by finite number of bits (9 bits for Range and 10 bits for Low), it is necessary to renormalize (scale up) the interval to prevent precision degradation, and the upper bits of Low are output as coded bits during renormalization., Because of the high computational complexity of CABAC, another entropy coding tool CAVLC is deployed in the Baseline profile and extended profile of H.264/AVC targeting low bit-rate real-time video coding. It offers compression complexity trade-off with lower complexity, and lower coding efficiency, compared to CABAC. It is employed to encode the quantized transform coefficients of 4×4 residual blocks, while zero-order Exp-Golomb codes (EG0) are used for all other types of non-residual SEs…. Since the adoption of CABAC entropy coding in H.264/AVC, CABAC is also applied in many applications of image and video processing including motion mode and residual data of 3D dynamic mesh, prediction residual in lossless 4D medical image compression, SEs of 8×8 transform coefficients of AVS coding standard, motion vector coding of scalable video coder, parameters of depth and correction vectors in multi-view video coding., wherein an entropy-based encoder (either CABAC or CAVLC) compresses (image/video) blocks in which Exponential-Golomb is used to encode renormalized intervals for the features in those block (to prevent precision degradation – interpreted as corresponding to a removemin operation in which specifically the “low” parameter is encoded during CABAC using kth order Exp-Golomb binarization)  and (alternatively) is also used to specifically encode all of the non-residual serial elements (SE) during CAVLC encoding (in the non-residual SEs are being interpreted as including the minimum value such as encoded using fixed length encoding by Choi) and wherein it is noted that these entropy encoders (especially CABAC) are lossless.) 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Choi and Loganathan  to incorporate the teachings of Aziz for the encoder to encode each block of an activation map using a coding scheme using  Sparse-Exponential-Golomb_RemoveMin encoding. The modification would have been obvious because one of ordinary skill would have been motivated to achieve improved computational efficiency  with optimal performance/accuracy/efficiency for (multi-dimensional) data with geometric distributions including the encoding of normalization/renormalization functionalities such as using CAVLC for low bit rate data  (Aziz, [Abstract, p. 3483, Section 9.1, p. 3485, Section 11]).

In regards to claim 3, the rejection of claim 1 is incorporated and Choi further teaches wherein the at least one lossless compression mode selected to encode the at least one block of values is different from a lossless compression mode selected to encode another block of the tensor.  ([pp. 3-4, Section III], Prediction residuals for each 4 × 4 block are coded by CABAC. The first bit is the SKIP indicator. If the residual is all-zero, the SKIP indicator is set to 1 and the encoder moves to the next block. Otherwise, the SKIP indicator is set to 0 and residuals are coded using one of three scan orders: horizontal, vertical, and zig-zag. For the Ver (Hor) prediction mode, vertical (horizontal) scan order is used. Other modes use the zig-zag scan order. Locations of non-zero residuals are first indicated by binarizing the scanned block, with 1’s…. Finally, the non-zero residual values are coded in a manner similar to HEVC [15]: values larger than 1 or 2 are flagged, the flags are CABAC-coded, and the non-flagged values are binarized using exponential Golomb-Rice coding, then coded by CABAC., wherein distinct encoding/compression modes are invoked/selected in the form of exponential-Golomb encoding (interpreted to be a species of Golomb-Rice encoding) or  zero-encoding (with CABAC) depending on if there are any residual values (e.g., SKIP =1 is encoded and the encoder moves to the next block if all residuals are zero).)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Choi to incorporate the teachings of Loganathan and Aziz for the same reasons as pointed out for claim 1.

In regards to claim 4, the rejection of claim 1 is incorporated and Choi further teaches wherein the encoder circuit is further configured to encode the at least one block of values by encoding the at least one block of values independently from other blocks of the tensor using a plurality of the lossless compression modes.  ([pp. 3-4, Section III], Prediction residuals for each 4 × 4 block are coded by CABAC. The first bit is the SKIP indicator. If the residual is all-zero, the SKIP indicator is set to 1 and the encoder moves to the next block. Otherwise, the SKIP indicator is set to 0 and residuals are coded using one of three scan orders: horizontal, vertical, and zig-zag. For the Ver (Hor) prediction mode, vertical (horizontal) scan order is used. Other modes use the zig-zag scan order. Locations of non-zero residuals are first indicated by binarizing the scanned block, with 1’s…. Finally, the non-zero residual values are coded in a manner similar to HEVC [15]: values larger than 1 or 2 are flagged, the flags are CABAC-coded, and the non-flagged values are binarized using exponential Golomb-Rice coding, then coded by CABAC., wherein zero-encoding and exponential-Golomb encoding  are two different (nearly lossless overall but lossless after the quantization) coding techniques/modes applied distinctly and independently to each tile/block as pointed out above.)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Choi to incorporate the teachings of Loganathan and Aziz for the same reasons as pointed out for claim 1.

In regards to claim 5, the rejection of claim 1 is incorporated and Choi further teaches wherein the at least one block of values comprises 48 bits.  ([pp. 2, Section II], Feature values are typically quantized using an n-bit uniform quantizer (Q-layer in Fig. 1) prior to lossless [3] or lossy [4] compression…. . In the studies performed so far [3], [4], [7], this uniform n-bit quantization was shown to have negligible effect on image classification and object detection accuracy, for n ≥ 6., wherein the tensor may be quantized in the compression/encoding system to n-bits where n≥ 6 which includes n=48 without affecting the performance of the system (i.e., the compression method is applicable to any n equal to or greater than 6) and wherein it is noted that the specification appears to associate the 48 bits with the dimensions of the entire tensor rather than to a particular block which has been formed, through a formatter, from the content of that tensor as required by the claim.)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Choi to incorporate the teachings of Loganathan and Aziz for the same reasons as pointed out for claim 1.

In regards to claim 6, the rejection of claim 1 is incorporated and Choi further teaches wherein the encoder circuit is further configured to output the at least one block of values encoded as a bit stream. Patent Application Page 18 of 23 Attorney Docket No. 1535-406  ([pp. 3-4, Section III], If the current block’s mode is the same as mpm, bit 1 is coded by CABAC [15] to indicate it. Otherwise, bit 0 is coded, followed by two bits to indicate the mode1…. This binary vector is coded using CABAC. Finally, the non-zero residual values are coded in a manner similar to HEVC [15]: values larger than 1 or 2 are flagged, the flags are CABAC-coded, and the non-flagged values are binarized using exponential Golomb-Rice coding, then coded by CABAC., wherein the coding of each blocks generates a sequence of binary values (e.g., block-specific mode encoding, binary vector encoding) and wherein it is further noted that it is known that CABAC generates a bit stream (see, for example, https://en.wikipedia.org/wiki/Context-adaptive_binary_arithmetic_coding).)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Choi to incorporate the teachings of Loganathan and Aziz for the same reasons as pointed out for claim 1.

Claim 9 is also rejected because it is just a method implementation of the same subject matter of claim 1 which can be found in Choi, Loganathan, and Aziz. 

Claim 11/9 is also rejected because it is just a method implementation of the same subject matter of claim 3/1 which can be found in Choi, Loganathan, and Aziz. 

Claim 12/9 is also rejected because it is just a method implementation of the same subject matter of claim 4/1 which can be found in Choi, Loganathan, and Aziz. 

Claim 13/9 is also rejected because it is just a method implementation of the same subject matter of claim 5/1 which can be found in Choi, Loganathan, and Aziz. 

Claim 14/9 is also rejected because it is just a method implementation of the same subject matter of claim 6/1 which can be found in Choi, Loganathan, and Aziz. 

Claim 7-8, 15-17, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Choi, in view of Loganathan, in view of Aziz, and in further view of Luo et al. (“DeepSIC: Deep Semantic Image Compression”, arxiv.org/pdf/1801.09468.v1.pdf, arXiv:1804.099963v1 [cs.CV] 29 January 2018, pp. 1-8), hereinafter referred to as Aziz.

In regards to claim 7, the rejection of claim 6 is incorporated and Choi further teaches further comprising: a decoder circuit configured to decode the at least one block of values…; and a deformater circuit configured to deformat the … into a tensor having the size of HxWxC.  ([pp. 4, Section IV, Figure 1, Figure 5], To demonstrate this, we construct a mirror model, indicated in the bottom right of Fig. 4, based on the network in the mobile. Specifically, given the network in the mobile, the mirror model consists of the same number of layers, but in reverse order: convolutional layers from the mobile network are mapped to the same convolutional layers in the mirror model, while max-pooling layers from mobile network are mapped to up-sampling layers. The goal of the mirror model is to reconstruct the input image from the deep features transmitted to the cloud., wherein a decoder decodes/decompresses the block-specific encoded/compressed activation maps to form a mirror model of the CNN which incorporates the decoded feature maps such that this decoding process reforms the input feature values/activation map, and thereby the corresponding tensor with size HxWxC, in order to reconstruct the images represented by those feature values and wherein the re-formation of the tensor as represented in the mirror model includes the incorporation/formatting of each block into that model framework.)
However, Choi, Loganathan, and Aziz do not explicitly teach … independently from other blocks of the tensor using at least one decompression mode corresponding to the at least one compression mode used to compress the at least one block of values; … the at least one block of values … In other words, although Choi suggests that the decoding process proceeds as an inverse encoding process such that each of the steps used to form the encoding would be performed in reverse sequence, he does not explicitly disclose this function (such as with respect to particular encoded “blocks”).  Although Loganathan teaches a decoding/decompression process that performs a sequence of operations that correspond to the reverse of the sequence of encoding operations (e.g., Figure 4 shows this decoding process that includes Exponential-Golomb decoding for the sparse component), he does not explicitly teach that this image restoration process (at the receiver) is applied to restore a tensor representation of the blocks of data. Aziz discloses decoding of (image) information using CABAC (also a mirror process) for restoration, he does not specifically the details of the decoding process such as the independence of the processing of disparate blocks of information. 
However, Luo, in the analogous environment of lossless compression of activation/feature maps teaches further comprising: a decoder circuit configured to decode the at least one block of values independently from other blocks of the tensor using at least one decompression mode corresponding to the at least one compression mode used to compress the at least one block of values; and a deformatter circuit configured to deformat the at least one block into a tensor having the size of HxWxC.  ([p. 4, Section IIIB, p. 4, Section IIIC, Figure 2], We exploit this low entropy by lossless compression via entropy coding, to be specific, we implement an entropy coding based on the context-adaptive binary arithmetic coding (CABAC) framework proposed by [20]. Arithmetic entropy codes are designed to compress discrete-valued data to bit rates closely approaching the entropy of the representation, assuming that the probability model used to design the code approximates the data well. We associate each bit location in Q(f(x)) with a context, which comprises a set of features indicating the bit value. These features are based on the position of the bit as well as the values of neighboring bits…. Given y = Q(f(x)) denotes the quantized code, after entropy encoding y into its binary representation yˆ, we retrieve the compression code sequence. During decoding, we decompress the code by performing the inverse operation. Namely, we interleave between computing the context of a particular bit using the values of previously decoded bits. The obtained context is employed to retrieve the activation probabilitx of the bit to be decoded. Note that this constrains the context of each bit to only involve features composed of bits already decoded., Although arithmetic entropy encoding is lossless, the quantization will bring in some loss in accuracy, the result of Q−1 (Q(f(x)) is not exactly the same as the output of feature extraction., wherein an encoder compresses (using CABAC) feature maps represented in a tensor of a CNN one block (a pixel and (features of) its neighbors) at a time to form a code sequence that is received by a decoder which performs decompression and feature map extraction/reconstruction using the inverse operation (i.e., using a decompression mode based on the compression mode) such that this decompression also occurs one block at a time (i.e., a pixel and features of its neighbors), wherein both the compression and decompression is performed independently over different blocks because they are performed distinctly and separately (i.e., each set of pixels is distinct from block to block and evaluated independently based on the context of the reference pixel in that set of pixels), wherein (as with Choi) each decompressed block is used to regenerate the tensor f of the extracted feature values (with dimension CxHxW), and wherein the deformatter is the function that performs a deformatting to integrate each block into the tensor (i.e., the reformulation/reconstruction of the tensor is a deformatting/formatting operation).)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Choi, Loganathan, and Aziz to incorporate the teachings of Luo to decode each encoded block of activation map values independently, using a decoding mode corresponding to the encoding mode, and deformat them into a tensor of activation maps. The modification would have been obvious because one of ordinary skill would have been motivated to achieve near state of the art image reconstruction from losslessly decoding block-by-block individual blocks using the inverse process used in the encoding process and reformat the decompressed values into the feature map tensor of values that are incorporated into a deep neural network to facilitate image reconstruction in a multi-computation-platform (e.g., server and client) environment (Luo, [p. 4, Section IIIB, pp. 6-7, Section IVC, Figure 6, Figure 7]). 

In regards to claim 8, the rejection of claim 1 is incorporated and Choi further teaches wherein the activation map includes … values, the system further comprising a quantizer circuit configured to quantize the … values of the activation map to be integer values.  ([pp. 2, Section II], In this work, the Q-layer performs uniform 8-bit quantization. Note that min(V) and max(V) need to be transferred to the cloud for the inverse Q process., wherein the tensor of feature values (activation map) is quantized to 8 bits prior to data compression such that this quantization is interpreted as a conversion from a tensor with at least 8 bits to integer values represented by 8 bits.
However, neither Choi nor Loganathan nor Aziz explicitly teaches floating-point … floating-point … Choi does not explicitly disclose what the data/bit format of the tensor is prior to quantization even though he suggests that the compression of 32-bit floating point feature values is wasteful (viz., [p. 1, Section I]).  Although Loganathan discloses various quantization functions including quantization to integer values that are encoded “bit-by-bit” (Section IIIC) , he does not clearly disclose a quantization of floating point block data elements into integer values.  Although Aziz discloses various quantization processes, he does not specifically disclose quantization into integer values.
However, Luo, in the analogous environment of lossless compression of activation/feature maps teaches wherein the activation map includes floating-point values, the system further comprising a quantizer module that quantizes the floating-point values of the activation map to be integer values.  ([p. 4, Section IIIB, p. 5, Section IIID, Figure 2a], Given the extracted tensor f(x) ∈ RC×H×W , before entropy coding the tensor, we first perform quantization. The feature tensor is optimally quantized to a lower bit precision B: <equation 1>  The quantization bin B we use here is 6 bit., The output of the feature extraction module is the feature map of an image, which contains the significant structure of the image., The input feature maps of semantic analysis module in pre-semantic DeepSIC are under floating point precision.
Wherein the feature/activation map tensor extracted from the CNN consists of floating point values and undergoes a quantization to integer values (8 bits) for the architecture paths leading to target/image reconstruction in the DeepSIC framework.)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Choi, Loganathan, and Aziz to incorporate the teachings of Luo to quantize floating point activation map values into integer values. The modification would have been obvious because one of ordinary skill would have been motivated to achieve near state of the art image reconstruction from losslessly encoding feature maps optimally quantized (when those feature maps are floating point) for deep neural network activation maps to facilitate multi-computation-platform (e.g., server and client) application of deep neural networks (Luo, [p. 4, Section IIIB, pp. 6-7, Section IVC, Figure 6, Figure 7]). 

Claim 15/14 is also rejected because it is just a method implementation of the same subject matter of claim 7/6 which can be found in Choi, Loganathan, Aziz, and Luo. 

Claim 16/9 is also rejected because it is just a method implementation of the same subject matter of claim 8/1 which can be found in Choi, Loganathan, Aziz, and Luo. 

In regards to claim 17, Choi teaches A method to losslessly decompress an activation map of a neural network, the method comprising: receiving at a decoder circuit, a bitstream representing at least one compressed block of values of the activation map, the bitstream of the at least one compressed block of value of the activation map comprising a reduced size of a memory used by the neural network compared to the bitstream being uncompressed blocks of values of the activation map; ([Abstract, pp. 4, Section IV, Figure 1, Figure 5, Figure 6], Collaborative intelligence is a new paradigm for efficient deployment of deep neural networks across the mobile cloud infrastructure. By dividing the network between the mobile and the cloud, it is possible to distribute the computational workload such that the overall energy and/or latency of the system is minimized. However, this necessitates sending deep feature data from the mobile to the cloud in order to perform inference. In this work, we examine the differences between the deep feature data and natural image data, and propose a simple and effective near-lossless deep feature compressor., To demonstrate this, we construct a mirror model, indicated in the bottom right of Fig. 4, based on the network in the mobile. Specifically, given the network in the mobile, the mirror model consists of the same number of layers, but in reverse order: convolutional layers from the mobile network are mapped to the same convolutional layers in the mirror model, while max-pooling layers from mobile network are mapped to up-sampling layers. The goal of the mirror model is to reconstruct the input image from the deep features transmitted to the cloud., wherein a decoder receives the compressed feature data (bit stream of encoded neural network activation map as shown in Figure 4) for the purpose of image reconstruction such that these compressed values correspond to the encoding of the blocks/tiles as pointed out above such that this compression framework creates a bitstream of encoded CNN feature data with reduced size (Figure 6) for transmission from one platform to another.)  decompressing by the decoder circuit the at least one compressed block of values to form … decompressed … values, the decompressed … values … blocks of the activation map using a decompression mode corresponding to a lossless compression mode used to compress the at least one block of values of the activation map, ([p. 4, Section IV, Figure 1, Figure 5], … a good approximation to the input image can be reconstructed from the transmitted features. To demonstrate this, we construct a mirror model, indicated in the bottom right of Fig. 4, based on the network in the mobile. Specifically, given the network in the mobile, the mirror model consists of the same number of layers, but in reverse order: convolutional layers from the mobile network are mapped to the same convolutional layers in the mirror model, while max-pooling layers from mobile network are mapped to up-sampling layers. The goal of the mirror model is to reconstruct the input image from the deep features transmitted to the cloud., wherein the decoder decompresses the compressed tiles to generate the compressed feature data (bit stream of encoded neural network activation map as shown in Figure 4) such that these compressed values correspond to the encoding of the blocks/tiles as pointed out above.)  the lossless compression mode being a …RemoveMin lossless compression encoding mode ([p. 2, Section II, pp. 3-4, Section III], Feature values are typically quantized using an n-bit uniform quantizer (Q-layer in Fig. 1) prior to lossless [3] or lossy [4] compression. <equation 1>, Prediction residuals for each 4 × 4 block are coded by CABAC. The first bit is the SKIP indicator. If the residual is all-zero, the SKIP indicator is set to 1 and the encoder moves to the next block. Otherwise, the SKIP indicator is set to 0 and residuals are coded using one of three scan orders: horizontal, vertical, and zig-zag. For the Ver (Hor) prediction mode, vertical (horizontal) scan order is used. Other modes use the zig-zag scan order. Locations of non-zero residuals are first indicated by binarizing the scanned block, with 1’s…. Finally, the non-zero residual values are coded in a manner similar to HEVC [15]: values larger than 1 or 2 are flagged, the flags are CABAC-coded, and the non-flagged values are binarized using exponential Golomb-Rice coding, then coded by CABAC., wherein exponential-Golomb encoding (interpreted to be a species of Golomb-Rice encoding) is used to encode each block as previously indicated, wherein CABAC with zero-encoding is employed if there are no residual values (i.e., SKIP =1 is encoded and the encoder moves to the next block if all residuals are zero), and wherein it is further noted that various parameters ( e.g., H,W, C values) associated with the feature/activation map tensor are encoded using fixed-length coding (also lossless/nearly lossless) to determine a palette vector that is used to perform block-specific encoding by associating an element of that vector with each tile/block, including the parameter min(V) which is the minimum of the value of the features determined prior to the removal of that minimum value (removemin operation) for quantization (equation 1).) and deformatting, by a deformatter circuit… into a tensor having a size of H x W x C in which H represents a height of the tensor, W represents a width of the tensor, and C represents a number of channels of the tensor, the tensor being the activation map that has been decompressed.  ([p. 2, Section II, p. 4, Section IV, Figure 4, Figure 5], Feature values are typically quantized …. where V ∈ R N×M×C is the feature tensor with N rows, M columns, and C channels at the point of split,  … In this work, the Q-layer performs uniform 8-bit quantization. Note that min(V) and max(V) need to be transferred to the cloud for the inverse Q process. The quantized features Ve are rearranged in a tiled image, as shown in Fig. 2., … a good approximation to the input image can be reconstructed from the transmitted features. To demonstrate this, we construct a mirror model, indicated in the bottom right of Fig. 4, based on the network in the mobile. Specifically, given the network in the mobile, the mirror model consists of the same number of layers, but in reverse order: convolutional layers from the mobile network are mapped to the same convolutional layers in the mirror model, while max-pooling layers from mobile network are mapped to up-sampling layers. The goal of the mirror model is to reconstruct the input image from the deep features transmitted to the cloud., wherein a decoder decodes/decompresses the block-specific encoded/compressed activation maps to form a mirror model of the CNN which incorporates the decoded feature maps such that this decoding process reforms the input feature values/activation map, and thereby the corresponding tensor with size HxWxC, in order to reconstruct the images represented by those feature values and wherein the re-formation of the tensor, as represented in the mirror model, includes the incorporation/formatting of each block into that model framework.) 
However, Choi does not explicitly teach … at least one … block of …, the decompressed block of … being independently decompressed from other blocks of values of  …Sparse-Exponential-Golomb …; … by a deformatter circuit  the at least one decompressed block of values…. In other words, although Choi suggests that the decoding process proceeds as an inverse encoding process such that each of the steps used to form the encoding (such as block-specific encoding) would be performed in reverse sequence, he does not explicitly disclose the details of the decompression function. Moreover, although Choi teaches exponential Golomb encoding, he does not explicitly disclose a “sparse” encoding mode – viz., Sparse-Exponential-Golomb encoding. In addition, although Choi teaches a removemin operation (as noted above with respect to equation 1) and associated fixed length encoding of min(V), the coding associated with min(V) (removeMin operation) is fixed length not exponential Goulomb. Finally, Choi teaches CABAC encoding of tensor blocks but does not disclose sufficient algorithmic details of CABAC encoding to associate it with the recited encoding mode.
However, Loganathan, in the analogous environment of lossless compression of images teaches … a lossless compression mode used to compress the at least one block of values of the activation map, the lossless compression mode being a Sparse-Exponential-Golomb…  lossless encoding mode … ([p. 697, Section II, p. 697, Section IIIB, p. 698, Section IIIE, Figure 2], In order to apply the random matrix more efficiently, the sparse components are needed to be regrouped and a block based sampling strategy is followed [4]. The sparse components are first divided into several groups by scaling and then reordering it into a number of vectors of the same dimension., After applying the 2D wavelet transform to the image, the resultant four bands LL, LH, HL and HH are to be encoded and transmitted. The transmitter block diagram of the proposed scheme is as shown in Fig. 2., In this case, m is determined as m = 2k . Therefore, only the parameter k must be specified to obtain m. This parameter k will also indicate the length of the suffix for the code. Exponential Golomb codes have three different parts which, once concatenated, produce the code [16]. Two intermediate values are used to build the code, f and w, which are calculated using equations (4) and (5) respectively. w(n)=1+n/2^k (4) f(n)=log2(1+n/2^k) (5) When the k value increases the number of bits needed to encode the non-negative decimal no increases. The practical implementation of the Exponential Golomb coding is dealt in detail in [16]. In the algorithm implementation, the coding of the zero value can be optimized by just writing 0 with k + 1 bits. If the value is not zero, then we must continue with the coding process., wherein an entropy-based encoder compresses each block (e.g., LL, LH, HL, and HH in Figure 2) independently (i.e., each block/group within the image is compressed separately/independently in a block based sampling strategy) using  jpeg encoding and sparse measurement encoding in which (for the latter encoding) sparse components of each (image) block to be encoded undergo Exponential Golomb Coding (interpreted as being sparse Exponential Golomb Encoding) followed by a run-length encoding in which a zero value is represented/encoded with k+1 bits (in a BRI sense, this is a “sparse” fixed length encoding scheme since k is pre-specified with this particular fixed length encoding applied to any “0” term).)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Choi to incorporate the teachings of Loganathan for the encoder to encode each block of an activation map using a coding scheme using  Sparse-Exponential-Golomb encoding. The modification would have been obvious because one of ordinary skill would have been motivated to achieve improved computational efficiency  with good performance/accuracy for (sparse) data with exponential probability distributions and large dispersions ([Loganathan, p. 698, Section IIIE, p. 701, Section VI]).
However, Choi and Loganathan do not explicitly teach the combination Sparse-Exponential-Golomb and RemoveMin .  Loganathan does not clearly disclose a remove minimum operation (quantization/binarization), parameters of which (e.g., min(V)) are encoded by Exponential-Golomb encoding.
However, Aziz, in the analogous environment of lossless compression of images teaches a lossless compression mode used to compress the at least one block of values of the activation map, the lossless compression mode being a …Exponential-Golomb-RemoveMin lossless encoding mode … ([p. 3483, Section 9.1, p. 3484, Section 9.3, p. 3485, Section 11], Binarization maps the non-binary valued SE into bin string, which is a sequence of binary decision (bin)…. Kth ordered Exp-Golomb binarization (EGk) – a derivative of Golomb coding is proved to be an optimal prefix-free coding for geometrically distributed sources. EGk codeword consists of prefix and suffix bin strings, with total length of 2l+k+1 bits. EGk prefix is a Unary codeword, with l bits of 1 and one terminating bit 0., Because Range and Low of coding interval are represented by finite number of bits (9 bits for Range and 10 bits for Low), it is necessary to renormalize (scale up) the interval to prevent precision degradation, and the upper bits of Low are output as coded bits during renormalization., Because of the high computational complexity of CABAC, another entropy coding tool CAVLC is deployed in the Baseline profile and extended profile of H.264/AVC targeting low bit-rate real-time video coding. It offers compression complexity trade-off with lower complexity, and lower coding efficiency, compared to CABAC. It is employed to encode the quantized transform coefficients of 4×4 residual blocks, while zero-order Exp-Golomb codes (EG0) are used for all other types of non-residual SEs…. Since the adoption of CABAC entropy coding in H.264/AVC, CABAC is also applied in many applications of image and video processing including motion mode and residual data of 3D dynamic mesh, prediction residual in lossless 4D medical image compression, SEs of 8×8 transform coefficients of AVS coding standard, motion vector coding of scalable video coder, parameters of depth and correction vectors in multi-view video coding., wherein an entropy-based encoder (either CABAC or CAVLC) compresses (image/video) blocks in which Exponential-Golomb is used to encode renormalized intervals for the features in those block (to prevent precision degradation – interpreted as corresponding to a removemin operation in which specifically the “low” parameter is encoded during CABAC using kth order Exp-Golomb binarization)  and (alternatively) is also used to specifically encode all of the non-residual serial elements (SE) during CAVLC encoding (in the non-residual SEs are being interpreted as including the minimum value such as encoded using fixed length encoding by Choi) and wherein it is noted that these entropy encoders (especially CABAC) are lossless.) 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Choi and Loganathan  to incorporate the teachings of Aziz for the encoder to encode each block of an activation map using a coding scheme using  Sparse-Exponential-Golomb encoding. The modification would have been obvious because one of ordinary skill would have been motivated to achieve improved computational efficiency  with optimal performance/accuracy/efficiency for (multi-dimensional) data with geometric distributions including the encoding of normalization/renormalization functionalities such as using CAVLC for low bit rate data  (Aziz, [Abstract, p. 3483, Section 9.1, p. 3485, Section 11]).
However, Choi, Loganathan, and Aziz do not explicitly teach … at least one … block of …, the decompressed block of … being independently decompressed from other … by a deformatter circuit the at least one decompressed block of values…. Although Loganathan teaches a decoding/decompression process that performs a sequence of operations that correspond to the reverse of the sequence of encoding operations (e.g., Figure 4 shows this decoding process that includes Exponential-Golomb decoding for the sparse component), he does not explicitly teach that this image restoration process (at the receiver) is applied to restore a tensor representation of the blocks of data. Although Loganathan teaches the use of sparse Exponential Golomb encoding (as well as quantization), he does not disclose an application in a context that includes a removemin operation (e.g., used as a prelude to quantization/binarization). Although Aziz discusses the encoding process in detail, he does not provide details associated with the decoding process sufficient to discern the independence of the decompression of blocks or a deformatting functionality.
However, Luo, in the analogous environment of lossless compression of activation/feature maps teaches decompressing by the decoder module the at least one compressed block of values to form at least one decompressed blocks of values, the decompressed block of values being independently decompressed from other blocks of the activation map using at least one decompression mode corresponding to at least one lossless compression mode used to compress the at least one block; and deformatting by a deformatter circuit the at least one block of values into a tensor having a size of H x W x C in which H represents a height of the tensor, W represents a width of the tensor, and C represents a number of channels of the tensor, the tensor being the decompressed activation map.  ([p. 4, Section IIIB, p. 4, Section IIIC, Figure 2], We exploit this low entropy by lossless compression via entropy coding, to be specific, we implement an entropy coding based on the context-adaptive binary arithmetic coding (CABAC) framework proposed by [20]. Arithmetic entropy codes are designed to compress discrete-valued data to bit rates closely approaching the entropy of the representation, assuming that the probability model used to design the code approximates the data well. We associate each bit location in Q(f(x)) with a context, which comprises a set of features indicating the bit value. These features are based on the position of the bit as well as the values of neighboring bits…. Given y = Q(f(x)) denotes the quantized code, after entropy encoding y into its binary representation yˆ, we retrieve the compression code sequence. During decoding, we decompress the code by performing the inverse operation. Namely, we interleave between computing the context of a particular bit using the values of previously decoded bits. The obtained context is employed to retrieve the activation probabilitx of the bit to be decoded. Note that this constrains the context of each bit to only involve features composed of bits already decoded., Although arithmetic entropy encoding is lossless, the quantization will bring in some loss in accuracy, the result of Q−1 (Q(f(x)) is not exactly the same as the output of feature extraction., wherein an encoder compresses (using CABAC) feature maps represented in a tensor of a CNN one block (a pixel and (features of) its neighbors) at a time to form a code sequence that is received by a decoder which performs decompression and feature map extraction/reconstruction using the inverse operation (i.e., using a decompression mode based on the compression mode) such that this decompression also occurs one block at a time (i.e., a pixel and features of its neighbors), wherein both the compression and decompression is performed independently over different blocks because they are performed distinctly and separately (i.e., each set of pixels is distinct from block to block and evaluated independently based on the context of the reference pixel in that set of pixels), wherein (as with Choi) each decompressed block is used to regenerate the tensor f of the extracted feature values (with dimension CxHxW), and wherein the deformatter is the function that performs a deformatting to integrate each block (pixel and neighboring pixel features) into the tensor (i.e., the reformulation/reconstruction of the tensor is a deformatting/formatting operation).)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Choi, Loganathan, and Aziz to incorporate the teachings of Luo to decode each encoded block of activation map values independently, using a decoding mode corresponding to the encoding mode, and deformat them into a tensor of activation maps. The modification would have been obvious because one of ordinary skill would have been motivated to achieve near state of the art image reconstruction from losslessly decoding block-by-block individual blocks using the inverse process used in the encoding process and reformat the decompressed values into the feature map tensor of values that are incorporated into a deep neural network to facilitate image reconstruction in a multi-computation-platform (e.g., server and client) environment (Luo, [p. 4, Section IIIB, pp. 6-7, Section IVC, Figure 6, Figure 7]). 

In regards to claim 19, the rejection of claim 17 is incorporated and Choi further teaches further comprising: receiving at a formatter circuit at least one activation map configured as a tensor having a tensor size of HxWxC; formatting by the formatter circuit, the tensor of the received at least one activation map into at least one block of values; ([p. 2, Section II, Figure 2], Feature values are typically quantized using an n-bit uniform quantizer (Q-layer in Fig. 1) prior to lossless [3] or lossy [4] compression. Ve = round V − min(V) max(V) − min(V) · (2n − 1) (1) where V ∈ R N×M×C is the feature tensor with N rows, M columns, and C channels at the point of split, Ve is the quantized feature tensor, and min(V) and max(V) are the minimum and maximum value in V, respectively…. The quantized features Ve are rearranged in a tiled image, as shown in Fig. 2., wherein a tensor with dimension N rows (corresponding to height H) x M columns (corresponding to W width) x C channels is quantized and processed (received and acted upon by a formatter) for rearrangement (formatting) in a tiled image (Figure 2) such that the resulting tiles are blocks formed from the output of the “formatter”.) and compressing by an encoder circuit the at least one block of values independently from other blocks of the tensor of the at least one received activation map using the at least one lossless compression mode to reduce the size of the memory used by the neural network. Patent Application Page 21 of 23 Attorney Docket No. 1535-406 ([pp. 3-4, Section III, Figure 6], Before coding the quantized feature data, the following parameters are encoded directly using fixed-length coding: dimensions of the feature tensor, min(V) and max(V) (32- bit each) and the eight most frequent feature values, mi for i = 0, 1, ...7. A vector of these values, p = (p0, p1, ..., p7), is referred to as the palette vector.  Initially, the palette vector is sorted according to the frequency of these values in the first tile, so that p0 is the most frequent of the mi’s in the first tile, p1 is the next most frequent, etc. As we move to other tiles, the palette vector p = (p0, p1, ..., p7) is re-sorted according to the frequency of occurrence of mi’s up to the previously coded tile, so that p0 is the most frequent mi up to that point, and so on. At the tile boundary, once p is updated, one element of p is chosen to minimize the mean absolute difference (MAD) from the feature values in the to-be-coded tile. … The most frequently used mode among them is considered the mpm. If the current block’s mode is the same as mpm, bit 1 is coded by CABAC [15] to indicate it. Otherwise, bit 0 is coded, followed by two bits to indicate the mode. Prediction residuals for each 4 × 4 block are coded by CABAC. The first bit is the SKIP indicator. If the residual is all-zero, the SKIP indicator is set to 1 and the encoder moves to the next block. Otherwise, the SKIP indicator is set to 0 and residuals are coded using one of three scan orders: horizontal, vertical, and zig-zag. For the Ver (Hor) prediction mode, vertical (horizontal) scan order is used. Other modes use the zig-zag scan order. Locations of non-zero residuals are first indicated by binarizing the scanned block, with 1’s. …. . Finally, the non-zero residual values are coded in a manner similar to HEVC [15]: values larger than 1 or 2 are flagged, the flags are CABAC-coded, and the non-flagged values are binarized using exponential Golomb-Rice coding, then coded by CABAC., wherein a current tile/block is separately/independently encoded using CABAC and zero-encoding (corresponding to the condition SKIP =1) if there are no non-zero residuals or through HEVC (High Efficiency Video Coding) with CABAC encoding if the residual values are larger than 1 or 2 or using HEVC with Golomb-Rice coding and CABAC if the non-zero residual values are not larger than 1 or 2 such that this compression reduces the size (number of bits) of memory associated with the neural network feature data (i.e., memory used by a neural network), wherein it is noted that other encoding modes are also used in this process (e.g., three scan orders or to code residuals) with the independence of the encoding of different blocks also indicated by the association of a most probable mode with that particular block, and wherein it is further noted that various parameters associated with the feature/activation map tensor are encoded using fixed-length coding to determine a palette vector that is used to perform block-specific encoding by associating an element of that vector with each tile/block; it is further noted that the reduction of size of the memory is an intended and must be given little patentable weight.).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Choi to incorporate the teachings of Loganathan, Aziz, and Luo for the same reasons as pointed out for claim 17.

In regards to claim 20, the rejection of claim 19 is incorporated and Choi further teaches wherein the at least one lossless compression mode selected to compress the at least one block of values is different from a lossless compression mode selected to compress another block of the tensor of the received at least one activation map, ([pp. 3-4, Section III], Prediction residuals for each 4 × 4 block are coded by CABAC. The first bit is the SKIP indicator. If the residual is all-zero, the SKIP indicator is set to 1 and the encoder moves to the next block. Otherwise, the SKIP indicator is set to 0 and residuals are coded using one of three scan orders: horizontal, vertical, and zig-zag. For the Ver (Hor) prediction mode, vertical (horizontal) scan order is used. Other modes use the zig-zag scan order. Locations of non-zero residuals are first indicated by binarizing the scanned block, with 1’s…. Finally, the non-zero residual values are coded in a manner similar to HEVC [15]: values larger than 1 or 2 are flagged, the flags are CABAC-coded, and the non-flagged values are binarized using exponential Golomb-Rice coding, then coded by CABAC., wherein distinct encoding/compression modes are invoked/selected in the form of exponential-Golomb encoding (interpreted to be a species of Golomb-Rice encoding) or  zero-encoding (with CABAC) depending on if there are any residual values (e.g., SKIP =1 is encoded and the encoder moves to the next block if all residuals are zero).) and wherein compressing the at least one block of values further comprises compressing by the encoder circuit the at least one block independently from other blocks of the tensor of the received at least one activation map using a plurality of the lossless compression modes. ([pp. 3-4, Section III], Prediction residuals for each 4 × 4 block are coded by CABAC. The first bit is the SKIP indicator. If the residual is all-zero, the SKIP indicator is set to 1 and the encoder moves to the next block. Otherwise, the SKIP indicator is set to 0 and residuals are coded using one of three scan orders: horizontal, vertical, and zig-zag. For the Ver (Hor) prediction mode, vertical (horizontal) scan order is used. Other modes use the zig-zag scan order. Locations of non-zero residuals are first indicated by binarizing the scanned block, with 1’s…. Finally, the non-zero residual values are coded in a manner similar to HEVC [15]: values larger than 1 or 2 are flagged, the flags are CABAC-coded, and the non-flagged values are binarized using exponential Golomb-Rice coding, then coded by CABAC., wherein zero-encoding and exponential-Golomb encoding  are two different (nearly lossless overall but lossless after the quantization) coding/compression techniques/modes applied distinctly and independently to each tile/block as pointed out above.)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Choi to incorporate the teachings of Loganathan, Aziz, and Luo for the same reasons as pointed out for claim 17.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Sze et al. (“High Throughput CABAC Entropy Coding in HEVC”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, No. 12, December 2012, pp. 1778-1791) teach CABAC entropy encoding including Exp-Golomb-based binarization encoding modes.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ROBERT LEWIS KULP whose telephone number is (571)272-7983. The examiner can normally be reached M, Th, F 8-5:30; Tu 8-3.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang, can be reached on 571-270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ROBERT LEWIS KULP/Examiner, Art Unit 2124                                                                                                                                                                                                        
/LUIS A SITIRICHE/Primary Examiner, Art Unit 2126