DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to the application filed 7/18/2019.
Claims 1-10 are pending and have been examined.

Priority
The Examiner has noted applicant’s claim for foreign priority based on Taiwanese application serial no. 108118062, filed on May 24, 2019. 
The examiner acknowledges that a certified copy of Taiwanese application serial no. 108118062 was received on 9/24/2019, as required by 37 CFR 1.55.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 7/18/2019 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement has been placed in the application file and the information referred to therein has been considered by the examiner.

Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference characters not mentioned in the description: 
Reference character d shown in Figures 3 and 4 is not found in the detailed description.
Reference character cp1 shown in Figure 4 is not found in the detailed description.
Reference characters cp4 shown in Figure 5B are not found in the detailed description.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Specification
The disclosure is objected to because of the following informalities:
Reference character d shown in Figures 3 and 4 is not described in applicant’s specification with reference to those drawings (see, e.g., paragraphs 22-23 describing FIGs. 3 and 4). Appropriate correction is required.
Reference character cp1 shown in Figure 4 is not described in applicant’s specification with reference to that drawing (see, e.g., paragraph 23 describing FIG. 4). Appropriate correction is required.
Reference characters cp4 shown in Figure 5B are not described in applicant’s specification with reference to that drawing (see, e.g., paragraph 28 describing FIG. 5B). Appropriate correction is required.
The specification is also objected to as failing to provide proper antecedent basis for the claimed subject matter. See 37 CFR 1.75(d)(1) and MPEP § 608.01(o). Correction of the following is required: claim 10 does not appear to have support in the originally filed specification. There does not appear to be any discussion of “selecting one of the compression modes according to the identification code in the final commutative result to decompress the final commutative result through the selected compression mode”. In particular, the specification is silent regarding any “commutative result” or “final commutative result”, let alone the above-noted limitation of claim 10. Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
 (b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


Claims 1-10 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
In independent claims 1 and 6, the respective recitations of “grouping every at least two of the plurality of neural network parameters into one of a plurality of encoding combinations” and “group every at least two of the plurality of neural network parameters into one of a plurality of coding combinations” (see, lines 4-5 of claim 1 and lines 6-7 of claim 6) are unclear. These recitations are grammatically incorrect and appear to either be missing one or more words, and/or have an extraneous word (e.g., “every”). In particular, it is unclear whether the recitations require grouping “every” [i.e., every one, or all] or only “at least two” “of the plurality of neural network parameters into one of a plurality of coding combinations”. For examination purposes, the recitations of grouping and group “every at least two of the plurality of neural network parameters into one of a plurality of encoding combinations” have been interpreted as grouping “at least two of the plurality of neural network parameters into one of a plurality of encoding combinations”. Appropriate correction is required.
Claim 10 recites “the final commutative result to decompress the final commutative result” in lines 3-4. There is insufficient antecedent basis for this limitation in this claim. Applicant did not previously introduce any “final commutative result” or any other “commutative result” in this claim, its base claim, claim 6, or in intervening claims 7 and 9. The examiner further notes that applicant’s specification is silent regarding any “commutative result”. Applicant previously introduced “the final compression result” in claim 9. For examination purposes, recitations of “the final commutative result” in claim 10 have been interpreted as the previously-introduced “the final compression result”. Appropriate correction is required.
Also, claims 2-5 and 7-10, which each depend directly or indirectly from claims 1 and 6, respectively are rejected under 35 U.S.C. 112(b) as being indefinite under the same rationale as claims 1 and 6.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-3 and 6-8 are rejected under 35 U.S.C. 103 as being unpatentable over Ahmad (U.S. Patent Application Pub. No. 2019/0044535 A1, hereinafter “Ahmad”) in view of non-patent literature Chen, et al. ("Compressing neural networks with the hashing trick." International conference on machine learning. PMLR, 2015, hereinafter “Chen”).
With respect to claim 1, Ahmad discloses the invention as claimed including a compression method for neural network parameters (see, e.g., Abstract, paragraphs 22, 33 and 48 and claim 1, “A method for implementing compressed parameters”, “the present disclosure may apply to any suitable learned parameter system (e.g., Convolution Neural Networks”, “Deep Neural Networks may have runs of parameter values” [i.e., neural network parameters], “methods relate to embodiments for improving implementation efficiency of learned parameter systems 100 implemented via integrated circuits 202 by efficiently compressing system parameters 308”, “A method for implementing compressed parameters of a learned parameter system” [i.e., a method for compression of neural network parameters]), comprising:
obtaining a plurality of neural network parameters, wherein the plurality of neural network parameters are used for a neural network algorithm (see, e.g., paragraphs 21, 29 and 31 and claim 13, “the present disclosure may apply to any suitable learned parameter system (e.g., Convolution Neural Networks, Neuromorphic systems, Spiking Networks, Deep Learning Systems, and the like).”, “network file server 208A may receive parameters 308”, “The parameters 308 may be transferred to a portion 312 of the FPGA 202A programmed to implement the Deep Neural Network”, “obtain the parameters of the learned parameter system” [i.e., receive/obtain neural network parameters that are used to implement a neural network algorithm/program]);
grouping every at least two of the plurality of neural network parameters into one of a plurality of encoding combinations (as indicated above, “grouping every at least two of the plurality of neural network parameters into one of a plurality of encoding combinations” has been interpreted as grouping “at least two of the plurality of neural network parameters into one of a plurality of encoding combinations”) (see, e.g., paragraphs 19, 31, 40 and 43, “learned parameter system 100 may apply the parameters to inputs … Different sets of parameters may be employed”, “the set of parameters 308 (e.g., parameters associated with identifying a car).” [i.e., grouping at least two sets of learned/neural network parameters into combinations], “determining runs within the parameter sequence that have run lengths greater than a run length threshold (process block 452). The host CPU 204A or a tool chain 310 may then encode and compress the system parameters 308, for example, using run-length encoding”, “a set of parameters 308. The input may include one or more runs of similar parameter values. Standard run-length encoding may designate a run value and run length for each run” [i.e., group at least two sets of the neural network parameters 308 into an encoding combination/run of sets of encoding combinations/runs]) … ; and
compressing the plurality of encoding combinations with a same compression target bit number (paragraphs 25 and 27-28 of the specification state “The compression target bit number is the total bit number of the final compression result of each encoding combination.” and “The compression target bit number of the embodiment is 25 bits, that is, the total bit number of a compression result cp3 after encoding compression of each of the compression combinations G1 and G2 is 25 bits. The computing system 100 may also provide other compression target bit numbers / compression ratios”. Therefore, a “compression target bit number”, under the broadest reasonable interpretation (BRI), in light of the specification, is any desired/target/goal total bit precision/number or compression factor/ratio) (see, e.g., paragraph 45, “a sequence of parameters that is compressed using IEEE 754 run-length encoding … sequence of parameters 308 may be stored in floating point format under IEEE 754, which divides the floating-point value into a sign field, exponent field, and mantissa field. In particular, the sign field may be a high bit value or a low bit value … The mantissa field may store to the precision bits of the floating-point number. Although shown at a 16-bit number in this example, the floating-point bit precision may be 8, 32, or 64 bits.” [i.e., compressing the encoding combinations/sequences of parameters with a same compression bit target bit number – 8, 16, 32 or 64]).
Although Ahmad substantially discloses the claimed invention, Ahmad is not relied on to explicitly disclose wherein a number of the plurality of neural network parameters is the same in each of the plurality of encoding combinations and 
wherein each of the plurality of encoding combinations is compressed independently and the compression target bit number is not larger than a bit number of each of the plurality of encoding combinations.
In the same field, analogous art Chen teaches wherein a number of the plurality of neural network parameters is the same in each of the plurality of encoding combinations (see, e.g., FIG. 1 – illustrating “a neural network with random weight sharing under compression factor 1/4 .The 16+9 = 24 virtual weights are compressed into 6 real weights” in matrices w1 and w2 [i.e., a plurality of compression/encoding combinations of neural network weights/parameters in matrices w1 and w2, where the number of weights/parameters, 3 in FIG. 1, is the same in each of the combinations/matrices], Abstract and pages 2 and 3, sections 1 and 4.1, “group connection weights into hash buckets, and all connections within the same hash bucket share a single parameter value.”, “group network connections into hash buckets uniformly at random such that all connections grouped to the ith hash bucket share the same weight value wi. … tune the hash bucket parameters and take into account the random weight sharing within the neural network architecture.”, “We only allow exactly Kl different weights to occur within Vl, which we store in a weight vector wl ϵ RKl . … Connections are randomly grouped into three categories per layer and their weights are shown in the virtual weight matrices V1 and V2. Connections belonging to the same color share the same weight value, which are stored in w1 and w2, respectively … i.e. the 24 weights stored in the virtual matrices V1 and V2 are reduced to only six real values in w1 and w2.” [i.e., a same/uniform number, Kl, of connection weights/neural network parameters are grouped into buckets/encoding combinations/matrices wl – see matrices w1 and w2 in FIG. 1 with 3 weights/parameters each]) and 
wherein each of the plurality of encoding combinations is compressed independently and the compression target bit number is not larger than a bit number of each of the plurality of encoding combinations (as indicated above, the “compression target bit number”, under the BRI, in light of the specification, is any desired/target/goal total bit precision/number or compression factor/ratio) (see, e.g., FIG. 1 – illustrating “a neural network with random weight sharing under compression factor 1/4 .The 16+9 = 24 virtual weights are compressed into 6 real weights” in matrices w1 and w2 [i.e., each of the compression/encoding combinations/matrices w1 and w2 is compressed independently, the compression target bit number/factor/ratio is 1/4] and pages 3-4 and 6-7, sections 4.1, 4.3 and 6, “Overall, the entire network is compressed by a factor ¼, i.e. the 24 weights stored in the virtual matrices V1 and V2 are reduced to only six real values in w1 and w2.”, “the technique of HashedNets could naturally be extended to other kinds of neural networks, such as recurrent neural networks … It can also be used in conjunction with other approaches for neural network compression. All weights can be stored with low bit precision”, “for a network with a single hidden layer of 1000 units and a storage compression factor of 1/10 , we adopt a size-equivalent baseline with a single hidden layer of 100 units. … The distilled model structure is chosen to be same as the ‘equivalent-sized’ network (NN) at the corresponding compression rate. … We use 32 bit precision throughout but note that the compression rates of all methods may be improved with lower precision” [i.e., the compression target bit number/ratio/factor is not larger than a bit number/precision of the encoding combinations w1 and w2]).
Ahmad and Chen are analogous art because they are both directed to compressing neural network parameters by “implementing compressed parameters” for neural networks “by efficiently compressing the parameters” (See, e.g., Ahmad, Abstract and paragraph 17) and "Compressing Neural Networks with the Hashing Trick” by “parameter hashing” to achieve “model compression for neural networks” (see, e.g., Chen, page 1, title, page 2, section 1 and page 8, section 7).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Chen’s “novel network architecture, HashedNets” in order “to randomly group connection weights into hash buckets, and all connections within the same hash bucket share a single parameter value” where neural network “parameters are tuned to adjust to the HashedNets weight sharing architecture”, as taught by Chen, to the disclosed invention of Ahmad (see, e.g., Chen, Abstract).
One of ordinary skill in the art would have been motivated to make this modification to Ahmad to use Chen’s architecture “that exploits inherent redundancy in neural networks to achieve drastic reductions in model sizes” where Chen’s “hashing procedure introduces no additional memory overhead, and … HashedNets shrink[s] the storage requirements of neural networks substantially while mostly preserving generalization performance”, as suggested by Chen (see, e.g., Chen, Abstract).

With respect to independent claim 6, Ahmad discloses the invention as claimed including a computing system for neural network parameters (see, e.g., Abstract and paragraphs 22, 33 and 48, “Systems and methods of the present disclosure may improve operation efficiency of learned parameter systems … implementing compressed parameters, via a processor coupled to the integrated circuit” [i.e., a computing system], “the present disclosure may apply to any suitable learned parameter system (e.g., Convolution Neural Networks”, “Deep Neural Networks may have runs of parameter values” [i.e., neural network parameters], “embodiments for improving implementation efficiency of learned parameter systems 100 implemented via integrated circuits 202 by efficiently compressing system parameters 308”, “systems and methods relate to embodiments for improving implementation efficiency of learned parameter systems” [i.e., a computing system for neural network parameters]), comprising:
a memory; and
a processor coupled to the memory and configured to execute (see, e.g., paragraph 25, “host processor(s) 204 may communicate with the memory and/or storage circuitry 206, which may be a tangible, non-transitory, machine-readable-medium, such as random-access memory (RAM), read-only memory (ROM), …The memory and/or storage circuitry 206 may hold data to be processed by the data processing system 200, such as processor-executable control software, configuration software” [i.e., a memory 206 and processor 204 communicatively coupled to the memory and configured to execute software]):
obtaining a plurality of neural network parameters, wherein the plurality of neural network parameters are used for a neural network algorithm (see, e.g., paragraphs 21, 29 and 31 and claim 13, “the present disclosure may apply to any suitable learned parameter system (e.g., Convolution Neural Networks, Neuromorphic systems, Spiking Networks, Deep Learning Systems, and the like).”, “network file server 208A may receive parameters 308”, “The parameters 308 may be transferred to a portion 312 of the FPGA 202A programmed to implement the Deep Neural Network”, “obtain the parameters of the learned parameter system” [i.e., receive/obtain neural network parameters that are used to implement a neural network algorithm/program]);
group every at least two of the plurality of neural network parameters into one of a plurality of coding combinations, wherein a number of the plurality of neural network parameters is the same in each of the plurality of encoding combinations (as indicated above, “group every at least two of the plurality of neural network parameters into one of a plurality of encoding combinations” has been interpreted as grouping “at least two of the plurality of neural network parameters into one of a plurality of encoding combinations”) (see, e.g., paragraphs 19, 31, 40 and 43, “learned parameter system 100 may apply the parameters to inputs … Different sets of parameters may be employed”, “the set of parameters 308 (e.g., parameters associated with identifying a car).” [i.e., group at least two sets of learned/neural network parameters into combinations], “determining runs within the parameter sequence that have run lengths greater than a run length threshold (process block 452). The host CPU 204A or a tool chain 310 may then encode and compress the system parameters 308, for example, using run-length encoding”, “a set of parameters 308. The input may include one or more runs of similar parameter values. Standard run-length encoding may designate a run value and run length for each run” [i.e., group at least two sets of the neural network parameters 308 into an encoding combination/run of sets of encoding combinations/runs]);
compressing the plurality of encoding combinations with a same compression target bit number (as indicated above, a “compression target bit number”, under the BRI, in light of the specification, is any desired/target/goal total bit precision/number or compression factor/ratio) (see, e.g., paragraph 45, “a sequence of parameters that is compressed using IEEE 754 run-length encoding … sequence of parameters 308 may be stored in floating point format under IEEE 754, which divides the floating-point value into a sign field, exponent field, and mantissa field. In particular, the sign field may be a high bit value or a low bit value … The mantissa field may store to the precision bits of the floating-point number. Although shown at a 16-bit number in this example, the floating-point bit precision may be 8, 32, or 64 bits.” [i.e., compressing the encoding combinations/sequences of parameters with a same compression bit target bit number – 8, 16, 32 or 64]).… ; and
storing a plurality of compression results of the plurality of encoding combinations in the memory (see, e.g., Abstract and paragraph 38, “the method may include storing the parameters of the run in a compressed form into memory associated with the integrated circuit such that the integrated circuit may retrieve the parameters of the run in the compressed form”, “compressor 408 may use IEEE 754 run-length encoding to encode the results prior to transmitting and storing the results in memory”).

Although Ahmad substantially discloses the claimed invention, Ahmad is not relied on to explicitly disclose wherein each of the plurality of encoding combinations is compressed independently and the compression target bit number is not larger than a bit number of each of the plurality of encoding combinations.
In the same field, analogous art Chen teaches wherein each of the plurality of encoding combinations is compressed independently and the compression target bit number is not larger than a bit number of each of the plurality of encoding combinations (as indicated above, the “compression target bit number”, under the BRI, in light of the specification, is any desired/target/goal total bit precision/number or compression factor/ratio) (see, e.g., FIG. 1 – illustrating “a neural network with random weight sharing under compression factor 1/4 .The 16+9 = 24 virtual weights are compressed into 6 real weights” in matrices w1 and w2 [i.e., each of the compression/encoding combinations/matrices w1 and w2 is compressed independently, the compression target bit number/factor/ratio is 1/4] and pages 3-4 and 6-7, sections 4.1, 4.3 and 6, “Overall, the entire network is compressed by a factor ¼, i.e. the 24 weights stored in the virtual matrices V1 and V2 are reduced to only six real values in w1 and w2.”, “the technique of HashedNets could naturally be extended to other kinds of neural networks, such as recurrent neural networks … It can also be used in conjunction with other approaches for neural network compression. All weights can be stored with low bit precision”, “for a network with a single hidden layer of 1000 units and a storage compression factor of 1/10 , we adopt a size-equivalent baseline with a single hidden layer of 100 units. … The distilled model structure is chosen to be same as the ‘equivalent-sized’ network (NN) at the corresponding compression rate. … We use 32 bit precision throughout but note that the compression rates of all methods may be improved with lower precision” [i.e., the compression target bit number/ratio/factor is not larger than a bit number/precision of the encoding combinations w1 and w2]).
Ahmad and Chen are analogous art because they are both directed to compressing neural network parameters by “implementing compressed parameters” for neural networks “by efficiently compressing the parameters” (See, e.g., Ahmad, Abstract and paragraph 17) and "Compressing Neural Networks with the Hashing Trick” by “parameter hashing” to achieve “model compression for neural networks” (see, e.g., Chen, page 1, title, page 2, section 1 and page 8, section 7).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Chen’s “novel network architecture, HashedNets” in order “to randomly group connection weights into hash buckets, and all connections within the same hash bucket share a single parameter value” where neural network “parameters are tuned to adjust to the HashedNets weight sharing architecture”, as taught by Chen, to the disclosed invention of Ahmad (see, e.g., Chen, Abstract).
One of ordinary skill in the art would have been motivated to make this modification to Ahmad to use Chen’s architecture “that exploits inherent redundancy in neural networks to achieve drastic reductions in model sizes” where Chen’s “hashing procedure introduces no additional memory overhead, and … HashedNets shrink[s] the storage requirements of neural networks substantially while mostly preserving generalization performance”, as suggested by Chen (see, e.g., Chen, Abstract).

Regarding claims 2 and 7, as discussed above, Ahmad in view of Chen teaches the method of claim 1 and the system of claim 6.
Although Ahmad substantially discloses the claimed invention, Ahmad is not relied on to explicitly disclose compressing each of the plurality of encoding combinations through a plurality of different compression modes; and
selecting one of a plurality of compression results of the plurality of compression modes as a final compression result of the encoding combination.
In the same field, analogous art Chen teaches compressing each of the plurality of encoding combinations through a plurality of different compression modes (see, e.g., Tables 1 and 2 – showing “error rates (in %) with a compression factor of 1/8 across all data sets.” and “error rates (in %) with a compression factor of 1/64 across all data sets.” and pages 7-8, section 6, “Results with varying compression. Figures 2 and 3 show the performance of all methods on MNIST and the ROT variant with different compression factors on 3-layer (1 hidden layer) and 5-layer (3 hidden layers) neural networks”, “HashNet and HashNetDK maintain performance for small compression factors.”, “different compression methods perform differently.” [i.e., compressing each of the data sets/encoding combinations through a plurality of different compression modes/methods/factors]); and
selecting one of a plurality of compression results of the plurality of compression modes as a final compression result of the encoding combination (see, e.g., Tables 1 and 2 comparing “error rates (in %) with a compression factor of 1/8 across all data sets.” and “error rates (in %) with a compression factor of 1/64 across all data sets. Best results are printed in blue.” and pages 2 and 7-8, sections 1 and 6, “we also show that for a finite set of parameters it is beneficial to ‘inflate’ the network architecture by reusing each parameter value multiple times. Best results are achieved when networks are inflated by a factor 8-16x.”, “The size-matched NN is consistently the best performing baseline, however its test error is significantly higher than that of HashNet especially at small compression rates. … Of all methods, only HashNet and HashNetDK maintain performance for small compression factors.”[i.e., selecting best results/performance of the compression modes/methods as a final compression result to be subsequently inflated/decompressed]). 
Ahmad and Chen are analogous art because they are both directed to compressing neural network parameters by “implementing compressed parameters” for neural networks “by efficiently compressing the parameters” (See, e.g., Ahmad, Abstract and paragraph 17) and "Compressing Neural Networks with the Hashing Trick” by “parameter hashing” to achieve “model compression for neural networks” (see, e.g., Chen, page 1, title, page 2, section 1 and page 8, section 7).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Chen’s “novel network architecture, HashedNets” in order “to randomly group connection weights into hash buckets, and all connections within the same hash bucket share a single parameter value” where neural network “parameters are tuned to adjust to the HashedNets weight sharing architecture”, as taught by Chen, to the disclosed invention of Ahmad (see, e.g., Chen, Abstract).
One of ordinary skill in the art would have been motivated to make this modification to Ahmad to use Chen’s architecture “that exploits inherent redundancy in neural networks to achieve drastic reductions in model sizes” where Chen’s “hashing procedure introduces no additional memory overhead, and … HashedNets shrink[s] the storage requirements of neural networks substantially while mostly preserving generalization performance”, as suggested by Chen (see, e.g., Chen, Abstract).

Regarding claims 3 and 8, as discussed above, Ahmad in view of Chen teaches the method of claim 2 and the system of claim 7.
Although Ahmad substantially discloses the claimed invention, Ahmad is not relied on to explicitly disclose selecting according to compression distortions of the plurality of compression modes.
In the same field, analogous art Chen teaches selecting according to compression distortions of the plurality of compression modes (paragraph 44 of the specification states “The compression distortion of 249 of the quantization compression method is less than the compression distortion of 3265 of the average compression method (the example uses the sum of compression errors as the compression distortion)”. Therefore, “compression distortions”, under the BRI, in light of the specification, are any distortions or errors/error rates associated with compression methods/modes) (see, e.g., FIGs. 1-2 and Tables 1-2 – showing different “error rates under varying compression factors” and “error rates (in %) with a compression factor of 1/8” and “1/64 across all data sets. Best results are printed in blue.” and pages 7-8, section 6, “The size-matched NN is consistently the best performing baseline, however its test error is significantly higher than that of HashNet especially at small compression rates. … Figure 4 shows the test error rate under various expansion rates of a network … methods improve over the fixed-sized neural network. There is a general trend that more expansion decreases the test error until a ‘sweet-spot’ … The test error of the HashNet neural network decreases substantially” [i.e., selecting a sweet-spot according to error rates/distortion of the compression modes/methods]).
The motivation to combine Ahmad and Chen is the same as discussed above with respect to claims 2 and 7.

Claims 4-5 and 9-10 are rejected under 35 U.S.C. 103 as being unpatentable over Ahmad in view of Chen as applied to claims 1-2 and 6-7 above, and further in view of Avery et al. (U.S. Patent Application Pub. No. 2001/0038347 A1, hereinafter “Avery”).
Regarding claims 4 and 9, as discussed above, Ahmad in view of Chen teaches the method of claim 2 and the system of claim 7.
Although Ahmad in view of Chen substantially teaches the claimed invention, Ahmad in view of Chen is not relied on to teach appending an identification code corresponding to a selected compression mode to the final compression result. 
In the same field, analogous art Avery teaches appending an identification code corresponding to a selected compression mode to the final compression result (paragraph 33 of the specification states “In order to identify the type of compression mode adopted by the final compression result, the processor 130 appends an identification code corresponding to the selected compression mode to the final compression result. The identification code may be one or more identification bits and is located in the first bit or another specific bit of the final compression result.” Therefore, “an identification code”, under the BRI, in light of the specification, is any identifier, such as one or more bits, located in/appended to a compression result) (see, e.g., paragraphs 8, 30-31 and 33, “This method reads the data, symbol by symbol, and appends bits to the output code”, “by using this transform, the 9-bit message transmits 58 bits of content value. The 58-bit data segment has been greatly compressed. Each transform and accompanying state information are next packaged 108 into a packet … that provides any other information necessary to allow later unpackaging and recovery of the content value of the packet … can include information that identifies the segment, identifies the transform, or any other necessary information.”, “packet coding is decoded 204 to determine the identity of the transform” [i.e., appending identity information that identifies a selected compression/mode transform to the compression/transform result]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Ahmad in view of Chen with Avery to provide a “system and method for lossless data compression. A mathematical transform equivalent to the content value of the data, and taking fewer bits to represent, is found” and a “compression process” where “by using this transform, the 9-bit message transmits 58 bits of content value. The 58-bit data segment has been greatly compressed. Each transform and accompanying state information are next packaged 108 into a packet … that provides any other information necessary to allow later unpackaging and recovery of the content value of the packet” and “can include information that identifies the segment, identifies the transform, or any other necessary information. The packet typically takes fewer bits to express than the original segment. Thus, the segment has been compressed.” (See, e.g., Avery, Abstract and paragraphs 24 and 30-31). Doing so would have allowed Ahmad in view of Chen to use Avery’s “compression process” to implement a transform where “nine bits are capable of representing an integer over 700,000 decimal digits long, equivalent to more than 2.3 million binary bits” and the compression mode/transform “can represent the content value of 150,094,635,296,999,121 … using only nine bits” and a corresponding decoding/recovery process can “use the state information to recover the original segment. Using the decoded information, the original segment's content value is recalculated 206. Finally, all recomputed segments are put together again in their original order to reconstitute 208 the original input binary data. Thus, the output 210 of the recovery process 200 is identical to the input 102 of the compression process”, as suggested by Avery (See, e.g., Avery, paragraphs 30-33). 

Regarding claims 5 and 10, as discussed above, Ahmad in view of Chen and Avery teaches the method of claim 4 and the system of claim 9.
Although Ahmad in view of Chen substantially teaches the claimed invention, Ahmad in view of Chen is not relied on to teach selecting one of the compression modes according to the identification code in the final compression result so as to decompress the final compression result through the selected compression mode.
In the same field, analogous art Avery teaches selecting one of the compression modes according to the identification code in the final compression result so as to decompress the final compression result through the selected compression mode (as indicated above, recitations of “the final commutative result” in claim 10 have been interpreted as the previously-introduced “the final compression result”) (see, e.g., paragraph 33 and claim 78, “the packet coding is decoded 204 to determine the identity of the transform and how to use the state information to recover the original segment [i.e., selecting the compression mode/transform according to the identity/identification code in the final compression result/packet]. Using the decoded information, the original segment's content value is recalculated 206. Finally, all recomputed segments are put together again in their original order to reconstitute 208 the original input binary data. Thus, the output 210 of the recovery process 200 is identical to the input 102 of the compression process”, “decompressing a message … receiving the code representing the transform … and calculating the numerical value of the transform and … the numerical value of the message.” [i.e., decompress the compression result through the selected compression mode/transform]).
The motivation to combine Ahmad, Chen and Avery is the same as discussed above with respect to claims 4 and 9.

Conclusion
The prior art made of record, listed on form PTO-892, and not relied upon, is considered pertinent to applicant's disclosure.
The examiner requests, in response to this office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the reference cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111 (c).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RANDY K BALDWIN whose telephone number is (571)270-5222. The examiner can normally be reached on Mon - Fri 9:00-6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/R.K.B./Examiner, Art Unit 2125 


/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125