Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This Office Action is in response to Application No. 15/877,723 filed on January 23, 2018. Specifically, the amendment and arguments filed by applicant on January 7th 2022.
Information Disclosure Statement
The information disclosure statement (IDS) was submitted on November 19.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Response to Arguments
Applicant’s arguments regarding the §101 rejection, see Pg. 8 of applicant's remarks, have been fully considered and are persuasive.  The §101 rejection for claims 25-30 has been withdrawn.
Applicant’s arguments regarding the §112(b) rejection, see Pg. 8 of applicant's remarks, have been fully considered and are persuasive.  The §112(b) rejection for claims 1-30 has been withdrawn.  
Applicant's arguments filed regarding the §103 rejection have been fully considered but they are not persuasive. The applicant makes arguments on pages 8-10 that the cited references fail, individually or in combination, to teach the emphasized claim elements of claim 1 and similar independent claims 9, 17 and 24. The emphasized portion is posted below for the ease of the applicant:
“A method of operating a computational network, comprising: determining a low-rank approximation for one or more layers of the computational network based at least in part on a set of targets respectively corresponding to a set of tensor approximation residuals; compressing at least one layer of the computational network based at least in part on the low-rank approximation; and operating the computational network using the at least one compressed layer.” (Emphasis added)
	Applicant argues that the emphasized portions are not taught by the cited references and further argues that the combination of any of the arts fails to teach it as well. Applicant further argues that Brothers’ neural network analysis is not based on any targets that correspond to a set of tensor approximation residuals.
	Respectfully, the examiner disagrees with the above arguments. In regards to the newly amended claim recitation, Brothers recites:
[ (¶0067) “The decomposition may be performed to exactly replicate the original convolution or to approximate the original convolution within a specified tolerance. In another example, the neural network analyzer can apply a low rank matrix approximation using singular value decomposition to reduce the number of operations required.” ]
This citation from Brothers teaches the low-rank approximation being done to within a specific tolerance which is equivalent to an approximation residual target. The rest of the citation teaches the low-rank approximation using SVD as taught in the claim. Applicant is reminded 
	Accordingly, the arguments regarding the §103 rejection for claims 1, 9, 17, 24 and their dependents have been fully considered but they are not persuasive. Please see the §103 section below for full claim mapping.


Claim Interpretation
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 

(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do 
Claims 17-24 have invoked 112(f) for use of the term “means for”.
Regarding claim 17, the claim recites the following limitations: means for determining a low-rank approximation for one or more layers of the computational network based at least in part on a set of targets respectively corresponding to a set of tensor approximation residuals; which has a structure depicted in Fig. 4 of the specification, specifically elements 400, 410, 412, and 414 and Fig. 1.
Regarding claim 20, the claim recites the limitations from claim 17 and additionally adds the following: means for updating a bias associated with the at least one layer. This structure is depicted in Fig. 4, specifically element 406 and Fig. 1. 
Regarding Claim 21, the claim recites the limitations from claim 20 and additionally adds the following: means for applying a vector m to compensate for a mean shift in an output activation of the at least one compressed layer and an output activation of the at least one layer. This structure is depicted in Fig. 4 and Fig. 1 of the specification.
Regarding claim 22, the claim recites the limitations from claim 17 and additionally adds the following: means for determining the low-rank approximation without fine tuning. This structure is depicted in Fig. 4 and Fig. 1 of the specification.
Regarding claim 23, the claim recites the limitations from claim 17 and additionally adds the following: means for determining the low-rank approximation using singular value decomposition. This structure is depicted in Fig. 4 and Fig. 1 of the specification.
Regarding claim 24, the claim recites the limitations from claim 17 and additionally adds the following: means for determining a set of candidate rank vectors that satisfy each residual target of the set of targets respectively corresponding to a set of tensor approximation residuals; means for evaluating each candidate rank vector of the set of candidate rank vectors; and means for selecting a rank vector of the set of candidate rank vectors according to an objective function. This structure is depicted in Fig. 4 and Fig. 1 of the specification.
Claims not specifically mentioned are interpreted by virtue of their dependency.
Rejections

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:

2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-3, 6, 7, 9-11, 14, 15, 17-19, 22,  23, 25, 26, 28, and 29 are rejected under 35 U.S.C. 103 as being unpatentable over Brothers (US 20160358070 A1) in light of Lin (US 20160328644 A1) .
Regarding claim 1, Brothers teaches determining a low-rank approximation for one or more layers of the computational network 
[“In another example, the neural network analyzer can apply a low rank matrix approximation using singular value decomposition to reduce the number of operations required.” (¶0067) ]
	This citation shows Brothers’ teaching of low rank matrix approximation and it being done to reduce number of operations is equivalent to being done to layer(s).
Based at least in part on the set of targets respectively corresponding to a set of tensor approximation residuals;
[“The modified neural network may be validated to determine whether the modified neural network meets established performance requirements.” (¶0023) ]
[“The decomposition may be performed to exactly replicate the original convolution or to approximate the original convolution within a specified tolerance. In another example, the neural network analyzer can apply a low rank matrix approximation using singular value decomposition to reduce the number of operations required.” (¶0067) ]
This citation and the one above it from Brothers shows the low-rank approximation is carried out to a specific performance requirement. Specifically the portion which recites 
	Operating the computational network using the at least one compressed layer
[a neural network for an Advanced Driver Assistance System (ADAS) application to be run on a particular mobile hardware   (¶0022) ]
	This is an example of operating a network using the at least one compressed layer.
Brothers does not explicitly teach Compressing at least one layer of the computational network based at least in part on the low-rank approximation
Lin does however teach this as seen below:
Compressing at least one layer of the computational network based at least in part on the low-rank approximation
[The weight matrices of the compressed layers may be obtained through low-rank approximation methods (¶0071) ] 
	Therefore it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to compress a layer in a computational network using a low-rank approximation based off of residual targets and further operate that computational network with the at least one compressed layer. The combination would be obvious because one of ordinary skill in the art would know to apply a technique of compressing layers of the computational network to using low-rank approximation based off at least in part on residual targets and further operating the computational network with the compressed layer. The reasoning being that one of ordinary skill in the art would appreciate the increased performance and accuracy from using residual targets while training a computational network and further the improved results that may come from utilizing the network that has been compressed in a product.

Regarding claim 2, the method of claim 1 is taught by Brothers/Lin as in the rejection for claim 1 above. With the rest of the claim taught by Brother below
	Wherein the low-rank approximation is automatically determined
[Example embodiments include a framework that automatically tunes parameters of an input neural network and outputs a modified (e.g., optimized) neural network. (¶0023)]
	Based on a performance metric
[Methods and systems to modify and/or optimize neural networks are described herein that automatically tune neural network parameters to achieve required performance. The term “performance,” may be used herein in describing certain aspects of operation of a neural network such as accuracy of the neural network, runtime of the neural network, computational efficiency of the neural network, throughput, and/or power consumption of the neural network (¶0022)].
	Therefore it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to compress a layer in a computational network using an automatic low-rank approximation based off of performance metrics and further operate that computational network with the at least one compressed layer as taught by Brothers/Lin. The combination would be obvious because one of ordinary skill in the art would know appreciate that automating the approximation based off of a performance metric would reduce the amount of manual (user) work needed to train each computational network.

Regarding claim 3, the method of claim 2 is taught by Brothers/Lin as in the rejection for claim 2 above. With the rest of the claim taught by Brother below
	Wherein the performance metric includes at least one of an accuracy metric, a completion time metric and a complexity metric.

	To explain further, this citation from brother shows direct performance metrics that relate to accuracy, runtime (completion time metric) and computational efficiency (complexity metric).
	Therefore it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to compress a layer in a computational network using an automatic low-rank approximation based off of residual targets and further include performance metrics. The combination would be obvious because one of ordinary skill in the art would know that gauging the success of the compression would require metrics and goals to test against.
In regards to claim 6, Brothers/Lin teaches the limitation method of claim 1 as in the rejection for claim 1 above.
wherein the low-rank approximation is determined without fine tuning
Lin teaches the rest of this limitation as seen below:
[The weight matrices of the compressed layers may be obtained through low- rank approximation methods or by an alternating minimization algorithm. (¶0069)]
	To explain further, Lin explicitly teaches that compressed layers may be obtained through low-rank approximation and also teaches that fine tuning is an optional step and not one that is mandatory. 
Therefore it would be obvious to one of ordinary skill in the art, before the effective filing date of the invention, to apply low-rank approximation without fine tuning to compress a layer of the computational network in favor of using low-rank approximation. The reason it would be 

Regarding claim 7, the method of claim 1 is taught by Brothers/Lin as in the rejection for claim 1 above. With the rest of the claim taught by Brother below
wherein the low-rank approximation is determined using singular value decomposition
[In another example, the neural network analyzer can apply a low rank matrix approximation using singular value decomposition to reduce the number of operations required. (¶0067)]
	Therefore it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to compress a layer in a computational network using a low-rank approximation based off of residual targets and further operate that computational network with the at least one compressed layer. The combination would be obvious because one of ordinary skill in the art, prior to the effective filing date, would recognize that singular value decomposition would provide a computationally efficient manner in which to perform low-rank approximation.

Regarding claim 9, Brothers teaches an apparatus of operating a computational network comprising memory and at least one processor coupled to the memory;
[Another embodiment includes an apparatus for tuning a neural network. The apparatus includes a memory storing program code and a processor coupled to the memory. The processor is configured to initiate operations responsive to executing the program code.  (¶0006) ]
determining a low-rank approximation for one or more layers of the computational network 
[In another example, the neural network analyzer can apply a low rank matrix approximation using singular value decomposition to reduce the number of operations required. (¶0067) ]
Based at least in part on the set of targets respectively corresponding to a set of tensor approximation residuals;
[“The modified neural network may be validated to determine whether the modified neural network meets established performance requirements.” (¶0023) ]
[“The decomposition may be performed to exactly replicate the original convolution or to approximate the original convolution within a specified tolerance. In another example, the neural network analyzer can apply a low rank matrix approximation using singular value decomposition to reduce the number of operations required.” (¶0067) ]
This citation and the one above it from Brothers shows the low-rank approximation is carried out to a specific performance requirement. Specifically the portion which recites “specified tolerance” is equivalent to the approximation residuals. Further, the examiner notes that the matrix cited is equivalent to the tensor as matrices and tensors can be converted from one format to the other.
	Operating the computational network using the at least one compressed layer
[a neural network for an Advanced Driver Assistance System (ADAS) application to be run on a particular mobile hardware   (¶0022) ]
Brothers does not explicitly teach 	Compress at least one layer of the computational network based at least in part on the low-rank approximation 
Lin does however teach this as seen below:
[The weight matrices of the compressed layers may be obtained through low-rank approximation methods (¶0071) ]


Regarding claim 10, the apparatus of claim 9 is taught by Brothers/Lin as in the rejection for claim 9 above. With the rest of the claim taught by Brother below
	Wherein the processor is configured to determine the low-rank approximation automatically
[Example embodiments include a framework that automatically tunes parameters of an input neural network and outputs a modified (e.g., optimized) neural network. (¶0023)]
	Based on a performance metric
[Methods and systems to modify and/or optimize neural networks are described herein that automatically tune neural network parameters to achieve required performance. The term “performance,” may be used herein in describing certain aspects of operation of a neural network such as accuracy of the neural network, runtime of the neural network, computational efficiency of the neural network, throughput, and/or power consumption of the neural network (¶0022)].
	


Regarding claim 11, the apparatus of claim 10 is taught by Brothers/Lin as in the rejection for claim 10 above. With the rest of the claim taught by Brother below
	Wherein the performance metric includes at least one of an accuracy metric, a completion time metric and a complexity metric.
[Methods and systems to modify and/or optimize neural networks are described herein that automatically tune neural network parameters to achieve required performance. The term “performance,” may be used herein in describing certain aspects of operation of a neural network such as accuracy of the neural network, runtime of the neural network, computational efficiency of the neural network, throughput, and/or power consumption of the neural network (¶0022)]

In regards to claim 14, Brothers/Lin teaches the limitation apparatus of claim 9 wherein the processor as in the rejection for claim 9 above.
wherein the at least one processor is further configured to determine the low-rank approximation without fine tuning
Lin teaches this limitation [The weight matrices of the compressed layers may be obtained through low- rank approximation methods or by an alternating minimization algorithm. (¶0069)]


Regarding claim 15, the apparatus of claim 9 is taught by Brothers/Lin as in the rejection for claim 9 above. With the rest of the claim taught by Brother below
wherein the low-rank approximation is determined using singular value decomposition

	

Regarding claim 17, Brothers teaches an apparatus of operating a computational network;
[Another embodiment includes an apparatus for tuning a neural network. The apparatus includes a memory storing program code and a processor coupled to the memory. The processor is configured to initiate operations responsive to executing the program code.  (¶0006)]
Means for determining a low-rank approximation for one or more layers of the computational network 
[In another example, the neural network analyzer can apply a low rank matrix approximation using singular value decomposition to reduce the number of operations required. (¶0067)]
Based at least in part on the set of targets respectively corresponding to a set of tensor approximation residuals;
[“The modified neural network may be validated to determine whether the modified neural network meets established performance requirements.” (¶0023)]
[“The decomposition may be performed to exactly replicate the original convolution or to approximate the original convolution within a specified tolerance. In another example, the neural network analyzer can apply a low rank matrix approximation using singular value decomposition to reduce the number of operations required.” (¶0067) ]
This citation and the one above it from Brothers shows the low-rank approximation is carried out to a specific performance requirement. Specifically the portion which recites “specified tolerance” is equivalent to the approximation residuals. Further, the examiner notes 
	Means for operating the computational network using the at least one compressed layer
[a neural network for an Advanced Driver Assistance System (ADAS) application to be run on a particular mobile hardware   (¶0022)]
Brothers does not explicitly teach:
means for compressing at least one layer of the computational network based at least in part on the low-rank approximation
Lin does however teach this as seen below:
[The functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in hardware, an example hardware configuration may comprise a processing system in a device (¶0092)] and 
[The weight matrices of the compressed layers may be obtained through low-rank approximation methods (¶0071) ]
	For claim 17, means for determining a low-rank approximation, means for compressing at least one layer of the computational network, and means for operating the computational network are referenced in the specification Fig 1 and Fig 4. It is taken for claims 17-24 to use the means of a processor coupled to memory as one of ordinary skill in the art would expect to use to achieve the intended function. 
Therefore it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to compress a layer in a computational network using a low-rank approximation based off of residual targets and further operate that computational network with the at least one compressed layer. The combination would be obvious because one of ordinary skill in the art would know to apply a technique of compressing layers of the computational network to using low-rank approximation based off at least in part on residual targets and further 
Regarding claim 18, the apparatus of claim 17 is taught by Brothers/Lin as in the rejection for claim 17 above. With the rest of the claim taught by Brother below
	Wherein the low-rank approximation is automatically determined
[Example embodiments include a framework that automatically tunes parameters of an input neural network and outputs a modified (e.g., optimized) neural network. (¶0023)]
	Based on a performance metric
[Methods and systems to modify and/or optimize neural networks are described herein that automatically tune neural network parameters to achieve required performance. The term “performance,” may be used herein in describing certain aspects of operation of a neural network such as accuracy of the neural network, runtime of the neural network, computational efficiency of the neural network, throughput, and/or power consumption of the neural network (¶0022)].
	
Regarding claim 19, the method of claim 18 is taught by Brothers/Lin as in the rejection for claim 18 above. With the rest of the claim taught by Brother below
	Wherein the performance metric includes at least one of an accuracy metric, a completion time metric and a complexity metric.

	
In regards to claim 22, Brothers/Lin teaches the limitation apparatus of claim 17 as in the rejection for claim 17 above.
further comprising means for determining the low-rank approximation without fine tuning
Lin teaches this limitation [The weight matrices of the compressed layers may be obtained through low- rank approximation methods or by an alternating minimization algorithm. (¶0069)]


Regarding claim 23, the method of claim 17 is taught by Brothers/Lin as in the rejection for claim 17 above. With the rest of the claim taught by Brother below
Means for determining the low-rank approximation using singular value decomposition
[In another example, the neural network analyzer can apply a low rank matrix approximation using singular value decomposition to reduce the number of operations required. (¶0067)]
	

Regarding claim 25, Brothers teaches a non-transitory computer readable medium having executable code for operating a computational network, comprising code to:
 [A computer program product includes a computer readable storage medium having program code stored thereon for tuning a neural network. The program code is executable by a processor to perform operations.   (¶0007)]
determine a low-rank approximation for one or more layers of the computational network 
[In another example, the neural network analyzer can apply a low rank matrix approximation using singular value decomposition to reduce the number of operations required. (¶0067)]
Based at least in part on the set of targets respectively corresponding to a set of tensor approximation residuals;
[“The modified neural network may be validated to determine whether the modified neural network meets established performance requirements.” (¶0023)]
[“The decomposition may be performed to exactly replicate the original convolution or to approximate the original convolution within a specified tolerance. In another example, the neural network analyzer can apply a low rank matrix approximation using singular value decomposition to reduce the number of operations required.” (¶0067) ]
This citation and the one above it from Brothers shows the low-rank approximation is carried out to a specific performance requirement. Specifically the portion which recites “specified tolerance” is equivalent to the approximation residuals. Further, the examiner notes that the matrix cited is equivalent to the tensor as matrices and tensors can be converted from one format to the other.
	Operating the computational network using the at least one compressed layer

Brothers does not explicitly teach Compress at least one layer of the computational network based at least in part on the low-rank approximation
 but does teach a method for determining a low-rank approximation for one or more layers of the computational network based at least in part on the set of residual targets and teaches a use of operating the compressed layer(s).
Lin does however teach Compress at least one layer of the computational network based at least in part on the low-rank approximation
[The weight matrices of the compressed layers may be obtained through low-rank approximation methods (¶0071)]
	Therefore it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to utilize a computer readable medium with instructions to compress a layer in a computational network using a low-rank approximation based off of residual targets and further operate that computational network with the at least one compressed layer. The combination would be obvious because one of ordinary skill in the art would know to apply a technique of compressing layers of the computational network to using low-rank approximation based off at least in part on residual targets and further operating the computational network with the compressed layer. The reasoning being that one of ordinary skill in the art would appreciate the increased performance and accuracy from using residual targets while training a computational network and further the improved results that may come from utilizing the network that has been compressed in a product.

Regarding claim 26, the non-transitory computer readable medium of claim 25 is taught by Brothers/Lin as in the rejection for claim 25 above. With the rest of the claim taught by Brother below
	Determine the low-rank approximation automatically
[Example embodiments include a framework that automatically tunes parameters of an input neural network and outputs a modified (e.g., optimized) neural network. (¶0023)]
	Based on a performance metric
[Methods and systems to modify and/or optimize neural networks are described herein that automatically tune neural network parameters to achieve required performance. The term “performance,” may be used herein in describing certain aspects of operation of a neural network such as accuracy of the neural network, runtime of the neural network, computational efficiency of the neural network, throughput, and/or power consumption of the neural network (¶0022)].
	The performance metric comprising at least one of an accuracy metric, a completion time metric and a complexity metric
[Methods and systems to modify and/or optimize neural networks are described herein that automatically tune neural network parameters to achieve required performance. The term “performance,” may be used herein in describing certain aspects of operation of a neural network such as accuracy of the neural network, runtime of the neural network, computational efficiency of the neural network, throughput, and/or power consumption of the neural network (¶0022)]
	
In regards to claim 28, Brothers/Lin teaches the limitation the non-transitory computer readable medium of claim 25 as in the rejection for claim 25 above. Lin teaches the following limitations as seen below:
further comprising code for determining the low-rank approximation without fine tuning
 [The weight matrices of the compressed layers may be obtained through low- rank approximation methods or by an alternating minimization algorithm. (¶0069)]
	

Regarding claim 29, the non-transitory computer readable media of claim 25 is taught by Brothers/Lin as in the rejection for claim 25 above. With the rest of the claim taught by Brother below
wherein the low-rank approximation is determined using singular value decomposition
[In another example, the neural network analyzer can apply a low rank matrix approximation using singular value decomposition to reduce the number of operations required. (¶0067)]
	



Claims 4, 5, 12, 13, 20, 21 and 27 are rejected under 35 U.S.C. 103 as being unpatentable over Brothers/Lin in light of Annapureddy (US20160217369A1).

Regarding claim 4, Brothers/Lin teaches the limitation method of claim 1 as in the rejection for claim 1 above. What Brothers/Lin does not explicitly teach is:
 wherein the compressing at least one layer includes updating a bias with the at least one layer
However, Annapureddy does teach this as seen below:

	To explain further, this citation from Annapureddy and the entire section consisting of (¶0095) explains use of bias values after compressing a layer. 
Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to operate a computational network that determines low-rank approximations based at least in part on residual values and compresses layers of that network based at least in part on said low-rank approximations and updating a bias associated with the layer. The reason this would be obvious would be because one of ordinary skill in the art would recognize, prior to the effective filing date of the invention, that a compression of a layer within a computational network would most likely affect the output/accuracy of the layer and model as a whole and updating a bias on layer(s) is one way to counteract and maintain accuracy.

Regarding claim 5, Brothers/Lin along with Annapureddy teaches the limitation Method of claim 4 as in the rejection for claim 4 above. Brothers/Lin does not explicitly teach:
wherein the bias is updated by applying a vector ‘m’
However, this is taught by Annapureddy with:
[The number of parameters before compression may be equal to the number of elements in the weight matrix W=nm, plus the bias vector, which is equal to m.  (¶0080)]
	

Regarding claim 12, Brothers/Lin teaches the limitation apparatus of claim 9 as in the rejection for claim 9 above. What Brothers/Lin does not explicitly teach is:
 wherein the at least one processor is further configured to update a bias with the at least one layer
However, Annapureddy does explicitly teach as seen below:

To explain further, this citation from Annapureddy and the entire section consisting of (¶0095) explains use of bias values after compressing a layer.
to update a bias with the at least one layer
 [where Bj is a bias term, and Zk is an activation vector of the added layer of neurons  (¶0095)]
	With respect to Claim 12, it is substantially similar to Claim 4 and is rejected in the same manner, the same art and reasoning applying. Please see the motivation to combine in claim 4.

Regarding claim 13, Brothers/Lin along with Annapureddy teaches the limitation apparatus of claim 12 as in the rejection for claim 12 above. The rest of the claim limitations are also taught by Annapureddy as follows.
Wherein the at least one processor is further configured
[The various operations of methods described above may be performed by any suitable means capable of performing the corresponding functions. The means may include various hardware and/or software component(s) and/or module(s), including, but not limited to, a circuit, an application specific integrated circuit (ASIC), or processor  (¶0141)]
To update the bias by applying a vector ‘m’
[The number of parameters before compression may be equal to the number of elements in the weight matrix W=nm, plus the bias vector, which is equal to m.  (¶0080)]
	To compensate for a mean shift 
[To make sure that the effective transformation of compressed convolution layers (Equation 5) is a close approximation to the transformation of the original convolution layer, the weight matrices 
	To further explain, weight matrices of the compressed layers in this citation are what would be analyzed to determine if a mean shift had occurred. 
	in an output activation of the at least one compressed layer and an output activation of the at least one layer
[After replacing a layer with multiple layers, the neurons in the compressed layer may be configured with identity activations. However, nonlinear layers may be added between compressed layers to improve the representational capacity of the network.   (¶0119)] and
[In one exemplary aspect, a rectifier nonlinearity may be inserted. In this example, a threshold may be applied to the output activations of a compressed layer using max(¶0, x) nonlinearity before passing them to the next layer.   (¶0120)]
	To further explain, the limitation of compensating for a mean shift in an output activation of the at least one compressed layer would be interpreted, by one of ordinary skill in the art before the effective filing date, to be an operation that could be classified to minimize errors or biases but also readjust the computational network after compression back towards whatever goal the user may wish to achieve with said network. Annapureddy teaches that the computational network (referenced by DCN which stands for deep convolutional network) may have an error or inadvertently reduce accuracy when compressed, which can be adjusted after compression. Annapureddy’s explanation of updating a bias vector to compensate for a mean shift in an output activation of compressed and non-compressed layers would be obvious to combine with the teaching of Brothers/Lin, as in the limitations of this claim, for one of ordinary skill in the art.

Regarding claim 20, Brothers/Lin teaches the limitation apparatus of claim 17 as in the rejection for claim 17 above. What Brothers/Lin does not explicitly teach is:
means for updating a bias associated with the at least one layer
However, Annapureddy does explicitly teach means for 
[The various operations of methods described above may be performed by any suitable means capable of performing the corresponding functions. The means may include various hardware and/or software component(s) and/or module(s), including, but not limited to, a circuit, an application specific integrated circuit (ASIC), or processor  (¶0141)]
updating a bias associated with the at least one layer
 [where Bj is a bias term, and Zk is an activation vector of the added layer of neurons   (¶0095)]
To explain further, this citation from Annapureddy and the entire section consisting of (¶0095) explains use of bias values after compressing a layer.
With respect to Claim 20, it is substantially similar to Claim 4 and is rejected in the same manner, the same art and reasoning applying.
Regarding claim 21, Brothers/Lin teaches the limitation apparatus of claim 17 as in the rejection for claim 17 above. What Brothers/Lin do not teach is:
 further comprising means for applying a vector m to compensate for a mean shift in an output activation of the at least one compressed layer and an output activation of the at least one layer. 
However, the rest of the claim limitations are taught by Annapureddy as follows.
further comprising means for 
[The various operations of methods described above may be performed by any suitable means capable of performing the corresponding functions. The means may include various hardware 
applying a vector ‘m’
[The number of parameters before compression may be equal to the number of elements in the weight matrix W=nm, plus the bias vector, which is equal to m.  (¶0080)]
	To compensate for a mean shift 
[To make sure that the effective transformation of compressed convolution layers (Equation 5) is a close approximation to the transformation of the original convolution layer, the weight matrices of the compressed layers may be chosen so as to reduce or minimize the approximation error (¶0096)] 
To further explain, weight matrices of the compressed layers in this citation are what would be analyzed to determine if a mean shift had occurred.
	in an output activation of the at least one compressed layer and an output activation of the at least one layer
[After replacing a layer with multiple layers, the neurons in the compressed layer may be configured with identity activations. However, nonlinear layers may be added between compressed layers to improve the representational capacity of the network.   (¶0119)] and
[In one exemplary aspect, a rectifier nonlinearity may be inserted. In this example, a threshold may be applied to the output activations of a compressed layer using max(¶0, x) nonlinearity before passing them to the next layer.  (¶0120)]
To further explain, the limitation of compensating for a mean shift in an output activation of the at least one compressed layer would be interpreted, by one of ordinary skill in the art before the effective filing date, to be an operation that could be classified to minimize errors or biases but also readjust the computational network after compression back towards whatever goal the user may wish to achieve with said network. Annapureddy teaches that the computational network (referenced by DCN which stands for deep convolutional network) may 
In regards to claim 27, Brothers/Lin teach the limitation non-transitory computer readable medium of claim 25 as in the rejection for claim 25 above. The rest of the claim limitations are also taught by Annapureddy as follows.
Code to update the bias by applying a vector ‘m’
[The number of parameters before compression may be equal to the number of elements in the weight matrix W=nm, plus the bias vector, which is equal to m.  (¶0080)]
	To compensate for a mean shift 
[To make sure that the effective transformation of compressed convolution layers (Equation 5) is a close approximation to the transformation of the original convolution layer, the weight matrices of the compressed layers may be chosen so as to reduce or minimize the approximation error (¶0096)] 
To further explain, weight matrices of the compressed layers in this citation are what would be analyzed to determine if a mean shift had occurred.
	in an output activation of the at least one compressed layer and an output activation of the at least one layer
[After replacing a layer with multiple layers, the neurons in the compressed layer may be configured with identity activations. However, nonlinear layers may be added between compressed layers to improve the representational capacity of the network.   (¶0119)] and

	To further explain, the limitation of compensating for a mean shift in an output activation of the at least one compressed layer would be interpreted, by one of ordinary skill in the art before the effective filing date, to be an operation that could be classified to minimize errors or biases but also readjust the computational network after compression back towards whatever goal the user may wish to achieve with said network. Annapureddy teaches that the computational network (referenced by DCN which stands for deep convolutional network) may have an error or inadvertently reduce accuracy when compressed, which can be adjusted after compression. Annapureddy’s explanation of updating a bias vector to compensate for a mean shift in an output activation of compressed and non-compressed layers would be obvious to combine with the teaching of Brothers/Lin, as in the limitations of this claim, for one of ordinary skill in the art.



Claim 8, 16, 24 and 30 is/are rejected under 35 U.S.C. 103 as being unpatentable over Brothers/Lin in light of Sindhwani (WO 2017151203 A1) henceforth known as JP.
In regards to claim 8, Brothers/Lin teaches the limitation method of claim 1 as in the rejection for claim 1 above. Brothers also teaches the following limitations:
 determining a set of candidate rank vectors that satisfy each target of the set of targets;

	However, Brothers fails to explicitly teach 
evaluating each candidate rank vector of the set of rank vectors;
and selecting a rank vector of the set of candidate rank vectors 
Lin teaches those limitations as follows:
	Evaluating each candidate rank vector of the set of rank vectors and selecting a rank vector of the set of candidate rank vectors
[ (Fig. 5) ] 
	To further explain, in figure 5 of Lin, there is a depiction of “Dynamically selecting between a current configuration and the new configuration based on the current system resources and the performance specification” this directly follows from the previous step in the image, where the candidate rank vectors are determined, and continues by selecting between new configuration(s) and current configuration which would mean the process evaluates the rank vectors and selects from the choice of rank vectors.
	What Brothers/Lin together and individually don’t explicitly teach is
 based on the evaluations of a minimization function using the set of candidate rank vectors
Which JP teaches as seen below: 
[“The system optimizes the objective function, i.e., either by maximizing or minimizing, so as to determine the training values of the parameters of the recurrent neural network from the initial values of the parameters”  (¶0063) ]
	To further explain, Brothers/Lin teach a device for automatic determination of compression parameters and a rank selector module that automatically evaluates and then selects from the available set of candidate rank vectors. One of ordinary skill in the art, before 
	Therefore, it would be obvious to one of ordinary skill in the art, before the filing date of the claimed invention, to combine a method for compressing a computational network using candidate rank vectors as taught by Brothers/Lin with the minimizing functions as taught by JP. The reason this would be obvious is because one of ordinary skill in the art, prior to the effective filing date of the invention, would recognize that utilizing metrics to analyze the performance of layers after compression (in context of reaching a goal or objective function) would be one of the most obvious ways to ensure that the compression would have been successful while still maintaining accuracy or other performance and objective functions as defined by the user.

In regards to claim 16, Brothers/Lin teaches the limitation the apparatus of claim 9 as in the rejection for claim 9 above. Brothers also teaches the following limitations:
 	wherein the at least one processor is further configured to:
[Another embodiment includes an apparatus for tuning a neural network. The apparatus includes a memory storing program code and a processor coupled to the memory. The processor is configured to initiate operations responsive to executing the program code.  (¶0006)]
 determine a set of candidate rank vectors that satisfy each target of the set of targets;

	However, Brothers fails to explicitly teach 
evaluate each candidate rank vector of the set of rank vectors;
and select a rank vector of the set of candidate rank vectors 
Lin teaches those limitations as follows:
	Evaluate each candidate rank vector of the set of rank vectors and select a rank vector of the set of candidate rank vectors
[ (Fig. 5) ] 
	To further explain, in figure 5 of Lin, there is a depiction of “Dynamically selecting between a current configuration and the new configuration based on the current system resources and the performance specification” this directly follows from the previous step in the image, where the candidate rank vectors are determined, and continues by selecting between new configuration(s) and current configuration which would mean the process evaluates the rank vectors and selects from the choice of rank vectors.
	What Brothers/Lin together and individually don’t explicitly teach is
 	based on the evaluations of a minimization function using the set of candidate rank vectors
Which JP teaches as seen below: 
[“The system optimizes the objective function, i.e., either by maximizing or minimizing, so as to determine the training values of the parameters of the recurrent neural network from the initial values of the parameters”  (¶0063) ]
	To further explain, Brothers/Lin teach a device for automatic determination of compression parameters and a rank selector module that automatically evaluates and then selects from the available set of candidate rank vectors. One of ordinary skill in the art, before 
	With respect to Claim 16, it is substantially similar to Claim 8 and is rejected in the same manner, the same art and reasoning applying. Please refer to claim 8 to see the motivation to combine.

In regards to claim 24, Brothers/Lin teaches the limitation the apparatus of claim 17 as in the rejection for claim 17 above. Brothers also teaches the following limitations:
 Further comprising: means for determining a set of candidate rank vectors that satisfy each target of the set of targets;
[The modified neural network may be validated to determine whether the modified neural network meets established performance requirements. (¶0023)] and [Example embodiments further include a method of determining tuned parameters for a neural network. (¶0023)]
	However, Brothers fails to explicitly teach 
means for evaluating each candidate rank vector of the set of rank vectors;
and means for selecting a rank vector of the set of candidate rank vectors 
Lin teaches those limitations as follows:
	means for evaluating each candidate rank vector of the set of rank vectors and selecting a rank vector of the set of candidate rank vectors
[ (Fig. 5) ] 

	What Brothers/Lin together and individually don’t explicitly teach is
 based on the evaluations of a minimization function using the set of candidate rank vectors
Which JP teaches as seen below: 
[“The system optimizes the objective function, i.e., either by maximizing or minimizing, so as to determine the training values of the parameters of the recurrent neural network from the initial values of the parameters”  (¶0063) ]
	To further explain, Brothers/Lin teach a device for automatic determination of compression parameters and a rank selector module that automatically evaluates and then selects from the available set of candidate rank vectors. One of ordinary skill in the art, before the claimed invention date would find tensor to be defined as a general term for vectors and matrices within machine learning hence why it is analogous art. Further, the minimization function which is seen in JP and is used in the context of training parameters for minimizing an error value is equivalent to the minimization function of the claim language. Examiner notes that the candidate rank vectors was taught in the previous citations and the JP reference/citation(s) are used for the minimizing function. 
	With respect to Claim 24, it is substantially similar to Claim 8 and is rejected in the same manner, the same art and reasoning applying. Please refer to claim 8 to see the motivation to combine.



In regards to claim 30, Brothers/Lin teaches the limitation the non-transitory, computer readable medium of claim 25 as in the rejection for claim 25 above. Brothers also teaches the following limitations:
 	wherein the at least one processor is further configured to:
[Another embodiment includes an apparatus for tuning a neural network. The apparatus includes a memory storing program code and a processor coupled to the memory. The processor is configured to initiate operations responsive to executing the program code.  (¶0006)]
 determine a set of candidate rank vectors that satisfy each target of the set of targets;
[The modified neural network may be validated to determine whether the modified neural network meets established performance requirements. (¶0023)] and [Example embodiments further include a method of determining tuned parameters for a neural network. (¶0023)]
	However, Brothers fails to explicitly teach 
evaluate each candidate rank vector of the set of rank vectors;
and select a rank vector of the set of candidate rank vectors 
Lin teaches those limitations as follows:
	Evaluate each candidate rank vector of the set of rank vectors and select a rank vector of the set of candidate rank vectors
[ (Fig. 5) ] 
	To further explain, in figure 5 of Lin, there is a depiction of “Dynamically selecting between a current configuration and the new configuration based on the current system resources and the performance specification” this directly follows from the previous step in the image, where the candidate rank vectors are determined, and continues by selecting between 
	What Brothers/Lin together and individually don’t explicitly teach is
 	based on the evaluations of a minimization function using the set of candidate rank vectors
Which JP teaches as seen below: 
[“The system optimizes the objective function, i.e., either by maximizing or minimizing, so as to determine the training values of the parameters of the recurrent neural network from the initial values of the parameters”  (¶0063) ]
	To further explain, Brothers/Lin teach a device for automatic determination of compression parameters and a rank selector module that automatically evaluates and then selects from the available set of candidate rank vectors. One of ordinary skill in the art, before the claimed invention date would find tensor to be defined as a general term for vectors and matrices within machine learning hence why it is analogous art. Further, the minimization function which is seen in JP and is used in the context of training parameters for minimizing an error value is equivalent to the minimization function of the claim language. Examiner notes that the candidate rank vectors was taught in the previous citations and the JP reference/citation(s) are used for the minimizing function. 
	With respect to Claim 30, it is substantially similar to Claim 8 and is rejected in the same manner, the same art and reasoning applying. Please refer to claim 8 to see the motivation to combine.
	


Conclusion
The following is prior art made of record and not relied upon is considered pertinent to applicant's disclosure.

US 11227213 B2 – Device and method for improving processing speed of neural network which teaches low rank expansion, gradual approximation to ensure accuracy, and column reduction. 
US 20180293758 A – low rank matrix compression which teaches compression with rank comparison in various layers and low-rank approximation done through SVD


THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL MERABI whose telephone number is (571)272-9685. The examiner can normally be reached Mon-Fri 7:30am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on (571) 270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/M.A.M./Examiner, Art Unit 2123                                                                                                                                                                                                        
/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123