Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


Claim 59 recites the limitation " performs the inferencing." in the last line.  There is insufficient antecedent basis for this limitation in the claim.
Claim 61 recites the limitation " " in the last line.  There is insufficient antecedent basis for this limitation in the claim.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 51-54, 58 and 70 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Lin (Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration).
Regarding claim 51, Lin teaches an apparatus comprising: at least one processor; and at least one memory including computer program code which, when executed by the at least one processor, causes the apparatus to: initialize a neural network based on a set of stored model information (e.g. TABLE 1 defines a plurality of orthogonal binary basis vectors which are to be used to implement kernels in one or more hidden layers of the neural network), 
which defines a plurality of orthogonal binary basis vectors which are to be used to implement kernels in one or more hidden layers of the neural network (e.g. Section 3 and Figs. 1-3: which show pairs of independent binary vectors in an SVD filter for BCNN kernels), and 
plural sets of plural coefficients, each set of plural coefficients corresponding to a respective one of the kernels, wherein each of the coefficients in a given set of coefficients is associated with a respective one of the one or more orthogonal binary basis vectors (Section 3.1 to 3.2 and Fig. 2: which show singular coefficient values corresponding to vectors);
pass input data through the neural network such that convolution operations between the kernels and data arriving at the kernels are performed, wherein each of the kernels is implemented using a respective set of coefficients and the orthogonal binary basis vectors with which the coefficients in the set are associated (TABLE 1 and Section 6.1: which show CNN processing and acceleration of layer classification using the coefficients and vectors as shown above in Sections 3, 3.1 and 3.2); and
output data from the neural network, the output data representing an inference corresponding to the input data (Section 6.1: which shows output and storage).
Regarding claim 52, see the rejection of claim 51 above. Lin further teaches
for each of the binary orthogonal basis vectors associated with the coefficients corresponding to the kernel, computing a binary convolution of the orthogonal binary basis vector and the data arriving at the kernel, and multiplying the binary convolution by the coefficient associated with the orthogonal binary basis vector; and adding the results of the multiplications to generate an output from the kernel (Sections 3 and 6.1, Figs. 1-3).
Regarding claim 53, see the rejection of claim 51 above. Lin further teaches wherein the computer program code, when executed, causes the apparatus to initialise the neural network by: for each of the kernels, using the set of coefficients corresponding to that kernel and the associated orthogonal binary basis vectors to generate the kernel (Section 3.1 to 3.2 and Fig. 2-3: which show singular coefficient values corresponding to vectors applied to kernels).
Regarding claim 54, see the rejection of claim 53 above. Lin further teaches wherein the computer program code, when executed, causes the apparatus to, for each of the kernels, generate the kernel by multiplying each binary basis vector with its associated coefficient and superimposing the results of the multiplications (Section 3.1 to 3.2 and Fig. 1-3).
Regarding claim 58, see the rejection of claim 51 above. Lin further teaches wherein the computer program code, when executed, causes the apparatus to: deterministically generate the orthogonal binary basis vectors required for the neural network based on identifiers in the set of stored model information (e.g. Look-up table (LUT) with a second table is the mapping relationship between all possible binary filters to their corresponding binarized separable filters We design an estimation function to make the tables content-addressable. The key to index the first table can be obtained- Section 3.3).
Claim(s) 70 recite(s) similar limitations as claim(s) 51 above, but in method form. Therefore, the same rationale used in regards to claim(s) 51 is/are incorporated herein. Furthermore, Lin teaches a method to carry out the invention (page 2).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 55-57 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lin (Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration) as applied to claim 51 above, in view of Bruckhaus (US 8417715 B1).
Regarding claim 55, see the rejection of claim 51 above. As can be seen above, Lin teach/es all the limitations of claim 55 except wherein the computer program code, when executed, causes the apparatus to: determine one or more performance requirements for the neural network; and based on the one or more performance requirements, select the set of stored model information from plural sets of stored model information, each corresponding to a different neural network model.
In the same field of AI, Bruckhaus teaches determine one or more performance requirements for the neural network; and based on the one or more performance requirements, select the set of stored model information from plural sets of stored model information, each corresponding to a different neural network model (e.g. The model manager 144 then compares the performance the best currently known model to the performance requirements the translated business task may specify. At any given time, the automatic optimization module 148 may or may not have discovered a model that meets such requirements specified in the translated business task. The model manager 144 component may then either continue constructing more models in the search for an appropriate model, it may use the model currently providing the best performance and indicate that the model does not meet performance requirements, or it may stop, depending on how the configuration of the translated business task and system defaults (col. 47, ll. 44-58); and The model manager 144 then selects the best model meeting all requirements and having the best performance, and facilitates deployment of the best model by making the executable representation of the model available in the model repository. The delivery services component can then retrieve the executable and provide it to a third party software system for deployment into automobiles. In this manner, the invention also supports other similar applications with model deployment needs into electronic devices, other than automobiles (col. 57, ll. 31-40). One of ordinary skill in the art could have a reasonable expectation of success to combine two like systems where both systems use neural networks.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention, to modify the invention of Lin with the features of performance requirements as taught by Bruckhaus. The motivation would have been to provide a high degree of automation that reduces the time and resources required to perform highly complex data mining tasks, while providing the automatic optimization of predictive models (col. 4, ll. 44-49).
Regarding claim 56, see the rejection of claim 55 above. Lin as modified by Bruckhaus further teach(es) wherein an indication of an acceptable accuracy for the neural network (e.g. Users of data mining technology require accurate data mining results because inaccurate results may lead to taking actions that could prove harmful... Thus, the invention brings about a high degree of accuracy and a closed-loop business process having tight integration with existing business processes--all of which are easily realized because the invention can be easily installed, configured, implemented, and maintained by a user or service provider (Bruckhaus: col. 4, ll-15-54). In a broadest reasonable interpretation, the phrase “high degree of accuracy” reads is analogous to “acceptable accuracy”. Since the claim states “and/or”, only one condition needs to be met. The same motivation to combine used in claim 55 above is applied herein.
Regarding claim 57, see the rejection of claim 56 above. Lin as modified by Bruckhaus further teach(es) wherein at least one of the one or more performance requirements is determined based on a current usage by the device of one or more computational resources (e.g. resource usage- Lin: Section 6.1 and 6.2). The same motivation to combine used in claim 55 above is applied herein.

Claim(s) 60 and 62 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lin (Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration) as applied to claim 1 above, in view of Manning (US 20170140260 A1).
Regarding claim 60, see the rejection of claim 51 above. As can be seen above, Lin teach/es all the limitations of claim 60 except wherein the input data is a representation of sensor data.
In the same field of CNNs, Manning teaches wherein the input data is a representation of sensor data (e.g. The computing device 100 may be any suitable device, such as, for example, a computer 20 as described in FIG. 7, for implementing the input converter 105, the convolutional neural networks 110, 120, and 130, and the storage 140. The computing device 100 may be a single computing device, or may include multiple connected computing devices. The input converter 105 may convert input, such as, for example, audio data 150 and video data 160, into an appropriate format to be input into a neural network, such as, for example, the convolutional neural networks 110, 120, and 130. The storage 140 may store the audio data 150, video data 160, vector representations 170, and labels 180 in any suitable manner- para. 28. Sensors such as a microphone are implied with the use of audio data).
Regarding claim 62, see the rejection of claim 60 above. Lin as modified by Manning further teach wherein the sensor data is audio data (e.g. The computing device 100 may be any suitable device, such as, for example, a computer 20 as described in FIG. 7, for implementing the input converter 105, the convolutional neural networks 110, 120, and 130, and the storage 140. The computing device 100 may be a single computing device, or may include multiple connected computing devices. The input converter 105 may convert input, such as, for example, audio data 150 and video data 160, into an appropriate format to be input into a neural network, such as, for example, the convolutional neural networks 110, 120, and 130. The storage 140 may store the audio data 150, video data 160, vector representations 170, and labels 180 in any suitable manner- para. 28). The same motivation to combine used in claim 60 above is applied herein.

Claim(s) 63-66 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bhattacharya (Sparsification and Separation of Deep Learning Layers for Constrained Resource Inference on Wearables) in view of Lin (Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration).
Regarding claim 63, Bhattacharya teaches an apparatus comprising: at least one processor; and at least one memory including computer program code which, when executed by the at least one processor, causes the apparatus to: based on determined one or more performance requirements (e.g. acceptable levels of performance- page 2, last 3 bullet points), determine at least one set of hyperparameters, each set of hyperparameters defining a different neural network model, wherein the hyperparameters in each set include a number of layers for the neural network, a number of kernels in each of the layers and a number of orthogonal basis vectors which are to be used to implement the kernels in each layer; and for each set of hyperparameters (e.g. introducing a hyper parameters… optimizing deep learning model layers… exploit the transformation of the model and realize radical reductions in computation, energy and memory usage- page 2, first para. and last 3 bullet points, and  Sections 2- third para., and Sections 4.4 to 4.5); initialize a respective neural network based on the set of hyperparameters (Algorithm 1) train the neural network using training data, thereby to learn a set of plural coefficients for each kernel, wherein each coefficient is associated with a respective one of the orthogonal basis vectors and wherein each of the kernels is implemented using a respective set of coefficients and the orthogonal binary basis vectors with which the coefficients in the set are associated; and store the learned coefficients in association with the hyperparameters for provision to a user device (Algorithm 1, steps 15-17).
	Bhattacharya fails to teach binary basis vectors.
In the same field of CNNs, Lin teaches binary basis vectors (e.g. Section 3 and Figs. 1-3: which show pairs of independent binary vectors in an SVD filter for BCNN kernels). One of ordinary skill in the art could have a reasonable expectation of success to combine two like systems where both systems use CNNs.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention, to modify the invention of Bhattacharya with the features of binary basis vectors as taught by Lin. The motivation would have been to improve error rates (page 2, column 2, para. 1-2). 
Regarding claim 64, see the rejection of claim 63 above. Bhattacharya as modified by Lin further teach/es wherein the computer program code, when executed, causes the apparatus to: for each of the trained neural networks: use validation data to determine whether the trained neural network satisfies the at least one constraint indicated by the one or more performance requirements (Sections 4.4.2 to 4.4.5 and Algorithm 1); and in response to determining that the trained neural network satisfies the at least one constraint indicated by the one or more performance requirements, provide the hyperparameters and the learned coefficients for storage on a user device (Algorithm 1, step 15). The same motivation to combine used in claim 63 above is applied herein.
Regarding claim 65, see the rejection of claim 64 above. The limitations in claim 65 are the same as claim 63 above EXCEPT “plural sets of hyperparameters”. Claim 63 recites “at least one set of hyperparameters”. This is an obvious variation of claim 63 being as if one skilled can produce one set of hyperparameters they can certainly produce plural sets of hyperparameters. The same motivation to combine used in claim 63 above is applied herein.
Regarding claim 66, see the rejection of claim 64 above. The limitations in claim 66 are the same as claims 63 and 65. This is an obvious variation of claims 63 and 65 being as if one skilled can produce one set of hyperparameters they can certainly produce plural sets of hyperparameters. The same motivation to combine used in claim 63 above is applied herein.

Claim(s) 67 and 69 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bhattacharya (Sparsification and Separation of Deep Learning Layers for Constrained Resource Inference on Wearables) in view of Lin (Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration) as applied to claim 63 above, in view of Bruckhaus (US 8417715 B1).
Regarding claim 67, see the rejection of claim 63 above. As can be seen above, Bhattacharya as modified by Lin teach/es all the limitations of claim 67 except wherein the one or more performance requirements include either or both of: a minimum level of accuracy for the neural network; a maximum level of computational resource use by the neural network.
In the same field of AI, Bruckhaus teaches wherein the one or more performance requirements include either or both of: a minimum level of accuracy for the neural network; a maximum level of computational resource use by the neural network (e.g. Users of data mining technology require accurate data mining results because inaccurate results may lead to taking actions that could prove harmful... Thus, the invention brings about a high degree of accuracy and a closed-loop business process having tight integration with existing business processes--all of which are easily realized because the invention can be easily installed, configured, implemented, and maintained by a user or service provider (col. 4, ll-15-54). In a broadest reasonable interpretation, the phrase “high degree of accuracy” is analogous to “acceptable accuracy”. Since the claim states “either or both”, only one condition needs to be met. One of ordinary skill in the art could have a reasonable expectation of success to combine two like systems where both systems use neural networks.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention, to modify the invention of Bhattacharya as modified by Lin with the features of performance requirements as taught by Bruckhaus. The motivation would have been to provide a high degree of automation that reduces the time and resources required to perform highly complex data mining tasks, while providing the automatic optimization of predictive models (col. 4, ll. 44-49).
Regarding claim 69, see the rejection of claim 67 above. Bhattacharya as modified by Lin and Bruckhaus further teach wherein the maximum level of computational resource use by the neural network comprises one or any combination of: CPU usage when executing the neural network; latency associated with the neural network; energy consumption resulting from executing the neural network; memory usage when executing the neural network; and memory usage required to store model information defining the neural network (e.g. that is able to exploit the transformation of the model and realize radical reductions in computation, energy and memory usage- Bhattacharya: page 2 first para.). Since the claim states “or”, only one condition needs to be met. The same motivation to combine used in claim 67 above is applied herein.

Allowable Subject Matter
Claim(s) 68 is/are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TODD BUTTRAM whose telephone number is (571)270-1540.  The examiner can normally be reached on M-F 9am-7pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, XIAO WU can be reached on 571-272-7761.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


TODD BUTTRAMPrimary ExaminerArt Unit 2613



/TODD BUTTRAM/Primary Examiner, Art Unit 2613