DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Specification
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification.


Drawings
The applicant’s submitted drawings appear to be acceptable for examination purposes. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the drawings.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 9-13 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter.

Claims 9-13 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. Specifically, according to the description given in the specification, in paragraph 0085, the broadest reasonable interpretation of “machine-readable storage medium” covers transitory propagating signals, which are non-statutory. To overcome this rejection, applicant should insert --non-transitory-- before “machine-readable storage medium”. Such an amendment is not considered new matter. See the "Subject Matter Eligibility of Computer Readable Media" memo dated January 26, 2010 (OG Cite: 1351 OG 212; OG Date: 23 Feb 2010).


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 9-13, 16, and 17 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and 

Claim 9 recites the limitation "the filter" in the final line.  There is insufficient antecedent basis for this limitation in the claim.
Claims 10-13 depend upon claim 9, and thus include the aforementioned limitation(s) (as well as a further recitation of “the filter” found in claim 13).

Claim 16 recites the limitation "the plurality of filters in a layer of the CNN" in line 3.  There is insufficient antecedent basis for this limitation in the claim.
Claim 17 depends upon claim 16, and thus includes the aforementioned limitation(s).
Claim 17 also recites the limitations “the first filter”, and “the third filter” in line 3.  There is insufficient antecedent basis for these limitations in the claim.


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 14-16 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Aliabadi (US 2018/0096226).

As per claim 14, Aliabadi teaches an apparatus comprising: a multi-processor supporting execution of a Single Instruction Multiple Data (SIMD) instruction set [an implementation of CNNs on a computing device including multiple processors executing SIMD instructions (paras. 0028, 0067, etc.) from a memory (para. 0034, etc.)]; a SIMD register to be used when executing the SIMD instruction set [the SIMD instructions are executed using SIMD registers (para. 0031, etc.)]; a memory coupled to the multi-processor, the memory comprising instructions [an implementation of CNNs on a computing device including multiple processors executing SIMD instructions (paras. 0028, 0067, etc.) from a memory (para. 0034, etc.)] which when executed by the multi-processor cause the multi-processor to: generate a filter based on a receptive field size and a number of learnable parameters [a kernel of a CNN is divided into multiple runnels (filters) (para. 0006, etc.) each based on a number of weights (learnable parameters) that is based on a size of SIMD registers used for SIMD instructions (para. 0031, etc.) and a size of a number of feature maps (receptive field size) (para. 0256, etc.)], the number of [a kernel of a CNN is divided into multiple runnels (filters) (para. 0006, etc.) each based on a number of weights (learnable parameters) that is based on a size of SIMD registers used for SIMD instructions (para. 0031, etc.) and a size of a number of feature maps (receptive field size) (para. 0256, etc.)]; and embed the filter in a channel of a convolutional neural network (CNN), the CNN comprising a plurality of channels, wherein the CNN is executed by the multi-processor using the filter and the SIMD instruction set [each layer of the CNN including the kernels including the runnels has M input channels to the runnels (para. 0030, etc.) and the CNN is executed using SIMD instructions on a SIMD processor (para. 0067)].

As per claim 15, Aliabadi teaches the memory further comprising instructions which, when executed by the multi-processor, cause the multiprocessor to: generate a plurality of filters based on the receptive field size and the number of learnable parameters, the number of learnable parameters being arranged in a second configuration in one filter of the plurality of filters [a kernel of a CNN is divided into multiple runnels (filters) (para. 0006, etc.) each based on a number of weights (learnable parameters) that is based on a size of SIMD registers used for SIMD instructions (para. 0031, etc.) and a size of a number of feature maps (receptive field size) (para. 0256, etc.)], wherein the first configuration is different from the second configuration [the CNN may be trained to learn and modify the parameters of the model using training data (para. 0025, etc.) (modifying the sets of weights in the filters during the training causing them to be different)]; and embed the one filter in a second channel of the CNN [each layer of the CNN including the kernels including the runnels includes a number of channels (para. 0030, etc.)].

As per claim 16, Aliabadi teaches the memory further comprising instructions which, when executed by the multi-processor, cause the multiprocessor to: embed a second filter in the plurality of filters in a layer of the CNN, the number of learnable parameters being arranged in a third configuration of the second filter, wherein the third configuration is different from the first configuration and the second configuration [the CNN may be trained to learn and modify the parameters of the model using training data (para. 0025, etc.) the CNN may be trained to learn and modify the parameters of the model using training data (para. 0025, etc.) (modifying the sets of weights in the filters during the training causing them to be different)].


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-13, 17, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Aliabadi (US 2018/0096226) in view of David (US 2019/0080243) or, alternatively, over David in view of Aliabadi, both as described below.

As per claim 1, Aliabadi teaches an apparatus comprising: a processor [an implementation of CNNs on a computing device including multiple processors executing SIMD instructions (paras. 0028, 0067, etc.)]; and memory coupled to the processor, the memory comprising instructions [an implementation of CNNs on a computing device including multiple processors executing SIMD instructions (paras. 0028, 0067, etc.) from a memory (para. 0034, etc.)] which, when executed by the processor, cause the processor to: generate, based in part on a receptive field size and a number of learnable parameters, a plurality of filters for a convolutional neural network (CNN) [a kernel of a CNN is divided into multiple runnels (filters) (para. 0006, etc.) each based on a number of weights (learnable parameters) that is based on a size of SIMD registers used for SIMD instructions (para. 0031, etc.) and a size of a number of feature maps (receptive field size) (para. 0256, etc.)], wherein the number of learnable parameters is based on a computing characteristic of the apparatus [a kernel of a CNN is divided into multiple runnels (filters) (para. 0006, etc.) each based on a number of weights (learnable parameters) that is based on a size of SIMD registers used for SIMD instructions (para. 0031, etc.)]; and train the CNN on a validation set [the CNN may be trained to learn the parameters of the model using training data (para. 0025, etc.)].
While Aliabadi teaches selecting and training filters of a CNN (see above) it does not explicitly teach each filter comprising the number of learnable parameters arranged in different random configurations on the filter, select one filter from the plurality of filters 
David teaches an apparatus comprising: a processor [a system including a processor executing instructions from memory (para. 0060, etc.)]; and memory coupled to the processor, the memory comprising instructions [a system including a processor executing instructions from memory (para. 0060, etc.)] which, when executed by the processor, cause the processor to: generate filters, each filter comprising the number of learnable parameters arranged in different random configurations on the filter [filters of a CNN may be modified by selecting and modifying filters, as well as changing random weights with random values (abstract; paras. 0045-47; etc.)], select one filter from the plurality of filters based on a convergence speed, for each of the plurality of filters, of the CNN [the modification of filters and creation of new filters includes selecting filter modifications that produce faster convergence (paras. 0034, 0045-47, etc.)]; and train the CNN on a validation set using the one filter from the plurality of filters [the modification of filters and creation of new filters includes selecting filter modifications that produce faster convergence (paras. 0034, 0045-47, etc.), used to train the CNN on a training set (para. 0009, etc.)].
Aliabadi and David are analogous art, as they are within the same field of endeavor, namely optimizing CNNs.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to include random mutation of filter weights and selecting filters for speed of convergence, in the optimization of the CNN, as taught by 
David provides motivation as [choosing filters to improve convergence speed improves and speeds up training of the CNN while randomization allows greater exploration (paras. 0034, 0045-47, etc.)].
Alternatively/additionally it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to include selecting a number of learnable parameters of filters based upon a computing characteristic of the apparatus for SIMD instructions/execution, as taught by Aliabadi, for the selection of filters in the system taught by David.
Aliabadi provides motivation as [by fitting the number of weights in the filters to the size of registers used by the ISA to implement the convolution layers, the system can more efficiently implement the CNN (para. 0031, etc.) while using a SIMD architecture allows for improved parallel computations (para. 0067, etc.)].

As per claim 2, Aliabadi/David teaches wherein the computing characteristic is the ability of the processor to execute a Single Instruction, Multiple Data (SIMD) instruction set [each runnel (filter) is based on a number of weights (learnable parameters) that is based on a size of SIMD registers used for SIMD instructions (Aliabadi: para. 0031, etc.)].

As per claim 3, Aliabadi/David teaches wherein the filter is disposed in a channel of the CNN [each layer of the CNN including the kernels including the filters has M input channels to the runnels (Aliabadi: para. 0030, etc.); and each CNN includes a hierarchy of layers, each including one or more channels (David: para. 0010, etc.)].

As per claim 4, Aliabadi/David teaches wherein the filter is disposed in a layer of the CNN [each layer of the CNN including the kernels including the filters has M input channels to the runnels (Aliabadi: para. 0030, etc.); and each CNN includes a hierarchy of layers, each including one or more channels (David: para. 0010, etc.)].

As per claim 5, while Aliabadi/David teaches various receptive field sizes and numbers of parameters (see, e.g., David: para. 0031; Aliabadi: fig. 1; paras. 0084-85; etc.) it does not explicitly teach wherein the receptive field size is 5 x 5 and the number of learnable parameters is 8.  However, it has been held that where the general conditions of a claim are disclosed in the prior art, discovering the optimum or working ranges involves only routine skill in the art. In re Aller, 105 USPQ 233.  Furthermore, it has been held that a change in size is within the level of ordinary skill in the art.  In re Rose, 105 USPQ 237 (CCPA 1955).

As per claim 6, Aliabadi/David teaches the memory comprising instructions which, when executed by the processor, cause the processor to: generate, based in part on the receptive field size and the number of learnable parameters, a second filter comprising the number of learnable parameters, wherein the learnable parameters are [a kernel of a CNN is divided into multiple runnels (filters) (Aliabadi: para. 0006, etc.) each based on a number of weights (learnable parameters) that is based on a size of SIMD registers used for SIMD instructions (Aliabadi: para. 0031, etc.) and a size of a number of feature maps (receptive field size) (Aliabadi: para. 0256, etc.); where filters of a CNN may be modified by selecting and modifying filters, as well as changing random weights with random values (David: abstract; paras. 0045-47; etc.)]; and train a second CNN on the validation set using the second filter [the CNN may be trained to learn the parameters of the model using training data (Aliabadi: para. 0025, etc.) where the modification of filters and creation of new filters includes selecting filter modifications that produce faster convergence (David: paras. 0034, 0045-47, etc.), used to train the CNN on a training set (David: para. 0009, etc.)].

As per claim 7, Aliabadi/David teaches wherein the second CNN converges faster than the CNN [the modification of filters and creation of new filters includes selecting filter modifications that produce faster convergence (David: paras. 0034, 0045-47, etc.)].

As per claim 8, Aliabadi/David teaches instruction which, when executed by the processor, cause the processor to store the second configuration in a database [the chromosomes comprising the filters may be stored in a database (David: para. 0056, etc.)].

As per claim 9, Aliabadi/David teaches at least one machine-readable storage medium comprising instructions [an implementation of CNNs on a computing device including multiple processors executing SIMD instructions (Aliabadi: paras. 0028, 0067, etc.) from a memory (Aliabadi: para. 0034, etc.); and/or a system including a processor executing instructions from memory (David: para. 0060, etc.)] that, when executed by a processor, cause the processor to: define a filter dimension to be used in a convolution layer of a convolutional neural network (CNN), wherein the filter dimension determines a receptive field size of the convolutional layer [a kernel of a CNN is divided into multiple runnels (filters) (Aliabadi: para. 0006, etc.) each based on a number of weights (learnable parameters) that is based on a size of SIMD registers used for SIMD instructions (Aliabadi: para. 0031, etc.) and a size of a number of feature maps (receptive field size) (Aliabadi: para. 0256, etc.); filters of a CNN may defined in a number of chromosomes and modified by selecting and modifying filters via mutation (David: abstract; paras. 0045-47; etc.)]; specify a number of learnable parameters based on a computing characteristic of the processor [a number of weights (learnable parameters) that is based on a size of SIMD registers used for SIMD instructions (Aliabadi: para. 0031, etc.)]; generate a plurality of filters, each of the plurality of filters comprising the receptive field size and comprising the specified number of learning parameters [a kernel of a CNN is divided into multiple runnels (filters) (Aliabadi: para. 0006, etc.) each based on a number of weights (learnable parameters) that is based on a size of SIMD registers used for SIMD instructions (Aliabadi: para. 0031, etc.) and a size of a number of feature maps (receptive field size) (Aliabadi: para. 0256, etc.)], wherein the arrangement of learning parameters is distinct for each of the plurality of filters [filters of a CNN may be modified by selecting and modifying filters, as well as changing random weights with random values (David: abstract; paras. 0045-47; etc.)]; and execute the CNN using the filter [the CNN may be trained to learn the parameters of the model using training data before execution (Aliabadi: para. 0025, etc.) where the modification of filters and creation of new filters includes selecting filter modifications that produce faster convergence (David: paras. 0034, 0045-47, etc.), used to train the CNN on a training set before execution (David: para. 0009, etc.)].
Examiner’s Note: the reasoning and motivation for the combination is provided above, in the rejection of claim 1.

As per claim 10, Aliabadi/David teaches instructions that further cause the processor to specify the number of learnable parameters based on a Single Instruction, Multiple Data (SIMD) computing characteristic of the processor [a kernel of a CNN is divided into multiple runnels (filters) (Aliabadi: para. 0006, etc.) each based on a number of weights (learnable parameters) that is based on a size of SIMD registers used for SIMD instructions (Aliabadi: para. 0031, etc.) and a size of a number of feature maps (receptive field size) (Aliabadi: para. 0256, etc.)].

As per claim 11, Aliabadi/David teaches instructions that further cause the processor to: use one of the plurality of filters in a channel of the CNN; and use a second of the plurality of filters in a layer of the CNN [each layer of the CNN including the kernels including the filters has M input channels to the runnels (Aliabadi: para. 0030, etc.); and each CNN includes a hierarchy of layers, each including one or more channels (David: para. 0010, etc.)].

As per claim 12, Aliabadi/David teaches instructions that further cause the processor to select one of the plurality of filters based on which one converges the fastest when running the CNN [the modification of filters and creation of new filters includes selecting filter modifications that produce faster convergence (David: paras. 0034, 0045-47, etc.)].

As per claim 13, Aliabadi/David teaches instructions that further cause the processor to use the filter to perform training and inference of the CNN [the CNN may be trained to learn the parameters of the model using training data before execution (Aliabadi: para. 0025, etc.) where the modification of filters and creation of new filters includes selecting filter modifications that produce faster convergence (David: paras. 0034, 0045-47, etc.), used to train the CNN on a training set before execution (David: para. 0009, etc.)].

As per claim 17, Aliabadi teaches the apparatus of claim 16, as described above.
While Aliabadi teaches selecting and training filters of a CNN (see above) it does not explicitly teach the memory further comprising instructions which, when executed by the multi-processor, cause the multiprocessor to: select either the first filter, the second filter or the third filter based on how fast the CNN converges with each filter.
David teaches the memory further comprising instructions which, when executed by the multi-processor, cause the multiprocessor to: select either the first filter, the second filter or the third filter based on how fast the CNN converges with each filter [filters of a CNN may defined in a number of chromosomes and modified by selecting and modifying filters via mutation (abstract; etc.) where the modification of filters and creation of new filters includes selecting filter modifications that produce faster convergence (paras. 0034, 0045-47, etc.)].
Aliabadi and David are analogous art, as they are within the same field of endeavor, namely optimizing CNNs.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to include random mutation of filter weights and selecting filters for speed of convergence, in the optimization of the CNN, as taught by 
David provides motivation as [choosing filters to improve convergence speed improves and speeds up training of the CNN while randomization allows for greater exploration (paras. 0034, 0045-47, etc.)].

As per claim 20, see the rejection of claim 8, above


Claims 18 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Aliabadi (US 2018/0096226) in view of well-known practices in the art.

As per claim 18, while Aliabadi teaches various receptive field sizes and numbers of parameters (see, e.g., Aliabadi: fig. 1; paras. 0084-85; etc.) it does not explicitly teach wherein the receptive field size is 5 x 5 and the number of learnable parameters is 8.  However, it has been held that where the general conditions of a claim are disclosed in the prior art, discovering the optimum or working ranges involves only routine skill in the art. In re Aller, 105 USPQ 233.  Furthermore, it has been held that a change in size is within the level of ordinary skill in the art.  In re Rose, 105 USPQ 237 (CCPA 1955).

As per claim 19, while Aliabadi/David teaches various receptive field sizes and numbers of parameters (see, e.g., David: para. 0031; Aliabadi: fig. 1; etc.) it does not explicitly teach wherein the receptive field size is 10 x 10 and the number of learnable In re Aller, 105 USPQ 233.  Furthermore, it has been held that a change in size is within the level of ordinary skill in the art.  In re Rose, 105 USPQ 237 (CCPA 1955).


Conclusion
The following is a summary of the treatment and status of all claims in the application as recommended by M.P.E.P. 707.07(i): claims 1-20 are rejected.

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Annapureddy (US 2016/0217369) – discloses a system including compression of filters in a CNN.
Ren (US 2018/0268284) – discloses trimming layers of a CNN including reducing filter sizes.

The examiner requests, in response to this Office action, that support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line number(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.

When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the references cited or the objections made. He or she must also show how the amendments avoid such references or objections.  See 37 CFR 1.111(c).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to GEORGE GIROUX whose telephone number is (571)272-9769. The examiner can normally be reached M-F 10am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas can be reached on 571-272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For 





/GEORGE GIROUX/Primary Examiner, Art Unit 2128