DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of the Claims
Claims 1, 5-6, 8, 15, and 19-20 are currently amended. Claims 9-10 are canceled. Claims 24-27 are new. Claims 1-3, 5-8, 12, 14-17, 19-20, and 22-27 are pending and have been considered. 

Claim Objections
Claims 24-26 are objected to because of the following informalities:  Claim 24 recites “where the programmable set of weights… are sparse.” The subject of the sentence is “set” which is singular but the verb is plural. Furthermore, the weights should be sparse, not the set. Claims 25-26 are objected to the reasons set forth in the objection to claim 24.
	Claim 27 is objected to because of the following informalities:  In line 1, “said process” should recite “said processing”. In lines 3, “concatenate” should recite “concatenating” and in line 5, “input” should recite “inputting”.
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1-3, 5-7, 15-17, 19-20, 22-24, and 26 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 

NEW MATTER
Claim 1, line 10 recites: “arranging the fixed layer portion and the programmable layer portion in parallel”. The limitation constitutes new matter because the written disclosure lacks a positive step of arranging the fixed layer portion and the programmable portion in parallel.  For purposes of examination, Examiner interprets this limitation as if it had recited the portions are arranged in parallel.
Claims 2-3, 5-7, 22, and 24 are rejected for failing to cure the deficiencies of claim 1 upon which they depend.
Claim 15, line 11 recites the same features recited by claim 1. Claim 15 is rejected for the reasons set forth in the rejection of claim 1.
Claims 16-17, 19-20, 23 and 26 are rejected for failing to cure the deficiencies of claim 15 upon which they depend.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-3, 5, 7, 15-17, 19, 24, and 26 are rejected under 35 U.S.C. 103 as being unpatentable over Wen et al. (“Latent Factor Guided Convolutional Neural Networks for Age-Invariant Face Recognition”) in view of Han et al. (“Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding”).

	Regarding CLAIM 1, Wen teaches: A method, comprising: 
at a training computer: (The experiments in section 4 from p. 4897, col. 2 to p. 4900 are evidence of a training computer.)
training a neural network on a first dataset to generate a first set of weights, the neural network including a plurality of layers including at least a first layer, a second layer and a third layer (On p. 4895, Fig. 3 shows the architecture of a Latent-Feature CNN (LF-CNN). A first layer as claimed includes the boxes “Convolution Unit” and “Convolution Unit (Frozen)”. The convolution architecture is further described in the caption below Fig. 3 and at p. 4895, § 3.1, ¶ 1-2. Since the BRI of a neural network layer includes any mathematical function, a second layer and a third layer include the functions performed by the Age-Invariant Identity Loss which include “Latent Identity Analysis”, “Latent Factor FC Layer (Frozen)”, “Contrastive”, and “Softmax”, where FC means fully-connected. LF-FC is discussed on p. 4895, col. 2, first full paragraph and p. 4897, col. 1, second to last paragraph. Latent Identity Analysis is discussed throughout § 3.2 starting on p. 4895. A first set of weights is taught by the convolution weights on p. 4895, col. 1, lines 4-5 from the end, and by the parameter W on p. 4895, col. 2, throughout the first full paragraph.
P. 4897, col. 2, § 4.1, “Training data” discloses the two types of training data                         
                            
                                
                                    Y
                                
                                
                                    i
                                
                            
                        
                     and                         
                            
                                
                                    Y
                                
                                
                                    i
                                    a
                                
                            
                        
                     (Note: in line 4,                         
                            Y
                        
                     should be                         
                            
                                
                                    Y
                                
                                
                                    i
                                    a
                                
                            
                        
                    ). In Fig. 3, “Convolution Unit” receives input                         
                            
                                
                                    Y
                                
                                
                                    i
                                
                            
                        
                     and generates a feature map output, and “Convolution Unit (Frozen)” receives input                         
                            
                                
                                    Y
                                
                                
                                    i
                                    a
                                
                            
                        
                     and generates a different feature map output. The BRI of the limitation “training a neural network on a first dataset to generate a first set of weights” includes training the neural network in Fig. 3 on                         
                            
                                
                                    Y
                                
                                
                                    i
                                    a
                                
                            
                        
                    , which forward-propagates into the Age-Invariant Identity Loss section of Fig. 3, as discussed in p. 4897, col. 1, second-to-last paragraph, lines 1-3.)
training the neural network on a second dataset to generate a second set of weights; (A second set of weights is taught by the convolution weights on p. 4895, col. 1, lines 4-5 from the end, and by the parameter W on p. 4895, col. 2, throughout the first full paragraph. The BRI of this limitation includes training the neural network from Fig. 3 on                         
                            
                                
                                    Y
                                
                                
                                    i
                                
                            
                        
                    , as discussed in p. 4897, col. 1, last paragraph, lines 3-4.)
identifying a fixed layer portion and a programmable layer portion in the first layer and arranging the fixed layer portion and the programmable layer portion in parallel, the fixed layer portion having a fixed set of weights, the programmable layer portion having a programmable set of weights; (The BRI of this limitation includes identifying a fixed layer portion and a programmable layer portion arranged in parallel. In Fig. 3 on p. 4895, a fixed convolutional layer portion is disclosed by the box “Convolution Unit (Frozen)” and programmable convolutional layer portion is disclosed by the box “Convolution Unit”. These portions are arranged in parallel, as discussed in the last sentence of the caption.)
training the neural network on the second dataset to generate a final set of weights (P. 4897, col. 2, final paragraph teaches a total number of epochs is about 12. A final set of weights is generated at the end of the last training epoch.)
at an inference computer: (The experiments in section 4 from p. 4897, col. 2 to p. 4900 are evidence of an inference computer)
processing, using the fixed layer portion of the first layer of the neural network, input data to generate intermediate feature map data; (Fig. 3 on p. 4895 includes “Convolution Unit (Frozen)” which processes                         
                            
                                
                                    Y
                                
                                
                                    i
                                    a
                                
                            
                        
                    ; P. 4895, col.1, last 3 lines teach generating features; input testing data for                         
                            
                                
                                    Y
                                
                                
                                    i
                                    a
                                
                            
                        
                     is disclosed by p. 4898, col. 1, § 4.2, lines 4-6.)
processing, using the programmable layer portion of the first layer of the neural network, the input data to generate concatenate feature map data; (Fig. 3 on p. 4895 includes “Convolution Unit” which processes                         
                            
                                
                                    Y
                                
                                
                                    i
                                
                            
                        
                    ; P. 4895, col.1, last 3 lines teach generating features; and the experimental results in section 4 are evidence of testing data for                         
                            
                                
                                    Y
                                
                                
                                    i
                                
                            
                        
                    .)
Att'y Dkt: P05152US.family-2- Application Number: 16/054,35818.ARM.32PATENTprocessing, using the second layer of the neural network, the intermediate feature map data and the concatenate feature map data to generate output feature map data; (Fig. 3 on p. 4895 includes “Latent Identity Analysis” and “Latent Factor FC Layer (Frozen)” which process the feature maps from the frozen and not-frozen convolution units, respectively. P. 4895, col. 2, first full paragraph, lines 3-5 discloses matrix multiplication for the LF-FC layer. Both Fig. 3 and p. 4895, col. 2, first full para., lines 3-5 and 9 teach the LF-FC layer outputs age-invariant features with a dimension of 512. The BRI of “output feature map data” includes any data related to feature maps. The outputs data was generated by processing feature maps.)
processing, using the third layer of the neural network, the output feature map data to generate output data; and (Fig. 3 on p. 4895 includes “Contrastive” and “Softmax” operations that generate contrastive and softmax losses, which is further discussed at p. 4897, col. 1 last 3 lines; and p. 4897, col. 2, lines 3-4.)
outputting the output data. (Experimental results in § 4 starting on p. 4897 are evidence of outputting the data from the neural network.)
However, Wen does not explicitly teach: training the neural network on the second dataset to generate a final set of weights including at least one of: 
quantizing at least a portion of the fixed set of weights, and 
pruning at least a portion of the fixed set of weights; 
But Han teaches: training the neural network on the second dataset to generate a final set of weights including at least one of: 
quantizing at least a portion of the fixed set of weights, and (P. 1, Abstract, lines 7-8; P. 2, middle dotted box in Fig. 1 and Fig. 1 caption, lines 1-2; P. 3, § 3, paragraphs 1-2 teach that weights are quantized to 4 bins denoted with 4 colors as seen in Fig. 3. The online version of this reference contains colors.)
pruning at least a portion of the fixed set of weights; (P. 1, Abstract, lines 6-7; P. 2, left dotted box in Fig. 1 and Fig. 1 caption, lines 1-2; Pages 2-3, all §2.)
	Han is in the same field of endeavor as the claimed invention, namely, pruning and quantizing CNNs. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have quantized and pruned at least some of the fixed set of weights in the network of Wen/O’Shea. A motivation for the combination is to reduce the storage requirements of the neural network without affecting accuracy. (Han, p. 1, Abstract, lines 4-6) 
	
	Regarding CLAIM 2, the combination of Wen and Han teaches: The method as claimed in claim 1,
Wen teaches: further comprising identifying similarities of the first set of weights and the second set of weights. (The BRI of this limitation includes a contrastive loss. See top right of Fig. 3; P. 4897, col. 1, last 2 lines; P. 4897, col. 2, line 4.)

	Regarding CLAIM 3, the combination of Wen and Han teaches: The method as claimed in claim 1, 
Wen teaches: further comprising determining that the first dataset is the same domain as the second dataset. (According to p. 4897, col. 2, § 4.1, “Training Data”, both datasets contain facial images.)

	Regarding CLAIM 5, the combination of Wen, and Han teaches: The method as claimed in claim 1, 
	Wen teaches: where: the neural network is a convolutional neural network (CNN); and (P. 4894, col. 1, last paragraph, lines 1-3; P. 4895, Fig. 3 and first line of its caption)
the first layer is a convolutional layer. (P. 4895, Fig. 3 shows a first layer comprising a frozen convolution unit and an unfrozen convolution unit; P. 4897, col. 1, § 3.3, lines 1-3 state: “In LF-CNN model, the convolution unit maps a raw input image                         
                            
                                
                                    F
                                
                                
                                    i
                                    m
                                    g
                                
                            
                        
                     to convolutional feature                         
                            
                                
                                    F
                                
                                
                                    c
                                    o
                                    n
                                    v
                                
                            
                        
                     by                         
                            
                                
                                    F
                                
                                
                                    c
                                    o
                                    n
                                    v
                                
                            
                            =
                            f
                            (
                            
                                
                                    F
                                
                                
                                    i
                                    m
                                    g
                                
                            
                            )
                        
                    ”.)

	Regarding CLAIM 7, the combination of Wen and Han teaches: The method as claimed in claim 1, 
Wen teaches: further comprising: identifying selected layers of the neural network having particular connectivity properties; and (The BRI of this limitation includes the connections of the convolutional neural network having any properties.  On p. 4895, Fig. 3 shows “Convolution Unit” being unfrozen meaning its weights are adjustable. The architecture is further described in the caption below Fig. 3 and p. 4895, § 3.1, ¶ 1-2.)
updating the identified selected layers of the neural network. (The BRI of this limitation includes training the neural network from Fig. 3 on                         
                            
                                
                                    Y
                                
                                
                                    i
                                
                            
                        
                    , as discussed in p. 4897, col. 1, last paragraph, lines 3-4.)

Regarding CLAIM 24, the combination of Wen and Han teaches: The method as claimed in claim 1,
Wen teaches: the programmable convolutional layer portion of the first layer (“Convolution Unit” in Fig. 3 on p. 4895.)
However, Wen does not explicitly teach: where the programmable convolutional layer portion of the first layer are sparse.
	But Han teaches: where the programmable convolutional layer portion of the first layer are sparse. (P. 3, first two paragraphs)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to pruned Wen’s “Convolution Unit” layers and to have stored the sparse structure that results from pruning using compressed sparse row (CSR) or compressed sparse column (CSC) format. A motivation for the combination is to reduce the number of parameters in the network. (Han p. 2, last line)

Claims 15-17, 19, and 26 recite the same features as method claims 1-3, 5, and 24, respectively. Independent claim 15 also recites the training computer includes a memory and a processor, coupled to the memory, that executes instructions stored in the memory, and the inference computer includes a memory and a processor, coupled to the memory, that executes instructions stored in the memory. Wen discloses these additional limitations by the experiments in section 4 from p. 4897, col. 2 to p. 4900, which are evidence for a computer including a memory and a processor. Claims 15-17, 19, and 26 are rejected for the reasons set forth in the rejections of claims 1-3, 5, and 24, respectively.

Claims 6 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Wen et al. (“Latent Factor Guided Convolutional Neural Networks for Age-Invariant Face Recognition”) in view of Han et al. (“Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding”) and O’Shea et al. (“An Introduction to Convolutional Neural Networks”, see PTO-892 filed 12/16/2021).

Regarding CLAIM 6, the combination of Wen and Han teaches: The method as claimed in claim 5, 
	Wen teaches: where: the third layer is a fully-connected layer. (Fig. 3 on p. 4895 includes a fully-connected layer “Latent Factor FC Layer (Frozen)”. P. 4895, col. 2, first full paragraph, lines 3-5 discloses matrix multiplication for the LF-FC layer.)
	Although Wen teaches both convolution layers and a fully-connected layer, Wen does not explicitly teach two consecutive convolution layers. Neither Wen nor Han explicitly teaches: where: the second layer is a convolutional layer;
	But O’Shea teaches: where: the second layer is a convolutional layer; and (Fig. 5 on p. 9 shows consecutive convolution layers. A “second layer” in interpreted as the combination of the last convolutional layer and the last pooling layer. This second layer in Fig. 5 receives feature maps from the preceding layer and generates an output feature map. Fig. 5 is discussed at the top of p. 9. O’Shea teaches feature maps/activation maps at p. 6, first and third paragraphs; and p. 8, second paragraph.)
	O’Shea is in the same field of endeavor as the claimed invention, namely, convolutional neural networks. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have inserted O’Shea’s last convolutional layer and last pooling layer into Wen’s LF-CNN between the “Convolution Unit” and “Latent Factor FC Layer” in Wen’s Fig. 3. A motivation for the combination is that a network with more layers may express stronger features of the input with fewer parameters. (O’Shea, p. 9, paragraph under Fig. 5, line 9)

Claim 20 recites the same features as method claims 6. Independent claim 15, upon which claim 20 depends, also recites the training computer includes a memory and a processor, coupled to the memory, that executes instructions stored in the memory, and the inference computer includes a memory and a processor, coupled to the memory, that executes instructions stored in the memory. Wen discloses these additional limitations by the experiments in section 4 from p. 4897, col. 2 to p. 4900, which are evidence for a computer including a memory and a processor. Claim 20 is rejected for the reasons set forth in the rejection of claim 6.

Claims 8, 12, 14, and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Wen et al. (“Latent Factor Guided Convolutional Neural Networks for Age-Invariant Face Recognition”) in view of O’Shea et al. (“An Introduction to Convolutional Neural Networks”, see PTO-892 filed 12/16/2021) and Han et al. (“Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding”).

	Regarding CLAIM 8, Wen teaches: An apparatus, comprising: a memory; and a processor, coupled to the memory, configured to: (The experiments in section 4 from p. 4897, col. 2 to p. 4900 are evidence of a computer comprising memory and a processor)
Att'y Dkt: P05152US.family-3- Application Number: 16/054,35818.ARM.32PATENTread, from the memory, a convolutional neural network including at least a first layer, a second layer and a third layer (On p. 4895, Fig. 3 shows the architecture of a Latent-Feature CNN (LF-CNN). A first layer as claimed includes the boxes “Convolution Unit” and “Convolution Unit (Frozen)”. The architecture is further described in the caption below Fig. 3 and at p. 4895, § 3.1, ¶ 1-2. Since the BRI of a neural network layer includes any mathematical function, a second layer and a third layer include the functions performed by the Age-Invariant Identity Loss which include “Latent Identity Analysis”, “Latent Factor FC Layer (Frozen)”, “Contrastive”, and “Softmax”. LF-FC is discussed on p. 4895, col. 2, first full paragraph and p. 4897, col. 1, second to last paragraph. Latent Identity Analysis is discussed throughout § 3.2 starting on p. 4895.)
the first layer including a fixed convolutional layer portion and a programmable convolution layer portion arranged in parallel, the fixed convolutional layer portion configured to receive input data and generate one or more intermediate feature maps and the programmable convolutional layer portion configured to receive input data and generate one or more concatenate feature maps, (In Fig. 3 on p. 4895, fixed convolutional layer portion is interpreted as the box “Convolution Unit (Frozen)” and programmable convolutional layer portion is interpreted as the box “Convolution Unit”. These portions are arranged in parallel as shown in Fig. 3 and as disclosed by the last sentence of the caption. 
P. 4897, col. 1, § 3.3, lines 1-3 state: “In LF-CNN model, the convolution unit maps a raw input image                         
                            
                                
                                    F
                                
                                
                                    i
                                    m
                                    g
                                
                            
                        
                     to convolutional feature                         
                            
                                
                                    F
                                
                                
                                    c
                                    o
                                    n
                                    v
                                
                            
                        
                     by                         
                            
                                
                                    F
                                
                                
                                    c
                                    o
                                    n
                                    v
                                
                            
                            =
                            f
                            (
                            
                                
                                    F
                                
                                
                                    i
                                    m
                                    g
                                
                            
                            )
                        
                    ”. P. 4897, col. 2, § 4.1, “Training data” discloses the two types of training data                         
                            
                                
                                    Y
                                
                                
                                    i
                                
                            
                        
                     and                         
                            
                                
                                    Y
                                
                                
                                    i
                                    a
                                
                            
                        
                     (Note: in line 4,                         
                            Y
                        
                     should be                         
                            
                                
                                    Y
                                
                                
                                    i
                                    a
                                
                            
                        
                    ). In Fig. 3, “Convolution Unit” receives input                         
                            
                                
                                    Y
                                
                                
                                    i
                                
                            
                        
                     and generates a feature map output, and “Convolution Unit (Frozen)” receives input                         
                            
                                
                                    Y
                                
                                
                                    i
                                    a
                                
                            
                        
                     and generates a different feature map output.)
the second layer, including a programmable convolutional layer, configured to receive the one or more intermediate feature maps and the one or more concatenate feature maps and generate one or more output feature maps, (Fig. 3 on p. 4895 includes “Latent Identity Analysis” and “Latent Factor FC Layer (Frozen)” which process the feature maps from the frozen and not-frozen convolution units, respectively. P. 4895, col. 2, first full paragraph, lines 3-5 discloses matrix multiplication for the LF-FC layer. Both Fig. 3 and p. 4895, col. 2, first full para., lines 3-5 and 9 teach the LF-FC layer outputs age-invariant features with a dimension of 512. The BRI of “output feature map data” includes any data related to feature maps. The outputs data was generated by processing feature maps.)
the third layer, including at least one programmable fully-connected layer, configured to receive the output feature maps and generate output data, (Fig. 3 on p. 4895 includes “Contrastive” and “Softmax” operations that generate contrastive and softmax losses, which is further discussed at p. 4897, col. 1 last 3 lines; and p. 4897, col. 2, lines 3-4.)
process, using the first layer of the neural network, the input data to generate the intermediate feature maps and the concatenate feature maps, (P. 4897, col. 1, § 3.3, lines 1-3 state: “In LF-CNN model, the convolution unit maps a raw input image                         
                            
                                
                                    F
                                
                                
                                    i
                                    m
                                    g
                                
                            
                        
                     to convolutional feature                         
                            
                                
                                    F
                                
                                
                                    c
                                    o
                                    n
                                    v
                                
                            
                        
                     by                         
                            
                                
                                    F
                                
                                
                                    c
                                    o
                                    n
                                    v
                                
                            
                            =
                            f
                            (
                            
                                
                                    F
                                
                                
                                    i
                                    m
                                    g
                                
                            
                            )
                        
                    ”. P. 4897, col. 2, § 4.1, “Training data” discloses the two types of training data                         
                            
                                
                                    Y
                                
                                
                                    i
                                
                            
                        
                     and                         
                            
                                
                                    Y
                                
                                
                                    i
                                    a
                                
                            
                        
                     (Note: in line 4,                         
                            Y
                        
                     should be                         
                            
                                
                                    Y
                                
                                
                                    i
                                    a
                                
                            
                        
                    ). In Fig. 3, “Convolution Unit” receives input                         
                            
                                
                                    Y
                                
                                
                                    i
                                
                            
                        
                     and generates a feature map output, and “Convolution Unit (Frozen)” receives input                         
                            
                                
                                    Y
                                
                                
                                    i
                                    a
                                
                            
                        
                     and generates a different feature map output. The claim does not require the first layer to generate both the intermediate feature maps and the concatenate feature maps at the same time.)
process, using the second layer of the neural network, the intermediate feature maps and the concatenate feature maps to generate the output feature maps, (Fig. 3 on p. 4895 includes “Latent Identity Analysis” and “Latent Factor FC Layer (Frozen)” which process the feature maps from the frozen and not-frozen convolution units, respectively. P. 4895, col. 2, first full paragraph, lines 3-5 discloses matrix multiplication for the LF-FC layer. Both Fig. 3 and p. 4895, col. 2, first full para., lines 3-5 and 9 teach the LF-FC layer outputs age-invariant features with a dimension of 512. The BRI of “output feature map data” includes any data related to feature maps. The outputs data was generated by processing feature maps.)
process, using the third layer of the neural network, the output feature maps to generate the output data, and (Fig. 3 on p. 4895 includes “Contrastive” and “Softmax” operations that generate contrastive and softmax losses, which is further discussed at p. 4897, col. 1 last 3 lines; and p. 4897, col. 2, lines 3-4.)
output the output data, (Experimental results in § 4 starting on p. 4897 are evidence of outputting the data from the neural network.)
where the fixed convolutional layer portion of the first layer has a fixed set of weights, the programmable convolutional layer portion of the first layer, the second layer and the third layer have a programmable set of weights, (In Fig. 3 on p. 4895, fixed convolutional layer portion is interpreted as the box “Convolution Unit (Frozen)” and programmable convolutional layer portion is interpreted as the box “Convolution Unit”. These portions are arranged in parallel as seen in Fig. 3 and as discussed in the last sentence of the caption. P. 4895, col. 1, lines 4-5 from the end teaches convolution weights.) 
	Wen teaches a CNN with at least 3 layers. Wen also teaches a fully-connected layer with a set of weights at P. 4895, col. 2, first full paragraph, lines 3-5.  However, Wen does not teach two consecutive convolutional layers nor a programmable fully-connected layer. 
Wen does not explicitly teach: the second layer, including a programmable convolutional layer, configured to receive the one or more intermediate feature maps and the one or more concatenate feature maps and generate one or more output feature maps, 
the third layer, including at least one programmable fully-connected layer, configured to receive the output feature maps and generate output data,
where the second layer and the third layer have a programmable set of weights, at least a portion of the fixed set of weights are quantized, and at least a portion of the fixed set of weights are pruned.
	But O’Shea teaches: the second layer, including a programmable convolutional layer, (A “second layer” in interpreted as the combination of the last convolutional layer and the last pooling layer in Fig. 5 on p. 9. This “second layer” in Fig. 5 receives feature maps from the preceding layer and generates an output feature map. On p. 5, the last line states the kernels are learnable. Moreover, feature maps, which O’Shea calls activation maps, is taught on p. 6, first and third paragraphs; and p. 8, second paragraph.)
the third layer, including at least one programmable fully-connected layer, (A “third layer” is interpreted as the second-to-last fully-connected layer in Fig. 5 on p. 9. P. 5, item 4 states: “fully connected layers attempt to produce class scores from the activations, to be used for classification.”)
where the second layer and the third layer have a programmable set of weights (Second layer – On p. 5, the last line states the kernels are learnable. Third layer – On p. 5, item 4 states the fully-connected layers perform the same duties found in standard ANNs. On p. 2, the first paragraph teaches supervised training for a standard ANN. Therefore, the fully-connected layer is trainable.)
	O’Shea is in the same field of endeavor as the claimed invention, namely, convolutional neural networks. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have inserted O’Shea’s last convolutional layer, last pooling layer, and first fully-connected layer into Wen’s LF-CNN between the “Convolution Unit” and “Latent Factor FC Layer” in Wen’s Fig. 3. A motivation for the combination is that the network with more layers may express stronger features of the input with fewer parameters. (O’Shea, p. 9, paragraph under Fig. 5, line 9)
	However, neither Wen nor O’Shea explicitly teaches: where at least a portion of the fixed set of weights are quantized, and at least a portion of the fixed set of weights are pruned.
	But Han teaches: where at least a portion of the fixed set of weights are quantized, and (P. 1, Abstract, lines 7-8; P. 2, middle dotted box in Fig. 1 and Fig. 1 caption, lines 1-2; P. 3, § 3, paragraphs 1-2 teach that weights are quantized to 4 bins denoted with 4 colors which is shown in Fig. 3. The online version of this reference contains colors.) 
at least a portion of the fixed set of weights are pruned. (P. 1, Abstract, lines 6-7; P. 2, left dotted box in Fig. 1 and Fig. 1 caption, lines 1-2; Pages 2-3, all §2.)
	Han is in the same field of endeavor as the claimed invention, namely, pruning and quantizing CNNs. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have quantized and pruned at least some of the fixed set of weights in the network of Wen/O’Shea. A motivation for the combination is to reduce the storage requirements of the neural network without affecting accuracy. (Han, p. 1, Abstract, lines 4-6) 

	Regarding CLAIM 12, the combination of Wen, O’Shea, and Han teaches: The apparatus as claimed in claim 8, 
Wen teaches: where the fixed convolutional layer portion, the programmable convolutional layer portion, the programmable convolutional layer and the programmable fully-connected layer are identified during training. (In Fig. 3 on p. 4895, fixed convolutional layer portion is interpreted as the box labeled “Convolution Unit (Frozen)” and programmable convolutional layer portion is interpreted as the box labeled “Convolution Unit”. Training is taught on p. 4897, col. 2, § 4.1, “Training data”.)
However, Wen does not explicitly teach: where the fixed convolutional layer portion, the programmable convolutional layer portion, the programmable convolutional layer and the programmable fully-connected layer are identified during training.
	But O’Shea teaches: where the fixed convolutional layer portion, the programmable convolutional layer portion, the programmable convolutional layer and the programmable fully-connected layer are identified during training. (The BRI of “identified during training” is that the layers are identified as being programmable. Regarding the programmable convolutional layer, on p. 5, the last line states the kernels are learnable. Regarding the programmable fully-connected layer, on p. 5, item 4 states the fully-connected layers perform the same duties found in standard ANNs. On p. 2, the first paragraph teaches supervised training for a standard ANN. Therefore, the fully-connected layers are trained.)
	It would have been obvious to one of ordinary skill in the art in the art before the effective filing date of the claimed invention to have identified O’Shea’s last convolution layer, last pooling layer, and first fully-connected layers in Fig. 5 as trainable. A motivation for the combination is to reduce the model’s overall classification error, through correct calculation of the output value of training example by training. (O’Shea, p. 2, ¶ 1, lines 5-7)

	Regarding CLAIM 14, the combination of Wen, O’Shea, and Han teaches: The apparatus as claimed in claim 8,
	Wen teaches: where selected layers of the convolutional neural network have particular connectivity properties. (The BRI of this limitation includes the connections of the convolutional neural network having any properties.  On p. 4895, Fig. 3 shows “Convolution Unit” being unfrozen meaning its weights are adjustable and the “Convolution Unit (Frozen)” being frozen meaning its weights are fixed. The architecture is further described in the caption below Fig. 3 and p. 4895, § 3.1, ¶ 1-2.)

Regarding CLAIM 25, the combination of Wen, O’Shea, and Han teaches: The apparatus as claimed in claim 8,
Wen teaches: the programmable convolutional layer portion of the first layer (“Convolution Unit” in Fig. 3 on p. 4895.)
However, neither Wen nor O’Shea explicitly teaches: where the programmable convolutional layer portion of the first layer are sparse.
	But Han teaches: where the programmable convolutional layer portion of the first layer are sparse. (P. 3, first two paragraphs)
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to pruned Wen’s “Convolution Unit” layers and to have stored the sparse structure that results from pruning using compressed sparse row (CSR) or compressed sparse column (CSC) format. A motivation for the combination is to reduce the number of parameters in the network. (Han p. 2, last line)

Claims 22-23 are rejected under 35 U.S.C. 103 as being unpatentable over Wen et al. (“Latent Factor Guided Convolutional Neural Networks for Age-Invariant Face Recognition”) in view of Han et al. (“Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding”) and El-Yaniv et al. (US 20170286830 A1, see PTO-892 filed 12/16/2021).

Regarding CLAIM 22, the combination of Wen and Han teaches: The method as claimed in claim 1, 
However, neither Wen nor Han explicitly teaches: where the training computer and the inference computer are the same computer. 
	But El-Yaniv teaches: where the training computer and the inference computer are the same computer. (¶ [0054]-[0055] and Fig. 2 disclose a computing device used for both training and inferencing.)
	El-Yaniv is in the same field of endeavor as the claimed invention, namely, neural networks. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have used the same computer for both training and inferencing. A motivation for the combination is to use the same resources for both tasks of training and inferencing. (¶ [0054], lines 1-5 and ¶ [0055], lines 1-5)
	
Claim 23 recites the same features as method claims 22. Independent claim 15, upon which claim 23 depends, also recites the training computer includes a memory and a processor, coupled to the memory, that executes instructions stored in the memory, and the inference computer includes a memory and a processor, coupled to the memory, that executes instructions stored in the memory. Wen discloses these additional limitations by the experiments in section 4 from p. 4897, col. 2 to p. 4900, which are evidence for a computer including a memory and a processor. Claim 23 is rejected for the reasons set forth in the rejection of claim 22.

Claim 27 is rejected under 35 U.S.C. 103 as being unpatentable over Wen et al. (“Latent Factor Guided Convolutional Neural Networks for Age-Invariant Face Recognition”) in view of O’Shea et al. (“An Introduction to Convolutional Neural Networks”, see PTO-892 filed 12/16/2021), Han et al. (“Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding”), and Hotson et al. (US 20180211128 A1).

Regarding CLAIM 27, the combination of Wen, O’Shea, and Han teaches: The apparatus as claimed in claim 8,
However, neither Wen nor Han explicitly teaches: where said process using the second layer of the neural network includes: concatenate the intermediate feature maps and the concatenate feature maps to generate one or more input feature maps; and 
input the one or more input feature maps to the programmable convolutional layer of the second layer to generate the one or more output feature maps.
	But O’Shea teaches: input the one or more input feature maps to the programmable convolutional layer of the second layer to generate the one or more output feature maps. (A “second layer” in interpreted as the combination of the last convolutional layer and the last pooling layer in Fig. 5 on p. 9. This “second layer” in Fig. 5 receives input feature maps from the preceding layer and generates an output feature map. Moreover, feature maps, which O’Shea calls activation maps, is taught on p. 6, first and third paragraphs; and p. 8, second paragraph.)
	However, neither Wen, O’Shea, nor Han explicitly teaches: where said process using the second layer of the neural network includes: concatenate the intermediate feature maps and the concatenate feature maps to generate one or more input feature maps; and 
	But Hotson teaches: where said process using the second layer of the neural network includes: concatenate the intermediate feature maps and the concatenate feature maps to generate one or more input feature maps; and (Taught by all of ¶ [0020], [0042], [0052] and the concatenation of Image 0 and Depth 0 in Fig. 6. ¶ [0020], line 3 discloses parallel feature maps.)
	Hotson is in the same field of endeavor as the claimed invention, namely, convolutional neural networks. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have used Hotson’s system of concatenating parallel feature maps to concatenate the feature maps output by Wen’s frozen and unfrozen convolutional units. A motivation for the combination is that the use of concatenated feature maps may significantly improve object detection in the case of poor quality for one type of sensor data. (Hotson, ¶ [0022], last 4 lines)

Response to Arguments
Examiner herein responds to Applicant’s remarks and claim amendments dated 01/24/2022, filed in response to the non-final rejected dated 12/16/2021.

Objections to the Claims: The objection to claim 8 is withdrawn due to the claim amendments.

Claim Rejections Under 35 U.S.C. § 101: Applicant’s arguments with respect to claims 8-10, 12, and 14 have been fully considered and are persuasive.  The rejections of claims 8-10, 12, and 14 have been withdrawn. 

Claim Rejections Under 35 U.S.C. § 103: Applicant’s arguments with respect to claims 1-3, 5-10, 12, 14-17, and 19-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Sarwar et al. (“Gabor Filter Assisted Energy Efficient Fast Learning Convolutional Neural Networks”), on p. 3, col. 2, first paragraph, teaches a blended CNN configuration where the 2nd convolutional layer is partly trained with a combination of fixed Gabor filter kernels and regular trainable weight kernels.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Asher H. Jablon whose telephone number is (571)270-7648. The examiner can normally be reached Monday - Friday, 9:00 am - 6:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Al Kawsar can be reached on (571)270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ASHER H. JABLON/Examiner, Art Unit 2127                                                                                                                                                                                                        

/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127