DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are presented for examination.

Response to Amendment
Applicant’s amendment has obviated most, but not all, of the objections to the specification, drawings, and claims given in the last Office Action.  To the extent that an objection or rejection appears in the previous Office Action(s) but not this Office Action, that objection or rejection is withdrawn.  To the extent that is appears both in a previous Office Action(s) and this Office Action, the objection or rejection is maintained.
Applicant’s amendment has also obviated the rejection under 35 USC § 112(b) of claim 13.  Therefore, that rejection is withdrawn.

Specification
The disclosure is objected to because of the following informalities: 
In paragraph 90, “assign to the PEs” should be “assign them to the PEs”.
In paragraph 118, “example only” should be “examples only”.
Appropriate correction is required.

Claim Objections
Claim 12 is objected to because of the following informalities:  “plurality of processing element” should be “plurality of processing elements”.  Note that Applicant’s amendment is not sufficient to cure this error because additions to claims must be underlined, not enclosed in single brackets.  See 37 CFR § 1.121(c)(2).  Claim 13 is objected to for dependency on claim 12.  Appropriate correction is required.

Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1-7, 10-12, 14, and 16-20 are rejected under 35 U.S.C. 103 as being unpatentable over Sze et al., “Efficient Processing of Deep Neural Networks: A Tutorial and Survey,” in 105(12) Proc. IEEE 2295-2329 (2017) (“Sze”) in view of Achlioptas, “Database-Friendly Random Projections,” in Proc. 20th ACM SIGMOD-SIGACT-SIGART Symp. Principles of Database Sys. 274-81 (2001) (“Achlioptas”).
Regarding claim 1, Sze discloses “[a] system for dynamic sparse execution of a neural network, comprising:
at least one global buffer configured to receive inputs for the neural network (Sze Fig. 25 shows a global buffer connected to processing elements; p. 2311, first two paragraphs on right-hand column discloses that the filter weights and input activations are read from the global buffer, processed by MAC units, and the resulting partial sums [inputs] are put back into the global buffer; see also Fig. 31); 
a plurality of processing elements configured to execute activation functions for nodes of the neural network (Sze p. 2311, third paragraph indicates that in one type of neural network accelerator, each processing element handles the processing for each output activation value by fetching the corresponding input activations from neighboring PEs; see also Figs. 25 (showing processing elements connected to the global buffer), 11 (showing various activation functions used in CNNs), 31 (showing that the result of calculations by the PEs is sent to a ReLU (activation function) unit to generate an output feature map (output))); and 
at least one processor (Sze Fig. 31 contains an RLC decoder to decode the input feature map and an RLC encoder to encode the output feature map, which collectively comprise a processor) configured to: 
… reduce at least one dimension of the inputs from the at least one global buffer and generate a corresponding predictable output neuron map1 for use by the plurality of processing elements (in a CNN, a variety of computations that reduce the dimensionality of a feature map are referred to as pooling; a stride of greater than one is typically used so that there is a reduction in the dimension of the representation [feature map] – Sze, paragraph spanning pp. 2302-03; see also Fig. 10 (showing that the output of the pooling layer goes either to another CONV layer or to a fully connected layer – i.e., to another processing element), Figs. 22, 31 (showing that the global buffer exchanges data with the PEs), p. 2312, last paragraph (disclosing that the input fmap decoder unit is a compression unit)), and 
receive outputs from the plurality of processing elements (Sze Fig. 31 shows an RLC encoder that receives output from a ReLU unit, which receives outputs from the PEs via the global buffer), reduce at least one dimension of the outputs (Sze p. 2312, last paragraph, discloses that the fmap units are compression [dimensionality reduction] units; see also paragraph spanning pp. 2302-03 (disclosing that the pooling layer of the CNN reduces the dimension of the feature maps), Fig. 31 (showing the RLC encoder that compresses the output feature map)), and update the corresponding predictable output neuron map for use by the plurality of processing elements based on the reduced outputs (Sze p. 2312, last paragraph and Fig. 31 disclose that the chip that contains the RLC decoder and encoder communicates with an off-chip DRAM using a 64-b bidirectional data bus [i.e., data, including the output feature map, may flow from the RLC encoder to the DRAM and back to the RLC decoder for further decoding/updating of the feature map]; see also p. 2302, first full paragraph (disclosing that the output feature map is calculated by passing a stack of filters over an input feature map [thereby updating the feature map])).”
Sze appears not to disclose explicitly the remaining limitations of the claim.  However, Achlioptas discloses “execut[ing] ternary random projection to reduce at least one dimension of the inputs (given a high-dimensional pointset, the pointset could be embedded into a lower dimensional space without suffering great distortion – Achlioptas, sec. 1, first two paragraphs; one can replace projections onto random hyperplanes with simpler and faster operations, requiring extremely simple probability distributions such as sqrt(3) with probability 1/6, 0 with probability 2/3, and –sqrt(3) with probability 1/6 [ternary random projection] – id. at sec. 1.1, first three paragraphs and Theorem 2)….”
Achlioptas and the instant application both relate to dimensionality reduction of datasets and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Sze with to use ternary random projection to reduce the inputs’ dimensionality, as disclosed by Achlioptas, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would render the operation of dimensionality reduction simpler and faster relative to projection onto random hyperplanes without any sacrifice in the quality of the embedding.  See Achlioptas, sec. 1.1, third paragraph.
  
Regarding claim 2, Sze, as modified by Achlioptas, discloses that “the at least one processor iteratively receives current outputs from the plurality of processing elements (Sze Fig. 31 shows that the RLC encoder receives outputs from the ReLU unit, which in turn receives output from the global buffer that received partial sums from the processing elements), reduces at least one dimension of the current outputs (Sze p. 2312, last full paragraph and Fig. 31 show that the RLC encoder compresses the output feature map; see also paragraph spanning pp. 2302-03 (disclosing that the pooling layer of a convolutional network reduces the dimensionality of the feature map)), and updates the corresponding predictable output neuron map, based on the reduced current outputs (Sze p. 2312, last paragraph and Fig. 31 disclose that the chip that contains the RLC decoder and encoder communicates with an off-chip DRAM using a 64-b bidirectional data bus [i.e., data, including the output feature map, may flow from the RLC encoder to the DRAM and back to the RLC decoder for further decoding/updating of the feature map]; see also p. 2302, first full paragraph (disclosing that the output feature map is calculated by passing a stack of filters over an input feature map [thereby updating the feature map])), for use by the plurality of processing elements in generating next outputs until the plurality of processing elements have executed each layer of the neural network (Sze Fig. 31 shows that control flow may proceed from the processing elements to the global buffer to the ReLU unit to the RLC encoder and to the DRAM, then from the DRAM to the RLC encoder, back to the global buffer, and back to the processing elements [i.e., the reduced output feature map produced by the RLC encoder and the PEs of the last layer are fed back into the PEs for calculation of a next layer]; see also paragraph spanning pp. 2312-13 (disclosing that the fixed-size PE array may accommodate different layer shapes – i.e., the PE array is iteratively used to process each layer until all layers are processed)).”  

Regarding claim 3, Sze, as modified by Achlioptas, discloses that “each processing element comprises a control logic and a multiply-accumulate accelerator (fundamental component of both the CONV and the fully connected layers of a CNN are multiply-and-accumulate operations – Sze, p. 2307, first full paragraph on right-hand column; cost of chip depends on area efficiency, which accounts for the amount of control logic – id. at p. 2324, third bullet point; filter weights and input activations may be processed by MAC units/, and the resulting sums or output activations are put back into the global buffer (implying the existence of control logic in the PE to perform these calculations) – id. at p. 2311, second full paragraph on right-hand column; see also Figs. 25 (showing that each PE performs a multiply-and-accumulate operation), 31 (showing that each PE has a MAC operation and a control)).”

Regarding claim 4, Sze, as modified by Achlioptas, discloses that “the at least one processor comprises a plurality of adder trees (one example of an accelerator reads input activations and filter weights from a buffer and processes them through MAC units with custom adder trees – Sze, p. 2311, second full paragraph on right-hand column).”  

Regarding claim 5, Sze, as modified by Achlioptas, discloses that “the global buffer is further configured to transmit the predictable output neuron map from the at least one processor to the plurality of processing elements and to transmit the outputs from the plurality of processing elements to the at least one processor (in one example of a neural network accelerator, an output of an input feature map compression unit is fed into the global buffer, which is then sent to the PE array [processing elements]; the global buffer then sends the output to a ReLU unit, which is then sent to an output feature map compression unit [the compression units and ReLU collectively comprise a processor] – Sze, last full paragraph on p. 2312 and Fig. 31).”  

Regarding claim 6, Sze, as modified by Achlioptas, discloses that “the plurality of processing elements are organized in an array along a first dimension and a second dimension (Sze Fig. 31 shows that at least one neural network accelerator has a processing element array arranged in two dimensions).”  

Regarding claim 7, Sze, as modified by Achlioptas, discloses that “the plurality of processing elements share a first bus along the first dimension and communicate with the global buffer using a second bus along the second dimension (one neural network accelerator chip communicates with off-chip DRAM using a 64-b bidirectional data bus to fetch data into the global buffer – Sze, p. 2312, last full paragraph; Fig. 31 shows that each PE in a row communicates with other PEs in the same row via another set of horizontal buses).”  

Regarding claim 10, Sze, as modified by Achlioptas, discloses that “the plurality of processing elements and the at least one processor are configured to execute instructions in parallel (multiply-and-accumulate operations can be easily parallelized; highly-parallel compute paradigms are commonly used, including both spatial and temporal architectures – Sze, p. 2307, first full paragraph on right-hand column).”  

Regarding claim 11, Sze, as modified by Achlioptas, discloses that “the at least one processor reduces at least one dimension of the outputs and updates the corresponding predictable output neuron map35Attorney Docket No.: 12852.0316-00000Alibaba Ref No.: A23102U S concurrently2 with execution of one or more of the activation functions by the plurality of processing elements (Sze Fig. 31 shows that the ReLU unit [here considered one of the processing elements] applies a ReLU function to the partial sums and passes the result to the RLC encoder [part of the processor] which compresses the output, thereby updating the output feature map; since the RLC encoder performs the compression directly on the results of the ReLU operation, the two operations occur concurrently).”  

Regarding claim 12, Sze, as modified by Achlioptas, discloses that “the at least one processor re-assigns the nodes to the plurality of processing element[s] whenever the predictable output neuron map is updated (each CONV layer in a CNN is composed of high-dimensional convolutions; the input activations of a layer are structured as a set of input feature maps that are convolved with a 2_d filter; the result of this computation is output activations that comprise one channel of an output feature map [i.e., once the input feature map is updated to become the output feature map, the active nodes/neurons become those of the next layer] – Sze, p. 2302, first full paragraph; a fixed-size PE array can be used to accommodate different layer shapes [i.e., the same PEs are used to calculate each layer] – id. at paragraph spanning pp. 2312-13).”  

Regarding claim 14, Sze, as modified by Achlioptas, discloses “a quantizer configured to truncate the inputs before reducing at least one dimension of the inputs (Sze Fig. 39 shows that each MAC contains a quantizer after the multiply-and-accumulate operation [since this operation occurs in each PE, where computation for all neural network layers takes place, quantization in the PEs performing the operations of the CONV layer may occur before the processing by the PEs that perform the operations of the max pooling layer]; see also p. 2317, first paragraph (disclosing that the quantization may be fixed or variable)).”  

Regarding claim 16, Sze, as modified by Achlioptas, discloses that “the global buffer receives the inputs from a memory that is on a different chip from the global buffer (Sze Fig. 31 and p. 2312, last full paragraph disclose that in at least one neural network accelerator, an off-chip DRAM [memory] communicates with the chip using a 64-b bidirectional data bus to fetch data into the global buffer).”  

Regarding claim 17, Sze, as modified by Achlioptas, discloses that “the global buffer is further configured to transmit final outputs to the memory (Sze Fig. 31 and p. 2312, last full paragraph disclose that in at least one neural network accelerator, an off-chip DRAM [memory] communicates with the chip using a 64-b bidirectional data bus to fetch data into the global buffer; Fig. 31 also shows that the global buffer sends the output to a ReLU unit, whose output is sent to an RLC encoder, which is then sent to the off-chip DRAM).”

Regarding claim 18, Sze, as modified by Achlioptas, discloses that “the plurality of processing elements further comprise local buffers for storing inputs and outputs (Sze p. 2310, last paragraph discloses that the weights [inputs] may be stored in a register file (RF) [local buffer] in the PE; p. 2311, second full paragraph discloses that the accumulation of partial sums for the same output activation value local in the RF).”

Regarding claim 20, Sze discloses “[a] non-transitory computer-readable storage medium storing a set of instructions that is executable by a computing device to cause the computing device to perform a method for dynamic sparse execution of a neural network (Sze Fig. 31 and p. 2312, last full paragraph disclose a neural network accelerator consisting of a processing element (PE) array, a global buffer, and ReLU and feature map compression units connected to an off-chip DRAM [non-transitory computer-readable medium]), the method comprising: 
providing, via a buffer, inputs for a neural network to at least one processor (Sze Fig. 31 shows that a global buffer provides a filter, an input feature map, and partial sums [inputs] to the PEs [the PEs, RLC decoder, RLC encoder, and ReLU unit collectively comprise a processor]); …
generating, via the at least one processor, a corresponding predictable output neuron map (in a CNN, a variety of computations that reduce the dimensionality of a feature map are referred to as pooling; a stride of greater than one is typically used so that there is a reduction in the dimension of the representation [output map] – Sze, paragraph spanning pp. 2302-03; see also p. 2312, last paragraph (disclosing that the input fmap decoder unit is a compression unit)); 
executing, via a plurality of processing elements, one or more first activation functions of the neural network using the reduced inputs to generate first outputs (Sze p. 2311, third paragraph indicates that each processing element may handle the processing for each output activation value by fetching the corresponding input activations from neighboring PEs; see also Figs. 25 (showing processing elements connected to the global buffer), 11 (showing various activation functions used in CNNs), 31 (showing that the result of calculations by the PEs is sent to a ReLU (activation function) unit to generate an output feature map (output))); 
providing, via the buffer, the first outputs to the at least one processor (Sze Fig. 31 shows that the partial sum outputs of the PE array are sent to the global buffer, which then provides those outputs to the ReLU unit and the RLC encoder [part of the processor]); 
reducing, via the at least one processor, at least one dimension of the first outputs (Sze p. 2312, last paragraph and Fig. 31 show that the outputs, after passing through the ReLU unit, pass through an RLC encoder that compresses [reduces a dimension of] the output feature map; see also paragraph spanning pp. 2302-03 (disclosing that pooling is used to reduce the dimensionality of the feature map)); 
updating, via the at least one processor, the corresponding predictable output neuron map based on the reduced first outputs (Sze p. 2312, last paragraph and Fig. 31 disclose that the chip that contains the RLC decoder and encoder communicates with an off-chip DRAM using a 64-b bidirectional data bus [i.e., data, including the output feature map, may flow from the RLC encoder to the DRAM and back to the RLC decoder for further decoding/updating of the feature map]; see also p. 2302, first full paragraph (disclosing that the output feature map is calculated by passing a stack of filters over an input feature map [thereby updating the feature map])); and  6Attorney Docket No. 12852.0316-00000 Preliminary Amendment 
executing, via the plurality of processing elements, one or more second activation functions of the neural network using the reduced first outputs to generate second outputs (Sze p. 2311, third paragraph indicates that each processing element may handle the processing for each output activation value by fetching the corresponding input activations from neighboring PEs [i.e., execute the activation function]; see also Figs. 25 (showing processing elements connected to the global buffer), 11 (showing various activation functions used in CNNs), 31 (showing that the (reduced) input feature map is passed from the RLC decoder to the global buffer and from the global buffer to the PE array and that the outputs are passed through a ReLU (activation function) layer to produce second outputs)).”
Sze appears not to disclose explicitly the further limitations of the claim.  However, Achlioptas discloses “executing, via the at least one processor, ternary random projection to reduce at least one dimension of the inputs (given a high-dimensional pointset, the pointset could be embedded into a lower dimensional space without suffering great distortion – Achlioptas, sec. 1, first two paragraphs; one can replace projections onto random hyperplanes with simpler and faster operations, requiring extremely simple probability distributions such as sqrt(3) with probability 1/6, 0 with probability 2/3, and –sqrt(3) with probability 1/6 [ternary random projection] – id. at sec. 1.1, first three paragraphs and Theorem 2)….”
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Sze with to use ternary random projection to reduce the inputs’ dimensionality, as disclosed by Achlioptas, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would render the operation of dimensionality reduction simpler and faster relative to projection onto random hyperplanes without any sacrifice in the quality of the embedding.  See Achlioptas, sec. 1.1, third paragraph.

Claim 19 is a method claim corresponding to non-transitory computer-readable medium claim 20 and is rejected for the same reasons as given in the rejection of that claim.

Claims 8-9 are rejected under 35 U.S.C. 103 as being unpatentable over Sze in view of Achlioptas and further in view of Lee et al. (US 20200285944) (“Lee”).
Regarding claim 8, Sze, as modified by Achlioptas, discloses “reduced inputs (Sze Fig. 31 and p. 2312, last full paragraph disclose an RLC decoder and RLC encoder that function as compression units; paragraph spanning pp. 2302-03 discloses that the pooling layer of a CNN reduces the dimensionality of the feature map [input to next layer]).”  
Neither Sze nor Achlioptas appears to disclose explicitly the further limitations of the claim.  However, Lee discloses that “the at least one processor further comprises a systolic array configured to reduce at least one dimension of a set of weights for the neural network based on the … inputs (neural network may be implemented by processing element arrays (PE arrays) [systolic arrays] – Lee, paragraph 41; the operation of each graph convolutional layer is a propagation function of a feature matrix for the neural network’s previous layer [i.e., input into subsequent layer]; the weight matrix in the propagation function may be reduced in dimensionality to correspond to the number of features at the next layer [i.e., the dimensionality of the layer into which the matrix is being input determines how the matrix is reduced] – id. at paragraph 63).”  
Lee and the instant application both relate to dimensionality reduction in neural networks and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Sze and Achlioptas to reduce the dimensionality of the weights based on the inputs, as disclosed by Lee, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would reduce the number of operations that the system would need to perform.  See Lee, paragraph 63.

Regarding claim 9, Sze, as modified by Achlioptas, discloses “reduced outputs (Sze Fig. 31 and p. 2312, last full paragraph disclose an RLC decoder and RLC encoder that function as compression units; paragraph spanning pp. 2302-03 discloses that the pooling layer of a CNN reduces the dimensionality of the feature map [output of present layer]).”
Neither Sze nor Achlioptas appears to disclose explicitly the further limitations of the claim.  However, Lee discloses that “the systolic array is further configured to reduce at least one dimension of the set of weights for the neural network based on the … outputs (neural network may be implemented by processing element arrays (PE arrays) [systolic arrays] – Lee, paragraph 41; the operation of each graph convolutional layer is a propagation function of a feature matrix for the neural network’s previous layer [i.e., output of previous layer]; the weight matrix in the propagation function may be reduced in dimensionality to correspond to the number of features at the next layer [i.e., the initial dimensionality of the output determines how the matrix is reduced] – id. at paragraph 63).” 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Sze and Achlioptas with to reduce the dimensionality of the network weights based on an output, as disclosed by Lee, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would reduce the number of operations that the system would need to perform.  See Lee, paragraph 63. 

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Sze in view of Achlioptas and further in view of Proshin et al. (US 10783433) (“Proshin”).
Regarding claim 13, Sze, as modified by Achlioptas, discloses that “re-assigning comprises grouping the activation functions based on the predictable output neuron map (in a CNN, a nonlinear activation function is typically applied after each CONV or FC layer [i.e., the activation functions are grouped by layer] – Sze, p. 2302, last full paragraph; see also Fig. 10 (showing that each CONV layer contains convolution, non-linearity, normalization, and pooling operations, so that the activation function of the next layer is based on the feature map produced by pooling in the previous layer))….” 
Proshin discloses “grouping the activation functions … such that each group has the same number of calculations (method for training and self-organization of a neural network includes dividing a set of adjustable parameters p, which includes activation function parameters, into several groups of n parameters for each factor that have to be calculated simultaneously in m points [since each group has n parameters, the same number of calculations is involved for each activation function parameter group] – Proshin, col. 8, ll. 47-53).”
Proshin and the instant application both relate to neural networks and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Sze and Achlioptas to group the activation functions so that the same number of calculations is performed in each group, as disclosed by Proshin, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would allow for multiple activation functions to be computed in parallel, thereby saving processing time.  See Proshin, col. 6, ll. 23-28 (disclosing that a subset of parameters from the entire parameter set may be calculated in parallel).

Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Sze in view of Achlioptas and further in view of Yao (US 20180046894) (“Yao”).
Regarding claim 15,  the rejection of claim 14 is incorporated.  Sze further discloses truncation from fixed-point values (using dynamic fixed point, the bitwidth can be reduced [truncated] to 8 b for the weights and 10 b for the activations without any fine tuning of the weights; both can reach 8 b with fine tuning of the weights – Sze, paragraph spanning pp. 2317-18).
Neither Sze nor Achlioptas appears to disclose explicitly the further limitations of the claim.  However, Yao discloses that “the truncation comprises a truncation from 16-bit … values to 4-bit fixed-point values (fixed-point quantizing may comprise converting 16-bit floating point numbers into 4-bit fixed-point numbers – Yao, claim 6).”  
Yao and the instant application both relate to hardware acceleration of neural networks and are analogous.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Sze and Achlioptas to truncate the values from 16-bit to 4-bit, as disclosed by Yao, and an ordinary artisan could reasonably expect to have done so successfully.  Doing so would significantly reduce memory footprint and computation resources.  See Yao, paragraph 94.

Response to Arguments
Applicant's arguments filed May 23, 2022 (“Remarks”) have been fully considered but they are, except insofar as rendered moot by the withdrawal of a rejection, not persuasive.
Applicant’s sole substantive argument is that the Sze/Achlioptas combination allegedly does not disclose the generation of a predictable output neuron map for use by the processing elements and the updating of the predictable output neuron map for use by the processing elements based on reduced outputs because the reduced-dimensionality feature map of Sze should allegedly be regarded as a reduced input or output rather than as a predictable output neuron map.  Remarks at 12-14.  However, Applicant appears to be conflating two items that were mapped to different entities of Sze.  Note first of all that the flow of data in Figure 31 of Sze is cyclical; data flow from the off-chip DRAM to an RLC decoder that compresses an input feature map to a PE array, then back to the global buffer, to a ReLU unit, to an RLC encoder that compresses an output feature map, and back to the off-chip DRAM and/or the RLC decoder.  Upon receiving the output feature map of a previous layer from the RLC encoder, which is derived from data stored in the global buffer, the input feature map, which is the output feature map of the previous layer, is compressed by the RLC decoder; furthermore, by the time the feature map reaches the RLC decoder, it has already had its dimensionality reduced by pooling in the previous layer.  This reduced-dimensionality input map is the “predictable output neuron map” of the claim.  After the input map is processed by the PEs, the result is passed through the ReLU unit to produce the “outputs” of the claim.  Then, the output feature map is produced by passing the result through the RLC encoder, which compresses the output feature map, and another pooling layer may reduce the dimensionality of this output feature map further.  The result then becomes the input feature map for the next layer, which is then compressed further by the RLC decoder.  This compressed next-layer input map is the claimed “updated predictable output neuron map”.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RYAN C VAUGHN whose telephone number is (571)272-4849. The examiner can normally be reached M-R 7a-5:30p ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar, can be reached at 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/R.C.V./             Examiner, Art Unit 2125

/KAMRAN AFSHAR/             Supervisory Patent Examiner, Art Unit 2125                                                                                                                                                                                           


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 While the specification uses the term “predictable output neuron map” frequently, it does not appear to define the term explicitly, and Examiner can find no evidence that it was an accepted term of art before the effective filing date.  Paragraphs 54-55 appear to suggest, albeit not explicitly state, that the PON map is used to sparsify the weights, and Figure 3 appears to show that it is then applied to the original inputs to generate a full, non-sparse output.  Therefore, for purposes of examination, any feature map that has reduced dimensionality relative to the full input space and is used to produce a non-sparse output will be deemed a “predictable output neuron map”.
        2 “Concurrently” is not defined in the specification (it appears only once in paragraph 81); thus, Examiner interprets it according to its ordinary dictionary definition, namely “acting in conjunction; cooperating”.  Dictionary.com, definition 2 of “concurrent”, https://www.dictionary.com/browse/concurrent.  See also MPEP § 2111.01(I).