DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claim 6 is objected to because of the following informalities:  the term “LIDAR” should be changed to “light imaging, detection, and ranging (LIDAR)” in line 3 of claim 6.  Appropriate correction is required.
Claim 13 is objected to because of the following informalities:  the term “LIDAR” should be changed to “light imaging, detection, and ranging (LIDAR)” in line 3 of claim 13.  Appropriate correction is required.
Claim 18 is objected to because of the following informalities:  the term “CNN” should be changed to “convolutional neural network (CNN)” in line 1 of claim 18.  Appropriate correction is required.

 Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. §112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. §112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly 

Claims 1-10 are rejected under 35 U.S.C. §112(a) or 35 U.S.C. §112(pre-AIA ), first paragraph, as failing to comply with the written description requirement.  The claims contain subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or joint inventor, or for applications subject to pre-AIA  35 U.S.C. §112, the inventors, at the time the application was filed, had possession of the claimed invention. 
Claim 1 recites the limitations “grouping the received input data into a plurality of input data units.”  However, the specification merely recites the claim language and does not provide an explanation as to how the input data is grouped into input data units.  In addition, under the broadest reasonable interpretation, group and batch are synonyms.  
Claim 1 recites the limitation “dividing a computational graph of a deep learning workload into a plurality of processing pipeline stages.”  However, the specification merely recites this claim language and does not provide any explanation as to how the operation of dividing a computational graph of a deep learning workload into a plurality of pipeline stages is performed.
Claims 2-10 are rejected under 35 U.S.C. §112(a) or 35 U.S.C. §112(pre-AIA ), first paragraph, by virtue of their dependency on claim 1.


The following is a quotation of 35 U.S.C. 112(b):




The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites the limitation “a processor” in line 1 of claim 1 and “a processor” in line 3 of claim 1.  It is unclear as to whether “a processor” in line 1 of claim 1 and “a processor” in line 3 of claim 1 are the same processor.  Therefore, claim 1 is unclear and indefinite. For the purposes of prior art examination, the Examiner is interpreting “a processor” in line 3 of claim 1 to be “the processor” in line 3 of claim 1.
Claim 1 recites the limitations: “a plurality of input data, wherein each input data” in line 3 of claim 1; “input data units” in line 5 of claim 1; and “as a data unit arrives at the processor, processing the input data unit” in line 8 of claim 1.  It is unclear as to whether “input data units” in line 5 of claim 1 correspond to “as a data unit arrives at the processor, processing the input data unit” in line 8 of claim 1.  Therefore, claim 1 is unclear and indefinite.  For the purposes of prior art examination, the Examiner is interpreting “as a data unit arrives at the processor, processing the input data unit” in line 8 of claim 1 as “as each input data unit arrives at the processor, processing input data unit of the plurality of input data units” in line 8 of claim 1.
Claim 1 recites the limitation “the processed data unit” in line 10 of claim 1.  There is insufficient antecedent support in independent claim 1 for the limitation “the processed data unit” in line 10 of claim 1.  Therefore, claim 1 is unclear and indefinite. For the purposes of prior art examination, the Examiner is interpreting “the processed data unit” as “a processed data unit.”
Claim 2, which depends from independent claim 1, recites the limitation “the input data” in line 3 of claim 2 and recites the limitation “an input data” in line 4 of claim 2. Claim 1 recites the limitation “as a data unit arrives at the processor, processing the input data unit” in line 8 of claim 1.  It is unclear as to how “the input data” in line 3 of claim 2 and  “an input data” in line 4 of claim 2 correspond to “as a data unit arrives at the processor, processing the input data unit” in line 8 of claim 1.  Therefore, claim 2 is unclear and indefinite.  For the purposes of prior art examination, the Examiner is interpreting “the input data” in line 3 of claim 2 as “the input data unit” in line 3 of claim 2; and the Examiner is interpreting as “an input data” in line 4 of claim 2 as “the input data unit” in line 4 of claim 2.
Claim 2 further recites the limitation “wherein processing the input data in the plurality of pipeline stages comprises:…the final pipeline stage processing the intermediary activation map and outputting the processed data unit” in lines 2-8 of claim 2, which depends from claim 1.  Claim 1 recites the limitations “the method comprising:...processing the input data in the plurality of pipeline stages… to a next pipeline stage; and outputting the processed data unit.”  It is unclear as to whether “outputting the processed data unit” is part of the operation “processing the input data in the plurality of pipeline stages” or a separate operation. Therefore, claim 2 is unclear 
Claim 4, which depends from independent claim 1, recites “each data unit” in line 1 of claim 4.   Claim 1 recites the limitations: (1) “grouping the received input data into a plurality of input data units” in line 5 of claim 1; and (2) “as a data unit arrives at the processor, processing the input data unit” in line 8 of claim 1.  It is unclear as to how “each data unit” in line 1 of claim 4 relates to the limitation (1) “grouping the received input data into a plurality of input data units” in line 5 of claim 1 and the limitation (2) “as a data unit arrives at the processor, processing the input data unit” in line 8 of claim 1.  Therefore, claim 4 is unclear and indefinite. For the purposes of prior art examination, the Examiner is interpreting “each data unit” in line of claim 4 as “each input data unit” in line 1 of claim 1.
Claim 5, which depends from independent claim 1, recites the limitation “the outputted processed unit” in line 2 of claim 5. There is insufficient antecedent support in dependent claim 5 and independent claim 1 for “the outputted processed unit” in line 2 of claim 5. Therefore, claim 5 is unclear and indefinite.  For the purposes of prior art Examination, the Examiner is interpreting “the outputted processed unit” in line 2 of claim 5 as “the outputted processed data unit” in line 2 of claim 5.
Claim 10, which depends from claim 1 recites the limitations: “assigning computations of a pipeline stage to a sub-processor of the processor near or adjacent to a sub-processor preforming computations of a next pipeline stage” in lines 4-5 of claim 10. It is unclear as to whether the first occurrence of “a sub-processor” in line 4 of claim 
Claims 3 and 6-9 are rejected under 35 U.S.C. §112(b) by virtue of their dependency on claim 1.
Claim 11 recites the limitations “receive a plurality of input data units at different times; as a data unit arrives at a processor core, process the data unit, process the data unit in a pipeline stage assigned to the processor core and output the processed data to a next pipeline stage and associated processor core until the data unit is processed through the plurality of the pipeline stages; and generate an output based at least partly on output of the processing of the input data through the plurality of pipeline stages” in lines 6-12 of independent claim 11.  It is unclear as to how (1) “a plurality of input data units” in line 6 of claim 11; (2) “as a data unit” in line 7 of claim 11; (3) “the processed data” in line 8 of claim 11; (4) “the data unit” in line 9 of claim 11; and (5) “the input data” 
Claim 13, which depends from claim 11, recites the limitation “the plurality of input data” in line 1 of claim 13. There is insufficient antecedent support in independent claim 11 and dependent claim 13 for the limitation “the plurality of input data” in line 1 of claim 13. Therefore, claim 13 is unclear and indefinite.  For the purposes of prior art examination, the Examiner is interpreting “the plurality of input data” in line 1 of claim 13 as “the plurality of input data units” in line 1 of claim 13.
Claim 19, which depends from independent claim 11, recites the limitation “an input data unit” in line 1 of claim 19 and independent claim 11 recites “a plurality of input 
Claims 12, 14-18, and 20 are rejected under 35 U.S.C. §112(b) by virtue of their dependency on claim 11. 
Although claims 1-20 have been rejected and  have been interpreted by the Examiner as indicated above for the purposes of  examination of the prior art, Applicant must review claims 1-20 to further clarify the subject matter, which the Applicant wishes to claim to satisfy 35 U.S.C. §112.


Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. §102 and §103 (or as subject to pre-AIA  35 U.S.C. §102 and §103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. §103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention 

Claims 1, 3-5, 8-12, 14-15, and 17-19 are rejected under 35 U.S.C. §103 as being unpatentable over US-20200042287 Chalamalasetti et al. (hereinafter, “Chalamalasetti”)   in view of “Deep Neural Networks as Computational Graphs” by Tyler Elliot Bettilyon (hereinafter, “Bettilyon”).
As per claim 1, Chalamalasetti teaches: a method of processing deep learning inference workloads in a processor (Chalamalasetti, FIG. 1A and paragraphs [0016],    [0025], [0030], [0031], and [0050] disclose neural networks processing images and techniques for classification of neural network inference by an integrated circuit shown in FIG. 1A, which teaches a method of processing deep learning inference workloads in a processor), the method comprising:
receiving at a processor a plurality of input data, wherein each input data is received at different times (Chalamalasetti, Paragraph [0055] and block 540 in FIG. 5A disclose a neural network receiving live data as input, which teaches receiving at a processor a plurality of input data, wherein each input data is received at different times);
grouping the received input data into a plurality of input data units (Chalamalasetti, Paragraph [0055] and block 540 in FIG. 5A disclose a neural network receiving live data as input and paragraph [0025] discloses a batch of input data (i.e., a group of received input data) for some neural networks, which teaches grouping the received input data into a plurality of input data units);
(Chalamalasetti, FIG. 5A and paragraph [0055] disclose: “…Block 540 indicates that,…a neural network may go “live” in a production environment to receive live data and dynamically select precision at run-time to process input data into results for the production neural network.  For example, using techniques disclosed herein to automatically select an appropriate precision level at run-time for a particular stage of a multi-stage process being executed…,”  which teaches as a data unit arrives at the processor, processing the input data unit in the plurality of pipeline stages from one pipeline stage to a next pipeline stage); and 
outputting the processed data unit (Chalamalasetti, paragraph [0056] and FIG. 4B disclose results collector 475 outputting the results of a neural network, which teaches outputting the processed data unit).
However, Chalamalasetti by itself fails to explicitly teach: dividing a computational graph of a deep learning workload into a plurality of processing pipeline stages.  However, the combination of Chalamalasetti and Bettilyon teaches: dividing a computational graph of a deep learning workload into a plurality of processing pipeline stages (Bettilyon, Page 2, lines 13-18 disclose: “A computational graph is a way to represent a math function…In a computational graph nodes are either input values of functions for combining values.”; Bettilyon, page 7, lines 15-17 disclose: “Recall two facts about deep neural networks: 1. DNNs are a special kind of graph, a “computational graph”. 2. DNNs are made up of a series of “fully connected” layers of nodes.”; and Bettilyon, Figures showing nodes on pages 2-4 disclose a computational graph of a deep learning workload divided into layers of nodes, and paragraph [0016] of Chalamalasetti discloses stages as parts of a pipeline and discloses that these stages may be referred to as layers of a neural network, which teaches dividing a computational graph of a deep learning workload into a plurality of processing pipeline stages).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the invention to incorporate the computational graph of a deep learning workload of Bettilyon into the neural network having a plurality of layers (stages) as taught by Chalamalasetti to show that a computational graph of a deep learning workload is divided into a plurality of pipeline stages to improve the efficiency of neural networks.
As per claim 3, the combination of Chalamalasetti and Bettilyon as shown in claim 1 teaches: the method of claim 1,  wherein each pipeline stage comprises a layer of a neural network and processing the input data unit comprises performing operations of the layer on the input data unit (Chalamalasetti, Paragraphs [0016], [0050], and [0055] disclose neural networks providing a multi-stage compute process in which each stage is a layer of a neural network forming a pipeline to receive and process live data, which teaches each pipeline stage comprises a layer of a neural network and processing the input data unit comprises performing operations of the layer on the input data unit).
As per claim 4, the combination of Chalamalasetti and Bettilyon as shown in claim 1 teaches: the method of claim 1, wherein each data unit comprises a received input data (Chalamalasetti, Paragraphs [0016], [0050], and [0055] and block 540 of FIG. 5A disclose a neural network receiving live data as input data and paragraph [0025] discloses a batch of input data, which teaches each data unit comprises a received input data).
As per claim 5, the combination of Chalamalasetti and Bettilyon as shown in claim 1 teaches: the method of claim 1, wherein the deep learning workload comprises an inference deep learning workload and the outputted processed unit is used in an inference application (Chalamalasetti, FIG. 1A, Abstract, paragraphs [0016], [0025], [0029]-[0031], and [0049]-[0050] disclose neural networks processing images and techniques for classification of neural network inference by an integrated circuit shown in FIG. 1A, which teaches wherein the deep learning workload comprises an inference deep learning workload and the outputted processed unit is used in an inference application).
As per claim 8, the combination of Chalamalasetti and Bettilyon as shown in claim 1 teaches: the method of claim 1, wherein each pipeline stage comprises one or more layers of a neural network (Chalamalasetti, Paragraphs [0016], [0050], and [0055] disclose neural networks providing a multi-stage compute process in which each stage is a layer of a neural network forming a pipeline, which teaches each pipeline stage comprises one or more layers of a neural network).

As per claim 9, the combination of Chalamalasetti and Bettilyon as shown in claim 1 teaches: the method of claim 1 further comprising: 
performing the computations of each pipeline stage in a sub-processor of the processor assigned to that pipeline stage (Chalamalasetti, FIG. 1A and paragraphs [0029]-[0032] disclose circuitry for implementing the neural network forming a pipeline as taught by paragraphs [0016], [0050], and [0055], which teaches performing the computations of each pipeline stage in a sub-processor of the processor assigned to that pipeline stage); and
storing in adjacent or physically close memory regions data associated with the performing of the computations of each pipeline stage (Chalamalasetti, FIG. 1A and paragraphs [0029]-[0032] disclose circuitry for implementing the neural network forming a pipeline as taught by paragraphs [0016], [0050], and [0055], which teaches storing in adjacent or physically close memory regions data associated with the performing of the computations of each pipeline stage). 
As per claim 10, the combination of Chalamalasetti and Bettilyon as shown in claim 1 teaches: the method of claim 1, further comprising:
storing data associated with the plurality of pipeline stages in adjacent or physically close memory regions (Chalamalasetti, FIG. 1A and paragraphs [0029]-[0032] disclose circuitry for implementing the neural network forming a pipeline as taught by paragraphs [0016], [0050], and [0055],  which teaches storing data associated with the plurality of pipeline stages in adjacent or physically close memory regions); and
assigning computations of a pipeline stage to a sub-processor of the processor near or adjacent to a sub-processor performing computations of a next pipeline stage, wherein the assigned sub-processors are near or adjacent to memory regions where the sub-processor’s pipeline stage data is stored (Chalamalasetti, FIG. 1A and paragraphs [0029]-[0032] disclose circuitry for implementing the neural network forming a pipeline as taught by paragraphs [0016], [0050], and [0055], which teaches assigning computations of a pipeline stage to a sub-processor of the processor near or adjacent to a sub-processor performing computations of a next pipeline stage, wherein the assigned sub-processors are near or adjacent to memory regions where the sub-processor’s pipeline stage data is stored).
As per claim 11, Chalamalasetti teaches: a deep learning inference accelerator (Chalamalasetti, FIG. 1A and paragraphs [0025] and [0054] disclose an accelerator for a CNN for desired classification accuracy of neural network inference to increase accelerator performance), comprising:
a plurality of processor cores, each assigned to a pipeline stage of a plurality of pipeline stages and configured to process the pipeline stage (Chalamalasetti, FIG. 1A and paragraph [0031] discloses multiple processor cores to perform functions on an integrated circuit in parallel and paragraph [0016] discloses a multi-stage compute process referring to computer processing where outputs from a previous stage may be used as inputs to a next stage to form a pipeline referred to as layers of a neural network, which teaches a plurality of processor cores, each assigned to a pipeline stage of a plurality of pipeline stages and configured to process the pipeline stage), wherein the plurality of processor cores are configured to:
receive a plurality of input data units at different times (Chalamalasetti, Paragraph [0055] and block 540 in FIG. 5A disclose a neural network receiving live data as input, which teaches receive a plurality of input data units at different times); 
 (Chalamalasetti, FIG. 5A and paragraph [0055] disclose: “…Block 540 indicates that,…a neural network may go “live” in a production environment to receive live data and dynamically select precision at run-time to process input data into results for the production of the neural network.  For example, using techniques disclosed herein to automatically select an appropriate precision level at run-time for a particular stage of a multi-stage process being executed…,”; paragraph [0031] discloses multiple processor cores to perform functions (corresponding to stages) on an integrated circuit in parallel, which discloses a plurality of processor cores, each assigned to a pipeline stage of a plurality of pipeline stages and configured to process the pipeline stage; and paragraph [0016] discloses a multi-stage compute process referring to computer processing where outputs from a previous stage may be used as inputs to a next stage to form a pipeline referred to as layers of a neural network, which teaches as a data unit arrives at a processor core, process the data unit in a pipeline stage assigned to the processor core and output the processed data to a next pipeline stage and associated processor core until the data unit is processed through the plurality of the pipeline stages); and
generate an output based at least partly on output of the processing of the input data through the plurality of pipeline stages (Chalamalasetti, Paragraph [0016] discloses a multi-stage compute process referring to computer processing where outputs from a previous stage may be used as inputs to a next stage to form a pipeline referred to as layers of a neural network; and paragraph [0056] and FIG. 4B disclose a results collector 475 outputting the results of a neural network, which teaches generate an output based at least partly on output of the processing of the input data through the plurality of pipeline stages).
However, Chalamalasetti by itself fails to specifically teach:  the plurality of pipeline stages together comprise the computational graph of a deep learning inference neural network.  However, the combination of Chalamalasetti and Bettilyon teaches: the plurality of pipeline stages together comprise the computational graph of a deep learning inference neural network stages (Bettilyon, Page 2, lines 13-18 disclose: “A computational graph is a way to represent a math function…In a computational graph nodes are either input values of functions for combining values.”; Bettilyon, page 7, lines 15-17 disclose: “Recall two facts about deep neural networks: 1. DNNs are a special kind of graph, a “computational graph”. 2. DNNs are made up of a series of “fully connected” layers of nodes.”; and Bettilyon, Figures showing nodes on pages 2-4 disclose a computational graph of a deep learning workload divided into layers of nodes, and paragraph [0016] of Chalamalasetti discloses stages as parts of a pipeline and discloses that these stages may be referred to as layers of a neural network, which teaches the plurality of pipeline stages together comprise the computational graph of a deep learning inference neural network).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the invention to incorporate the computational graph of a deep learning workload of Bettilyon into the neural network having a plurality of stages as taught by Chalamalasetti to show that a computational graph of a deep learning workload is divided into a plurality of pipeline stages to improve the efficiency of neural networks.
As per claim 12,  the combination of Chalamalasetti and Bettilyon as shown in claim 11 teaches: the accelerator of claim 11, wherein each pipeline stage comprises a layer of the neural network and the processing in the pipeline stage comprises performing operations of the layer on the input data unit (Chalamalasetti, Paragraphs [0016], [0050], and [0055] disclose neural networks providing a multi-stage compute process in which each stage is a layer of a neural network forming a pipeline to receive and process live data, which teaches each pipeline stage comprises a layer of the neural network and the processing in the pipeline stage comprises performing operations of the layer on the input data unit).
As per claim 14, the combination of Chalamalasetti and Bettilyon as shown in claim 11 teaches: the accelerator of claim 11, wherein adjacent or nearby processors are assigned to adjacent or nearby pipeline stages (Chalamalasetti, FIG. 1A and paragraphs [0029]-[0032] disclose circuitry for implementing the neural network forming a pipeline as taught by paragraphs [0016], [0050], and [0055], which teaches wherein adjacent or nearby processors are assigned to adjacent or nearby pipeline stages).
As per claim 15,  the combination of Chalamalasetti and Bettilyon as shown in claim 11 teaches: the accelerator of claim 11 further comprising a memory circuit configured to store data associated with the processing of each pipeline stage, wherein (Chalamalasetti, FIG. 1A and paragraphs [0029]-[0032] disclose circuitry for implementing the neural network forming a pipeline as taught by paragraphs [0016], [0050], and [0055], which teaches a memory circuit configured to store data associated with the processing of each pipeline stage, wherein the data is stored in adjacent or nearby memory regions for near or adjacent pipeline stages).
As per claim 17, the combination of Chalamalasetti and Bettilyon as shown in claim 11 teaches: the accelerator of claim 11, wherein one or more pipeline stages are skipped (Chalamalasetti, Paragraph [0050] disclose inputs flowing both forward to a next layer and looping back to another layer, which teaches one or more pipeline stages are skipped).
As per claim 18, the combination of Chalamalasetti and Bettilyon as shown in claim 11 teaches: the accelerator of claim 11, wherein the neural network comprises a CNN (Chalamalasetti, paragraph [0025] discloses network layers with high throughput in a convolutional neural network, which teaches wherein the neural network comprises a CNN).
As per claim 19, the combination of Chalamalasetti and Bettilyon as shown in claim 11 teaches:  the accelerator of claim 11, wherein an input data unit comprises an image (Chalamalasetti, FIG. 3, Paragraphs [0016], [0048], and [0049] disclose image processing in a neural network to process raw pixel data 305, which teaches an input data unit comprises an image)
Claims 2 and 16 are rejected under 35 U.S.C. §103 as being unpatentable over Chalamalasetti in view of Bettilyon, and further in view of “An Introduction to Convolutional Neural Networks” to O’Shea et al. (hereinafter, “O’Shea”) .
As per claim 2, the combination of Chalamalasetti and Bettilyon as shown in claim 1 teaches:  the method of claim 1, 
wherein the plurality of pipeline stages comprise: a first pipeline stage, one or more intermediary pipeline stages, and a final pipeline stage (Chalamalasetti, Paragraphs [0016], [0050], and [0055] disclose neural networks providing a multi-stage computer process in which each stage is a layer of neural network forming a pipeline, which teaches wherein the plurality of pipeline stages comprise: a first pipeline stage, one or more intermediary pipeline stages, and a final pipeline stage), and 
wherein processing the input data in the plurality of pipeline stages (Chalamalasetti, FIG. 1A, FIG. 4A, and paragraphs [0016], [0025], [0030], [0031], [0050], and [0055] disclose neural networks providing a multi-stage compute process receiving live data as input in which each stage is a layer of a neural network forming a pipeline, which teaches wherein processing the input data in the plurality of pipeline stages) comprises: 
the first pipeline stage receiving an input data (Chalamalasetti, FIG. 1A, FIG. 4A, and paragraphs [0016], [0025], [0030], [0031], [0050], and [0055] disclose neural networks providing a multi-stage compute process receiving live data as input (e.g., one or more input streams 410) in which each stage is a layer of a neural network forming a pipeline, which teaches the first pipeline stage receiving an input data); and 
outputting the processed data unit  (Chalamalasetti, FIG. 1A, FIG. 4A, and paragraphs [0016], [0025], [0030], [0031], [0050], and [0055] disclose neural networks providing a multi-stage compute process outputting output 415 representing output of the overall network (output of neural network processing), which teaches outputting the processed data unit).  
 However, the combination of Chalamalasetti and Bettilyon by themselves fail to explicitly teach: outputting an activation map (from the first pipeline stage); the intermediary pipeline stages receiving the activation map and processing the activation map from one intermediary pipeline stage to a next intermediary pipeline stage and outputting an intermediary activation map to the final pipeline stage, and the final pipeline stage processing the intermediary activation map.  
However, the combination of Chalamalasetti, Bettilyon, and O’Shea teaches: outputting an activation map (from the first pipeline stage); the intermediary pipeline stages receiving the activation map and processing the activation map from one intermediary pipeline stage to a next intermediary pipeline stage and outputting an intermediary activation map to the final pipeline stage, and the final pipeline stage processing the intermediary activation map (Chalamalasetti, Paragraphs [0016] and [0055] disclose neural networks provide a multi-stage compute process in which each stage is a layer of a neural network forming a pipeline; O’Shea, page 4, section 2.1, first paragraph discloses a CNN with fully connected layers; and O’Shea, pages 5-6, section 2.2, first through fourth paragraphs and FIG. 2 disclose layer parameters focused around learnable kernels, an activation map being generated each time data hits a convolutional layer, and every kernel having a corresponding activation map, which teaches outputting an activation map (from the first pipeline stage); the intermediary pipeline stages receiving the activation map and processing the activation map from one intermediary pipeline stage to a next intermediary pipeline stage and outputting an intermediary activation map to the final pipeline stage, and the final pipeline stage processing the intermediary activation map).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the invention to incorporate the activation maps of O’Shea into the method provided by the combination of Chalamalasetti and Bettilyon to provide the benefits of parameter sharing as taught in the seventh paragraphs on page 7 (page 7, lines 30-34) of O’Shea.
As per claim 16, the combination of Chalamalasetti and Bettilyon as shown in claim 15 teaches:  the accelerator of claim 15, wherein the data associated with the processing of each pipeline comprises weights (Chalamalasetti, FIG. 5A and paragraphs [0016] and [0052] disclose block 520 to dynamically adjust weights for each layer or a neural network (each layer of the neural network corresponding to a stage), which teaches wherein the data associated with the processing of each pipeline stage comprises weights).
However, Chalamalasetti and Bettilyon fail to explicitly teach: activation 
maps.
However, O’Shea teaches: activation maps (O’Shea, Page 7, lines 30-34 disclose “Parameter sharing works on the assumption that if one region feature is useful to compute at a set spatial region, then it is likely to be useful in another region.  If we constrain each activation map within the output volume to the same weights and bias, then we will see a massive reduction in the number of parameters being produced by the convolutional layer.”).
	It would have been obvious to one having ordinary skill in the art before the effective filing date of the invention to apply the relationship between weights and activation maps as taught by O’Shea into the accelerator as taught by the combination of Chalamalasetti and Bettilyon to improve performance of the accelerator as taught by Chalamalasetti and Bettilyon. 
Claims 6, 13, and 20 are rejected under 35 U.S.C. §103 as being unpatentable over Chalamalasetti in view of Bettilyon, and further in view of U.S. Patent Application Publication 2018/0314941 to Lie et al. (hereinafter, “Lie”).
As per claim 6, the combination of Chalamalasetti and Bettilyon as shown in claim 1 teaches:  the method of claim 1.
However, the combination of Chalamalasetti and Bettilyon fail to explicitly 
teach: wherein the input data is data received from one or more of: a sensor measuring or detecting a physical parameter, a rolling shutter camera, a radar detector, a LIDAR scanning and detection mechanism, and a server storing high frequency day trading data.
	However, Lie teaches:  wherein the input data is data received from one or more of: a sensor measuring or detecting a physical parameter, a rolling shutter (Lie, FIG. 1, FIG. 2, and paragraphs [0469], [0475], and [0484] disclose a camera 135 and a video camera 232 inputting video into one or more inference engines (e.g., inference engine 233) of an autonomous vehicle 130 or 230 wherein the inference engines are implemented by using techniques such as deep learning accelerator 120, which teaches wherein the input data is data received from one or more of: a sensor measuring or detecting a physical parameter, a rolling shutter camera, a radar detector, a LIDAR scanning and detecting mechanism, and a server storing high frequency day trading data).
	It would have been obvious to one having ordinary skill in the art before the effective filing date of the invention to supply the input data received from a sensor (camera) in Lie into the accelerator as taught by the combination of Chalamalasetti and Bettilyon, so that the input data is obtained without delay to improve the method of processing deep learning inference workloads.
As per claim 13, the combination of Chalamalasetti and Bettilyon as shown in claim 11 teaches:  the accelerator of claim 11.
However, Chalamalasetti and Bettilyon fail to explicitly teach: wherein the plurality of input data is received from one or more of: a sensor measuring or detecting a physical parameter, a rolling shutter camera, a radar detector, a LIDAR scanning and detection mechanism, and a server storing high frequency day trading data.
However, Lie teaches: wherein the plurality of input data is received from one or more of: a sensor measuring or detecting a physical parameter, a rolling shutter (Lie, FIG. 1, FIG. 2, and paragraphs [0469], [0475], and [0484] disclose a camera 135 and a video camera 232 inputting video into one or more inference engines (e.g., inference engine 233) of an autonomous vehicle 130 or 230 wherein the inference engines are implemented by using techniques such as deep learning accelerator 120, which teaches wherein the plurality of input data is data received from one or more of: a sensor measuring or detecting a physical parameter, a rolling shutter camera, a radar detector, a LIDAR scanning and detecting mechanism, and a server storing high frequency day trading data).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the invention to supply the input data received from a sensor (camera) in Lie into the accelerator as taught by the combination of Chalamalasetti and Bettilyon, so that the input data is obtained without delay to improve the method of processing deep learning inference workloads.
As per claim 20, the combination of Chalamalasetti and Bettilyon as shown in claim 11 teaches:  the accelerator of claim 11.
However, the combination of Chalamalasetti and Bettilyon fails to explicitly teach: an autonomous vehicle.
However, Lie teaches: an autonomous vehicle (Lie, FIG. 1 and paragraphs [0469] and [0475] disclose an autonomous vehicle 130 including a deep learning accelerator 120, which taken in combination with Chalamalasetti and Bettilyon teaches an autonomous vehicle comprising the accelerator of claim 11 as taught by Chalamalasetti and Bettilyon).
Because Lie teaches the incorporation of a deep learning accelerator into an autonomous vehicle in FIG. 1 and paragraphs [0469] and [0475] of Lie, it would have been obvious to one having ordinary skill in the art before the effective filing date of the invention to incorporate the accelerator as taught by Chalamalasetti and Bettilyon into an autonomous vehicle of Lie to improve the accelerator of Lie.
Claim 7 is rejected under 35 U.S.C. §103 as being unpatentable over Chalamalasetti in view of Bettilyon, and further in view of “Semantic3D.NET: A New Large-Scale Point Cloud Classification Benchmark” to Hackel et al. (hereinafter, “Hackel”).
As per claim 7, the combination of Chalamalasetti and Bettilyon as shown in claim 1 teaches:  the method of claim 1.
However, the combination of Chalamalasetti and Bettilyon fails to explicitly teach: wherein input data comprises a portion of a point cloud.
However, Hackel teaches: wherein input data comprises a portion of a point cloud (Hackel, Abstract discloses a point cloud classification benchmark data sets as input to a convolutional neural network, which teaches wherein input data comprises a portion of a point cloud). 
It would have been obvious to one having ordinary skill in the art before the effective filing date of the invention to apply the point cloud dataset of Hackel into the method of processing deep learning inference workloads in order to learn richer, more general 3D representations (Hackel, Abstract).   

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL FREYMUTH DAEBELER whose telephone number is (571) 272-8315.  The examiner can normally be reached on 8:00 AM -5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, LI B. ZHEN can be reached on (571) 272- 3768.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/PAUL FREYMUTH DAEBELER/Examiner, Art Unit 2121                                                                                                                                                                                                        


/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121