Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claim 13 is objected to because of the following informalities:  The claim ends with a comma as shown: “The calculation method of the neural network according to claim 12, wherein the second weight parameter is a set of predetermined lower digits of the weight parameter whose absolute value is equal to or less than a predetermined threshold value,”.  Appropriate correction is required.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

s 1 and 6 are rejected under 35 U.S.C. 103 as being unpatentable over Srinivasan et al. (“Significance Driven Hybrid 8T-6T SRAM for Energy-Efficient Synaptic Storage in Artificial Neural Networks”) and Lee et al. (EP 3098762 A1).
	Regarding claim 1,
Srinivasan teaches a calculation system in which a neural network performing calculation using input data and a weight parameter is implemented in a calculation device including a calculation circuit (Srinivasan, in page 151, in the text for figure 1 “Fig. 1. Feedforward ANN with an input and two hidden layers, followed by an output layer. The artificial neurons accumulate the product of the inputs and the interconnecting synaptic weights, and apply a sigmoid activation function to the resulting sum”) and an internal memory and an external memory, wherein the weight parameter is divided into two, i.e., a first weight parameter and a second weight parameter, the first weight parameter is stored in the internal memory of the calculation device, and the second weight parameter (Srinivasan, in the abstract, recites in part “On the contrary, the on-chip synaptic storage designed using a conventional 6T SRAM is susceptible to bitcell failures at reduced voltages.”  Srinivasan, on the left column of page 152, recites in part “We, therefore propose a significance driven hybrid 8T-6T SRAM, wherein the sensitive MSBs of the synaptic weights are stored in 8T bitcells while the relatively resilient LSBs are stored in 6T bitcells.”  Where the MSB of the synaptic weight is the first weight parameter and the LSB of the synaptic weight is the second weight parameter.)
  Srinivasan does not teach is stored in the external memory.
is stored in the external memory. (Lee in paragraph 4 recites in part “External memory can also be used to store a large number of weights used in the feature classification layers.”).
  It would have been obvious to anyone of ordinary skill in the art at the time of the claimed invention to combine the teachings of Srinivasan with the teachings of Lee with the motivation of being able to store a large number of weights (see the quote from paragraph 4 of Lee cited above).
Regarding claim 6,
Claim 6 is directed towards a method that is substantially identical to what is recited in claim 1.  Therefore the rejections to claim 1 apply equally here.
In addition, the Srinivasan/Lee combination teaches the additional limitations of wherein the neural network contains an intermediate layer that performs processing including inner product calculation, (in the caption of figure 1, Srinivasan recites in part “Feedforward ANN with an input and two hidden layers, followed by an output layer. The artificial neurons accumulate the product of the inputs and the interconnecting synaptic weights” where hidden layer and intermediate layer are equivalent terms and where ‘accumulate the product of the inputs’ is equivalent to ‘processing inner product calculation’).
Regarding claim 4, 8-9, 11, and 14 are rejected under 35 U.S.C. 103 as being unpatentable over the Srinivasan/Lee combination and in further view of Xu et al (US 8131659 B2).
Regarding claim 4,
The Srinivasan/Lee combination has taught the calculation system according to claim 1, (see the discussion of claim 1 as listed above), but this combination does not teach wherein the calculation circuit is constituted by an FPGA (Field-Programmable Gate Array), and 61 the internal memory is at least one of a memory storing configuration data for setting the calculation circuit and a memory storing an intermediate result of calculation executed by the calculation circuit.
Xu, in the same field of artificial neural networks, teaches wherein the calculation circuit is constituted by an FPGA (Field-Programmable Gate Array), (Xu, in column 12 lines 8 to 24 recites in part (and references figure 6) “At least some of the training data is written directly to an internal memory (such as internal memory 424) directly accessible by the FPGA computation logic blocks such as processing units or processing engines.”)
and 61the internal memory is at least one of a memory storing configuration data for setting the calculation circuit and a memory storing an intermediate result of calculation executed by the calculation circuit.  (Xu, in column 12, lines 20 to 24, recites in part “At (hh), the hardware polls the prepare register until the register is pulled up by software, and then sends training results to the software. The training results may be intermediate data or final results.”  Where the register is a type of memory storage and where “intermediate data” is equivalent to an intermediate result of calculation).
 It would have been obvious to anyone of ordinary skill in the art at the time of the claimed invention to combine the teachings of the Srinivasan/Lee combination with the teachings of Xu with the motivation of using internal memories within the same device as the calculation circuits in order to achieve both high bandwidth and low latency (Xu col 15, lines 30 to 66, recites in part “Temporary data structures, such as intermediate variables, parameters, and so forth, and results, e.g., the learned model, could be stored in the onboard memory (such as the onboard memory 108) or registers inside the FPGA, which would act as high bandwidth, low latency cache. The data could be utilized without needing to access memory off of the FPGA, which would enhance the access speed of the cache.”)
Regarding claim 8,
The Srinivasan/Lee combination has taught the calculation system according to claim 6, but does not teach wherein the calculation circuit is constituted by an FPGA (Field-Programmable Gate Array), 63 the storage area is constituted by an SRAM (Static Random Access Memory), the calculation circuit and the storage area are embedded in a single chip semiconductor device.  
Xu, in the same field of neural networks, teaches wherein the calculation circuit is constituted by an FPGA (Field-Programmable Gate Array), 63 the storage area is constituted by an SRAM (Static Random Access Memory), the calculation circuit and the storage area are embedded in a single chip semiconductor device.   (Xu, in column 16 lines 19-21 recites “The accelerator system supports hierarchical memory organization and access methods using SDRAM, SRAM and RAM/registers within the FPGA.” Where an FPGA is a single chip semiconductor device which inherently has calculation circuits).
It would have been obvious to anyone of ordinary skill in the art at the time of the claimed invention to combine the teachings of the Srinivasan/Lee combination with the teachings of Xu with the motivation of using internal memories within the same device as the calculation circuits in order to achieve both high bandwidth and low latency (Xu col 15, lines 30 to 66, recites in part “Temporary data structures, such as intermediate variables, parameters, and so forth, and results, e.g., the learned model, could be stored in the onboard memory (such as the onboard memory 108) or registers inside the FPGA, which would act as high bandwidth, low latency cache. The data could be utilized without needing to access memory off of the FPGA, which would enhance the access speed of the cache.”)
Regarding claim 9,
 The Srinivasan/Lee/Xu combination teaches the calculation system according to claim 8, wherein the one chip semiconductor device has a temporary storage area storing intermediate results of calculations executed in the calculation circuit, a part of the weight parameter for calculating the inner product is further stored in the temporary storage area. (Xu col 15, lines 30 to 66, recites in part “Temporary data structures, such as intermediate variables, parameters, and so forth, and results, e.g., the learned model, could be stored in the onboard memory (such as the onboard memory 108) or registers inside the FPGA, which would act as high bandwidth, low latency cache. The data could be utilized without needing to access memory off of the FPGA, which would enhance the access speed of the cache.”)
Regarding claim 11,
The Srinivasan/Lee combination has taught a calculation method of a neural network, (see the discussion of claim 1) but does not teach wherein the neural network is implemented on a calculation 64 system including a calculation device including a calculation circuit and an internal memory, 
Xu, in the same field of artificial neural networks, teaches wherein the neural network is implemented on a calculation 64 system including a calculation device including a calculation circuit and an internal memory, (Xu, in column 5 lines 7-13 recites in part “Xu col 15, lines 30 to 66, recites in part “Temporary data structures, such as intermediate variables, parameters, and so forth, and results, e.g., the learned model, could be stored in the onboard memory (such as the onboard memory 108) or registers inside the FPGA, which would act as high bandwidth, low latency cache. The data could be utilized without needing to access memory off of the FPGA, which would enhance the access speed of the cache.” Where the registers inside the FGPA are an internal memory, and the FPGA is a calculation system.)
	and a bus connecting the calculation device and the external memory, (Xu, Col 4, 60-63 “The PCI could be replaced by other computer buses, including but not limited to PCI-X, PCI-Express, HyperTransport, Universal Serial Bus (USB) and Front-Side Bus (FSB).” , column 10, lines 49-53, “Altera Stratix-II FPGAs further support various high-speed external memory interfaces, including double data rate (DDR) SDRAM and DDR2 SDRAM, RLDRAM II, quad data rate (QDR) II SRAM, and single data rate (SDR) SDRAM.”)
	and the calculation method of the neural network performs calculation using input data and a weight parameter with the neural network, the calculation method comprising: storing a first weight parameter, which is a part of the weight parameter, to the internal memory; storing a second weight parameter, which is a part of the weight parameter, (Srinivasan, in the abstract recites in part: “On the contrary, the on-chip synaptic storage designed using a conventional 6T SRAM is susceptible to bitcell failures at reduced voltages. However, the intrinsic error resiliency of neural networks to small synaptic weight perturbations enables us to scale the operating voltage of the 6T SRAM. Our analysis on a widely used digit recognition dataset indicates that the voltage can be scaled by 200 mV from the nominal operating voltage (950 mV) for practically no loss (less than 0.5%) in accuracy (22 nm predictive technology). Scaling beyond that causes substantial performance degradation owing to increased probability of failures in the MSBs of the synaptic weights. We, therefore propose a significance driven hybrid 8T-6T SRAM, wherein the sensitive MSBs are stored in 8T bitcells that are robust at scaled voltages due to decoupled read and write paths. In an effort to further minimize the area penalty, we present a synaptic-sensitivity driven hybrid memory architecture consisting of multiple 8T-6T SRAM banks” Where the original weight parameter is divided into the stated MSB (which is the first weight parameter) and the unstated LSB which is the remainder of the original weight parameter after the MSB portion is removed (which is the second weight parameter).
 to the external memory; (Lee in paragraph 4 recites in part “External memory can also be used to store a large number of weights used in the feature classification layers.”) reading the first weight parameter from the internal memory (Lee in paragraph 67, recites in part “For example, if there are 16 million 16-bit parameters for layer 755 and 32 KB of storage for weights 750 in internal memory 705, the NN engine reads a subset of weights […]”)
and reading the second weight parameter from the external memory when the calculation is performed; . (Lee in claim 6, recites in part “the method of claim 5, wherein the processing the plurality of output feature maps of the plurality of images through the feature classification layer comprises: loading a first plurality of weights of the feature classification layer from an external memory into the internal memory of the processor;” Where loading the entire weight parameter into internal memory inherently loads the second weight parameter into internal memory.)
and 65 preparing the weight parameter required for the calculation in the calculation device and performing the calculation.  (Figure 3 of Srinivasan (shown below), this shows how the weights for each layer of the neural network are stored in different memories (look at configuration 1 or 2) preparing the weight parameter is just rejoining the two halves of the weight after reading them from the arrays and thus it is inherent).
`
    PNG
    media_image1.png
    738
    642
    media_image1.png
    Greyscale

It would have been obvious to anyone of ordinary skill in the art at the time of the claimed invention to combine the teachings of the Srinivasan/Lee combination with the teachings of Xu with the motivation of using internal memories within the same device as the calculation circuits in order to achieve both high bandwidth and low latency (Xu col 15, lines 30 to 66, recites in part “Temporary data structures, such as intermediate variables, parameters, and so forth, and results, e.g., the learned model, could be stored in the onboard memory (such as the onboard memory 108) or registers inside the FPGA, which would act as high bandwidth, low latency cache. The data could be utilized without needing to access memory off of the FPGA, which would enhance the access speed of the cache.”)
Regarding claim 14,
The Srinivasan/Lee/Xu combination teaches the calculation method of the neural network according to claim 11, wherein the external memory stores the entire weight parameter including both of the first weight parameter and the second weight parameter, and among them, a part corresponding to the first weight parameter is transferred to the internal memory.  (Lee, in claim 6, recites in part “the method of claim 5, wherein the processing the plurality of output feature maps of the plurality of images through the feature classification layer comprises: loading a first plurality of weights of the feature classification layer from an external memory into the internal memory of the processor;” Where loading the entire weight parameter into internal memory inherently loads the first weight parameter into internal memory.)
Claims 2, 7, 12 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over the Srinivasan/Lee combination as applied to claim 1 above, and further in view of Yu et al. (US 20140257803 A1).  
Regarding claim 2,
The Srinivasan/Lee combination teaches the calculation system according to claim 1, wherein the first weight parameter is a set of 60 predetermined lower digits (Srinivasan, in the section 1, on page 152, recites in part “We, therefore propose a significance driven hybrid 8T-6T SRAM, wherein the sensitive MSBs of the synaptic weights are stored in 8T bitcells while the relatively resilient LSBs are stored in 6T bitcells.” Where the LSB bits are the predetermined lower digital of the weight values”  ).
The Srinivasan/Lee combination does not teach the weight parameter whose absolute value is equal to or less than a predetermined threshold value, (see the teachings of Yu, shown below)
The Srinivasan/Lee combination teaches and the second weight parameter is a set of part of the weight parameter other than the first weight parameter. (As shown above, Srinivasan on page 152 recites in part “We, therefore propose a significance driven hybrid 8T-6T SRAM, wherein the sensitive MSBs of the synaptic weights are stored in 8T bitcells while the relatively resilient LSBs are stored in 6T bitcells.”  Which shows that the weights are divided into two parts, a LSB part and a MSB part. )
Yu, in the same field of artificial neural networks, teaches the weight parameter whose absolute value is equal to or less than a predetermined threshold value, (Yu, paragraph 0043, recites in part “In other examples, the adapter component 114 may be configured to only adapt weights of synapses with absolute values below a predefined threshold.”  Where ‘below a predetermined threshold’ is equivalent to ‘equal to or less than a predetermined threshold value’ if the predetermined threshold is simply increased by the smallest possible amount which would have been obvious to anyone of ordinary skill in the art at the time of the claimed invention.  Where the system of Yu can be on implementations such as SOCs that inherently have internal memories are referenced in claim 1.  Yu, in paragraph 0064 recites in part “Alternatively, or in addition, the functionally described herein can be performed, at least in part, by one or more hardware logic components. For example, and without limitation, illustrative types of hardware logic components that can be used include Field-programmable Gate Arrays (FPGAs), Program-specific Integrated Circuits (ASICs), Program-specific Standard Products (ASSPs), System-on-a-chip systems (SOCs), Complex Programmable Logic Devices (CPLDs), etc.”)
 It would have been obvious to anyone of ordinary skill in the art at the time of the claimed invention to combine the teachings of the Srinivasan/Lee combination with the teachings of Yu with the motivation of being able to limit how many nodes are adapted in order to ease time constraints (Yu, in paragraph 0034, recites in part “In other embodiments, the adapter component 114 can adapt a subset of parameters of the DNN 106. For instance, the adapter component 114 can cause parameters of a single hidden layer to be adapted, can cause parameters corresponding to certain nodes to be adapted, etc. Selectively updating a subset of parameters of the DNN 106 may be beneficial in situations where the computing device 102 has received a relatively large amount of speech data from the user 104, and there is a time constraint on the adapting of the DNN 106.” Where the quote in used earlier from paragraph 0043 describes using threshold based on absolute value (threshold) as a basis for choosing which weight values to adapt.).
Regarding claim 7,
Claim 7 is directed towards a method that is substantially identical to what is recited in claim 2.  Therefore the rejections to claim 2 apply equally here.
Regarding claim 12,
Claim 12 is directed towards a method that is substantially identical to what is recited in claim 2.  Therefore the rejections to claim 2 apply equally here.
Regarding claim 13,
Claim 13 is directed towards a method that is substantially identical to what is recited in claim 2.  Therefore the rejections to claim 2 apply equally here.
Claims  3 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over the Srinivasan/Lee combination as applied to claim 1 above, and further in view of Xu and Liaw et al. (US 7,307,871 B2).  
Regarding claim 3,
The Srinivasan/Lee combination teaches the calculation system according to claim 1, (see the discussion of claim 1 above) but it does not teach wherein the calculation circuit is constituted by an FPGA (Field-Programmable Gate Array), the internal memory is an SRAM (Static Random Access Memory), and the external memory
Xu, in the same field of artificial neural networks, teaches wherein the calculation circuit is constituted by an FPGA (Field-Programmable Gate Array), the internal memory is an SRAM (Static Random Access Memory), (Xu, in column 15 lines 20 to 22, recites “The accelerator system supports hierarchical memory organization and access methods using SDRAM, SRAM and RAM/registers within the FPGA.”) and the external memory (Xu in col 10, lines 49-53, recites in part “Altera Stratix-II FPGAs further Support various high-speed external memory interfaces, including double data rate (DDR) SDRAM and DDR2 SDRAM, RLDRAM II, quad data rate (QDR) II SRAM, and single data rate (SDR) SDRAM.”)
Neither the Srinivasan/Lee combination nor Xu teaches is a memory superior to the SRAM in a soft error resistance.
Liaw, in the applicant mentioned field of soft error resistant memories, teaches is a memory superior to the SRAM in a soft error resistance.  (Liaw recites, in column 3 lines 19 to 21 “The present invention describes an apparatus and method to reduce soft error rate of a SRAM memory cell.”  Where the SRAM with the inferior soft error resistance is the conventional internal SRAM, as opposed to the soft error resistant SRAM taught by Liaw).
It would have been obvious to anyone of ordinary skill in the art at the time of the claimed invention to combine the teachings of the Srinivasan/Lee combination with the teachings of Liaw in order to store the more sensitive MSB bits in of the weights in a more soft error resistant SRAM memory that could be external with the motivation of soft rates in SRAM being a known problem (see Liaw, col 1, lines 63-64 “Thus, a need exists to provide a memory cell that offers improved protection against soft errors.”).
Regarding claim 15,
Claim 15 is directed towards a method that is substantially identical to what is recited in claim 3.  Therefore the rejections to claim 3- apply equally here.
Claims  5 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over the Srinivasan/Lee combination as applied to claim 1 above, and further in view of Shafiee et al. (“ISAAC: A convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars”).
While the Srinivasan/Lee combination has taught the calculation system according to claim 1, (see the discussion of claim 1 above) but the combination has not taught wherein the neural network includes at least one of a convolution layer and a full connection layer performing sum-of-products calculation, and the weight parameter is data for performing the sum- of-products calculation on the input data.
Shafiee, in the same field of artificial neural networks, teaches the calculation system according to claim 1, wherein the neural network includes at least one of a convolution layer and a full connection layer (Shafiee, in section II A, recites in part “Convolutional neural networks (CNNs) are deep neural networks primarily seen in the context of computer vision, and consist of four different types of layers: convolutional, classifier, pooling, and local response/contrast normalization (LRN/LCN). […] The classifier layer can be viewed as a special case of a convolution, with many output feature maps, each using the largest possible kernel size, i.e., a fully connected network...”  Where the ‘fully connected network’ of the classifier contains at least one full connection layer.) performing sum-of-products calculation, and the weight parameter is data for performing the sum- of-products calculation on the input data. (Shafiee, in section V, recites in part “Having addressed the input values and the DACs, we now turn our attention to the synaptic weights and the ADCs. It is impractical to represent a 16-bit synaptic weight in a single memristor cell [26]. We therefore represent one 16-bit synaptic weight with 16/w w-bit cells located in the same row. For the rest of this discussion, we assume w = 2 because it emerges as a sweet spot in our design space exploration. When an input is provided, the cells in a column perform their sum of products operations.”)
It would have been obvious to anyone of ordinary skill in the art at the time of the claimed invention to combine the teachings of the Srinivasan/Lee combination with the teachings of Shafiee in order to have the benefits of convolutional neural networks at image recognition (CNNs) (Shafiee, at the start of section II B, recites in part “We first summarize CNNs targeted at image detection and classification, such as the winners of the annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC) [58].”).
Regarding claim 10,
While the Srinivasan/Lee combination has taught the calculation system according to claim 6, (see the discussion of claim 6 above), it does not teach wherein the intermediate layer is a convolution layer or a full connection layer.
Shafiee, in the same field of artificial neural networks teaches wherein the intermediate layer is a convolution layer or a full connection layer.  (Shafiee, in section II Background A. CNNs and DNNs, recites in part “Deep neural networks (DNNs) are a broad class of classifiers consisting of cascading layers of neural networks. Convolutional neural networks (CNNs) are deep neural networks primarily seen in the context of computer vision, and consist of four different types of layers: convolutional, classifier, pooling, and local response/contrast normalization (LRN/LCN). […] A typical algorithm in the image processing domain starts with multiple convolutional layers […] The classifier layer can be viewed as a special case of a convolution, with many output feature maps, each using the largest possible kernel size, i.e., a fully connected network.”  Which shows that the system described by Shafiee uses primarily convolution layers but also has at least one fully connected layer.  Since there are multiple convolutional layers and the classification layer comes after the convolutional layers but before the pooling layer, both the classification layer (which is fully connected) and at least one convolutional layer has to be an intermediate layer).
It would have been obvious to anyone of ordinary skill in the art at the time of the claimed invention to combine the teachings of the Srinivasan/Lee combination with the teachings of Shafiee in order to have the benefits of convolutional neural networks at image recognition (CNNs) (Shafiee, at the start of section B, recites in part “We first summarize CNNs targeted at image detection and classification, such as the winners of the annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC) [58].”).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL EDWARD SHIPLEY whose telephone number is (408)918-7530.  The examiner can normally be reached on Monday-Thursday and alternate Fridays 7:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on 571-270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/P.E.S./Examiner, Art Unit 2124                                                                                                                                                                                                        



/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124