DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments with respect to claim(s) 1 and 10 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).

The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-11 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 10-18 of copending Application No. 15618906 (reference application). Although the claims at issue are not identical, they are not patentably distinct from each other because the only difference appears to be that claims from the copending application 15618906 are the apparatus claims 10-18, 
This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.
Corresponding Application: 15/618,906
Instant Application: 15/812,608
Claim 10: A method, comprising:
forming an analog integrated circuit chip having a Convolutional Neural Network (CNN), the CNN including a two-dimensional (2D) array of analog elements arranged in columns and rows and being configured to simultaneously provide a plurality of outputs by duplicating, using a controllable connection weight allocation duplication factor such that a larger duplication factor results in faster execution versus a smaller duplication factor, a same final connection weight corresponding to a single one of the analog elements on a plurality of the analog elements in 
An apparatus, comprising: an 

analog integrated circuit chip having a Convolutional Neural Network (CNN), the CNN including a two-dimensional (2D) array of analog elements arranged in columns and rows and being configured to simultaneously provide a plurality of outputs by duplicating, using a controllable connection weight allocation duplication factor such that a larger duplication factor results in faster execution versus a smaller duplication factor, a same final connection weight corresponding to a single one of the analog elements on a plurality of the analog elements in different ones of the 
The method of claim 10, further comprising configuring the CNN to perform a pooling operation by arranging connection weights produced by a duplication in a single column
Claim 2: The apparatus of claim 1, wherein connection weights produced by a duplication are arranged in a single column for a pooling operation.
Claim 12: The method of claim 11, wherein the pooling operation is equivalent to a sum pooling operation.
Claim 3: The apparatus of claim 2, wherein the pooling operation is equivalent to a sum pooling operation.
Claim 13: The method of claim 10, wherein connection weights of the CNN are represented by respective electric conductances of the analog elements of the 2D array.
Claim 4: The apparatus of claim 1, wherein connection weights of the CNN are represented by respective electric conductances of the analog elements of the 2D array.
Claim 14: The method of claim 10, wherein respective voltages provided to the analog elements of the 2D array form respective inputs to the 2D array.
Claim 5: The apparatus of claim 1, wherein respective voltages provided to the analog elements of the 2D array form respective inputs to the 2D array.
Claim 15: The method of claim 14, wherein said forming step forms the analog integrated circuit chip such that the CNN further includes a set of Digital to Analog Converters for converting the respective voltages from a digital domain to an analog domain.
The apparatus of claim 5, further comprising a set of Digital to Analog Converters for converting the 
The method of claim 10, wherein respective currents, read from the columns in which the analog elements of the 2D array are arranged, form respective outputs from the 2D array.
Claim 7: The apparatus of claim 1, wherein respective currents, read from the columns in which the analog elements of the 2D array are arranged, form respective outputs from the 2D array.
Claim 17: The method of claim 16, wherein said forming step forms the analog integrated circuit chip such that the CNN further includes a set of Analog to Digital Converters for converting the respective currents from an analog domain to a digital domain.
Claim 8: The apparatus of claim 7, further comprising a set of Analog to Digital Converters for converting the respective currents from an analog domain to a digital domain.
Claim 18: The method of claim 10, wherein the 2D array of analog elements is formed in a fully connected layer of the CNN.
Claim 9: The apparatus of claim 1, wherein the 2D array of analog elements is comprised in a fully connected layer of the CNN.
Claim 10: A method, comprising:
forming an analog integrated circuit chip having a Convolutional Neural Network (CNN), the CNN including a two-dimensional (2D) array of analog elements arranged in columns and rows and being configured to simultaneously provide a plurality of outputs by duplicating, using a controllable connection weight allocation duplication factor such that a larger duplication factor results in faster execution versus a smaller duplication factor, a same final connection weight corresponding to a single one of the analog elements on a plurality of the analog elements in different ones of the columns of the 2D array, wherein the outputs are provided from the columns.
A system, comprising: 
an integrated circuit manufacturing system configured to convert an input specification into an analog integrated circuit chip having a Convolutional Neural Network (CNN), the CNN including a two-dimensional (2D) array of analog elements arranged in columns and rows and being configured to simultaneously provide a plurality of outputs by duplicating, using a controllable connection weight allocation duplication factor such that a larger duplication factor results in faster execution versus a smaller duplication factor, a same final connection weight corresponding to a single one of the analog elements on a plurality of the analog elements in Page 3 of 13different ones of the columns of the 2D array, wherein the outputs are provided from the columns.

The method of claim 10, wherein the 2D array of analog elements is formed in a fully connected layer of the CNN.
Claim 11: 
The system of claim 10, wherein the 2D array of analog elements is comprised in a fully connected layer of the CNN.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2 and 4-11 are rejected under 35 U.S.C. 103 as being unpatentable over Xia et al. (“Switched by input: Power efficient structure for RRAM-based convolutional neural network"; hereinafter Xia) and Shafiee et al. (“ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars”; hereinafter Shafiee).
Regarding Claim 1,
Xia teaches an apparatus, comprising: 
an analog integrated circuit chip having a Convolutional Neural Network (CNN) (pg. 1, col. 2; Since RRAM is used as an analog computing device, the large amounts of analog intermediate results are difficult to be directly stored, while some other functions in CNN like max pooling are difficult to implement in analog circuits.), the CNN including a two-dimensional (2D) array of analog elements arranged in columns and rows (pg. 2, section 2.2; For example, for the Conv layer containing 64 kernels in 3×3×3 size, we can use 27 × 64 RRAM crossbar to store all the 64 kernels, where each RRAM column stores the weights of a specific 3 × 3 × 3 Conv kernel.) and being configured to simultaneously provide a plurality of outputs by duplicating,…, a same final connection weight corresponding to a single one of the analog elements on a plurality of the analog elements in different ones of the columns of the 2D array (pg. 4, Section 4.2; To deal with this problem, we use an additional RRAM column, which is also selected by input data, to implement the dynamic threshold, as Fig. 4 shows. If we map k on the extra “input” port and store the bias w0 into the cells of additional RRAM column, the output signal of the rightmost column is the dynamic part of threshold k  inj=1 w0.), wherein the outputs are provided from the columns (pg. 2, section 2.2; ). If the “matrix” is stored by the conductivity of the RRAM devices and the “vector” by the input voltage signals, the RRAM crossbar is able to perform analog matrix-vector multiplication (or vector-vector inner product). The relationship between the input and output voltage can be expressed as in Equ. (3) [3]: iout,k = N j=1 gk,j · vin,j (3) where vin is the input voltage (denoted by j = 1, 2, ..., N),iout is the output current (denoted by k = 1, 2, ..., M), and gk,j is the conductance of RRAM device representing the matrix data.).

    PNG
    media_image1.png
    440
    542
    media_image1.png
    Greyscale

Xia does not explicitly disclose
using a controllable connection weight allocation duplication factor such that a larger duplication factor results in faster execution versus a smaller duplication factor
However, Shafiee teaches
using a controllable connection weight allocation duplication factor such that a larger duplication factor results in faster execution versus a smaller duplication factor (pg. 24; Thus, on early layers, ISAAC has a far higher CE than early layers of DaDianNao – the actual speedups vary depending on the degree of replication for each layer. And pg. 18; In essence, the synaptic weights for layer i−1 are replicated in a different crossbar array so that two different input vectors can be processed in parallel to produce two output values in one cycle. And pg. 18; One way to reduce the sequential 16-cycle delay is to replicate the synaptic weights on (say) two IMAs… In essence, if half the IMAs on a chip are not utilized, we can replicate all the weights and roughly double system throughput.)
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine Xia’s implementation of a CNN using a crossbar array with Shafiee’s implementation of a CNN using a crossbar array.
Doing so would allow for improved computational efficiency and power efficiency (Abs. On a suite of CNN and DNN workloads, the proposed ISAAC architecture yields improvements of 14.8×, 5.5×, and 7.5× in throughput, energy, and computational density (respectively), relative to the state-of-the-art DaDianNao architecture.)
Regarding Claim 2,
Xia and Shafiee teach the apparatus of claim 1. Xia further teaches wherein connection weights produced by a duplication are arranged in a single column for a pooling operation (pg. 2; Spatial Pooling merges the neighbor area of input feature map, which chooses the maximum value of input blocks And pg. 4, Section 4.2; To deal with this problem, we use an additional RRAM column, which is also selected by input data, to implement the dynamic threshold, as Fig. 4 shows. If we map k on the extra “input” port and store the bias w0 into the cells of additional RRAM column, the output signal of the rightmost column is the dynamic part of threshold k  inj=1 w0.).

    PNG
    media_image1.png
    440
    542
    media_image1.png
    Greyscale

Regarding Claim 4,
Xia and Shafiee teach the apparatus of claim 1. Xia further teaches wherein connection weights of the CNN are represented by respective electric conductances of the analog elements of the 2D array (pg. 2, section 2.2; The relationship between the input and output voltage can be expressed as in Equ. (3) [3]: iout,k = N j=1 gk,j · vin,j (3) where vin is the input voltage (denoted by j = 1, 2, ..., N),iout is the output current (denoted by k = 1, 2, ..., M), and gk,j is the conductance of RRAM device representing the matrix data.).
Regarding Claim 5,
Xia and Shafiee teach the apparatus of claim 1. Xia further teaches wherein respective voltages provided to the analog elements of the 2D array form respective inputs to the 2D array (pg. 2, section 2.2; If the “matrix” is stored by the conductivity of the RRAM devices and the “vector” by the input voltage signals, the RRAM crossbar is able to perform analog matrix-vector multiplication (or vector-vector inner product).).
Regarding Claim 6,
Xia and Shafiee teach the apparatus of claim 5. Xia further comprising a set of Digital to Analog Converters for converting the respective voltages from a digital domain to an analog domain (pg. 3, section 3.2; As a result, the input layer of whole CNN still needs DACs to transfer input pictures into analog signals for RRAM-based design.).
Regarding Claim 7,
Xia and Shafiee teach the apparatus of claim 1. Xia wherein respective currents, read from the columns in which the analog elements of the 2D array are arranged, form respective outputs from the 2D array (pg. 2, section 2.2; The relationship between the input and output voltage can be expressed as in Equ. (3) [3]: iout,k = N j=1 gk,j · vin,j (3) where vin is the input voltage (denoted by j = 1, 2, ..., N),iout is the output current (denoted by k = 1, 2, ..., M), and gk,j is the conductance of RRAM device representing the matrix data.).
Regarding Claim 8,
Xia and Shafiee teach the apparatus of claim 7. Xia further teaches further comprising a set of Analog to Digital Converters for converting the respective currents from an analog domain to a digital domain (fig. 2 (b); pg. 3, section 4; However, the ADCs are still demanded when the output signals of RRAM crossbars need to be merged with other results instead of directly quantized by threshold processing.).
Regarding Claim 9,
pg. 2; Fully-Connected Layers are the final layers that all inputs and outputs are connected by weights like Artificial Neural Network (ANN). Generally, the function of FC layer can be regarded as: outputi = f( j wi,j × inputj + bi) (2) where −−−→ input = {input1, input2, ..., inputn} is the input vector of layer denoted by j, −−−−→ output = {output1, output2, ..., outputm} is the output vector of layer denoted by i. b is the bias vector that is only used in FC layer, and W = (wi,j )m×n is the weight matrix).
Regarding Claim 10,
Xia teaches a system, comprising: 
an integrated circuit manufacturing system configured to convert an input specification into an analog integrated circuit chip having a Convolutional Neural Network (CNN) (pg. 1, col. 2; Since RRAM is used as an analog computing device, the large amounts of analog intermediate results are difficult to be directly stored, while some other functions in CNN like max pooling are difficult to implement in analog circuits.), the CNN including a two-dimensional (2D) array of analog elements arranged in columns and rows (pg. 2, section 2.2; For example, for the Conv layer containing 64 kernels in 3×3×3 size, we can use 27 × 64 RRAM crossbar to store all the 64 kernels, where each RRAM column stores the weights of a specific 3 × 3 × 3 Conv kernel.) and being configured to simultaneously provide a plurality of outputs by duplicating, …a same final connection weight corresponding to a single one of the analog elements on a plurality of the analog elements in Page 3 of 13different ones of the columns of pg. 4, Section 4.2; To deal with this problem, we use an additional RRAM column, which is also selected by input data, to implement the dynamic threshold, as Fig. 4 shows. If we map k on the extra “input” port and store the bias w0 into the cells of additional RRAM column, the output signal of the rightmost column is the dynamic part of threshold k  inj=1 w0.), wherein the outputs are provided from the columns (pg. 2, section 2.2; ). If the “matrix” is stored by the conductivity of the RRAM devices and the “vector” by the input voltage signals, the RRAM crossbar is able to perform analog matrix-vector multiplication (or vector-vector inner product). The relationship between the input and output voltage can be expressed as in Equ. (3) [3]: iout,k = N j=1 gk,j · vin,j (3) where vin is the input voltage (denoted by j = 1, 2, ..., N),iout is the output current (denoted by k = 1, 2, ..., M), and gk,j is the conductance of RRAM device representing the matrix data.).

    PNG
    media_image1.png
    440
    542
    media_image1.png
    Greyscale


…using a controllable connection weight allocation duplication factor such that a larger duplication factor results in faster execution versus a smaller duplication factor,
However, Shafiee teaches
…using a controllable connection weight allocation duplication factor such that a larger duplication factor results in faster execution versus a smaller duplication factor (pg. 24; Thus, on early layers, ISAAC has a far higher CE than early layers of DaDianNao – the actual speedups vary depending on the degree of replication for each layer. And pg. 18; In essence, the synaptic weights for layer i−1 are replicated in a different crossbar array so that two different input vectors can be processed in parallel to produce two output values in one cycle. And pg. 18; One way to reduce the sequential 16-cycle delay is to replicate the synaptic weights on (say) two IMAs… In essence, if half the IMAs on a chip are not utilized, we can replicate all the weights and roughly double system throughput.),
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine Xia’s implementation of a CNN using a crossbar array with Shafiee’s implementation of a CNN using a crossbar array.
Doing so would allow for improved computational efficiency and power efficiency (Abs. On a suite of CNN and DNN workloads, the proposed ISAAC architecture yields improvements of 14.8×, 5.5×, and 7.5× in throughput, energy, and computational density (respectively), relative to the state-of-the-art DaDianNao architecture.)

Regarding Claim 11,
Xia and Shafiee teach the system of claim 10. Xia further teaches wherein the 2D array of analog elements is Page 3 of 12comprised in a fully connected layer of the CNN (pg. 2; Fully-Connected Layers are the final layers that all inputs and outputs are connected by weights like Artificial Neural Network (ANN). Generally, the function of FC layer can be regarded as: outputi = f( j wi,j × inputj + bi) (2) where −−−→ input = {input1, input2, ..., inputn} is the input vector of layer denoted by j, −−−−→ output = {output1, output2, ..., outputm} is the output vector of layer denoted by i. b is the bias vector that is only used in FC layer, and W = (wi,j )m×n is the weight matrix).

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Xia et al. (“Switched by input: Power efficient structure for RRAM-based convolutional neural network"; hereinafter Xia) in view of Shafiee et al. (“ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars”; hereinafter Shafiee) and Ji et al. (“NEUTRAMS: Neural Network Transformation and Co-design under Neuromorphic Hardware Constraints”).
Regarding Claim 3,
Xia and Shafiee teach the apparatus of claim 2. 
	Xia and Shafiee do not explicitly disclose
wherein the pooling operation is equivalent to a sum pooling operation.
However, Ji teaches
The apparatus of claim 2, wherein the pooling operation is equivalent to a sum pooling operation (pg. 8; the 3rd is sum pooling and the output is 2 × 12 × 12;).

Doing so would allow for a faster processing speed (pg. 13; Compared with other types with the same configuration, this type owns the highest speed and the lowest power dissipation: its speed can reach 2 to 2.66 times as much as that of the other two, while the core power dissipation is only 1/13 to 1/25 of the others.).
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Nestler et al. “Convolutional Neural Network” (US 20170169327 A1) – This prior art discloses an analog circuit comprising an array of capacitors for storing weights.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HENRY K NGUYEN whose telephone number is (571)272-0217.  The examiner can normally be reached on Mon - Fri 7:00am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/HENRY NGUYEN/Examiner, Art Unit 2121                                                                                                                                                                                                        
/BABOUCARR FAAL/Primary Examiner, Art Unit 2184