DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 03/01/2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Terminal Disclaimer
The terminal disclaimer filed on 07/28/2022 disclaiming the terminal portion of any patent granted on this application which would extend beyond the expiration date of Patent Number 10,936,891 has been reviewed and is accepted.  The terminal disclaimer has been recorded.
EXAMINER'S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.

Authorization for this examiner’s amendment was given in an interview with Jay Beale (Reg# 50,901) on 07/28/2022.

The application has been amended as follows: 
1. (Currently Amended) A memory device, comprising: 
a buffer die communicating with an external graphic processor; and 
a plurality of memory dies  to store learning data, weights, and a learned object recognition model received from the external graphic processor, the plurality of memory dies communicating with the buffer die,
wherein the buffer die includes a processor-in-memory circuit configured to receive the learning data and the weights from the plurality of memory dies, to divide a feature vector extracted from an input data into a first sub feature vector and a second sub feature vector, to provide the first sub feature vector to the external graphic processor, to receive the learned object recognition model from the external graphic processor, to perform a first calculation to apply the second sub feature vector and the weights to the learned object recognition model to generate a second object recognition result and to merge the second object recognition result and a first object recognition result generated by the external graphic processor to provide a merged object recognition result to a user.

22. (Currently Amended) The memory device of claim 1, wherein: 
the plurality of memory dies are stacked on the buffer die, 
the external graphic processor includes an artificial neural network engine configured to perform a second calculation to apply the first sub feature vector and the weights to the learned object recognition model to generate the first object recognition result, and 
the memory device further includes: a plurality of through silicon vias (TSVs) extending through the plurality of memory dies to connect to the buffer die.

23. (Currently Amended) The memory device of claim 22, wherein the processor-in-memory circuit includes: 
a data distributor to receive the feature vector from at least some of the plurality of memory dies, to divide the feature vector into the first sub feature vector and the second sub feature vector, and to provide the first sub feature vector to the external graphic processor; 
a multiplication and accumulation (MAC) circuit to receive the second sub feature vector from the data distributor, to apply the weights to the second sub feature vector from the data distributor, and to perform the second calculation to output the second object recognition result; and 
a controller to control the MAC circuit.

27. (Currently Amended) The memory device of claim 1, wherein: 
the first sub feature vector and the second sub feature vector include at least some duplicate data, 
the external graphic processor is to provide the memory device with an intermediate operation result on the at least some duplicate data, and 
the buffer die is to perform the first calculation using the intermediate operation result on the at least some duplicate data.

30. (Currently Amended) The semiconductor package of claim 28, wherein each of the one or more stacked memory devices includes: 
a buffer die to communicate with the graphic processor and an external device; 
a plurality of memory dies stacked on the buffer die; and 
a plurality of through silicon vias (TSVs) extending through the plurality of memory dies to connect to the buffer die, 
wherein each of the plurality of memory dies includes a memory cell array which includes a plurality of dynamic memory cells coupled to a plurality of word-lines and a plurality of bit-lines, and the plurality of dynamic memory cells store the learning data, the weights and the feature vector, and 
wherein the buffer die includes a processor-in-memory circuit connected to the plurality of memory dies through the plurality of TSVs, and the processor-in-memory circuit divides the feature vector into the first sub feature vector and the second sub feature vector, and performs the first calculation.

32. (Currently Amended) The semiconductor package of claim 31, further comprising: 
a central processing unit (CPU) to communicate with the graphic processor and the one or more stacked memory devices through a bus, 
wherein the CPU includes a system software to control the data distributor and the controller, and 
wherein the system software is to determine a division ratio of the first sub feature vector and the second sub feature vector.

34. (Currently Amended) A method of operating a memory device including a buffer die and a plurality of memory dies, the method comprising: 
storing, in the plurality of memory dies, learning data, weights and a learned object recognition model received from an external graphic processor; 
dividing, by a data distributor in the buffer die, a feature vector associated with an input data into a first sub feature vector and a second sub feature vector to provide the first sub feature vector to the external graphic processor; 
performing, by a multiplication and accumulation (MAC) circuit in the buffer die, a first calculation to apply the second sub feature vector and the weights to the learned object recognition model to provide a second object recognition result; and 
merging, by a pooler in the buffer die, the second object recognition result and a first object recognition result received from the external graphic processor to provide a merged object recognition result to a user, the first object recognition result being generated in the external graphic processor by performing a second calculation, 
wherein the first calculation and the second calculation are performed in parallel with each other.
Allowable Subject Matter
Claims 1 and 21-35 are allowed.
The following is an examiner’s statement of reasons for allowance: 
With regards to claim 1, Chatterjee et al. (US 2020/01 17700) discloses dividing an input vector into a plurality of sub vectors and then performing calculations to apply each sub vector to the model and provide respective result sub vectors, where the result sub vectors are concatenated into a result vector, however, there is no indication that a buffer die divides the input vector into the plurality of sub vectors, provides the first sub feature vector to a graphic processor, and performs a first calculation to apply a second sub feature vector and the weights to the model to generate a second object recognition result, that the graphic processor generates a first object recognition result, and that the buffer die merges the second object recognition result and the first object recognition result to provide a merged object recognition result to the user.  Marcus et al. (US 2016/0162779) discloses dividing a plurality of vectors into a training set and a validating set and using the training set to generate the predictive model and the validating set to validate the predictive model, however, it does not disclose where a buffer die generates a second object recognition result and a graphic processor generates a first object recognition result, where the first and second object recognition results are merged and provided to a user.  Goyal et al. (US 2017/0316312) discloses dividing a large size image into smaller image portions and input the smaller image portions into tensor engines that perform a portion/sub-task of the neural network processing task in parallel. However, Goyal et al. does not disclose or suggest that a buffer die performs a first calculation to apply a second sub feature vector and the weights to the neural network to generate a second object recognition result and a graphic processor generates a first object recognition result, where the object recognition results are merged.  Lea (US 2018/0276539) discloses a processor-in-memory circuit to train and generate recognition results, however, there is no mention of the rest of the limitations of the claim.  Thus, while different prior arts disclose parts of the claims, none of the prior arts disclose or have reasonable motivation to combine to disclose all of the limitations of the claim as a whole.
With regards to claims 21-27, they are dependent on allowed claim 1.
With regards to claim 28, Marcus et al. (US 2016/0162779) discloses dividing a plurality of vectors into a training set and a validating set and using the training set to generate the predictive model and the validating set to validate the predictive model, however, the performing of the calculations to apply the training set and to apply the validating set are not done in parallel. Goyal et al. (US 2017/0316312) discloses dividing a large size image into smaller image portions and input the smaller image portions into tensor engines that perform a portion/sub-task of the neural network processing task in parallel. However, Goyal et al. does not disclose or suggest that at least one or more stacked memory devices, which stores learning data and weights, performs a first calculation to apply a second sub feature vector and the weights to the neural network to provide a second object recognition result while the artificial neural network engine of a graphic processor performs a second calculation to apply a first sub feature vector and the weights to the neural network to provide a first object recognition result, where the second calculation is performed in parallel with the first calculation. Xie et al. (US 2018/0157969) discloses where each input data is divided into sub-blocks and the convolution and pooling unit of a convolutional neural network performs the convolution and pooling operations on the sub-blocks in parallel, however, that is not the same as at least one of the one or more stacked memory devices dividing a feature vector into a first sub feature vector and a second sub feature vector and performing a first calculation to apply the second sub feature vector and weights to the neural network, and an artificial neural network engine of a graphic processor performing a second calculation to apply the first sub feature vector and weights to the neural network, where the first and second calculations are performed in parallel. Chatterjee et al. (US 2020/01 17700) discloses dividing an input vector into a plurality of sub vectors and then performing calculations to apply each sub vector to the model and provide respective result sub vectors, where the result sub vectors are concatenated into a result vector, however, there is no indication that one or more stacked memory devices, which store learning data and weights of a learned object recognition model, performs a first calculation to apply a second sub vector and the weights to the learned object recognition model to provide a second result sub vector, and an artificial neural network engine of a graphic processor performs a second calculation to apply a first sub vector and the weights to the learned object recognition model to provide a first result sub vector, where the second calculation is performed in parallel with the first calculation. Thus, while different prior arts disclose parts of the claim, none of the prior arts disclose or have reasonable motivation to combine to disclose all of the limitations of the claim as a whole.
With regards to claims 29-33, they are dependent on allowed claim 28.
With regards to claims 34, Marcus et al. (US 2016/0162779) discloses dividing a plurality of vectors into a training set and a validating set and using the training set to generate the predictive model and the validating set to validate the predictive model, however, the performing of the calculations to apply the training set and to apply the validating set are not done in parallel. Goyal et al. (US 2017/0316312) discloses dividing a large size image into smaller image portions and input the smaller image portions into tensor engines that perform a portion/sub-task of the neural network processing task in parallel. However, Goyal et al. does not disclose or suggest a multiplication and accumulation circuit in a buffer die to perform a first calculation to apply a second sub feature vector and the weights to the neural network to provide a second object recognition result while the graphic processor performs a second calculation to provide a first object recognition result, where the second calculation is performed in parallel with the first calculation. Xie et al. (US 2018/0157969) discloses where each input data is divided into sub-blocks and the convolution and pooling unit of a convolutional neural network performs the convolution and pooling operations on the sub-blocks in parallel, however, that is not the same as a data distributor in the buffer die dividing a feature vector into a first sub feature vector and a second sub feature vector, a multiplication and accumulation circuit in the buffer die performing a first calculation to apply the second sub feature vector and weights to the neural network, and a graphic processor performing a second calculation, where the first and second calculations are performed in parallel. Chatterjee et al. (US 2020/01 17700) discloses dividing an input vector into a plurality of sub vectors and then performing calculations to apply each sub vector to the model and provide respective result sub vectors, where the result sub vectors are concatenated into a result vector, however, there is no indication that a multiplication and accumulation circuit in a buffer die performs a first calculation to apply a second sub vector and the weights to the learned object recognition model to generate a second result sub vector, and a graphic processor performs a second calculation to generate a first result sub vector, where the second calculation is performed in parallel with the first calculation. Thus, while different prior arts disclose parts of the claim, none of the prior arts disclose or have reasonable motivation to combine to disclose all of the limitations of the claim as a whole.
With regards to claim 35, it is dependent on allowed claim 34.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CAROL WANG whose telephone number is (571)272-5766. The examiner can normally be reached 9:30-3:30 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sumati Lefkowitz can be reached on (571) 272-3638. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CAROL WANG/Primary Examiner, Art Unit 2662