Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claims 1-2, 5-10, and 13-16 are pending. Claims 3-4 and 11-12 are canceled by Applicant.
Examiner Notes
Examiner cites particular paragraphs and/or columns and lines in the references as applied to Applicant’s claims for the convenience of the Applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the Applicant fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner. The prompt development of a clear issue requires that the replies of the Applicant meet the objections to and rejections of the claims. Applicant should also specifically point out the support for any amendments made to the disclosure. See MPEP § 2163.06.

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Authorization for Internet Communications in a Patent Application
Applicant may consider filing an Authorization for Internet Communications in a Patent Application form (http://www.uspto.gov/sites/default/files/documents/sb0439.pdf) along with the response to this office action to facilitate and expedite future communication between Applicant and the examiner. If the form is submitted then Applicant is requested to provide a contact email address in the signature block at the conclusion of the official reply.

USPTO Automated Interview Request (AIR)
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, Applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Claim Rejections - 35 USC § 103
	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1 and 8-9 are rejected under 35 U.S.C. 103 as being unpatentable over Seide et al. (US 9,477,925) (hereinafter Seide as previously cited), Son et al. (US 2018/0032867) (hereinafter Son as previously cited), Syeda-Mahmood et al. (US 2020/0184252) (hereinafter Syeda as previously cited), Hazeghi et al. (US 2019/0108678) (hereinafter Hazeghi as previously cited) and Ling et al. (US 2019/0130265) (hereinafter Ling as previously cited).

As per claim 1, the combination of references above teaches a method for accelerating deep learning, the method comprising: 
	invoking an entire deep learning architecture (Syeda [0085] and [0092] invoke operation of deep learning network), the entire deep learning architecture comprising a data operation program of a convolutional layer (Syeda [0089] training logic determines modifications of the operational parameters of convolutional layers of deep learning network) and a data operation program of a fully connecting layer (Son [0105] fully connected layers classify features transferred from previous layers); 
	obtaining the data operation program of the convolutional layer (Syeda fig. 6, block 636 and [0089] optimize objective function using training logic), discarding the data operation program of the fully connecting layer (Hazeghi [0175] remove fully connected layers), determining whether the convolutional layer needs to be divided, according to an amount of data of the convolutional layer (Seide col. 10, ll. 1-22 load balance amount of data processed equally among the number of processors for assigned layers and Ling [0073] layer may be split according to each partial input data such that the operational parameter array obtained by the splitting has a number of columns equal to the number of received plurality of partial input data) and a memory capacity of a first processor of a user terminal (Ling [0028] and [0112] before actually performing operations of a certain layer determine which layers needs to be split based on memory capacity and current processor performance), when the amount of data of the convolutional layer exceeds a maximum memory capacity of the first processor, dividing the convolutional layer according to a number of layers of the convolutional layer (Ling [0080] when an operational parameter e.g. maximum memory capacity is exceeded then subdivide operational parameters), and respectively loading the divided convolutional layers to at least two first processors (Seide col. 10, ll. 1-19 assign multiple layers to a corresponding multi-core processors);
	obtaining the data operation program of the fully connecting layer (Son [0105] fully connected layers classify features transferred from previous layers), loading the data operation program of the fully connecting layer to a second processor of the user terminal (Seide col. 10, ll. 1-19 assign multiple layers to a corresponding multi-core processor); and 
	inputting a result to the second processor to continue performing an operation on the fully connecting layer (Son [0105] results are used from a previous, same, or subsequent fully connected layers), thereby completing the entire deep learning architecture and training on the user terminal (Son [0105] training the deep convolutional neural network using training data and multiple iterations); wherein the result is obtained by the first processor performing convolution processing on the convolutional layer (Son [0105] results are used from a previous, same, or subsequent convolutional layers).

Son and Seide are both concerned with neural networks. Seide teaches load balancing within a neural network environment while Son teaches fully connected and convolutional layers of a neural network. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Seide in view of Son because it would provide for the existence of various values in parameters of the neural network which are advantageous in terms of a performance of a recognizer that uses the neural network. Thus, the range of values of such parameters may be limited through the example cutoff operation, which may lead to an increase in the performance of the recognizer that uses a neural network with the selectively cut off parameters. Also, the size or amount of data necessary to represent the original or the quantized parameters may be reduced by limiting values of the original or quantized parameters, and thus it is possible to achieve lightening of the original or quantized parameters through the cutoff operation. To enhance or at least maintain the performance of the recognizer while reducing the size of the data, values of such parameters may desirably be cut off to an appropriate maximum value and/or an appropriate minimum value.

Syeda and Seide are both concerned with neural networks. Seide teaches load balancing within a neural network environment while Syeda teaches invoking a deep learning architecture. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Seide and Son in view of Syeda because it would provide for a deep learning network to learn faster, implementing a simpler deep learning network architecture, e.g., a relatively smaller number of convolutional layers than would be needed to discern salient features in a deep learning network. The benefits of faster learning and reduced complexity of the deep learning network are achieved by augmenting the deep learning network and its machine learning process to process, in addition to utilizing raw image (non-annotated image) data and a pre-processed input. 

Hazeghi and Seide are both concerned with neural networks. Seide teaches load balancing within a neural network environment while Hazeghi teaches removing fully connected layers. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Seide, Son, and Syeda in view of Hazeghi because a pipelined algorithm may be configured to process input data sample batches having a size that is defined to optimize a tradeoff between computation accuracy and execution efficiency. In other words, the size may maximize both computation accuracy and execution efficiency of the pipelined algorithm, and it may be more efficient to devote a particular multi-core processor to process the largest layer, while processing two or more of the smallest layers on another multi-core processor. Such grouping may further eliminate some pipeline roundtrip delays and improve efficiency.

Ling and Seide are both concerned with neural networks. Seide teaches load balancing within a neural network environment while Ling teaches a splitting layers. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Seide, Son, Syeda, and Hazeghi in view of Ling because it would provide a beneficial way to improve operation parallelism and/or execution efficiency. In addition, limitation of hardware (such as the dedicated hardware accelerator) may be avoided, and thus the hardware may be used for convolution operations of weight parameters with any size. In addition, by splitting a large weight parameter into several smaller weight parameters, the high-speed memory can be ensured to completely cache the weight parameter for each operation, thereby correctness of the operations may be ensured, and data transportation may be reduced, which are beneficial to improve execution efficiency of the hardware.

As per claim 8, Seide further teaches wherein the user terminal comprises any one of a smart phone, a tablet computer, a laptop convenient computer, and a desktop computer (col. 5, ll. 46-50 and col. 6, ll. 21-24).

As per claim 9, it has similar limitations as claim 1 and is therefore rejected using the same rationale. 

Claims 2, 5, 10, and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Seide, Son, Syeda, Hazeghi, Ling, and Yang et al. (US 2019/0042840) (hereinafter Yang as previously cited).

As per claim 2, Yang teaches wherein the deep learning architecture is a neural network architecture based on VGG16 ([0104]).

Yang and Seide are both concerned with neural networks. Seide teaches load balancing within a neural network environment while Yang teaches a VGG16 architecture and dedicated convolutional hardware. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Seide, Son, Syeda, Hazeghi, and Ling in view of Yang because it would provide a way to achieve faster computations via computational performance improvement techniques using a CNN processing block, a representation of imagery data using as few bits as practical (e.g., 5-bit representation), each filter coefficient is represented as an integer with a radix point using as few bits as practical (e.g., 12-bit representation), and as a result, convolutions can thereby be performed using fixed-point arithmetic for faster computations.

As per claim 5, the combination of references above teaches wherein the first processor is a dedicated processor for convolutional layer data calculation (Yang [0013]), the dedicated processor is one of a Field-Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), and an Application Specific Integrated Circuit (ASIC) (Seide col. 5, ll. 57-64).

Yang and Seide are both concerned with neural networks. Seide teaches load balancing within a neural network environment while Yang teaches a VGG16 architecture and dedicated convolutional hardware. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Seide, Son, Syeda, Hazeghi, and Ling in view of Yang because it would provide a way to achieve faster computations via computational performance improvement techniques using a CNN processing block, a representation of imagery data using as few bits as practical (e.g., 5-bit representation), each filter coefficient is represented as an integer with a radix point using as few bits as practical (e.g., 12-bit representation), and as a result, convolutions can thereby be performed using fixed-point arithmetic for faster computations.

As per claim 10, it has similar limitations as claim 2 and is therefore rejected using the same rationale. 

As per claim 13, it has similar limitations as claim 5 and is therefore rejected using the same rationale. 

Claims 6, 14, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Seide, Son, Syeda, Hazeghi, Ling, and Fu et al. (US 2021/0158087) (hereinafter Fu as previously cited).

As per claim 6, Fu teaches wherein the data operation program of the fully connecting layer corresponds to an application, different applications correspond to different data operation programs of the fully connecting layer ([0055] different fully connected layers can be trained for different applications to obtain corresponding results).

Fu and Seide are both concerned with neural networks. Seide teaches load balancing within a neural network environment while Fu teaches different layers for different applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Seide, Son, Syeda, Hazeghi, and Ling in view of Fu because due to the target feature map extracted by the selective pooling, the detection sub-network may be simplified into a lightweight fully connected layer. The fully connected layer may have substantially less parameters than a conventional classifier. Moreover, the design can also have such advantages as reduced operating time and accelerated detection. Accordingly, the detection sub-network is a very simple and efficient detection sub-network, which significantly improves the computation efficiency compared with the conventional detection sub-network that requires more layers. With the effective feature representation and detection sub-network, the architecture can achieve the effects of dimension reduction and region selection to fulfill satisfactory performance when less parameters and rapid test speed are ensured. In this way, the effect of dimension reduction may be achieved to lower the demands on computing resources and increase computational speed. Because only a portion of the space-related channel groups is extracted from the plurality of channel groups, the target feature map greatly improves the computational performance without missing any information.

As per claim 14, it has similar limitations as claim 6 and is therefore rejected using the same rationale. 

As per claim 16, it has similar limitations as claim 8 and is therefore rejected using the same rationale. 

Claims 7 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Seide, Son, Syeda, Hazeghi, Fu, Ling, and Cleve et al. (US 2014/0201118) (hereinafter Cleve as previously cited).

As per claim 7, Cleve teaches wherein different applications corresponding to different data operation programs of the fully connecting layer comprises the number of layers in the fully connecting layer and/or the number of neurons, corresponding to different applications, are different ([0019] the number of layers and neurons can be selected differently depending upon the application).

Cleve and Seide are both concerned with neural networks. Seide teaches load balancing within a neural network environment while Cleve teaches different layers and neurons for different applications. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Seide, Son, Syeda, Hazeghi, Fu, and Ling in view of Cleve because it would provide for an inventive use of additional output clusters which are connected to lower hidden layers, while the output layer is supplied with additional error information, and as a result, the disappearance of error information for small weights is avoided.

As per claim 15, it has similar limitations as claim 7 and is therefore rejected using the same rationale. 

Response to Arguments
Applicant's arguments have been fully considered but they are not persuasive. 

In the Remarks on pg. 8, Applicant alleges that Ling does not disclose how to split the layers. The examiner respectfully disagrees. Ling in at least [0026] and [0028]-[0029] teaches determining that the weight parameters of layers in the convolution neural network (CNN) need to be split, or in other words, which layers in the CNN need to be selected as the selected layer to be split. Thus, in Ling the selected layer to be split is determined based on the weight parameters of the layers. Applicant also argues that the convolutional layer is divided according to the number of layers of the convolutional layer in the instant claim 1, is different from the operational parameter that may be subdivided in at least one of the dimensions of depth and number of kernels of Ling. The examiner respectfully disagrees. Firstly, it should be noted that Applicant’s claimed language “dividing the convolutional layer according to a number of layers of the convolutional layer” can be considered ambiguous and self-referencing because the convolutional layer is divided according to a number of layers of the convolutional layer (i.e. itself). Furthermore, if the number of layers of the convolutional layer is one, then the dividing step would not have any effect on the claim because dividing a singular convolutional layer by one would result in the same singular convolutional layer. For at least the reasons stated above, the “dividing the convolutional layer according to a number of layers of the convolutional layer” language is not given patentable weight that further limits the claims while also being interpreted to read exactly upon the cited references. Finally, Applicant alleges that the way of dividing the convolutional layer by number of layers is totally different from subdividing the operational parameter by number of kernels. The examiner respectfully disagrees. Ling in at least [0080] teaches that when an operational parameter e.g. maximum memory capacity is exceeded then subdivide the operational parameters. As discussed above, the operational parameters (e.g. weight parameters) are associated with the layers such that the dividing or splitting of the layers is based on the splitting or dividing of the associated parameters of the layers. For at least the rationale provided above, Applicant’s arguments are respectfully traversed, and the rejections are maintained. 

Relevant Art Not Cited
The prior art made of record and not relied upon is considered pertinent to Applicant’s disclosure:

Zhang et al. (US 2018/0260665) in at least [0074]-[0075] disclose splitting one convolutional layer into two convolutional layers.

Savic et al. (US 2019/0325302) disclose reducing communication bandwidth and latency for performing communication synchronization operations using a deep learning model.

Lee et al. (US 2019/0122106) disclose calculating individual update values for a weight assigned to a connection relationship between nodes included in a neural network and training the neural network.

Cui et al. (US 2020/0042362) disclose a self-adaptive batch dataset partitioning for distributed deep learning using a hybrid set of accelerators.

Bhattacharjee et al. (US 2020/0257980) disclose training optimization for neural networks with batch norm layers.

Conclusion
THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Adam Lee whose telephone number is (571)270-3369.  The examiner can normally be reached on M-TH 8AM-5PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chat Do can be reached on 5712723721.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




	
/Adam Lee/Primary Examiner, Art Unit 2193                                                                                                                                                                                            October 4, 2022