DETAILED ACTION
This action is in response to the Applicant Response filed 22 April 2022 for application 16/025,546 filed 02 July 2022.
Claims 1, 10, 13, 16-19 are currently amended.
Claims 1-20 are pending.
Claims 1-20 are rejected.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Examiner’s Remarks
The Examiner notes that the instant application was previously examined by a different examiner. As such, the Examiner proceeds with prosecution giving full faith and credit to the search and action of the previous examiner per MPEP § 704.01:
When an examiner is assigned to act on an application which has received one or more actions by some other examiner, full faith and credit should be given to the search and action of the previous examiner unless there is a clear error in the previous action or knowledge of other prior art. In general the second examiner should not take an entirely new approach to the application or attempt to reorient the point of view of the previous examiner, or make a new search in the mere hope of finding something. See MPEP § 719.05.

Response to Arguments
Applicant arguments regarding the 35 U.S.C. 112(b) rejections of claims 10, 13, 16-20 have been fully considered and, in light of the amendments to the claims, are persuasive. The 35 U.S.C. 112(b) rejections of claims 10, 13, 16-20 have been withdrawn.
Applicant’s arguments regarding the 35 U.S.C. 103 rejections of claims 1-20 have been fully considered and are partially moot and partially not persuasive.
It is noted while the Examiner may appreciate differences between the applied art and features described in the originally filed specification, any such features must be explicitly recited in the claims themselves and/or definitively and comprehensively defined in the specification in order to be considered and impact BRI of the metes and bounds of the claim terms. Applicant is respectfully reminded that during examination, the BRI of the claim terms consistent with the specification applies, and thus, the applicant is encouraged to amend the claims or point to portion(s) of the originally filed specification that prevent the BRI interpretation of the claim terms (MPEP 2173.01) enabling correspondence to the applied art.
Applicant argues that the cited references do not teach the amended limitations of claim 1 (similarly claim 16), particularly the following:
when an early-stop event for a first layer among the plurality of layers is generated based the hyper parameter while the plurality of depth-wise jobs are executed: 
determining a first subset of computing operations, among a plurality of computing operations for at least one second layer arranged at a prior stage than the first layer, and a second subset of computing operations, among the plurality of computing operations for the at least one second layer, 
continuing execution of the first subset of computing operations after the early-stop event is generated, and 
stopping execution of the second subset of computing operations after the early- stop event is generated.
Applicant’s arguments regarding the Hara reference have been fully considered but are moot because the arguments do not rely on any reference in the prior rejection of record for any teachings or matters specifically challenged in the argument.
Applicant next argues that Teerapittayanon discloses exiting a branch and not performing future layers, but does not teach or suggest continuing execution of a first subset of computing operations after the early-stop event is generated, and stopping execution of a second subset of computing operations after the early-stop event is generated. With respect to applicant’s arguments, Examiner respectfully disagrees. First, Examiner notes that “layers” and “computing operations” are not the same. As noted in the claim language above at least one second layer, which can include a single second layer, contains two subsets of a plurality of computing operations. Additionally, as noted below, Teerapittayanon teaches a DNN which contains multiple layers.

    PNG
    media_image1.png
    418
    308
    media_image1.png
    Greyscale
  
    PNG
    media_image2.png
    122
    301
    media_image2.png
    Greyscale

As can be seen from the image and algorithm above, there is a first layer, which is virtually added to the image but comprises steps 6 and 7 of the algorithm, that must take the entropy and classification from at least Exit 1 and Exit 2 blocks and compare the entropy value for the given layer with a threshold, i.e., early-stop event based on a hyperparameter, to determine if an early-stop event is generated and, if so, provide the output and stop additional steps from being performed, while providing no output if the early-stop event is not generated and instead causing the next steps of the model to proceed. Moreover, as shown in the image the blue (top) box represents the second subset of computing operations of the second layer and the red (bottom) box represents the first subset of computing operations of the second layer and the combination of the boxes represents the second layer, which is arranged prior to the output layer (i.e., first layer). In addition to the computing operations included in each box, the computing operations for each subset include at least the output operation for the classification as noted in step 7 of the algorithm. Therefore, if the early stop event occurs for the first subset, i.e., the entropy from Exit 1 block is less than threshold (hyperparameter), the computing operations of the second subset will not run (i.e., stop), while the computing operations of the first subset will continue, i.e., the output will be provided using the return statement in step 7 of the algorithm. Therefore, the combination of Hara and Teerapittayanon does, in fact, teach when an early-stop event for a first layer among the plurality of layers is generated based the hyper parameter while the plurality of depth-wise jobs are executed: determining a first subset of computing operations, among a plurality of computing operations for at least one second layer arranged at a prior stage than the first layer, and a second subset of computing operations, among the plurality of computing operations for the at least one second layer, continuing execution of the first subset of computing operations after the early-stop event is generated, and stopping execution of the second subset of computing operations after the early- stop event is generated.
Therefore, claim 1 is rejected under 35 U.S.C. 103 as unpatentable over Hara in view of Teerapittayanon. For similar reasons, claim 16 is also rejected as unpatentable over Hara in view of Teerapittayanon. Additionally, the rejections of claims 1, 16 apply to all dependent claims which are dependent on claims 1, 16, including claims 13, 15, 19 which are also unpatentable over Hara in view of Teerapittayanon; claims 2, 14, 17-18, 20 which are unpatentable over Hara in view of Teerapittayanon and further in view of Savvides; claims 3-10 which are unpatentable over Hara in view of Teerapittayanon and further in view of Raveane; and claims 11-12 which are unpatentable over Hara in view of Teerapittayanon and further in view of Lockett.

Claim Objections
Claims 1-20 are objected to because of the following informalities:
Claim 1, lines 11-12, based the hyper parameter should read “based on the hyper parameter”
Claim 16, line 15, based the hyper parameter should read “based on the hyper parameter”
Claims 2-15, 17-20 are objected to due to their dependence, either directly or indirectly on claims 1, 16
Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1, 13, 15, 16, 19 are rejected under 35 U.S.C. 103 as being unpatentable over Hara et al. (US 2017/0228639 – Efficient Determination of Optimized Learning Settings of Neural Networks, hereinafter referred to as “Hara”) in view of Teerapittayanon et al. (BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks, hereinafter referred to as “Teerapittayanon”).

Regarding claim 1 (Currently Amended), Hara teaches a method of controlling a plurality of computing operations in a deep neural network (DNN) (Hara, ¶0028 – teaches training and implementing [computer operations] deep neural networks), the method comprising: 
analyzing a network structure of the DNN, the network structure comprising a plurality of layers (Hara, ¶0047 – teaches a learning setting that requires analyzing the number of layers in a deep neural network; see also Hara, ¶0028); 
setting a hyper parameter, based on the network structure and real-time context information of a system configured to drive the DNN, the hyper parameter being used for performing an early-stop function (Hara, ¶0040 – teaches terminating training [early-stop function] based on an evaluation value of the network [real-time context information of a system configured to drive the DNN] not satisfying a given criterion [hyperparameter]; see also Hara, ¶¶0038, 0068-0069); 
assigning a plurality of depth-wise jobs to a plurality of resources included in the system based on the hyper parameter to execute the plurality of depth-wise jobs, each of the plurality of depth-wise jobs including at least a part of the plurality of computing operations (Hara, ¶0069 – teaches training the networks [depth-wise jobs]; Hara, ¶0101 – teaches various resources, e.g., processor and memory, performing various computer operations and/or processing; see also, Hara, ¶0047 – teaches various hyperparameters used for training).
While Hara teaches performing early-stop operations on DNNs, Hara does not explicitly teach when an early-stop event for a first layer among the plurality of layers is generated based the hyper parameter while the plurality of depth-wise jobs are executed: determining a first subset of computing operations, among a plurality of computing operations for at least one second layer arranged at a prior stage than the first layer, and a second subset of computing operations, among the plurality of computing operations for the at least one second layer, continuing execution of the first subset of computing operations after the early- stop event is generated, and stopping execution of the second subset of computing operations after the early- stop event is generated.
Teerapittayanon teaches when an early-stop event (Teerapittayanon, section 3, Figs. 1-2 – teaches comparing entropy [stop-event] to threshold [hyperparameter]) for a first layer among the plurality of layers (Teerapittayanon, Figs. 1-2 – image (fig. 1) below demonstrates the output layer [first layer] virtually but algorithm shows the output layer in steps 6 and 7) is generated based the hyper parameter (Teerapittayanon, section 3, Figs. 1-2 – teaches comparing entropy [stop-event] to threshold [hyperparameter]) while the plurality of depth-wise jobs (Teerapittayanon, Fig. 2 – teaches convolutional layers which perform depth-wise jobs) are executed:
determining a first subset of computing operations (Teerapittayanon, Figs. 1-2 – teaches a first subset of computing operations which include the computing operations of the red (bottom) box in image (fig. 1) below as well as the output return statement of algorithm (fig. 2) in step 7 below), among a plurality of computing operations for at least one second layer arranged at a prior stage than the first layer (Teerapittayanon, Figs. 1-2 – image (fig. 1) below shows the second layer represented by the combination of the blue (top) box and red (bottom) box, each of which contain computing operations, which is arranged prior to the output layer [first layer] of the image), and a second subset of computing operations (Teerapittayanon, Figs. 1-2 – teaches a second subset of computing operations which include the computing operations of the blue (top) box in image (fig. 1) below as well as the output return statement of algorithm (fig. 2) in step 7 below), among the plurality of computing operations for the at least one second layer (Teerapittayanon, Figs. 1-2 – image (fig. 1) below shows the second layer represented by the combination of the blue (top) box and red (bottom) box, each of which contain computing operations, which is arranged prior to the output layer [first layer] of the image), 
continuing execution of the first subset of computing operations after the early-stop event is generated (Teerapittayanon, Figs. 2 – algorithm (fig. 2) teaches that if the entropy returned from Exit 1 block is less than the threshold [early-stop event] then the first subset computing operations continue and the return output operation of algorithm step 7 is run for the given subset), and 
stopping execution of the second subset of computing operations after the early-stop event is generated (Teerapittayanon, Figs. 1-2 – algorithm (fig. 2) teaches that if the entropy returned from Exit 1 block is less than the threshold [early-stop event] then the second subset of computing operations represented by blue (top) box in image (fig. 1) below is not run [stopped]). 

    PNG
    media_image1.png
    418
    308
    media_image1.png
    Greyscale
  
    PNG
    media_image2.png
    122
    301
    media_image2.png
    Greyscale

It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Hara with the teachings of Teerapittayanon in order to improve accuracy and reduce inference time in the field of early-stop operations for deep neural networks (Teerapittayanon, Abstract – “Deep neural networks are state of the art methods for many learning tasks due to their ability to extract increasingly better features at each network layer. However, the improved performance of additional layers in a deep network comes at the cost of added latency and energy usage in feedforward inference. As networks continue to get deeper and larger, these costs become more prohibitive for real-time and energy-sensitive applications. To address this issue, we present BranchyNet, a novel deep network architecture that is augmented with additional side branch classifiers. The architecture allows prediction results for a large portion of test samples to exit the network early via these branches when samples can already be inferred with high confidence. BranchyNet exploits the observation that features learned at an early layer of a network may often be sufficient for the classification of many data points. For more difficult samples, which are expected less frequently, BranchyNet will use further or all network layers to provide the best likelihood of correct prediction. We study the BranchyNet architecture using several well-known networks (LeNet, AlexNet, ResNet) and datasets (MNIST, CIFAR10) and show that it can both improve accuracy and significantly reduce the inference time of the network.”).

Regarding claim 13 (Currently Amended), Hara in view of Teerapittayanon teaches all of the limitations of the method of claim 1 as noted above. Teerapittayanon further teaches wherein, after the second subset of computing operations are performed (Teerapittayanon, Figs. 1-2- teaches performing steps for the Exit 2 block, returning output and stopping all remaining computing operations if early-stop event is generated) and a remainder of the plurality of computing operations are stopped (Teerapittayanon, Figs. 1-2- teaches performing steps for the Exit 2 block, returning output and stopping all remaining computing operations if early-stop event is generated), remaining jobs among the plurality of depth-wise jobs are continuously executed based on a result of the first subset of the plurality of computing operations (Teerapittayanon, Section IV – teaches inputting multiple images to the model and when one images is classified after the second subset, remaining images are continuously processed through CNNs [depth-wise jobs] based on the results of the first subset).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Hara and Teerapittayanon in order to process remaining images based on the result of the first subset in order to improve accuracy and reduce inference time (Teerapittayanon, Abstract).

Regarding claim 15 (Original), Hara in view of Teerapittayanon teaches all of the limitations of the method of claim 1 as noted above. Hara further teaches wherein the plurality of resources include at least one of a central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP), an image signal processor (ISP), dedicated hardware and a neural processing unit (NPU) (Hara, ¶0101 – teaches the resources include a CPU).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Hara and Teerapittayanon for the same reasons as disclosed in claim 1 above.

Regarding claim 16 (currently amended), Hara teaches a system of controlling a plurality of computing operations in a deep neural network (DNN) (Hara, ¶0028 – teaches training and implementing [computer operations] deep neural networks), the system comprising: 
a memory storing one or more instructions (Hara, ¶0101 – teaches memory storing instructions); and 
a processor configured to execute the one or more instructions (Hara, ¶0101 – teaches processor executing instructions) to implement: 
a first model parser configured to analyze a network structure of the DNN, the network structure comprising a plurality of layers (Hara, ¶0047 – teaches a learning setting that requires analyzing the number of layers in a deep neural network; see also Hara, ¶0028); 
a first early-stopper configured to set a hyper parameter based on the network structure and real-time context information of the system configured to drive the DNN, the hyper parameter being used for performing an early-stop function (Hara, ¶0040 – teaches terminating training [early-stop function] based on an evaluation value of the network [real-time context information of a system configured to drive the DNN] not satisfying a given criterion [hyperparameter]; see also Hara, ¶¶0038, 0068-0069); 
a first job assigner configured to assign a plurality of depth-wise jobs based on the hyper parameter, each of the plurality of depth-wise jobs including at least a part of the plurality of computing operations (Hara, ¶0069 – teaches training the networks [depth-wise jobs]; Hara, ¶0101 – teaches various resources, e.g., processor and memory, performing various computer operations and/or processing; see also, Hara, ¶0047 – teaches various hyperparameters used for training); and 
a plurality of resources configured to execute the plurality of depth-wise jobs (Hara, ¶0069 – teaches training the networks [depth-wise jobs]; Hara, ¶0101 – teaches various resources, e.g., processor and memory, performing various computer operations and/or processing; see also, Hara, ¶0047 – teaches various hyperparameters used for training). 
While Hara teaches performing early-stop operations on DNNs, Hara does not explicitly teach a plurality of resources configured to execute the plurality of depth-wise jobs, wherein when an early-stop event for a first layer among the plurality of layers is generated based the hyper parameter while the plurality of depth-wise jobs are executed, the plurality of resources being configured to: determine a first subset of computing operations, among a plurality of computing operations for at least one second layer arranged at a prior stage than the first layer, and a second subset of computing operations, among the plurality of computing operations for the at least one second layer, continue execution of the first subset of computing operations after the early-stop event is generated, and stop execution of the second subset of computing operations after the early-stop event is generated.
Teerapittayanon teaches a plurality of resources (Teerapittayanon, section IV – teaches performing methods using GPU and CPU) configured to execute the plurality of depth-wise jobs (Teerapittayanon, Fig. 1 – teaches operation multiple convolutional layers [depth-wise jobs]), wherein when an early-stop event (Teerapittayanon, section 3, Figs. 1-2 – teaches comparing entropy [stop-event] to threshold [hyperparameter]) for a first layer among the plurality of layers (Teerapittayanon, Figs. 1-2 – image (fig. 1) below demonstrates the output layer [first layer] virtually but algorithm shows the output layer in steps 6 and 7) is generated based the hyper parameter (Teerapittayanon, section 3, Figs. 1-2 – teaches comparing entropy [stop-event] to threshold [hyperparameter]) while the plurality of depth-wise jobs are executed (Teerapittayanon, Fig. 2 – teaches convolutional layers which perform depth-wise jobs), the plurality of resources being configured to: 
determine a first subset of computing operations (Teerapittayanon, Figs. 1-2 – teaches a first subset of computing operations which include the computing operations of the red (bottom) box in image (fig. 1) below as well as the output return statement of algorithm (fig. 2) in step 7 below), among a plurality of computing operations for at least one second layer arranged at a prior stage than the first layer (Teerapittayanon, Figs. 1-2 – image (fig. 1) below shows the second layer represented by the combination of the blue (top) box and red (bottom) box, each of which contain computing operations, which is arranged prior to the output layer [first layer] of the image), and a second subset of computing operations (Teerapittayanon, Figs. 1-2 – teaches a second subset of computing operations which include the computing operations of the blue (top) box in image (fig. 1) below as well as the output return statement of algorithm (fig. 2) in step 7 below), among the plurality of computing operations for the at least one second layer (Teerapittayanon, Figs. 1-2 – image (fig. 1) below shows the second layer represented by the combination of the blue (top) box and red (bottom) box, each of which contain computing operations, which is arranged prior to the output layer [first layer] of the image), 
continue execution of the first subset of computing operations after the early-stop event is generated (Teerapittayanon, Figs. 2 – algorithm (fig. 2) teaches that if the entropy returned from Exit 1 block is less than the threshold [early-stop event] then the first subset computing operations continue and the return output operation of algorithm step 7 is run for the given subset), and 
stop execution of the second subset of computing operations after the early-stop event is generated (Teerapittayanon, Figs. 1-2 – algorithm (fig. 2) teaches that if the entropy returned from Exit 1 block is less than the threshold [early-stop event] then the second subset of computing operations represented by blue (top) box in image (fig. 1) below is not run [stopped]).

    PNG
    media_image1.png
    418
    308
    media_image1.png
    Greyscale
  
    PNG
    media_image2.png
    122
    301
    media_image2.png
    Greyscale

It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Hara with the teachings of Teerapittayanon in order to improve accuracy and reduce inference time in the field of early-stop operations for deep neural networks (Teerapittayanon, Abstract – “Deep neural networks are state of the art methods for many learning tasks due to their ability to extract increasingly better features at each network layer. However, the improved performance of additional layers in a deep network comes at the cost of added latency and energy usage in feedforward inference. As networks continue to get deeper and larger, these costs become more prohibitive for real-time and energy-sensitive applications. To address this issue, we present BranchyNet, a novel deep network architecture that is augmented with additional side branch classifiers. The architecture allows prediction results for a large portion of test samples to exit the network early via these branches when samples can already be inferred with high confidence. BranchyNet exploits the observation that features learned at an early layer of a network may often be sufficient for the classification of many data points. For more difficult samples, which are expected less frequently, BranchyNet will use further or all network layers to provide the best likelihood of correct prediction. We study the BranchyNet architecture using several well-known networks (LeNet, AlexNet, ResNet) and datasets (MNIST, CIFAR10) and show that it can both improve accuracy and significantly reduce the inference time of the network.”).

Regarding claim 19 (Currently Amended), Hara in view of Teerapittayanon teaches all of the limitations of the system of claim 16 as noted above. Hara further teaches 
a second model parser configured to analyze the network structure (Hara, ¶0047 – teaches a learning setting that requires analyzing the number of layers in a deep neural network; see also Hara, ¶0028); 
a second early-stopper configured to set the hyper parameter based on the network structure and the real-time context information (Hara, ¶0040 – teaches terminating training [early-stop function] based on an evaluation value of the network [real-time context information of a system configured to drive the DNN] not satisfying a given criterion [hyperparameter]; see also Hara, ¶¶0038, 0068-00690; and 
a second job assigner configured to assign the plurality of depth-wise jobs based on the hyper parameter (Hara, ¶0069 – teaches training the networks [depth-wise jobs]; Hara, ¶0101 – teaches various resources, e.g., processor and memory, performing various computer operations and/or processing; see also, Hara, ¶0047 – teaches various hyperparameters used for training).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Hara and Teerapittayanon for the same reasons as disclosed in claim 16 above.

Claims 2, 14, 17-18, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Hara in view of Teerapittayanon and further in view of Savvides et al. (US 2018/0053091 A1 – System and Method for Model Compression of Neural Networks for Use in Embedded Platforms, hereinafter referred to as “Savvides”).

Regarding claim 2 (Original), Hara in view of Teerapittayanon teaches all of the limitations of the method of claim 1 as noted above. However, Hara in view of Teerapittayanon does not explicitly teach updating the real-time context information based on an operating status of the system; and updating the hyper parameter based on the real-time context information.
Savvides teaches 
updating the real-time context information based on an operating status of the system (Savvides, ¶0027 – teaches, based on the operating status of the system, updating thresholds [real-time context information] in real-time); and 
updating the hyper parameter based on the real-time context information (Savvides, ¶0027 – teaches using the updated thresholds to update hyperparameters of the system).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Hara in view of Teerapittayanon with the teachings of Savvides in order to develop smaller, less resource intensive networks in the field of optimizing deep neural networks (Savvides, ¶0003 – “Neural networks, such as convolutional neural networks (CNNs) or fully connected networks (FNCs) may be used in machine learning applications for a variety of tasks, including classification and detection. These networks are often large and resource intensive in order to achieve desired results. As a result, the networks are typically limited to machines having the components capable of handling such resource intensive tasks. It is now recognized that smaller, less resource intensive networks are desired.”).
Regarding claim 14 (Original), Hara in view of Teerapittayanon teaches all of the limitations of the method of claim 1 as noted above. However, Hara in view of Teerapittayanon does not explicitly teach wherein the real-time context information includes at least one of performance and utilization of the plurality of resources, accuracy of the plurality of depth-wise jobs and power status of the system.
Savvides teaches wherein the real-time context information includes at least one of performance and utilization of the plurality of resources, accuracy of the plurality of depth-wise jobs and power status of the system (Savvides, ¶0027 – teaches real-time operating status of the system including processor speed, available memory, etc. [real-time context information]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Hara in view of Teerapittayanon with the teachings of Savvides in order to develop smaller, less resource intensive networks in the field of optimizing deep neural networks (Savvides, ¶0003 – “Neural networks, such as convolutional neural networks (CNNs) or fully connected networks (FNCs) may be used in machine learning applications for a variety of tasks, including classification and detection. These networks are often large and resource intensive in order to achieve desired results. As a result, the networks are typically limited to machines having the components capable of handling such resource intensive tasks. It is now recognized that smaller, less resource intensive networks are desired.”).

Regarding claim 17 (Currently Amended), Hara in view of Teerapittayanon teaches all of the limitations of the system of claim 16 as noted above. However, Hara in view of Teerapittayanon does not explicitly teach a context manager configured to update the real-time context information based on an operating status of the system, and wherein the first early-stopper is configured to update the hyper parameter based on the real-time context information.
Savvides teaches 
a context manager configured to update the real-time context information based on an operating status of the system (Savvides, ¶0027 – teaches, based on the operating status of the system, updating thresholds [real-time context information] in real-time), and 
wherein the first early-stopper is configured to update the hyper parameter based on the real-time context information (Savvides, ¶0027 – teaches using the updated thresholds to update hyperparameters of the system).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Hara in view of Teerapittayanon with the teachings of Savvides in order to develop smaller, less resource intensive networks in the field of optimizing deep neural networks (Savvides, ¶0003 – “Neural networks, such as convolutional neural networks (CNNs) or fully connected networks (FNCs) may be used in machine learning applications for a variety of tasks, including classification and detection. These networks are often large and resource intensive in order to achieve desired results. As a result, the networks are typically limited to machines having the components capable of handling such resource intensive tasks. It is now recognized that smaller, less resource intensive networks are desired.”).

Regarding claim 18 (Currently Amended), Hara in view of Teerapittayanon teaches all of the limitations of the system of claim 16 as noted above. However, Hara in view of Teerapittayanon does not explicitly teach a resource manager configured to generate resource status information that represents performance and utilization of the plurality of resources.
Savvides teaches a resource manager configured to generate resource status information that represents performance and utilization of the plurality of resources (Savvides, ¶0027 – teaches real-time operating status of the system including processor speed, available memory, etc. [real-time context information]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Hara in view of Teerapittayanon with the teachings of Savvides in order to develop smaller, less resource intensive networks in the field of optimizing deep neural networks (Savvides, ¶0003 – “Neural networks, such as convolutional neural networks (CNNs) or fully connected networks (FNCs) may be used in machine learning applications for a variety of tasks, including classification and detection. These networks are often large and resource intensive in order to achieve desired results. As a result, the networks are typically limited to machines having the components capable of handling such resource intensive tasks. It is now recognized that smaller, less resource intensive networks are desired.”).

Regarding claim 20 (Original), Hara in view of Teerapittayanon teaches all of the limitations of the system of claim 19 as noted above. Hara further teaches 
wherein the second model parser, the second early-stopper and the second job assigner are used for setting an initial value of the hyper parameter (Hara, ¶¶0038, 0068 – teaches using new settings for training second neural networks where the new settings include new initial hyper parameters), and 
However, Hara in view of Teerapittayanon does not explicitly teach wherein the first model parser, the first early-stopper and the first job assigner are used for updating the hyper parameter in real-time.
Savvides teaches wherein the first model parser, the first early-stopper and the first job assigner are used for updating the hyper parameter in real-time (Savvides, ¶0027 – teaches updating hyper parameters in real time).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Hara in view of Teerapittayanon with the teachings of Savvides in order to develop smaller, less resource intensive networks in the field of optimizing deep neural networks (Savvides, ¶0003 – “Neural networks, such as convolutional neural networks (CNNs) or fully connected networks (FNCs) may be used in machine learning applications for a variety of tasks, including classification and detection. These networks are often large and resource intensive in order to achieve desired results. As a result, the networks are typically limited to machines having the components capable of handling such resource intensive tasks. It is now recognized that smaller, less resource intensive networks are desired.”).

Claims 3-10 are rejected under 35 U.S.C. 103 as being unpatentable over Hara in view of Teerapittayanon and further in view of Raveane et al. (WO 2015/083199 A1 – Computer Device and Method Executed by the Computer Device, hereinafter referred to as “Raveane”).

Regarding claim 3 (Original), Hara in view of Teerapittayanon teaches all of the limitations of the method of claim 1 as noted above. However, Hara in view of Teerapittayanon does not explicitly teach wherein the first layer is a maximum pooling layer that selects a maximum value among a plurality of characteristic values, and wherein the plurality of characteristic values are included in a predetermined region of first volume data that is input to the first layer.
Raveane teaches 
wherein the first layer is a maximum pooling layer that selects a maximum value among a plurality of characteristic values (Raveane, ¶0034 – teaches the first layer being a max-pooling layer which selects the max value from a plurality of characteristic values), and 
wherein the plurality of characteristic values are included in a predetermined region of first volume data that is input to the first layer (Raveane, ¶0034 – teaches that the characteristic values are based on a high level set of features in the image data in YUV color space [predetermined region of first volume data that is input to the first layer]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Hara in view of Teerapittayanon with the teachings of Raveane in order to extract higher and lower level features that can be used in input classification in the field of deep neural networks (Raveane, ¶0034 – “FIG. 4 shows a possible architecture of the convolutional neural networks used by the system. The actual architecture used may vary according to the particular implementation details, and is chosen to better accommodate the required recognition task and the target shapes. However, there are common elements to all possible architectures. The input layer ... receives the image data in YUV color space (native to most mobile computer device cameras) and prepares it for further analysis through a contrast normalization process. In the case of devices equipped with a depth sensor, the neural network architecture is modified to provide one additional input channel for the depth information, which is then combined to the rest of the network in a manner similar to the U and V color channels. The first convolutional layer ... extracts a high level set of features through alternating convolutional and max-pooling layers. The second convolutional layer ... extracts lower level features through a similar set of neurons. The classification layer ... finally processes the extracted features and classifies them into a set of output neurons corresponding to each of the recognition target classes.”).

Regarding claim 4 (Original), Hara in view of Teerapittayanon and further in view of Raveane teaches all of the limitations of the method of claim 3 as noted above. Raveane further teaches wherein, before the plurality of depth-wise jobs are executed, a position of the maximum value is predetermined based on a training operation (Raveane, ¶¶0033-0035 – before extracting lower level features and processing the extracted features [before depth-wise jobs are executed], the first layer determines a position of maximum value [position of maximum value is predetermined based on training operation]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Hara, Teerapittayanon and Raveane in order to use max-pooling to extract higher and lower level feature for improved image classification (Raveane, ¶0034).

Regarding claim 5 (Original), Hara in view of Teerapittayanon and further in view of Raveane teaches all of the limitations of the method of claim 3 as noted above. Raveane further teaches wherein, while the plurality of depth-wise jobs are executed, a position of the maximum value is determined by tracking the at least one second layer in real-time (Raveane, ¶0034 – teaches the max-pooling layers track the position of a maximum value while training jobs are executed).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Hara, Teerapittayanon and Raveane in order to use max-pooling to extract higher and lower level feature for improved image classification (Raveane, ¶0034).

Regarding claim 6 (Original), Hara in view of Teerapittayanon and further in view of Raveane teaches all of the limitations of the method of claim 5 as noted above. Raveane further teaches 
wherein second volume data that is input to the at least one second layer includes first through N depths, where N is a natural number greater than or equal to two (Raveane, Fig. 4, ¶0034 – teaches input image data in YUV color space therefore creating at least 3 channels [depths]), and 
wherein only K consecutive depths among the first through N depths are computed, where K is a natural number greater than or equal to two and less than or equal to N (Raveane, Fig. 4, ¶0034 – teaches computing convolutions for the Y, U, V channels [consecutive depths]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Hara, Teerapittayanon and Raveane in order to use multiple channels to extract higher and lower level feature for improved image classification (Raveane, ¶0034).

Regarding claim 7 (Original), Hara in view of Teerapittayanon and further in view of Raveane teaches all of the limitations of the method of claim 6 as noted above. Raveane further teaches wherein the at least one second layer is a convolutional layer that performs a convolutional operation on the second volume data (Raveane, ¶0034, Fig. 4 – teaches a second convolutional layer that performs convolutions on the input data).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Hara, Teerapittayanon and Raveane in order to multiple convolutional layers to extract higher and lower level feature for improved image classification (Raveane, ¶0034).

Regarding claim 8 (Original), Hara in view of Teerapittayanon and further in view of Raveane teaches all of the limitations of the method of claim 5 as noted above. Raveane further teaches 
wherein second volume data that is input to the at least one second layer includes first through N depths, where N is a natural number greater than or equal to two (Raveane, Fig. 4, ¶0034 – teaches input image data in YUV color space therefore creating at least 3 channels [depths]), 
wherein only M depths among the first through N depths are computed, where M is a natural number greater than or equal to two and less than or equal to N (Raveane, Fig. 4, ¶0034 – teaches computing convolutions for the Y, U, V channels [consecutive depths]), and 
wherein any two depths among the M depths are arranged spaced apart from each other (Raveane, Fig. 4 – teaches that the channels [depths] Y, U and V are spaced apart).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Hara, Teerapittayanon and Raveane in order to use multiple channels to extract higher and lower level feature for improved image classification (Raveane, ¶0034).

Regarding claim 9 (Original), Hara in view of Teerapittayanon and further in view of Raveane teaches all of the limitations of the method of claim 8 as noted above. Raveane further teaches wherein the M depths are selected based on at least one of a predetermined interval, a predetermined number of times and a predetermined ratio (Raveane, fig. 4, ¶0034 – teaches selecting the depths based on an interval).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Hara, Teerapittayanon and Raveane in order to use multiple channels to extract higher and lower level feature for improved image classification (Raveane, ¶0034).

Regarding claim 10 (Currently Amended), Hara in view of Teerapittayanon and further in view of Raveane teaches all of the limitations of the method of claim 3 as noted above. Raveane further teaches wherein the maximum value is selected by tracking X characteristic values among the plurality of characteristic values, where X is a natural number (Raveane, ¶0034 – teaches using max pooling to select max characteristic values from a plurality of values).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Hara, Teerapittayanon and Raveane in order to use max-pooling to extract higher and lower level feature for improved image classification (Raveane, ¶0034).

Claims 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over Hara in view of Teerapittayanon and further in view of Lockett, Alan (U.S. Pat. No. 10,108,902 B1 – Methods and Apparatus for Asynchronous and Interactive Machine Learning Using Attention Selection Techniques, hereinafter referred to as “Lockett”).

Regarding claim 11 (Original), Hara in view of Teerapittayanon teaches all of the limitations of the method of claim 1 as noted above. However, Hara in view of Teerapittayanon does not explicitly teach wherein the first layer is an average pooling layer that obtains an average value of a plurality of characteristic values, and wherein the plurality of characteristic values are included in a predetermined region of first volume data that is input to the first layer.
Lockett teaches 
wherein the first layer is an average pooling layer that obtains an average value of a plurality of characteristic values (Lockett, Fig. 10, col. 16:30-37 – teaches attention pooling which is regional average pooling weighted according to attention gates), and 
wherein the plurality of characteristic values are included in a predetermined region of first volume data that is input to the first layer (Lockett, Fig. 10, col. 16:30-37 – teaches attention pooling which is regional [predetermined region] average pooling weighted according to attention gates).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Hara in view of Teerapittayanon with the teachings of Lockett in order to rapidly train a neural network in the field of deep neural networks (Lockett, Background – “Therefore, a need exists for methods and apparatus to rapidly train and identify logic used by machine learning models to generate outputs.”).

Regarding claim 12 (Original), Hara in view of Teerapittayanon and further in view of Lockett teaches all of the limitations of the method of claim 11 as noted above. Locket further teaches wherein the average value is obtained by selecting Y flows among a plurality of flows used for obtaining the average value, where Y is a natural number (Lockett, Fig. 10, col. 16:30-37 – teaches selecting characteristics on which to perform average pooling based on weights [selecting flows for average values]).
It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to combine the teachings of Hara, Teerapittayanon and Lockett in order to perform regional average pooling to rapidly train neural networks (Lockett, Background).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communication from the examiner should be directed to MARSHALL WERNER whose telephone number is (469) 295-9143. The examiner can normally be reached on Monday – Thursday 7:30 AM – 4:30 PM ET.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar, can be reached at (571) 272-7796. The fax number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/MARSHALL L WERNER/               Examiner, Art Unit 2125     
                                                                                                                                                                         
	/BRIAN M SMITH/               Primary Examiner, Art Unit 2122