DETAILED ACTION
This Office Action is in response to Application No. 16/189,206 filed on November 13, 2018. Claims 1-20 are presented for examination and are currently pending.
	Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The claimed priority date is November 15th, 2017. 
Information Disclosure Statement
The information disclosure statement (IDS) submitted on November 13th, 2018 was filed. The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.


Claim Objections
Claim 17 is objected to because of the following informalities:
  The claim recites “a neural network system, comprising: a plurality of processor” As the claim is in reference to a plurality of processing devices, processor should be “processors”. Appropriate correction is kindly requested.
The claim recites “providing a shared identification (ID) and a resource ID to a operation group that is found” Correct grammar would be “and a resource ID to an operation group that is found”. Appropriate correction is kindly requested.

Claim 5 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.



Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having 

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.



Claims 1-4, 6, 7, 9-12, 16, 17, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Gupta (US 8984515 B2), Yokono (US 5295227 A) and further in view of Lee (US 20200412799 A1).


In regards to claim 1, Gupta teaches the following:
A method of operating a … system, the method comprising: merging, by a processor, a first operation group in a first … and a second operation group in a second …, including identical operations, as a shared operation group;

	This citation and the paragraphs surrounding it teach grouping up operations that are common between the parallel tasks.]
[ (Col. 1, lines 52-55) “a task that requires computation of highest number of clusters, a task that requires maximum number of iterations to converge, a task with fewest or containing maximum shared clustering attributes across the tasks,”
	This citation teaches the grouping of tasks each having identical operations (shared tasks among clusters) ]
 selecting, by the processor, a first hardware to execute the shared operation group, from among a plurality of hardware; 
 [ (Col. 2, Lines 17-20) “When executed by a processor, the instructions cause the processor to perform operations comprising identifying one or more resource sharing opportunities across a plurality of parallel tasks.” 
Examiner notes that the selection of a first hardware from the plurality of hardware is not distinctly disclosed by Gupta and is instead taught by Lee as seen below. The limitation is kept together for clarity purposes. Gupta does however teach an embodiment containing a plurality of processors (Col. 6, Lines 37-41) but does not select from them. This citation and the one below it from Gupta do teach the processor executing the shared operation group. ]
[ (Col. 2, Lines 45-50) “Sharing one or more resources may include at least one of sharing data reads, sharing computations, sharing intermediate results, sharing at least one of map and reduce computations, sharing data processing resources” ]
and executing the shared operation group by using the first hardware.

	This citation teaches the processor identifying the shared opportunities in the task and then executing those tasks. ]
	Gupta does not distinctly disclose that the operation groups are specifically coming from respective neural networks. However, Yokono teaches that the operations are merged from neural networks as seen below:
	Neural network
[ (Fig. 5-7) and (Col. 3, Lines 15-31)
	Yokono teaches the merging of two distinct neural networks into one structure as seen in the above figures with the additional texts as support. Yokono teaches the environment (two distinct neural networks) where the operations Gupta teaches would be performed. ]
	Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine a method for reducing repeated operations as taught by Gupta with the merging of two or more distinct neural networks as taught by Yokono. The reason it would be obvious is one of ordinary skill in the art would recognize, prior to the effective filing date, that combining the two would save the neural network system from having to repeat any shared processes [ Yokono (Col. 3, Lines 4-10) ]. This would facilitate the recognized benefit of an increased efficiency for the system overall and require less computational resources for training the machine learning networks.
	What is not distinctly disclosed by Gupta or Yokono and is instead taught by Lee is seen below:
	selecting, by the processor, a first hardware… from among a plurality of hardware;
[ (¶0085) “An application server may be selected based on the determined application requirements and based on the determined application server conditions (block 840). For 
	This citation from Lee teaches a system where the application server(s) (equivalent to the plurality of hardware) can be selected based on the application requirements and application server conditions. The application server conditions include various processing resources such as those seen in (¶0086) which include examples like memory requirements or memory capacity among other possible limitations.]
Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine a method for reducing repeated operations in a neural network as taught by Gupta/Yokono with the hardware resource management system as taught by Lee. The reason it would be obvious is one of ordinary skill in the art would recognize, prior to the effective filing date, that combining the resource management system of Lee with the teachings of Gupta/Yokono would provide a more efficient system for reducing redundant operations across a plurality of operation groups. This would facilitate the recognized benefit of an increased efficiency for the system overall from processing times being reduced from the removed redundant operations and the allocation of resources to better manage processing times.







In regards to claim 2, The method of claim 1, is taught by Gupta/Yokono/Lee as seen in the rejection for claim 1 above. Gupta continues to teach the following:
wherein the merging of the first operation group and the second operation group comprises: obtaining the first operation group and the second operation group from among a plurality of operations in the first … and a plurality of operations in the second … respectively;
[ (Col. 1, Lines 50-55) “The primary task selection criteria may include selection of a task that requires computation of highest number of clusters, a task that requires maximum number of iterations to converge, a task with fewest or containing maximum shared clustering attributes across the tasks, or a combination thereof.”
	This citation from Gupta teaches selecting operations (tasks) from the operation groups (clusters) based on arbitrary selection data that the user is free to modify. ]
and assigning, to the first operation group and the second operation group, a shared identification (ID) indicating the shared operation group.
[ (Col. 1, Lines 61-64) “A map output of the merged task may include a combination of the cluster-id and a job-id as map-key, and a data value as the map-value may enable, at least in part, sharing of the map-output for multiple tasks.”
	This citation teaches the use of IDs for the cluster which is the equivalent to the operation group ID.]
Gupta does not distinctly disclose that the operation groups are specifically coming from respective neural networks. However, Yokono teaches that the operations are merged from neural networks as seen below:
Neural network
[ (Fig. 5-7) and (Col. 3, Lines 15-31)

	Please refer to the motivation to combine from claim 1.




In regards to claim 3, The method of claim 2, is taught by Gupta/Yokono/Lee as seen in the rejection for claim 2 above. Gupta continues to teach the following:
wherein the obtaining of the first operation group and the second operation group comprises: obtaining the first operation group and the second operation group, based on at least one of, operation group IDs, or a sub neural network ID set for each of the first … and the second … 
[ (Col. 1, Lines 61-64) “A map output of the merged task may include a combination of the cluster-id and a job-id as map-key, and a data value as the map-value may enable, at least in part, sharing of the map-output for multiple tasks.”
	This citation teaches the cluster ID being used to enable the sharing of the map-output for multiple tasks. ]
	What is not distinctly disclosed by Gupta and is instead taught by Yokono is seen below:
Neural network
[ (Fig. 5-7) and (Col. 3, Lines 15-31)
	Yokono teaches the merging of two distinct neural networks into one structure as seen in the above figures with the additional texts as support. Yokono teaches the environment (two distinct neural networks) where the operations Gupta teaches would be performed. ]
	Please refer to the motivation to combine from claim 1.




In regards to claim 4, The method of claim 2, is taught by Gupta/Yokono/Lee as seen in the rejection for claim 2 above. Yokono continues to teach the following:
wherein the obtaining of the first operation group and the second operation group comprises: analyzing a layer topology of at least one of the first neural network and the second neural network,
[ (Fig. 5) and (Col. 3, Lines 24-27) “or a learning parameter share their input layer or input and intermediate layers so that the optimum neural network can be obtained by providing the same pattern to the plurality of networks. FIG. 5 shows an example of a learning system configured by two networks each having a different structure.”
	This citation and corresponding figure from Yokono teaches the merging of neural network layers. ]
	What is not distinctly disclosed by Yokono and is instead taught by Gupta is seen below:
and obtaining the first operation group and the second operation group, based on a result from the analyzing.
[ [ (Col. 1, Lines 44-47) “The plurality of parallel tasks may include multiple clustering tasks or a clustering task and multiple grouping tasks, using one or more common data inputs in the plurality of parallel tasks”
	This citation and the paragraphs surrounding it teach grouping up operations that are common between the parallel tasks.]
[ (Col. 1, lines 52-55) “a task that requires computation of highest number of clusters, a task that requires maximum number of iterations to converge, a task with fewest or containing maximum shared clustering attributes across the tasks,”

	Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine a method for reducing repeated operations as taught by Gupta with the analysis of a layer’s topology as taught by Yokono to form a topological analysis that obtains operation groups based off of that analysis. The reason it would be obvious is one of ordinary skill in the art would recognize, prior to the effective filing date, that combining the two would simplify the learning process for neural networks [ Yokono (Col. 4, Lines 58-66) ]. This would facilitate the recognized benefit of an increased efficiency for the system overall and require less computational resources for training the machine learning networks.




In regards to claim 6, The method of claim 2, is taught by Gupta/Yokono/Lee as seen in the rejection for claim 2 above. Gupta continues to teach the following:
wherein the obtaining of the first operation group and the second operation group comprises: generating an operation history by tracing an operation process of at least one of the first … and the second … during runtime of at least one of the first … and the second … and obtaining the first operation group and the second operation group, based on the operation history.
[ (Col. 2, Lines 6-14) “A series of map and reduce functions may be called until a cluster termination condition for various tasks are obtained. In the merged task, the cluster-id of the primary task may be used as the map output key, and wherein a data value may be used as map output values. In a second reduce function, a new cluster center may be calculated, and 
	This citation from Gupta teaches the map-reduce calls which break down the parallel tasks and assign keys and values. As the keys represent the operations being performed (cluster IDs), the reduce function then merges the results based on key values. This is equivalent to the tracing of the claim where each operation is carried out (with their operation output being assigned as their map output) and then sorted and merged based off identical keys. ]
What is not distinctly disclosed by Gupta and is instead taught by Yokono is seen below:
Neural network
[ (Fig. 5-7) and (Col. 3, Lines 15-31)
	Yokono teaches the merging of two distinct neural networks into one structure as seen in the above figures with the additional texts as support. Yokono teaches the environment (two distinct neural networks) where the operations Gupta teaches would be performed. ]
	Please refer to the motivation to combine from claim 1.




In regards to claim 7, The method of claim 1, is taught by Gupta/Yokono/Lee as seen in the rejection for claim 1 above. Gupta continues to teach the following:
wherein the executing of the shared operation group comprises: storing, by the first hardware, an output of a last operation of the shared operation group in a shared buffer.

	This citation teaches the shared storage resources. ]
[ (Col. 7, Lines 5-10) “The instruction sets and subroutines of client applications 22, 24, 26, 28, which may be stored on storage devices 30, 32, 34, 36 coupled to client electronic devices 38, 40, 42, 44, may be executed by one or more processors (not shown) and one or more memory architectures” 
This citation teaches that the operations will be stored on the storage devices (including the output of the last operation). ]
[ (Col. 6, Lines 41-44) “Storage device 16 may include but is not limited to: a hard disk drive; a flash drive, a tape drive; an optical drive; a RAID array; a random access memory (RAM); and a read-only memory (ROM).”
	This citation teaches the variety of different storage solutions, one of which is RAM which is a functional equivalent to a data buffer. ]





In regards to claim 9, The method of claim 2, is taught by Gupta/Yokono/Lee as seen in the rejection for claim 2 above. Gupta continues to teach the following:
wherein the assigning of the shared ID comprises: assigning, to the first operation group and the second operation group, a shared count indicating a number of operation groups having the shared ID.

	This citation from Gupta mentions the hierarchy for selecting which task(s) are going to be completed in which order when there are a plurality of operation groups and shared tasks. Specifically, the citation mentions that one of the criteria could be which one of the tasks is shared between the fewest or maximum shared clusters. This implicitly shows that the cluster-IDs, Job-IDs and other data recorded as taught by Gupta would keep track of how many times each task is shared between the clusters and how many clusters share each task. ]





In regards to claim 10, The method of claim 9, is taught by Gupta/Yokono/Lee as seen in the rejection for claim 9 above. Lee continues to teach the following:
wherein the selecting of the first hardware comprises: selecting the first hardware based on at least one of preferred information of applications, amounts of available processing resources, or the shared count;
[ (¶0085) “An application server may be selected based on the determined application requirements and based on the determined application server conditions (block 840). For example, server selection mechanism 520 may compare the determined application requirements with the determined application server conditions to identify a particular application server 150 that best matches the application requirements.” 

and assigning a resource ID of the first hardware to the shared operation group.
[ (Fig. 11) and (¶0113) 
	The tailored session activation request explained in the above cited figure and paragraph are equivalent to the resource ID of the claim limitation. The session activation is tied to the application itself (which would be the operation group that is taught by Gupta above) and follows the application wherever it goes in the system. The session activation may be updated (tailored) whenever the application server associated with the application session is assigned. ]
	Please refer to the motivation to combine from claim 1



In regards to claim 11, The method of claim 10, is taught by Gupta/Yokono/Lee as seen in the rejection for claim 9 above. Gupta continues to teach the following:
wherein the selecting of the first hardware comprises: setting a priority with respect to the shared operation group, such that the first hardware executes the shared operation group prior to other operations.
[ (Col. 1, Lines 50-55) “The primary task selection criteria may include selection of a task that requires computation of highest number of clusters, a task that requires maximum number of iterations to converge, a task with fewest or containing maximum shared clustering attributes across the tasks, or a combination thereof”

	



In regards to claim 12, The method of claim 1, is taught by Gupta/Yokono/Lee as seen in the rejection for claim 1 above. Gupta continues to teach the following:
wherein the merging of the first operation group and the second operation group
[ (Col. 1, Lines 44-47) “The plurality of parallel tasks may include multiple clustering tasks or a clustering task and multiple grouping tasks, using one or more common data inputs in the plurality of parallel tasks”
	This citation and the paragraphs surrounding it teach grouping up operations that are common between the parallel tasks which were previously taught in the combination of Gupta/Yokono of claim 1 to be operation groups within neural networks. ]
[ (Col. 1, lines 52-55) “The plurality of parallel tasks involving zero or more relational operations and at least one non-relational operation are executed. In response to executing the plurality of parallel tasks, one or more resources of the identified resource sharing opportunities are shared across tasks involving zero or more relational operations and at least one non-relational operation.” (emphasis added)
This citation teaches that the merge operation happens during runtime due to it clarifying that the merge operation occurs in response to the execution of the parallel tasks. ]

and the selecting of the first hardware are performed during runtime of at least one of the first neural network and the second neural network
[ (¶0085) “An application server may be selected based on the determined application requirements and based on the determined application server conditions (block 840). For example, server selection mechanism 520 may compare the determined application requirements with the determined application server conditions to identify a particular application server 150 that best matches the application requirements.” 
	This citation from Lee teaches a system where the application server(s) (equivalent to the plurality of hardware) can be selected based on the application requirements and application server conditions. The application server conditions include various processing resources such as those seen in (¶0086) which include examples like memory requirements or memory capacity among other possible limitations. ]
[ (Fig. 11) and (¶0110) “FIGS. 11 and 12 describe processes relating to an implementation where UE 110 and resource information server 140 negotiate with respect to application requirements associated with UE applications.”
	In general, this figure and the corresponding paragraph describing it, detail the process of the system. Reference number 1110 in Fig. 11 shows that the application process is started before the application server is selected which shows the process is occurring during runtime. ]
	Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine a method for reducing repeated operations in a neural network as taught by Gupta/Yokono so that the operations occurs during runtime and hardware selection as taught by Lee. The reason it would be obvious is one of ordinary skill in the art would recognize, prior to the effective filing date, that combining the runtime operation and hardware selection system of Lee with the teachings of Gupta/Yokono would provide a more 





In regards to claim 16, The processor of claim 13, is taught by Gupta/Yokono as seen in the rejection for claim 13 below. Gupta continues to teach the following:
wherein the processor, with respect to the common … models, generates a shared ID, a shared count,
[ (Col. 1, Lines 61-64) “A map output of the merged task may include a combination of the cluster-id and a job-id as map-key, and a data value as the map-value may enable, at least in part, sharing of the map-output for multiple tasks.” ]
[ (Col. 1, Lines 50-55) “The primary task selection criteria may include selection of a task that requires computation of highest number of clusters, a task that requires maximum number of iterations to converge, a task with fewest or containing maximum shared clustering attributes across the tasks, or a combination thereof”
	This citation and the one above it teach the use of IDs attached to both the clusters and jobs. The cluster-ID would be equivalent to the shared-ID as it denotes which tasks are shared within the cluster. Examiner notes that the applicant’s specification points the shared-ID as having the same function/use as the operation group ID with the only difference being that the shared-ID is directed towards the neural networks rather than the shared operations within a neural network like the operation 
Gupta does not distinctly disclose that the operation groups are specifically coming from respective neural networks. However, Yokono teaches that the operations are merged from neural networks as seen below:
	Neural network
[ (Fig. 5-7) and (Col. 3, Lines 15-31)
	Yokono teaches the merging of two distinct neural networks into one structure as seen in the above figures with the additional texts as support. Yokono teaches the environment (two distinct neural networks) where the operations Gupta teaches would be performed. ]
	What is not distinctly disclosed by Gupta or Yokono and is instead taught by Lee is seen below:
 	and a resource ID indicating hardware by which the shared neural network model is executed, and adds the shared ID, the shared count, and the resource ID to model information regarding the neural network model.
[ (Fig. 11) and (¶0113) 
	The tailored session activation request explained in the above cited figure and paragraph are equivalent to the resource ID of the claim limitation. The session activation is tied to the application itself (which would be the operation group that is taught by Gupta above) and follows the application wherever it goes in the system. The session activation may be updated (tailored) whenever the application server associated with the application session is assigned. 
	Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine a method for reducing repeated operations in a neural network as taught by Gupta/Yokono with the hardware resource management system as taught by Lee. The reason it would be obvious is one of ordinary skill in the art would recognize, prior to the effective filing date, that combining the resource management system of Lee with the teachings of Gupta/Yokono would provide a more efficient system for reducing redundant operations across a plurality of operation groups. This would facilitate the recognized benefit of an increased efficiency for the system overall from processing times being reduced from the removed redundant operations and the allocation of resources to better manage processing times.





In regards to claim 17, Gupta teaches the following:
a plurality of processor;
[ (Col. 6, Lines 37-39) “The instruction sets and subroutines of resource sharing process 10, which may be stored on storage device 16 coupled to computer 12, may be executed by one or more processors” ]
and a … merge module searching for an operation group included in common in a plurality of … performing different tasks,
[ (Col. 3, Lines 5-10) “In another implementation, a computing system includes a processor and memory configured to perform operations comprising identifying one or 
	This citation teaches the system being able to identify tasks and select from the plurality of tasks, ones which are common (resource sharing opportunities) or otherwise. Examiner is interpreting the tasks of Gupta to be equivalent to the operation groups of the claim limitation being merged. Examiner also notes that the specification recites the merge module capable of being a software, hardware, or an embodiment that includes a combination (¶0034). The specification further adds (¶0035) that general circuitry along with memory and specific computer instructions could be used to carry out said operations. In light of the specification, examiner notes that the citations above that teach the processor and memory from Gupta, along with the operations being taught here teach the neural network merge module. ]
providing a shared identification (ID) 
[ (Col. 1, Lines 61-64) “A map output of the merged task may include a combination of the cluster-id and a job-id as map-key, and a data value as the map-value may enable, at least in part, sharing of the map-output for multiple tasks.” ]
[ (Col. 1, Lines 50-55) “The primary task selection criteria may include selection of a task that requires computation of highest number of clusters, a task that requires maximum number of iterations to converge, a task with fewest or containing maximum shared clustering attributes across the tasks, or a combination thereof”
	This citation and the one above it teach the use of IDs attached to both the clusters and jobs. The cluster-ID would be equivalent to the shared-ID as it denotes which tasks are shared within the cluster. Examiner notes that the applicant’s specification points the shared-ID as having the same function/use as the operation group ID with the only difference being that the shared-ID is directed towards the neural 
thereby setting the operation group to be computed in one of the plurality of processors during execution of the plurality of ...
[ (Col. 2, Lines 22-24) “The plurality of parallel tasks involving zero or more relational operations and at least one non-relational operation are executed.”
	This citation and the one above it teach the processor identifying the shared opportunities in the task and then executing those tasks. ]
[ (Col. 2, Lines 45-50) “Sharing one or more resources may include at least one of sharing data reads, sharing computations, sharing intermediate results, sharing at least one of map and reduce computations, sharing data processing resources” ]
Gupta does not distinctly disclose that the operation groups are specifically coming from respective neural networks. However, Yokono teaches that the operations are merged from neural networks as seen below:
A neural network system, comprising:
Neural network
[ (Abstract) “The neural network learning system operates on, for example, a plurality of neural networks” ]
[ (Fig. 5-7) and (Col. 3, Lines 15-31)
	Yokono teaches the merging of two distinct neural networks into one structure as seen in the above figures with the additional texts as support. Yokono teaches the environment (two distinct neural networks) where the operations Gupta teaches would be performed. ]
Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine a method for reducing repeated operations as taught by Gupta with the merging of two or more distinct neural networks as taught by Yokono. The reason it would be obvious is one of ordinary skill in the art would recognize, prior to the 
	What is not distinctly disclosed by Gupta or Yokono and is subsequently taught by Lee is seen below:
providing a shared identification (ID) and a resource ID to a operation group that is found,
[ (Fig. 11) and (¶0113) 
	The tailored session activation request explained in the above cited figure and paragraph are equivalent to the resource ID of the claim limitation. The session activation is tied to the application itself (which would be the operation group that is taught by Gupta above) and follows the application wherever it goes in the system. The session activation may be updated (tailored) whenever the application server associated with the application session is assigned. Examiner notes that Lee teaches assigning multiple pieces of data to the application which would be the Shared-ID as taught by Gupta above. ]
Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine a method for reducing repeated operations in a neural network as taught by Gupta/Yokono with the hardware resource management system as taught by Lee. The reason it would be obvious is one of ordinary skill in the art would recognize, prior to the effective filing date, that combining the resource management system of Lee with the teachings of Gupta/Yokono would provide a more efficient system for reducing redundant operations across a plurality of operation groups. This would facilitate the recognized benefit of an increased efficiency for the system overall from processing times being reduced from the removed redundant operations and the allocation of resources to better manage processing times.




In regards to claim 19, The neural network system of claim 17, is taught by Gupta/Yokono/Lee as seen in the rejection for claim 17 above. Gupta continues to teach the following:
wherein the … merge module comprises: a merger searching for the operation group from the plurality of …
[ (Col. 3, Lines 5-10) “In another implementation, a computing system includes a processor and memory configured to perform operations comprising identifying one or more resource sharing opportunities across a plurality of parallel tasks. The plurality of parallel tasks includes zero or more relational operations and at least one non-relational operation.”
	This citation teaches the system being able to identify tasks and select from the plurality of tasks, ones which are common (resource sharing opportunities) or otherwise. Examiner is interpreting the tasks of Gupta to be equivalent to the operation groups of the claim limitation being merged. Examiner also notes that the specification recites the merge module capable of being a software, hardware, or an embodiment that includes a combination (¶0034). The specification further adds (¶0035) that general circuitry along with memory and specific computer instructions could be used to carry out said operations. However, there is no specific description for the merger. In light of the specification, examiner notes that the citations that teach the processor and memory from Gupta, along with the operations being taught here teach the neural network merge module and the merger. ]
and setting the shared ID and a shared count to the operation group;

This citation and the one below it teach the use of IDs attached to both the clusters and jobs. The cluster-ID would be equivalent to the shared-ID as it denotes which tasks are shared within the cluster.]
[ (Col. 1, Lines 50-55) “The primary task selection criteria may include selection of a task that requires computation of highest number of clusters, a task that requires maximum number of iterations to converge, a task with fewest or containing maximum shared clustering attributes across the tasks, or a combination thereof”
	Further, this citation from Gupta mentions the hierarchy for selecting which task(s) are going to be completed in which order when there are a plurality of operation groups and shared tasks. Specifically, the citation mentions that one of the criteria could be which one of the tasks is shared between the fewest or maximum shared clusters. This implicitly shows that the cluster-IDs, Job-IDs and other data recorded as taught by Gupta would keep track of how many times each task is shared between the clusters and how many clusters share each task which examiner interprets to be equivalent to the shared count. ]
Gupta does not distinctly disclose that the operation groups are specifically coming from respective neural networks. However, Yokono teaches that the operations are merged from neural networks as seen below:
Neural network
[ (Abstract) “The neural network learning system operates on, for example, a plurality of neural networks” ]
[ (Fig. 5-7) and (Col. 3, Lines 15-31)

	Please see the motivation to combine from the rejection of claim 17 above.
	What is not distinctly disclosed by Gupta or Yokono and is subsequently taught by Lee is seen below:
 and an assignor selecting a processor to compute the operation group from among the plurality of processors, based on an execution policy, computing resources, and the shared count set with respect to the tasks,
[ (¶0085) “An application server may be selected based on the determined application requirements and based on the determined application server conditions (block 840). For example, server selection mechanism 520 may compare the determined application requirements with the determined application server conditions to identify a particular application server 150 that best matches the application requirements.” 
	This citation from Lee teaches a system where the application server(s) (equivalent to the plurality of hardware) can be selected based on the application requirements and application server conditions. The server selection mechanism (equivalent to the assignor) is able to select the hardware to perform the operation based on the server conditions and other parameters. The application server conditions include various processing resources such as those seen in (¶0086) which include examples like memory requirements or memory capacity among other possible limitations. Examiner notes that the shared count set is taught by Gupta, as seen above, and would be passed by the server selection mechanism to the application server as through the combination of their teachings. ]
and adding the resource ID of the processor to information regarding the plurality of neural networks.
[ (Fig. 11) and (¶0113) 

	Please see the motivation to combine from the rejection of claim 17 above.




In regards to claim 20, The neural network system of claim 19, is taught by Gupta/Yokono/Lee as seen in the rejection for claim 19 above. Gupta continues to teach the following:
wherein the merger sets a priority with respect to the processor to the operation group.
[ (Col. 1, Lines 50-55) “The primary task selection criteria may include selection of a task that requires computation of highest number of clusters, a task that requires maximum number of iterations to converge, a task with fewest or containing maximum shared clustering attributes across the tasks, or a combination thereof”
	This citation teaches that the operation groups in Gupta will have a priority hierarchy in terms of which tasks are selected to be processed first. Although not explicitly mentioning priority, it is implicit that the system will select (primary task being the priority) which tasks based 




Claims 13 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Gupta (US8984515B2), and further in view of Yokono (US 5295227 A).

In regards to claim 13, Gupta teaches the following:
An application processor, comprising: a memory storing programs; a processor configured to execute the programs stored in the memory;
[ (Col. 5, Lines 50-57) “These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified” 
This citation teaches the processor and the memory within the system. Further it teaches the processor running instructions stored in the memory to execute the operations. ]
and a … merge module comprising programs loadable in the memory, wherein the processor, by executing the … merge module, identifies common … models from among a plurality of … performing different tasks,
[ (Col. 3, Lines 5-10) “In another implementation, a computing system includes a processor and memory configured to perform operations comprising identifying one or more resource sharing opportunities across a plurality of parallel tasks. The plurality of 
	This citation teaches the system being able to identify tasks and select from the plurality of tasks, ones which are common (resource sharing opportunities) or otherwise. Examiner notes that applicant’s specification (¶0032) through (¶0035) explains the working of the neural network merge module which elaborates that the merge module combines common operation groups of the sub-neural networks. The sub-neural networks (according to the depiction within figure 3) seem to be performing the same pipeline processing where an input is passed along a series of operations and redundant operations could be merged between two parallel streams of operations. Examiner is interpreting the tasks of Gupta to be equivalent to the models of the claim limitation being merged. Examiner also notes that the specification recites the merge module capable of being a software, hardware, or an embodiment that includes a combination (¶0034). The specification further adds (¶0035) that general circuitry along with memory and specific computer instructions could be used to carry out said operations. In light of the specification, examiner notes that the citations above that teach the processor and memory from Gupta, along with the operations being taught here teach the neural network merge module. ]
and merges the common … models included in the plurality of … to be executed by a single process.
[ (Col. 2, Lines 45-50) “Sharing one or more resources may include at least one of sharing data reads, sharing computations, sharing intermediate results, sharing at least one of map and reduce computations, sharing data processing resources” ]
[ (Col. 1, Lines 44-47) “The plurality of parallel tasks may include multiple clustering tasks or a clustering task and multiple grouping tasks, using one or more common data inputs in the plurality of parallel tasks”

[ (Col. 1, lines 52-55) “a task that requires computation of highest number of clusters, a task that requires maximum number of iterations to converge, a task with fewest or containing maximum shared clustering attributes across the tasks,”
	This citation teaches the grouping of tasks each having identical operations (shared tasks among clusters) ]
Gupta does not distinctly disclose that the operation groups are specifically coming from respective neural networks. However, Yokono teaches that the operations are merged from neural networks as seen below:
	Neural network
[ (Fig. 5-7) and (Col. 3, Lines 15-31)
	Yokono teaches the merging of two distinct neural networks into one structure as seen in the above figures with the additional texts as support. Yokono teaches the environment (two distinct neural networks) where the operations Gupta teaches would be performed. ]
	Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine a method for reducing repeated operations as taught by Gupta with the merging of two or more distinct neural networks as taught by Yokono. The reason it would be obvious is one of ordinary skill in the art would recognize, prior to the effective filing date, that combining the two would save the neural network system from having to repeat any shared processes [ Yokono (Col. 3, Lines 4-10) ]. This would facilitate the recognized benefit of an increased efficiency for the system overall and require less computational resources for training the machine learning networks.



In regards to claim 14, The processor of claim 13, is taught by Gupta/Yokono as seen in the rejection for claim 13 above. Gupta continues to teach the following:
wherein the processor obtains the common … model, based on … models identifications (IDs) preset with respect to each of the plurality of ....
[ (Col. 1, Lines 61-64) “A map output of the merged task may include a combination of the cluster-id and a job-id as map-key, and a data value as the map-value may enable, at least in part, sharing of the map-output for multiple tasks.”
	This citation teaches the use of IDs for the cluster which is the equivalent to the operation group ID. Same as in the rejection for claim 13, examiner is interpreting the model(s) of the claim language to be equivalent to the tasks of Gupta. Therefore, the shared IDs being used to identify common sub-models or operation groups within the overall models is equivalent to the teachings of Gupta which teach finding the common tasks within the clusters. ] 
	Gupta does not distinctly disclose that the operation groups are specifically coming from respective neural networks. However, Yokono teaches that the operations are merged from neural networks as seen below:
	Neural network
[ (Fig. 5-7) and (Col. 3, Lines 15-31)
	Yokono teaches the merging of two distinct neural networks into one structure as seen in the above figures with the additional texts as support. Yokono teaches the environment (two distinct neural networks) where the operations Gupta teaches would be performed. ]
	Please refer to the motivation to combine from the rejection of claim 13 above. 






Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Gupta/Yokono/Lee as applied above, and further in view of CN (CN 108780406 B).


In regards to claim 8, The method of claim 7, is taught by Gupta/Yokono as seen in the rejection for claim 7 above. CN continues to teach the following:
further comprising: accessing, by second hardware, the shared buffer in response to a shared buffer ready signal; 
[ (Pg. 7, Paragraph 20) “As yet another example of using RDMA in the context of database management, a situation may arise where a replicated database server is newly designated as the primary replication server. When this occurs, the buffer pool of the newly designated master server is "cold" and may impair the performance of the database before filling the buffer pool with relevant data. However, when RDMA is available, RDMA memory accesses may be used to fill the buffer pool of the newly designated primary server by copying the existing contents of the buffer pool from the previous primary server and the now secondary replica server using RDMA memory transfers.”
	This citation from CN teaches the multiple computing systems with a shared buffer that allows any of the computing systems connected to access said buffer. Examiner is interpreting the designation of the server as the shared buffer ready signal. As seen in (Pg. 9, Section “N”), whenever the system designates one of the computing servers to be the master replication server, the action(s) consisting of accessing the shared buffer begin. ]
and executing, by the second hardware, operations that follow the second operation group.

	This citation shows that in some embodiments the computing system from CN will be able to execute the same operations as one another. ]
	Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine a method for reducing repeated operations in a neural network as taught by Gupta/Yokono with the shared memory and processing system as taught by CN. The reason it would be obvious is one of ordinary skill in the art would recognize, prior to the effective filing date, that combining the two would provide faster memory access speeds [ CN (Abstract) ]. This would facilitate the recognized benefit of an increased efficiency for the system overall by speeding up the processing time and reducing the memory load times.






Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Gupta/Yokono as applied above, and further in view of Keskin (US 20180314524 A1).


In regards to claim 15, The processor of claim 13, is taught by Gupta/Yokono as seen in the rejection for claim 13 above. Keskin continues to teach the following:
wherein the processor traces an operation process in runtime with respect to at least one neural network from among the plurality of neural networks,
[ (¶0023) “In operation, the branch predictor circuit may identify branch instructions in a given application program using a sample trace of that program. In some embodiments, a neural network, such as a convolutional neural network is then trained “offline”, using a branch history data for these particular branches”
	This citation from Keskin teaches tracing a neural network to go through all of the operations within all of its branches. Examiner notes that the training is done “offline” post-trace with the tracing operation still being performed during runtime. Further, the multiple neural networks are taught by Yokono.]
thereby generating an operation history with respect to the at least one neural network.
[ (¶0055) “The resulting decision tree may be a flow chart like structure stored on a memory device (e.g., memory 201 of FIG. 2) that may be used to classify the portion of features.”
	This citation teaches the operation history being saved (in the form of a decision tree) of the neural network and being saved. ]
	Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine a method for reducing repeated operations in a plurality of neural networks as taught by Gupta/Yokono with the neural network operation tracing as taught by Keskin. The reason it would be obvious is one of ordinary skill in the art would recognize, prior to the effective filing date, that combining the two would eliminate the need for the processing system to wait for an outcome to continue down multiple branches [ Keskin (Abstract) ]. This would facilitate the recognized benefit of an increased efficiency for the . 





Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Gupta/Yokono/Lee as applied above, and further in view of Keskin (US 20180314524 A1).


In regards to claim 18, The neural network system of claim 17, is taught by Gupta/Yokono/Lee as seen in the rejection for claim 17 above. Keskin continues to teach the following:
wherein the neural network merge module traces operation processes in runtime with respect to at least one neural network from among the plurality of neural networks,
[ (¶0023) “In operation, the branch predictor circuit may identify branch instructions in a given application program using a sample trace of that program. In some embodiments, a neural network, such as a convolutional neural network is then trained “offline”, using a branch history data for these particular branches”
	This citation from Keskin teaches tracing a neural network to go through all of the operations within all of its branches. Examiner notes that the training is done “offline” post-trace with the tracing operation still being performed during runtime. Further, the multiple neural networks are taught by Yokono.]
thereby generating an operation history with respect to the at least one neural network.
[ (¶0055) “The resulting decision tree may be a flow chart like structure stored on a memory device (e.g., memory 201 of FIG. 2) that may be used to classify the portion of features.”
	This citation teaches the operation history being saved (in the form of a decision tree) of the neural network and being saved. ]
	Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to combine a method for reducing repeated operations in a plurality of neural networks as taught by Gupta/Yokono/Lee with the neural network operation tracing as taught by Keskin. The reason it would be obvious is one of ordinary skill in the art would recognize, prior to the effective filing date, that combining the two would eliminate the need for the processing system to wait for an outcome to continue down multiple branches [ Keskin (Abstract) ]. This would facilitate the recognized benefit of an increased efficiency for the system overall by speeding up the processing time and reducing the waiting time for different threads/branches throughout the different networks. 





Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US 6968335 B2 – Method and susyem for parallel processing of database queries that teaches portions of data being split and passed between different nodes.
US 11269643 B2 – data operations and finite state machine for machine learning via bypass of computational tasks based on frequent use which teaches merging of operations and identifying repeated operations. 
US 11087206 B2 – Smart memory handling and data management for machine learning networks which teaches multiple networks being conjoined together, tables for organizing data, operation IDs and parallel operation execution. 




Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL MERABI whose telephone number is (571)272-9685. The examiner can normally be reached Mon-Fri 7:30am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on (571) 270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about 





/M.A.M./Examiner, Art Unit 2123                                                                                                                                                                                                        
/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123