DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Allowable Subject Matter
Claim 17 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

The following is a statement of reasons for the indication of allowable subject matter: the prior art references do not explicitly teach all the elements of claim 17 as explained below. Thus, the claim has allowable subject matter.
Bergstra (U.S. Pat. App. Pre-Grant Pub. No. 2017/0140259): describing a first agent (i.e. a write agent) and a second agent (i.e. a second agent), wherein “the first agent executes with a first frequency and the second agent executes with a second frequency. For example, a write agent writes every 20 milliseconds and a read agent reads every 5 milliseconds.” (Bergstra [0229]-[0230]). While Bergstra teaches that two agents operate at two different frequencies and that the frequencies are multiples of each other, Bergstra does not explicitly teach that the agents are each a first NN and as second NN as recited in the claim. Accordingly, the claim is distinguishable from the reference.
Yang et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2020/0251119): describing that “[t]he first neural network [NN] may be a filter for performing a neural network operation for .

Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference character(s) not mentioned in the description:
Fig. 2: reference label 201.
Fig. 4: reference label 201.
Corrected drawing sheets in compliance with 37 CFR 1.121(d), or amendment to the specification to add the reference character(s) in the description in compliance with 37 CFR 1.121(b) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

The drawings are objected to because: the orientation view of the figure labels and page numbers in Figs. 1, 2, and 4 are not in the same orientation view as the reference labels. The orientation view for all the values must be the same (see MPEP §608.02(V) at 37 C.F.R. 1.84(p)(1)). 
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Specification
Applicant is reminded of the proper language and format for an abstract of the disclosure.
The abstract should be in narrative form and generally limited to a single paragraph on a separate sheet within the range of 50 to 150 words in length. The abstract should describe the 
The language should be clear and concise and should not repeat information given in the title. It should avoid using phrases which can be implied, such as, “The disclosure concerns,” “The disclosure defined by this invention,” “The disclosure describes,” etc.  In addition, the form and legal phraseology often used in patent claims, such as “means” and “said,” should be avoided.
The Abstract currently exceeds 150 words and also contains the title. The Abstract should be amended to be below 150 words and the title should be removed from the Abstract. Appropriate correction is required. 

Claim Objections
Claim 20 is objected to because of the following informalities: there should be colon after “comprising” on line 1 and the comma on line 2 should be changed to semicolon. The claim should be amended like: “The method of claim 19, comprising: merging a first layer serving the first neural network and a second layer serving the second neural network to form the merged layer[[,]]; and replacing the first and second layers….” Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 5-7, 9, and 16 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

The following elements in claims 5-7, 9, and 16 lack sufficient antecedent basis:
Claim 5: “the form” on line 2. 
Claim 6: “the form” on line 2.
Claim 7: “the same dimensions” on line 2.
Claim 9: “the same activation function” on line 2. 
Claim 16: “the form” on line 2 and “the merged layer” on line 13.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1, 5-7, 15, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Fukuda et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2018/0053087, hereinafter Fukuda) in view of He et. al., “Multi-Task Zipping via Layer-wise Neuron Sharing” (hereinafter He). 

Regarding claim 1, Fukuda teaches:
A neural network system executable on a processor, the neural network system, when executed on the processor ([0095]-[0096] and [0099]: describing that a system, corresponding to a computer system, “reads a computer program for executing the trained front-end NN [neural network] and a back-end NN from the storage and then combines the back-end NN with the trained front-end NN” to generate a combined NN. 
See also [0022], [0024] and [0136]-[0137]: further describing the computer and its operating system, as well as describing processor for executing computer readable program instructions to perform the instructions. Wherein performing the instructions comprises implementing the combined NN as previously described.), 
comprising a merged layer shareable between a first neural network and a second neural network, wherein the merged layer is configured to ([0048]-[0050] and [0066]: describing a combined NN comprising of a front-end NN (i.e. a first NN) and a back-end NN (i.e. a second NN) which is joined via a joint layer that is shared between the two NNs. The NNs are depicted in Fig. 2.): 
receive input data from a prior layer of at least one of the first and second neural networks ([0030]: describing the input layer of the front-end NN, wherein the input layer denotes a prior layer from the joint layer of the combined NN as shown in Fig. 2. The input layer being able to receive input data ([0031]).); 
… to the input data to generate intermediate feature data representative of at least one feature of the input data ([0031]-[0033]: describing that the front-end NN determines clean feature data from noisy input feature data that is inputted into the front-end NN via its input layer. See also [0106]-[0109]: describing that the system comprises a “feature extraction section” and “feature mapping section” to obtain features derived from received, e.g. training data or input data. Wherein input data was previously described.)…; and 
([0030], [0035], [0040]-[0041]: describing that the clean feature data is outputted from the front-end NN via one or hidden layers and an output layer, which are subsequent layers of the front-end NN. It is commonly known that hidden layers and output layer occur at locations after an initial layer location and thus denote subsequent layers), the at least one subsequent layer serving the first and second neural networks ([0041], [0044], [0048], and [0050]: describing that the output layer of the front-end NN acts an input into the input layer of the back-end NN and is integrable as a hidden layer in the combined NN. That is, the output layer of the front-end NN is a subsequent layer that serves the front-end and back-end NNs. In addition, the hidden layers of the front-end NN help to generate the output that is then outputted into the output layer of the front-end NN ([0030 and [0037]), wherein the output layer serves as described above. Thus, the hidden layers also provide a service to both the NNs by virtue of helping to generate the output in the front-end NN.).

While the cited reference Fukuda teaches the above limitations of claim 1, it does not explicitly teach: “apply a superset of weights …, the superset of weights being combined from a first set of weights associated with the first neural network and a second set of weights associated with the second neural network” on lines 7-10. He teaches: two neural networks (NNs), MA and MB, that have weight matrices                                 
                                    
                                        
                                            W
                                        
                                        
                                            l
                                        
                                        
                                            A
                                        
                                    
                                
                             and                                 
                                    
                                        
                                            W
                                        
                                        
                                            l
                                        
                                        
                                            B
                                        
                                    
                                
                            , respectively, with corresponding respective weight vectors for l-th layers of the respective NNs (He Sections 3.1 and 3.2). Wherein the layers can be zipped, i.e. combined, and the corresponding respective weights and weight vectors are merged (He Section 3.2). The merged weights denoting a superset of weight. See also Sections 3.3-3.7: describing additional details about this process as it applies to additional layers and networks. 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network system in the cited reference to include the merged weights in He. Doing so would enable “Multi-Task Zipping (MTZ), a framework to automatically merge correlated, pre-trained deep neural networks for cross-model compression. Central in MTZ is a layer-wise neuron sharing and incoming weight updating scheme that induces a minimal change in the error function. MTZ inherits information from each model and demands light retraining to re-boost the accuracy of individual tasks.” (He Abstract). 

Regarding claim 5, the rejection of claim 1 is incorporated. He further teaches:
The neural network system of claim 1, wherein the superset of weights is applicable to the input data in the form of a kernel (He Section 3.2: describing that the respective weights of the two NNs and their corresponding merged weights are representable in matrix form. Whereby it is commonly known that a kernel is represented by matrices. Thus, the weights in matrix form represents a kernel. The weight matrices being applicable to the input data as shown in Algorithm 1. Similarly, see also Sections 3.4-3.7: describing additional details about the various weight matrices as it applies to additional layers and networks.).
	Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network system in the cited reference to include the merged weights in He. A motivation to combine the cited reference with He was previously given. 

Regarding claim 6, the rejection of claim 1 is incorporated. He further teaches:
The neural network system of claim 1, wherein the first and second sets of weights are applicable to the input data in the form of respective first and second kernels (He Section 3.2: describing that the respective weights of the two NNs are representable in matrix form. Whereby it is commonly known that a kernel is represented by matrices. Thus, the weights in matrix form represents a kernel, with weight matrices                         
                            
                                
                                    W
                                
                                
                                    l
                                
                                
                                    A
                                
                            
                        
                     and                         
                            
                                
                                    W
                                
                                
                                    l
                                
                                
                                    B
                                
                            
                        
                     representing kernels of the respective NN. The weight matrices being applicable to the input data as shown in Algorithm 1. Similarly, see also Sections 3.4-3.7: describing additional details about the various weight matrices as it applies to additional layers and networks.).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network system in the cited reference to include the merged weights in He. A motivation to combine the cited reference with He was previously given.

Regarding claim 7, the rejection of claim 6 is incorporated. He further teaches:
The neural network system of claim 6, wherein the first and second kernels have the same dimensions (He Section 3.6: describing that weight vectors for the corresponding weight matrices of the respective NN models MA and MB have the same dimension. Wherein the weight matrices represent kernels.).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network system in the cited reference to include the same dimensions in He. Doing so would enable “Multi-Task Zipping (MTZ), a framework to automatically merge correlated, pre-trained deep neural networks for cross-model compression. Central in MTZ is a layer-wise neuron sharing and incoming weight updating scheme that induces a minimal change in the error function. MTZ inherits information from each model and demands light retraining to re-boost the accuracy of individual tasks.” (He Abstract).

Regarding claim 15, the rejection of claim 1 is incorporated. Fukuda teaches:
The neural network system of claim 1, wherein the neural network system comprises the prior layer of at least one of the first and second neural networks ([0030] and [0037]: describing a prior layer, e.g. an input layer, of the front-end NN, i.e. the first NN.), 
the prior layer comprising a same layer serving both of the first and second neural networks ([0041], [0044], [0048], and [0050]: describing that the input layer of the front-end NN receives input data which is computed and then outputted at the output layer of the front-end NN. Wherein the output layer of the front-end NN acts an input into the input layer of the back-end NN, i.e. second NN, and is integrable as a hidden layer in the combined NN. Thus the input layer serves both NNs.) and being configured to generate the input data for the merged layer ([0048]-[0049] and [0051]: describing that the input layer of the front-end NN receives input data that is then computed to generate input data for the joint layer in the combined NN as shown in Fig. 2, wherein the joint layer is derived from combining layers from the front-end NN and back-end NN to become the combined NN.).

Regarding independent claim 19, claim 19 is substantially similar to independent claim 1 and therefore is rejected on the same grounds as claim 1. Claim 19 is a method claim that corresponds to system claim 1. 
claim 19 that differ from claim 1. 
Fukuda teaches:
A method of processing data using a processor configured to execute a neural network system, the method comprising ([0095]-[0096] and [0099]: describing a data process that is operated on a system, corresponding to a computer system, that “reads a computer program for executing the trained front-end NN [neural network] and a back-end NN from the storage and then combines the back-end NN with the trained front-end NN” to generate a combined NN. 
See also [0022], [0024] and [0136]-[0137]: further describing the computer and its operating system, as well as describing processor for executing computer readable program instructions to perform the instructions. Wherein performing the instructions comprises implementing the combined NN as previously described.): 
receiving input data at a merged layer, shared between a first neural network and a second neural network of the neural network system, from a prior layer of at least one of the first and second neural networks ([0048]-[0050] and [0066]: describing a combined NN comprising of a front-end NN (i.e. a first NN) and a back-end NN (i.e. a second NN) which is joined via a joint layer that is shared between the two NNs. The input layer of the front-end NN receives input data for the joint layer, wherein input layer denotes a prior layer from the joint layer of the combined NN as shown in Fig. 2. The input layer being able to receive input data ([0031]).); ….



Regarding claim 20, the rejection of claim 19 is incorporated. Fukuda teaches:
The method of claim 19, comprising 
merging a first layer serving the first neural network and a second layer serving the second neural network to form the merged layer ([0048]-[0050]: describing a merging of the output layer from the front-end NN, i.e. first layer serving the first NN, with the input layer from the back-end NN, i.e. second layer from the second NN, to form the joint layer in the combined NN. This is shown in Fig. 2.), and 
replacing the first and second layers with the merged layer shared between the first and second neural networks (0048]-[0050]: describing that the joint layer replaces both the first and second layers in the respective NN, wherein the joint layer is now shared between the two NNs in the combined NN. This is shown in Fig. 2.), 
…, stored in storage accessible to the processor ([0024], [0131], and [0136]: describing “computer readable storage” for storing data/instructions that is accessible to the processor for execution by the processor.), ….

While the cited reference Fukuda teaches the above limitations of claim 20, it does not explicitly teach: “wherein the merging comprises combining the first and second sets of weights … to form the superset of weights”. He further teaches: two neural networks (NNs), MA and MB, that have weight matrices                                 
                                    
                                        
                                            W
                                        
                                        
                                            l
                                        
                                        
                                            A
                                        
                                    
                                
                             and                                 
                                    
                                        
                                            W
                                        
                                        
                                            l
                                        
                                        
                                            B
                                        
                                    
                                
                            , respectively, with corresponding respective weight vectors for l-th layers of the respective NNs (He Sections 3.1 and 3.2). Wherein the layers can be zipped, i.e. combined, and the corresponding respective weights and weight vectors are merged (He Section 3.2). The merged weights denoting a superset of weight. See also Sections 3.3-3.7: describing additional details about this process as it applies to additional layers and networks. 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network system in the cited reference to include the merged weights in He. Doing so would enable “Multi-Task Zipping (MTZ), a framework to automatically merge correlated, pre-trained deep neural networks for cross-model compression. Central in MTZ is a layer-wise neuron sharing and incoming weight updating scheme that induces a minimal change in the error function. MTZ inherits information from each model and demands light retraining to re-boost the accuracy of individual tasks.” (He Abstract). 

Claims 2-4 are rejected under 35 U.S.C. 103 as being unpatentable over Fukuda et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2018/0053087, hereinafter Fukuda) and He et. al., “Multi-Task Zipping via Layer-wise Neuron Sharing” (hereinafter He) in view of Levi et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2020/0151465, hereinafter Levi). 

Regarding claim 2, the rejection of claim 1 is incorporated. The cited references in combination do not explicitly teach: “wherein the input data comprises sensor-originated data”.
Levi teaches: a sensor system comprising a plurality of sensor devices (Levi [0027]-[0028]) for gathering image data. Wherein the image data can be input into the NN (Levi [0038]-[0039] and [0055]).).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network system and merged weights in the combined cited references to include the sensor data in Levi. Doing “the system includ[ing] a sensor and a multi-layer convolutional neural network. The sensor generates an image indicative of a road scene of the vehicle. The multi-layer convolutional neural network generates a plurality of feature maps from the image via a first processing pathway, projects at least one of the plurality of feature maps onto a defined plane relative to a defined coordinate system of the road scene to obtain at least one projected feature map, applies a convolution to the at least one projected feature map in a second processing pathway to obtain a final feature map, and determines lane information from the final feature map.” (Levi Abstract). 

Regarding claim 3, the rejection of claim 2 is incorporated. Levi further teaches:
The neural network system of claim 2, wherein the sensor-originated data comprises feature data representative of one or more features of the sensor-originated data (Levi [0031]-[0032]: describing that “a first feature map 204 [is] obtained from the image 202”. That is, “[t]he plurality of images can be obtained from sensors having different sensor modalities. At block 1004, a multi-layer convolution neural network is applied to the image, wherein a first processing pathway of the neural network generates a plurality of feature maps from the image.” (Levi [0055]). The sensors for obtaining image data are previously described.).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network system and merged weights in the combined cited references to include the sensor data in Levi. A motivation to combine the cited references with Levi was previously given. 

Regarding claim 4, the rejection of claim 2 is incorporated. Levi further teaches:
The neural network system of claim 2, wherein the sensor-originated data comprises one or more of: image data (Levi [0031], [0038]-[0039], [0041], and [0055]: describing that the sensor data comprises image data.); video data; and audio data.
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network system and merged weights in the combined cited references to include the sensor data in Levi. A motivation to combine the cited references with Levi was previously given. 

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Fukuda et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2018/0053087, hereinafter Fukuda) and He et. al., “Multi-Task Zipping via Layer-wise Neuron Sharing” (hereinafter He) in view of Chou et. al. “Unifying and Merging Well-trained Deep Neural Networks for Inference Stage” (hereinafter Chou). 

Regarding claim 8, the rejection of claim 6 is incorporated. The cited references in combination do not explicitly teach: wherein the first and second kernels are applicable to the input data using at least one of: a same stride; a same processing frequency; and a same processing resolution. Chou teaches: a similar processing resolution as denoted by processing of input volumes for respective neural networks A and B, wherein the two NNs have similar resolution input volumes of N x M x d (with subscripts of A or B to denote the respective NNs) that are convolved with similar respective pA or pB convolution kernels of resolution size n x m x d (with subscripts of A or B to denote the respective NNs)  (Chou Section 3.1). 
 Chou. Doing so would enable “merging two CNNs [convolutional neural networks], our approach aligns the same-type layers (convolution; full-connection) into pairs. The layers in a pair are merged into a single layer that shares a common weight codebook through the proposed encoding scheme. The codebooks in the merged single model can be further trained via back-propagation algorithm; it thus can be fine-tuned to seek for performance improvement.” (Chou Section 1). 

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Fukuda et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2018/0053087, hereinafter Fukuda) and He et. al., “Multi-Task Zipping via Layer-wise Neuron Sharing” (hereinafter He) in view of Vukotic et. al. “Bidirectional Joint Representation Learning with Symmetrical Deep Neural Networks for Multimodal and Crossmodal Applications” (hereinafter Vukotic). 

Regarding claim 9, the rejection of claim 6 is incorporated. The cited references in combination do not explicitly teach: “wherein the first and second kernels are associated with the same activation function” Vukotic teaches: two modalities of NNs that are tied together such that the variables between the two NNs are the same, including the weight matrices (Vukotic Section 2.2). Whereby it is commonly known that a kernel is represented by matrices. Thus, the weights in matrix form represents a kernel. The matrices of the two modalities having associated activation functions that are similar since the variables in the tied modalities are the same (Vukotic Section 2.2). 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network system and merged weights in the combined cited references to include the activation functions in Vukotic. Doing so would enable “[o]ur proposed deep neural network architecture with tied weights is trained bidirectionally from the first modality to the second and from the second modality to the first and creates a crossmodal mapping between the two representation spaces that can be successfully used in multimodal query expansion. Due to its enforced symmetry, a joint multimodal embedding is also created that further improves multimodal data representation.” (Vukotic Section 4). 

Claim 10-14 are rejected under 35 U.S.C. 103 as being unpatentable over Fukuda et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2018/0053087, hereinafter Fukuda) and He et. al., “Multi-Task Zipping via Layer-wise Neuron Sharing” (hereinafter He) in view of Alwani et. al., “Fused-Layer CNN Accelerators” (hereinafter Alwani).

Regarding claim 10, the rejection of claim 1 is incorporated. The cited references in combination do not explicitly teach: “wherein the intermediate feature data comprises a feature map having first and second regions corresponding respectively to the first and second sets of weights”. Alwani teaches: a process for fusing layers in a NN that includes determining intermediate feature data comprising intermediate feature maps that show a first and second tile region locations based on data convolution using respective weight filter matrices from layers 1 and 2 (Alwani Sections III(A) and (B)) as depicted in Fig. 3. Wherein the weight filter matrices from the two layers being fused denote the first and second sets of weights. See also Alwani Section II and Fig. 1: giving a general overview of convolution in a NN. 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network system and merged weights in the combined cited references to include the intermediate feature data in Alwani. Doing so would enable one “to fuse the processing of multiple CNN [convolutional neural network] layers by modifying the order in which the input data are brought on chip, enabling caching of intermediate data between the evaluation of adjacent CNN layers” (Alwani Abstract). 

Regarding claim 11, the rejection of claim 10 is incorporated. Alwani further teaches:
The neural network system of claim 10, wherein the merged layer is configured to output metadata indicating the first and second regions (Alwani Sections III(B) and (D): describing a pyramid approach as shown in Fig. 4 for determining the fused layer and dimensions, i.e. metadata, corresponding to tile region locations of the feature maps (as previously described and shown in Fig. 3). Wherein the merged layer as determined via the pyramid approach provides dimensions, i.e. metadata, describing the tile region locations as shown in Fig. 3. See also Section V: describing an exploration tool and a tradeoff evaluation for performing an exploration and cost-benefit analysis of the fused layer using the pyramid approach to determine which layers should be fused and the dimensions, i.e. metadata, corresponding to the tile region locations as previously described and shown in Fig. 3. That is, the pyramid approach enables a process to “analyze the effect of fusing two or more layers by starting from the output and working backwards to calculate the dimensions of the pyramid at each level” (Alwani Section III(B)). The dimensions relating to the layers being fused and the related tile region locations and feature maps as previously described.).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network system and merged weights in the combined cited references to include the output indication regions in the feature map in Alwani. Doing so would enable “[b]ased on the pyramid’s dimensions, we evaluate the costs in terms of the required storage and arithmetic operations, as well as the benefit in the amount of off-chip data transfer avoided. Given a set of layers to fuse, we start from the final layer and work backwards to find the dimensions of the pyramid.” (Alwani Section III(B)). Wherein “[q]uantifying the benefits of these methods is straightforward. For each intermediate feature map within the fused-layer pyramid, we can count the data transfer saved by avoiding writing and reading intermediate feature maps to off-chip memory. For example, by fusing layers 1 and 2 in Figure 3, we avoid writing and reading back the intermediate feature map (3 × 3 ×M points) for each CNN evaluation.” (Alwani Section III(B)). 

Regarding claim 12, the rejection of claim 10 is incorporated. Fukuda teaches:
The neural network system of claim 10, wherein the neural network system comprises the at least one subsequent layer ([0095]-[0096] and [0099]: describing that a system, corresponding to a computer system, “reads a computer program for executing the trained front-end NN [neural network] and a back-end NN from the storage and then combines the back-end NN with the trained front-end NN” to generate a combined NN. Wherein the NN system comprises various subsequent layers of the NN, e.g. one or more hidden layers or an output layer of the front-end NN ([0030] and [0037]) of the back-end NN ([0043] and [0046]) as shown in Fig. 2.)….

While the cited reference Fukuda teaches the above limitations of claim 12, it does not explicitly teach: “the at least one subsequent layer being configured to obtain a corresponding region of the first and second regions of the feature map.” Alwani further teaches: that the fused layer, derived from fusing the first and second layers, obtains first and second tile region locations of the feature map as shown in Fig. 3 as part of the fusion process (Alwani Sections III(A) and (B)).).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network system in the cited reference to include the fused layer and feature map in Alwani. Doing so would enable one “to fuse the processing of multiple CNN [convolutional neural network] layers by modifying the order in which the input data are brought on chip, enabling caching of intermediate data between the evaluation of adjacent CNN layers” (Alwani Abstract). 

Regarding claim 13, the rejection of claim 1 is incorporated. Alwani further teaches:
The neural network system of claim 1, wherein the intermediate feature data comprises first and second feature maps (Alwani Section III(A): describing the intermediate feature map comprising first and second features maps, such as the input feature maps and the output feature maps as shown in Fig. 3.), 
(Alwani Sections III(A) and (B): describing respective weight filter matrices from layers 1 and 2 as depicted in Fig. 3. Wherein the weight filter matrices from the two layers being fused denote the first and second sets of weights and the.),
 which are separately output from the merged layer (Alwani Section IV(B): describing the computation for implementing the fused layer, wherein the computation involves calculating the corresponding weights. The weights comprising the weight matrix filters that relate to the tile region locations in the feature maps as previously described in Sections III(A) and (B) and shown in Fig. 3.).
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network system and merged weights in the combined cited references to include the intermediate data with feature maps in Alwani. Doing so would enable “[t]he key to the fused-layer strategy for CNN evaluation is in the management of intermediate data. In our implementation, this functionality is provided by the intermediate data buffers and the reuse module that manages them.” (Alwani Section IV(C).). 

Regarding claim 14, the rejection of claim 13 is incorporated. Fukuda teaches:
The neural network system of claim 13, wherein the neural network system comprises the at least one subsequent layer ([0095]-[0096] and [0099]: describing that a system, corresponding to a computer system, “reads a computer program for executing the trained front-end NN [neural network] and a back-end NN from the storage and then combines the back-end NN with the trained front-end NN” to generate a combined NN. Wherein the NN system comprises various subsequent layers of the NN, e.g. one or more hidden layers or an output layer of the front-end NN ([0030] and [0037]) as shown in Fig. 2.), 
the at least one subsequent layer comprising a first subsequent layer of the first neural network and a second subsequent layer of the first neural network ([0030] and [0037]: describing that the one or more hidden layers of the front-end NN, i.e. first NN, denote a first subsequent layer and a second subsequent layer of the front-end NN. Alternatively, one out of the one or more hidden layers and the output layer can also denote the first and second subsequent layers ([0030] and [0037]). Wherein the hidden layers and output layers are subsequent layers since they occur at subsequent locations in the NN as shown in Fig. 2. It is also commonly known that hidden layers and output layer occur at locations after an initial layer location and thus denote subsequent layers.), ….

While the cited reference Fukuda teaches the above limitations of claim 14, it does not explicitly teach: “the merged layer being configured to output the first and second feature maps to the first and second subsequent layers respectively.” Alwani further teaches: a pyramid approach for determining the fused layer that outputs the first and second feature maps corresponding to the first and second convolutional layers, wherein the fused layer is generated from fusing the first and second convolutional layers (Alwani Section III(A) and Fig. 3). It is commonly known that convolutional layers are a type of hidden layer and thus are subsequent layers since they occur at subsequent locations after an initial layer location in the NN.  
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the neural network system in the cited reference  Alwani. Doing so would enable one “to fuse the processing of multiple CNN [convolutional neural network] layers by modifying the order in which the input data are brought on chip, enabling caching of intermediate data between the evaluation of adjacent CNN layers” (Alwani Abstract). 

Claims 16 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Fukuda et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2018/0053087, hereinafter Fukuda) in view of Xu et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2020/0301739, hereinafter Xu) and He et. al., “Multi-Task Zipping via Layer-wise Neuron Sharing” (hereinafter He). 

Regarding claim 16, Fukuda teaches:
A data processing method comprising ([0136]-[0137]: describing a data process via a “data processing apparatus”, e.g. computer with executable instructions for performing the data process. Wherein the data process includes training NNs (Abstract and [0099]).): 
receiving data in the form of sequential data frames ([0041]: describing input data in the form of sequential frames, i.e. “neighboring left and right frames” as shown in Fig. 3.); 
… 
wherein the first and second neural networks together form a neural network system in accordance with claim 1 ([0095]-[0096] and [0099]: describing that a system, corresponding to a computer system, “reads a computer program for executing the trained front-end NN [neural network] and a back-end NN from the storage and then combines the back-end NN with the trained front-end NN” to generate a combined NN. The rejection of claim 1 is also incorporated, which includes rejections using Fukuda and He.), 
([0048]-[0050]: describing a combined NN comprising of a front-end NN (i.e. a first NN) and a back-end NN (i.e. a second NN) which is joined via a joint layer that is shared between the two NNs. The NNs are depicted in Fig. 2.).

While the cited reference Fukuda teaches the above limitations of claim 16, it does not explicitly teach: “performing data processing in processing cycles, wherein in a given processing cycle, data processing is performed using one or more neural networks in a selected data frame; and configuring a first processing cycle and a second processing cycle by: executing a first neural network of the one or more neural networks in the first processing cycle; and executing the first neural network and a second neural network of the one or more neural networks together in the second processing cycle,” on lines 6-10. Xu teaches:
“performing data processing in processing cycles (Xu [0017] and [0037]-[0038]: describing cores that includes “processing elements” and a workload analyzer for processing NN data. Wherein the processing occurs via execution cycles in a pipeline time divisional manner (Xu [0042]-[0045] and [0056]).), wherein in a given processing cycle, data processing is performed using one or more neural networks in a selected data frame (Xu [0042]-[0043]: describing respective execution cycles of the respective first and second NNs that occur with particular pipeline time divisionals, i.e. selected data frame, as shown in Figs. 5B and 5C.); and
configuring a first processing cycle and a second processing cycle by (Xu [0037]-[0038]: describing a “[w]orkload analyzer 301 [that] can determine an amount of resources for executing each neural network of the received two or more neural networks”. Wherein the workload analyzer operates in conjunction with a scheduler, a “resource evaluator”, and a “resource usage optimizer” to determine execution cycles in a pipeline time divisional manner (Xu [0042]-[0045] and [0056] and Fig. 3).): 
executing a first neural network of the one or more neural networks in the first processing cycle (Xu [0042] and [0052]: describing execution of the first NN at a “first execution cycle”.); and 
executing the first neural network and a second neural network of the one or more neural networks together in the second processing cycle (Xu [0050], [0055]-[0056], and [0074]: describing a concurrent processing of the first and second NNs. Wherein the concurrent processing occurs at a different cycle of execution than the first execution cycle of the first NN, e.g. after the first execution cycle of the first NN (Xu [0052]).),”. 
Thus, it would have been obvious to Person Having Ordinary Skill in the Art (PHOSITA) before the effective filing date (EFD) to modify the data process in the cited reference to include the execution cycles of the NNs in Xu. Doing so would enable “a method for allocating resources of an accelerator to two or more neural networks for execution. The two or more neural networks may include a first neural network and a second neural network. The method comprises analyzing workloads of the first neural network and the second neural network, wherein the first neural network and second neural network each includes multiple computational layers, evaluating computational resources of the accelerator for executing each computational layer of the first and second neural networks, and scheduling computational resources of the accelerator to execute one computational layer of the multiple computation layers of the first neural network and to execute one or more computational layers of the multiple computational layers of the second neural network.” (Xu Abstract).  

Regarding claim 18, the rejection of claim 16 is incorporated. Fukuda teaches:
The data processing method of claim 16, wherein the data in the form of sequential data frames comprises one or more of: image data; video data; and audio data ([0041]: describing acoustic (i.e. a type of audio) data comprising “acoustic features [that] may also include neighboring left and right frames as the acoustic context” that occur sequentially as shown in Fig. 3.).

Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicant's disclosure:
Baker et. al. (U.S. Pat. App. Pre-Grant Pub. No. 2020/0401869): describing a self-organizing process for neural networks (NNs). Wherein the process enables merging of neural networks using a “learning coach” to compute the objective of the various networks and an improvement of the networks with the merger. The NNs can be merged via “node soft-tying”, i.e. creation of arcs for the nodes of the various NNs to be merged as well as determining if the nodes have similar arc directions and parameters, e.g. weights or activations.   

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SELENE A HAEDI whose telephone number is (571)270-5762. The examiner can normally be reached M-F 11 AM - 7 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, OMAR FERNANDEZ RIVAS can be reached on (571)272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/SELENE A. HAEDI/Examiner, Art Unit 2128