Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The present application, filed on 01/19/2018. This action is in response to amendments and remarks filed on 12/31/2021. In the current amendments, claim 12, 19 are amended. Claims 8-21 are pending and have been examined.
In response to amendments and remarks filed on 12/31/2021, the 35 U.S.C. 112(b) rejection to claims 12 and 19 and 35 U.S.C. 101 rejection to claims 10, 12-14, 17 and 19-21 have been withdrawn.
Claim Interpretation
 “computer readable storage medium” in claims 8 -14 is interpreted as “non-transitory computer readable storage medium” in view of [00111], which recites “A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se.”
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 8 - 21 provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 - 7 of copending Application No. 16/440,064 (reference application). Although the claims at issue are not identical, they are not patentably distinct from each other because all the claimed limitations recited in the present application are transparently found in the copending Application N0. 16/401,252 with obvious wording variations.
This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.
Regarding Claim 8 – 14
Instant Application No. 15/875,575 
Application No. 16/440,064 (reference application) 
Claim 8: 
     A computer program product for generating a neural network, the computer program product comprising a computer readable storage medium having program code embodied therewith, the program code comprising the programming instructions for:

      preparing a plurality of initial neural networks, each of which comprises an input layer containing one or more input nodes, a middle layer containing one or more middle nodes, and an output layer containing one or more output nodes; and 


  generating a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks.
Claim 1: 
A method for generating a neural network, the method comprising:  






 
   2preparing, by a processor, a plurality of initial neural networks, each of which comprises 3an input layer containing one or more input nodes, a middle layer containing one or more middle 4nodes, and an output layer containing one or more output nodes; and 
 
     5generating, by the processor, a new neural network comprising a new middle layer 6containing one or more middle nodes based on the middle nodes of the middle layers of the 7plurality of initial neural networks.
Claim 9:  
    The computer program product as recited in claim 8, wherein the plurality of initial neural networks comprises N initial neural networks, N being an integer larger than 1, and wherein the generating of the new neural network comprises the programming instructions for: 


     Selecting one or more of the middle nodes of the N initial neural networks; and 

  including the selected one or more middle nodes in the new middle layer of the new neural network.
Claim 2: 
The method as recited in claim 1, wherein the plurality of initial neural networks 2comprises N initial neural networks, N being an integer larger than 1, and wherein the generating 3of the new neural network comprises:  


   4selecting one or more of the middle nodes of the N initial neural networks; and  

   5including the selected one or more middle nodes in the new middle layer of the new 6neural network.
Claim 10: 
    The computer program product as recited in claim 9, wherein the selecting of the one or more of the middle nodes of the N initial neural networks comprises the programming instructions for: 

     obtaining K different sets of training data, K being an integer more than 1; 

      performing supervised training on the N initial neural networks with each of the K different sets of training data to obtain K training results for each of the N initial neural networks; and 

     selecting at least one of the middle nodes in the middle layer of the N initial neural networks using the K training results, such that selected middle nodes contribute to an output from the output layer to a greater degree than non-selected middle nodes.
Claim 3: 
     The method as recited in claim 2, wherein the selecting of the one or more of the middle 2nodes of the N initial neural networks comprises: 


    3obtaining K different sets of training data, K being an integer more than 1;  

      4performing supervised training on the N initial neural networks with each of the K 5different sets of training data to obtain K training results for each of the N initial neural networks;  6and 
 
    7selecting at least one of the middle nodes in the middle layer of the N initial neural 8networks using the K training results, such that selected middle nodes contribute to an output 9from the output layer to a greater degree than non-selected middle nodes.
Claim 11: 
     The computer program product as recited in claim 9, wherein the middle layer of each of the plurality of initial neural network comprises L middle nodes, L being an integer larger than 2, and wherein the number of the middle nodes in the new middle layer is equal to or less than L.


Claim 4:
The method as recited in claim 2, wherein the middle layer of each of the plurality of 2initial neural network comprises L middle nodes, L being an integer larger than 2, and wherein 3the number of the middle nodes in the new middle layer is equal to or less than L.
Claim 12:
     The computer program product as recited in claim 9, wherein the generating of the new neural network further comprises the programming instructions for: 

      performing unsupervised training on the selected middle nodes, the unsupervised training comprising biasing the middle nodes such that certain middle nodes are avoided.

Claim 5:
 The method as recited in claim 2, wherein the generating of the new neural network 2 further comprises:  P201703879US02Page 23 of 25PATENT3


     performing unsupervised training on the selected middle nodes, the unsupervised training 4 comprising biasing the middle nodes such that middle nodes that are similar in relation to 5connections to the input nodes are avoided.
Claim 13:  
         The computer program product as recited in claim 8, wherein the preparing of the plurality of initial neural networks comprises the programming instructions for: 

         obtaining N initial conditions, N being an integer larger than 1, each condition corresponding to one of the initial neural networks; 

     performing unsupervised training of the middle layer of each initial neural network using the corresponding initial condition; and  3P201703879US01PATENT 

    performing supervised training of the output layer of each initial neural network using a set of training data.


Claim 6: 
     The method as recited in claim 1, wherein the preparing of the plurality of initial neural 2networks comprises:  



     3obtaining N initial conditions, N being an integer larger than 1, each condition 4corresponding to one of the initial neural networks; 

   5performing unsupervised training of the middle layer of each initial neural network using 6the corresponding initial condition; and 

     7performing supervised training of the output layer of each initial neural network using a 8set of training data.
Claim 14: 
     The computer program product as recited in claim 8, wherein the preparing of the plurality of initial neural networks comprises the programming instructions for: 

    obtaining M initial conditions, M being an integer larger than 2, each condition corresponding to one of M candidate neural networks; 

     performing unsupervised training of the middle layer of each candidate neural network using the corresponding initial condition;  


     performing supervised training of the output layer of each candidate neural network using a set of training data; 

    evaluating a performance of each candidate neural network; and 

    selecting N initial neural networks from among the M candidate neural networks using the performances, N being an integer larger than 1 and smaller than M.
Claim 7: 
      The method as recited in claim 1, wherein the preparing of the plurality of initial neural 2networks comprises:  


     3obtaining M initial conditions, M being an integer larger than 2, each condition 4corresponding to one of M candidate neural networks; 

     5performing unsupervised training of the middle layer of each candidate neural network 6using the corresponding initial condition;  


      7performing supervised training of the output layer of each candidate neural network using 8a set of training data;  

   9evaluating a performance of each candidate neural network; and  
  
   10selecting N initial neural networks from among the M candidate neural networks using 11 the performances, N being an integer larger than 1 and smaller than M.
Claim 15: 
     A system, comprising: a memory unit for storing a computer program for generating a neural network; and 

    a processor coupled to the memory unit, wherein the processor is configured to execute the program instructions of the computer program comprising: 

   preparing a plurality of initial neural networks, each of which comprises an input layer containing one or more input nodes, a middle layer containing one or more middle nodes, and an output layer containing one or more output nodes; and 

    generating a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks.
Claim 1  : 
A method for generating a neural network, the method comprising:  







    2preparing, by a processor, a plurality of initial neural networks, each of which comprises 3an input layer containing one or more input nodes, a middle layer containing one or more middle 4nodes, and an output layer containing one or more output nodes; and 

      5generating, by the processor, a new neural network comprising a new middle layer 6containing one or more middle nodes based on the middle nodes of the middle layers of the 7plurality of initial neural networks.
Claim 16:  
   The system as recited in claim 15, wherein the plurality of initial neural networks comprises N initial neural networks, N being an integer larger than 1, and wherein the generating of the new neural network comprises: 

    selecting one or more of the middle nodes of the N initial neural networks; and 

   including the selected one or more middle nodes in the new middle layer of the new neural network.
Claim 2: 
The method as recited in claim 1, wherein the plurality of initial neural networks 2comprises N initial neural networks, N being an integer larger than 1, and wherein the generating 3of the new neural network comprises:  

   4selecting one or more of the middle nodes of the N initial neural networks; and  

   5including the selected one or more middle nodes in the new middle layer of the new 6neural network.
Claim 17:  
    The system as recited in claim 16, wherein the selecting of the one or more of the middle nodes of the N initial neural networks comprises:

     obtaining K different sets of training data, K being an integer more than 1; 

    performing supervised training on the N initial neural networks with each of the K different sets of training data to obtain K training results for each of the N initial neural networks; and 

    selecting at least one of the middle nodes in the middle layer of the N initial neural networks using the K training results, such that selected middle nodes contribute to an output from the output layer to a greater degree than non-selected middle nodes.
Claim 3: 
     The method as recited in claim 2, wherein the selecting of the one or more of the middle 2nodes of the N initial neural networks comprises: 

    3obtaining K different sets of training data, K being an integer more than 1;  

      4performing supervised training on the N initial neural networks with each of the K 5different sets of training data to obtain K training results for each of the N initial neural networks;  6and 

    7selecting at least one of the middle nodes in the middle layer of the N initial neural 8networks using the K training results, such that selected middle nodes contribute to an output 9from the output layer to a greater degree than non-selected middle nodes.
Claim 18: 
The system as recited in claim 16, wherein the middle layer of each of the plurality of initial neural network comprises L middle nodes, L being an integer larger than 2, and wherein the number of the middle nodes in the new middle layer is equal to or less than L.
Claim 4:
The method as recited in claim 2, wherein the middle layer of each of the plurality of 2initial neural network comprises L middle nodes, L being an integer larger than 2, and wherein 3the number of the middle nodes in the new middle layer is equal to or less than L.
Claim 19:  
The system as recited in claim 16, wherein the generating of the new neural network further comprises: 

   performing unsupervised training on the selected middle nodes, the unsupervised training comprising biasing the middle nodes such that certain middle nodes are avoided.
Claim 5:
 The method as recited in claim 2, wherein the generating of the new neural network 2 further comprises:  P201703879US02Page 23 of 25PATENT3

    performing unsupervised training on the selected middle nodes, the unsupervised training 4 comprising biasing the middle nodes such that middle nodes that are similar in relation to 5connections to the input nodes are avoided.
Claim 20:  
  The system as recited in claim 15, wherein the preparing of the plurality of initial neural networks comprises:  5P201703879US01PATENT 

  obtaining N initial conditions, N being an integer larger than 1, each condition corresponding to one of the initial neural networks; 

   performing unsupervised training of the middle layer of each initial neural network using the corresponding initial condition; and

     performing supervised training of the output layer of each initial neural network using a set of training data.
Claim 6: 
     The method as recited in claim 1, wherein the preparing of the plurality of initial neural 2networks comprises:  

     3obtaining N initial conditions, N being an integer larger than 1, each condition 4corresponding to one of the initial neural networks; 

   5performing unsupervised training of the middle layer of each initial neural network using 6the corresponding initial condition; and 
     7performing supervised training of the output layer of each initial neural network using a 8set of training data.
Claim 21: 
    The system as recited in claim 15, wherein the preparing of the plurality of initial neural networks comprises: 

   obtaining M initial conditions, M being an integer larger than 2, each condition corresponding to one of M candidate neural networks; 
   
      performing unsupervised training of the middle layer of each candidate neural network using the corresponding initial condition; 


   performing supervised training of the output layer of each candidate neural network using a set of training data; 

   evaluating a performance of each candidate neural network; and 

   selecting N initial neural networks from among the M candidate neural networks using the performances, N being an integer larger than 1 and smaller than M.
Claim 7: 
      The method as recited in claim 1, wherein the preparing of the plurality of initial neural 2networks comprises:  
    
 3obtaining M initial conditions, M being an integer larger than 2, each condition 4corresponding to one of M candidate neural networks; 
  
   5performing unsupervised training of the middle layer of each candidate neural network 6using the corresponding initial condition;  

      7performing supervised training of the output layer of each candidate neural network using 8a set of training data; 

   9evaluating a performance of each candidate neural network; and  

  10selecting N initial neural networks from among the M candidate neural networks using 11 the performances, N being an integer larger than 1 and smaller than M.


Claim 8 of the current application differs from claim 1 of the reference application in that claim 8 (instant) recites “A computer program product for generating a neural network, the computer program product comprising a computer readable storage medium having program code embodied therewith, the program code comprising the programming instructions for:” whereas Claim 1 (reference) recites “A method for generating a neural network, the method comprising“. It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to implement the method of claim 1 of the reference application as a computer program product by generating neural network of the method utilizing computer with generic computer components. 
Claim 12 of the current application differs from claim 5 of the reference application in that claim 12 (instant) recites  “middle nodes such that certain middle nodes are avoided” whereas Claim 5 (reference) recites “the middle nodes such that middle nodes that are similar in relation to 5connections to the input nodes are avoided”.  The difference between the recitations are minor and do not distinguish the overall appearance of one over the other; the reference’s middle nodes are avoided.
Dependent claims 9-14 of the instant application are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 2-7 of copending Application No. 16/440,064 (reference application) for the same rationale as discussed with respect to instant application claim 8. 
Claim 15 of the instant application differs from claim 1 of the reference application in that claim 15 (instant) “A system, comprising: a memory unit for storing a computer program for generating a neural network; and a processor coupled to the memory unit, wherein the processor is configured to execute the program instructions of the computer program comprising” whereas claim 1 (reference) recites “A method for generating a neural network, the method comprising”. It would have been obvious to one of ordinary skill in the art before the effective filling date of the instant application to implement the method of claim 15 of the reference application as computer system by generating neural network the steps of the method utilizing a computer with generic computer components.   
Claim 19 of the current application differs from claim 5 of the reference application in that claim 19 (instant) recites  “middle nodes such that certain middle nodes are avoided” whereas Claim 5 (reference) recites “the middle nodes such that middle nodes that are similar in relation to 5connections to the input nodes are avoided”.  The difference between the recitations are minor and do not distinguish the overall appearance of one over the other; the reference’s middle nodes are avoided.
Dependent claims 16-21 of the instant application are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 2-7 of copending Application No. 16/440,064 (reference application) for the same rationale as discussed with respect to instant application claim 15. 

Claim Rejections - 35 USC § 101

35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 8, 9, 11, 15, 16 and 18 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Regarding Claim 8:
Claim 8 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1 Analysis: Claim 8 is directed to computer program product, which is directed to an article of manufacture, one of the statutory category.
Step 2A Prong one Analysis:  The claim is directed to a computer product for generating neural network and updating middle layer. Each of the following limitations: 
preparing …. an input layer containing one or more input nodes, a middle layer containing one or more middle nodes, and an output layer containing one or more output nodes…
generating a new neural network….a new middle layer containing one or more middle nodes….. 
as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, but for the generic computer components language (“Computer program product” and  “a computer readable storage medium having program code embodied therewith, the program code comprising the programming instructions for”), the above limitations in the context of this claim encompass preparing from a neural network each contain input, output and middle one or more nodes (correspond to evaluation with assistance of pen and pepper; for example neural network can be draw on paper with input, output and middle layers and nodes), and generating a new neural network…..a new middle layer(correspond to evaluation with assistance of pen and paper; for example, adding new middle layer is possible to draw on paper with multiple layers and nodes).   
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application. In particular, the claim only recites one additional elements that are mere instruction to implement and abstract idea on a computer, or merely uses a computer tools to perform an abstract idea. See MPEP 2106.05(f). The additional elements(s) of “Computer program product” and “a computer readable storage medium having program code embodied therewith, the program code comprising the programming instructions for” as drafted, are reciting generic computer components. The generic computer components in these steps are recited at a high – level of generality (i.e., generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.  
Step 2B Analysis:  The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components generic computer component to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instruction to apply an exception using a generic computer component cannot provide an inventive concept.  
Regarding Claim 9:
Claim 9 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1 Analysis: Claim 9 is directed to computer program product, which is directed to an article of manufacture, one of the statutory category.
Step 2A Prong one Analysis:  The claim is directed to a computer product for generating neural network and updating middle layer. Each of the following limitations: 
selecting one or more of the middle nodes of the N initial neural network. 
including the selected one or more middle nodes in the new middle layer of the new neural network.  
as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, but for the generic computer components language (“Computer program product” and “a computer readable storage medium having program code embodied therewith, the program code comprising the programming instructions for”), the above limitations in the context of this claim encompass selecting middle one or more nodes (correspond to judgment and observation), including the selected one or more middle nodes…… new neural network (correspond to evaluation with assistance of pen and paper; for example, adding the multiple layers and nodes).   
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application. In particular, the claim only recites one additional elements that are mere instruction to implement and abstract idea on a computer, or merely uses a computer tools to perform an abstract idea. See MPEP 2106.05(f). The additional elements(s) of “Computer program product” and “a computer readable storage medium having program code embodied therewith, the program code comprising the programming instructions for” as drafted, are reciting generic computer components. The generic computer components in these steps are recited at a high – level of generality (i.e., generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.  
Step 2B Analysis:  The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components generic computer component to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instruction to apply an exception using a generic computer component cannot provide an inventive concept.  
Regarding Claim 11:
Claim 11 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1 Analysis: Claim 11 is directed to computer program product, which is directed to an article of manufacture, one of the statutory category.
Step 2A Prong one Analysis:  The claim is directed to a computer product for generating neural network and updating middle layer. Each of the following limitations: 
wherein the middle layer of each of the plurality of initial neural network comprises L middle nodes, L being an integer larger than 2, and wherein the number of the middle nodes in the new middle layer is equal to or less than L. 
as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, but for the generic computer components language (“Computer program product” and “a computer readable storage medium having program code embodied therewith, the program code comprising the programming instructions for”), the above limitations in the context of this claim encompass middle layer with L middle nodes (correspond to observation and evaluation with assistance of pen and paper; for example, middle nodes larger than 2 can be drawn on paper.)  
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application. In particular, the claim only recites one additional elements that are mere instruction to implement and abstract idea on a computer, or merely uses a computer tools to perform an abstract idea. See MPEP 2106.05(f). The additional elements(s) of “Computer program product” and “a computer readable storage medium having program code embodied therewith, the program code comprising the programming instructions for” as drafted, are reciting generic computer components. The generic computer components in these steps are recited at a high – level of generality (i.e., generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.  
Step 2B Analysis:  The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components generic computer component to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instruction to apply an exception using a generic computer component cannot provide an inventive concept.  
Regarding Claim 15:
Claim 15 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1 Analysis: Claim 15 is directed to a system, which is directed to machine, one of the statutory category.
Step 2A Prong one Analysis:  The claim is directed to a computer product for generating neural network and updating middle layer. Each of the following limitations: 
preparing …. an input layer containing one or more input nodes, a middle layer containing one or more middle nodes, and an output layer containing one or more output nodes…
generating a new neural network….a new middle layer containing one or more middle nodes….. 
as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, but for the generic computer components language (“A system”, “a memory unit for storing a computer program for generating a neural network”, and  “a processor coupled to the memory unit, wherein the processor is configured to execute the program instructions of the computer program”), the above limitations in the context of this claim encompass preparing from a neural network each contain input, output and middle one or more nodes (correspond to evaluation with assistance of pen and pepper; for example neural network can be draw on paper with input, output and middle layers and nodes), and generating a new neural network…..a new middle layer(correspond to evaluation with assistance of pen and paper; for example, adding new middle layer is possible to draw on paper with multiple layers and nodes).   
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application. In particular, the claim only recites one additional elements that are mere instruction to implement and abstract idea on a computer, or merely uses a computer tools to perform an abstract idea. See MPEP 2106.05(f). The additional elements(s) of “A system”, “a memory unit for storing a computer program for generating a neural network”, and “a processor coupled to the memory unit, wherein the processor is configured to execute the program instructions of the computer program” as drafted, are reciting generic computer components. The generic computer components in these steps are recited at a high – level of generality (i.e., generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.  
Step 2B Analysis:  The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components generic computer component to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instruction to apply an exception using a generic computer component cannot provide an inventive concept.  
Regarding Claim 16:
Claim 16 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1 Analysis: Claim 16 is directed to a system, which is directed to machine, one of the statutory category.
Step 2A Prong one Analysis:  The claim is directed to a computer product for generating neural network and updating middle layer. Each of the following limitations: 
selecting one or more of the middle nodes of the N initial neural network. 
including the selected one or more middle nodes in the new middle layer of the new neural network.  
as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components and. For example, but for the generic computer components language (“A system”, “a memory unit for storing a computer program for generating a neural network”, and  “a processor coupled to the memory unit, wherein the processor is configured to execute the program instructions of the computer program”), the above limitations in the context of this claim encompass selecting middle one or more nodes (correspond to judgment and observation), including the selected one or more middle nodes…… new neural network (correspond to evaluation with assistance of pen and paper; for example, adding the multiple layers and nodes).   
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application. In particular, the claim only recites one additional elements that are mere instruction to implement and abstract idea on a computer, or merely uses a computer tools to perform an abstract idea. See MPEP 2106.05(f). The additional elements(s) of “A system”, “a memory unit for storing a computer program for generating a neural network”, and “a processor coupled to the memory unit, wherein the processor is configured to execute the program instructions of the computer program” as drafted, are reciting generic computer components. The generic computer components in these steps are recited at a high – level of generality (i.e., generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.  
Step 2B Analysis:  The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components generic computer component to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instruction to apply an exception using a generic computer component cannot provide an inventive concept.  
Regarding Claim 18:
Claim 18 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1 Analysis: Claim 18 is directed to a system, which is directed to machine, one of the statutory category.
Step 2A Prong one Analysis:  The claim is directed to a computer product for generating neural network and updating middle layer. Each of the following limitations: 
wherein the middle layer of each of the plurality of initial neural network comprises L middle nodes, L being an integer larger than 2, and wherein the number of the middle nodes in the new middle layer is equal to or less than L. 
as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, but for the generic computer components language (“A system”, “a memory unit for storing a computer program for generating a neural network”, and  “a processor coupled to the memory unit, wherein the processor is configured to execute the program instructions of the computer program”), the above limitations in the context of this claim encompass middle layer with L middle nodes (correspond to observation and evaluation with assistance of pen and paper; for example, middle nodes larger than 2 can be drawn on paper.)  
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application. In particular, the claim only recites one additional elements that are mere instruction to implement and abstract idea on a computer, or merely uses a computer tools to perform an abstract idea. See MPEP 2106.05(f). The additional elements(s) of “A system”, “a memory unit for storing a computer program for generating a neural network”, and “a processor coupled to the memory unit, wherein the processor is configured to execute the program instructions of the computer program” as drafted, are reciting generic computer components. The generic computer components in these steps are recited at a high – level of generality (i.e., generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.  
Step 2B Analysis:  The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components generic computer component to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instruction to apply an exception using a generic computer component cannot provide an inventive concept.  
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 8 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Cao et al. (US 2016/0328388 A1) in view of Suganuma et al. “A Genetic Programming Approach to Designing Convolutional Neural Network Architectures”.
Regarding Claim 8:
Cao et al. teaches a computer program product for generating a neural network (Page 7 Column 2 “The computer program product may include a computer readable storage medium (or media) having computer readable program instructions” and page 8 Column 3 “These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks” and Page 12 Column 11 “ As shown in FIG. 5, the neural network includes input layer 502, output layer 506, and hidden layers 504. In this embodiment, there may be between zero and “n” hidden layers, where “n” is a real number greater than or equal to one. Input layer 502, output layer 506, and each hidden layer 504 includes a plurality of nodes (or “neurons”), designed as 502 a through 502 n for input layer 502, 506-1 a through 506-1 n for pseudo labels portion 508 of output layer 506, 506-2 a through 506-2 n for actual labels portion 510 of output layer 506, 504 a-a through 504 a-n for the first hidden layer 504, and 504 n-a through 504 n-n for the last hidden layer 504.” teaches computer program product that generate neural network).
the computer program product comprising a computer readable storage medium having program code embodied therewith, the program code comprising the programming instructions for (Page 7 Column 2 “The computer program product may include a computer readable storage medium (or media) having computer readable program instructions” teaches computer storage device comprising program instructions). 
preparing a plurality of initial neural networks, each of which comprises an input layer containing one or more input nodes, a middle layer containing one or more middle nodes, and an output layer containing one or more output nodes (Page 12 Column 11 “As shown in FIG. 5, the neural network includes input layer 502, output layer 506, and hidden layers 504. In this embodiment, there may be between zero and “n” hidden layers, where “n” is a real number greater than or equal to one. Input layer 502, output layer 506, and each hidden layer 504 includes a plurality of nodes (or “neurons”), designed as 502 a through 502 n for input layer 502, 506-1 a through 506-1 n for pseudo labels portion 508 of output layer 506, 506-2 a through 506-2 n for actual labels portion 510 of output layer 506, 504 a-a through 504 a-n for the first hidden layer 504, and 504 n-a through 504 n-n for the last hidden layer 504.” Teaches neural network contain input, output and middle layer with each has one or more nodes).
Cao et al. doesn’t teach and generating a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks.
However, Suganuma et al. teaches and generating a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks. (Page 499 Section 3.2 Evolutionary Algorithm “we want to evaluate some candidate solutions in parallel at each generation. Therefore, we apply the mutation operator” and Page 499 Figure 1, teaches evaluating middle nodes of the middle layers and Figure 1 teaches middle layer (white square in figure 1) containing one or more nodes). 
Cao et al. and Suganuma et al. are analogous art because they are directed to classification using neural network.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, and generating a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks as taught by Suganuma et al. to the disclosed invention of Cao et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method.”(Suganuma et al., Page 498 Section 3 Cnn Architecture Design Using Cartesian Genetic Programming).
Regarding Claim 15:
Cao et al. teaches a system, comprising: a memory unit for storing a computer program for generating a neural network (Page 7 Column 2 “the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick” and Page 12 Column 11 “ As shown in FIG. 5, the neural network includes input layer 502, output layer 506, and hidden layers 504. In this embodiment, there may be between zero and “n” hidden layers, where “n” is a real number greater than or equal to one. Input layer 502, output layer 506, and each hidden layer 504 includes a plurality of nodes (or “neurons”), designed as 502 a through 502 n for input layer 502, 506-1 a through 506-1 n for pseudo labels portion 508 of output layer 506, 506-2 a through 506-2 n for actual labels portion 510 of output layer 506, 504 a-a through 504 a-n for the first hidden layer 504, and 504 n-a through 504 n-n for the last hidden layer 504” and Page 12 Column 11 “the present invention include a system”  teaches computer system  that generate neural network).
and a processor coupled to the memory unit (Page 3 Figure 1 teaches processor 204 and memory 208 connected).
wherein the processor is configured to execute the program instructions of the computer program comprising (Page 7 Column 2 “The computer program product may include a computer readable storage medium (or media) having computer readable program instructions” and Page 8 Column 3 “These computer readable program instructions may be provided to a processor of a general purpose computer,…. such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, ” teaches processor provide program instruction to computer” teaches computer storage device comprising program instructions). 
preparing a plurality of initial neural networks, each of which comprises an input layer containing one or more input nodes, a middle layer containing one or more middle nodes, and an output layer containing one or more output nodes (Page 12 Column 11 “As shown in FIG. 5, the neural network includes input layer 502, output layer 506, and hidden layers 504. In this embodiment, there may be between zero and “n” hidden layers, where “n” is a real number greater than or equal to one. Input layer 502, output layer 506, and each hidden layer 504 includes a plurality of nodes (or “neurons”), designed as 502 a through 502 n for input layer 502, 506-1 a through 506-1 n for pseudo labels portion 508 of output layer 506, 506-2 a through 506-2 n for actual labels portion 510 of output layer 506, 504 a-a through 504 a-n for the first hidden layer 504, and 504 n-a through 504 n-n for the last hidden layer 504.” Teaches neural network contain input, output and middle layer with each has one or more nodes).
Cao et al. doesn’t teach and generating a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks.
However, Suganuma et al. teaches and generating a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks. (Page 499 Section 3.2 Evolutionary Algorithm “we want to evaluate some candidate solutions in parallel at each generation. Therefore, we apply the mutation operator” and Page 499 Figure 1 teaches evaluating middle nodes of the middle layers and Figure 1 teaches middle layer (white square in figure 1) containing one or more nodes). 
Cao et al. and Suganuma et al. are analogous art because they are directed to classification using neural network.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, and generating a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks as taught by Suganuma et al. to the disclosed invention of Cao et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method.”(Suganuma et al., Page 498 Section 3 Cnn Architecture Design Using Cartesian Genetic Programming).
Claims 9, 10, 11 and 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over Cao et al. (US 2016/0328388 A1) in view of Suganuma et al. “A Genetic Programming Approach to Designing Convolutional Neural Network Architectures” and further in view of Kim et al. (US 2016/003050 A1). 
Regarding Claim 9: 
Cao et al. in view of Suganuma et al. teaches the computer program product as recited in claim 8.
Cao et al. further teaches wherein the plurality of initial neural networks comprises N initial neural networks (Page 7 Column 2 “The computer program product may include a computer readable storage medium (or media) having computer readable program instructions” and Page 12 Column 11 “As shown in FIG. 5, the neural network includes input layer 502, output layer 506, and hidden layers 504. In this embodiment, there may be between zero and “n” hidden layers, where “n” is a real number greater than or equal to one. Input layer 502, output layer 506, and each hidden layer 504 includes a plurality of nodes (or “neurons”), designed as 502 a through 502 n for input layer 502, 506-1 a through 506-1 n for pseudo labels portion 508 of output layer 506, 506-2 a through 506-2 n for actual labels portion 510 of output layer 506, 504 a-a through 504 a-n for the first hidden layer 504, and 504 n-a through 504 n-n for the last hidden layer 504.” Teaches initial neural networks more than 1).
and wherein the generating of the new neural network comprises the programming instructions for (page 8 Column 3 “These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks” and Page 12 Column 11 “ As shown in FIG. 5, the neural network includes input layer 502, output layer 506, and hidden layers 504. In this embodiment, there may be between zero and “n” hidden layers, where “n” is a real number greater than or equal to one. Input layer 502, output layer 506, and each hidden layer 504 includes a plurality of nodes (or “neurons”), designed as 502 a through 502 n for input layer 502, 506-1 a through 506-1 n for pseudo labels portion 508 of output layer 506, 506-2 a through 506-2 n for actual labels portion 510 of output layer 506, 504 a-a through 504 a-n for the first hidden layer 504, and 504 n-a through 504 n-n for the last hidden layer 504.” teaches computer program product that generate neural network).
Suganuma et al. further teaches selecting one or more of the middle nodes of the N initial neural networks; and including the selected one or more middle nodes in the new middle layer of the new neural network (Page 500 Section 3.2 Evolutionary Algorithm “Select an elite individual from the set of P and C, and then replace P with the elite individual” and Page 499 Figure 1, as shown in figure 1 from the initial middle layer (white square box) select middle nodes and selected middle nodes are include in neural network).
Cao et al. and Suganuma et al. are analogous art because they are directed to classification using neural network.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, selecting one or more of the middle nodes of the N initial neural networks; and including the selected one or more middle nodes in the new middle layer of the new neural network as taught by Suganuma et al. to the disclosed invention of Cao et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method.”(Suganuma et al., Page 498 Section 3 Cnn Architecture Design Using Cartesian Genetic Programming).
Cao in view of Suganuma does not explicitly teach N being integer larger than 1. 
However, Kim et al. teaches N being integer larger than 1 (Page 3 Paragraph 0076 “Although FIG. 1 illustrates three artificial neural networks 51, 52, and 53” teaches neural network more than 1). 
Cao et al. Suganuma et al. and Kim et al. are analogous art because they are directed to using neural networks in which a correlation between features is learned.  
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, N being integer larger than 1 as taught by Kim et al. to the disclosed invention of Cao et al in view of Suganuma et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “The image registration device 1 may perform image registration using a plurality of pre-trained artificial neural networks 50 (51,52, and 53). Although FIG. 1 illustrates three artificial neural networks 51, 52, and 53, it is understood that one or more other exemplary embodiments are not limited thereto, and the number of pre-trained artificial neural networks may be greater or less than three” and “The first image and the second image have different features. Therefore, in order to extract features from the first image and the second image, the different artificial neural networks 51 and 52 may be used” (Kim et al. Page 3 Paragraph 0076 and Page 4 Paragraph 0085).
Regarding Claim 10: 
Cao et al. in view of Suganuma et al. and further in view of Kim et al. the computer program product as recited in claim 9.  
Cao et al. further teaches comprises the programming instructions for (page 8 Column 3 “These computer readable program instructions may be provided to a processor of a general purpose computer,” teaches computer general purpose of programing instruction). 
Suganuma et al. further teaches wherein the selecting of the one or more of the middle nodes of the N initial neural networks (Page 500 Section 3.2 Evolutionary Algorithm “Select an elite individual from the set of P and C, and then replace P with the elite individual” and Page 499 Figure 1, as shown in figure 1 from the initial middle layer (white square box) select middle nodes and selected middle nodes are include in neural network).
and selecting at least one of the middle nodes in the middle layer (Page 500 Section 3.2 Evolutionary Algorithm   “(3) Train the λ CNNs represented by offsprings C in parallel, and assign the validation accuracies as the fitness. (4) Apply the neutral mutation to parent P. (5) Select an elite individual from the set of P and C, and then replace P with the elite individual” and Page 499 Figure 1, as shown in figure 1 from the initial middle layer ( white square box), best middle nodes picked).
such that selected middle nodes contribute to an output from the output layer to a greater degree than non-selected middle nodes (Page 599 “not all of the nodes are connected to the output nodes. Node No. 5 on the left side of Figure 2 is an inactive node” and Page 499 – 500 Section 3.2 Evolutionary Algorithm “To efficiently use the computational resource, we want to evaluate some candidate solutions in parallel at each generation. Therefore, we apply the mutation operator until at least one active node changes for reproducing the candidate solution…Select an elite individual from the set of P and C, and then replace P with the elite individual. (6) Return to step 2 until a stopping criterion is satisfied.” and page 499 Fig 1. Teaches connection between in output layer and middle layers (white square box as shown in figure 1) and on active node we apply the mutation, selected active node provide output)).
Cao et al. and Suganuma et al. are analogous art because they are directed to classification using neural network.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, wherein the selecting of the one or more of the middle nodes of the N initial neural networks… and selecting at least one of the middle nodes in the middle layer ….such that selected middle nodes contribute to an output from the output layer to a greater degree than non-selected middle nodes  as taught by Suganuma et al. to the disclosed invention of Cao et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method.”(Suganuma et al., Page 498 Section 3 Cnn Architecture Design Using Cartesian Genetic Programming).
Kim et al. further teaches obtaining K different sets of training data, K being an integer more than 1 (Page 4 paragraph [0093] “In this manner, training data 41 and 42 used for learning of the first artificial neural network 51 and the second artificial neural network 52 are different. Even when the first artificial neural network 51 and the second artificial neural network 52 have the same structure, the connection strengths W11 to W14 of the first artificial neural network 51 and the connection strengths W21 to W24 of the second artificial neural network 52 are different” and Figure 1 Teaches learning device (40) containing more than 1 training data (41, 42, 43)).
performing supervised training on the N initial neural networks with each of the K different sets of training data (Page 7 Paragraph [0157] “The third artificial neural network 53 may have a multilayer perceptron structure. For example, as illustrated in FIG. 11, the third artificial neural network 53 may include a plurality of conversion layers L31 to L35” and Page 7 paragraph [0159] “converted connection strengths W31 to W36 of units are determined by supervised learning” and Figure 14 and Figure 1, teaches train neural networks (for example 53 in figure 1) using supervised learning with more than 1 training data (for example 43 in figure 1)). 
To obtain K training results for each of the N initial neural networks (Page 8 paragraph [0166-0167] “The learning device 40 performs multilayer learning of the artificial neural network 50 (operation S523)...... In the multilayer learning operation, the converted connection strengths W31 to W36 of the third artificial neural network 53 are adjusted” and Page 4 Paragraph [0093] “of updating the connection strengths W11 to W14 using training data 41 and 42 and” and “Page 4 paragraph [0093] “In this manner, training data 41 and 42 used for learning of the first artificial neural network51 and the second artificial neural network 52 are different.” Teaches multiple neural networks receive multiple training data).
of the N initial neural networks using the K training results (Page 4 paragraph [paragraph [0093] “In this manner, training data 41 and 42 used for learning of the first artificial neural network 51 and the second artificial neural network 52 are different. Even when the first artificial neural network 51 and the second artificial neural network 52 have the same structure, the connection strengths W11 to W14 of the first artificial neural network 51 and the connection strengths W21 to W24 of the second artificial neural network 52 are different” and Page 4 Paragraph [0093] “of updating the connection strengths W11 to W14 using training data 41 and 42” and Figure 1. Teaches using training data update neural network connection).
Cao et al., Suganuma et al. and Kim et al. are analogous art because they are directed to using neural networks in which a correlation between features is learned.  
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, obtaining K different sets of training data, K being an integer more than 1; performing supervised training on the N initial neural networks with each of the K different sets of training data to obtain K training results for each of the N initial neural networks ….. of the N initial neural networks using the K training results as taught by Kim et al. to the disclosed invention of Cao et al. in view of Suganuma et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “The image registration device 1 may perform image registration using a plurality of pre-trained artificial neural networks 50 (51,52, and 53). Although FIG. 1 illustrates three artificial neural networks 51, 52, and 53, it is understood that one or more other exemplary embodiments are not limited thereto, and the number of pre-trained artificial neural networks may be greater or less than three” and “The first image and the second image have different features. Therefore, in order to extract features from the first image and the second image, the different artificial neural networks 51 and 52 may be used” (Kim et al. Page 3 Paragraph 0076 and Page 4 Paragraph 0085).
Regarding claim 11: 
Cao et al. in view of Suganuma et al. and further in view of Kim et al. teaches the computer program product as recited in claim 9.
Suganuma et al. further teaches wherein the middle layer of each of the plurality of initial neural network comprises L middle nodes, L being an integer larger than 2, and wherein the number of the middle nodes in the new middle layer is equal to or less than L (Page 499 Figure 1. Teaches initial middle layer (white square box) nodes larger than 2 and as shown new middle layers less than initial middle nodes).
 Cao et al. and Suganuma et al. are analogous art because they are directed to classification using neural network.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, wherein the middle layer of each of the plurality of initial neural network comprises L middle nodes, L being an integer larger than 2, and wherein the number of the middle nodes in the new middle layer is equal to or less than L as taught by Suganuma et al. to the disclosed invention of Cao et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method.”(Suganuma et al., Page 498 Section 3 Cnn Architecture Design Using Cartesian Genetic Programming).
Regarding Claim 16: 
Cao et al. in view of Suganuma et al. teaches the system as recited in claim 15. 
Cao et al. further teaches wherein the plurality of initial neural networks comprises N initial neural networks (Page 7 Column 2 “The computer program product may include a computer readable storage medium (or media) having computer readable program instructions” and Page 12 Column 11 “As shown in FIG. 5, the neural network includes input layer 502, output layer 506, and hidden layers 504. In this embodiment, there may be between zero and “n” hidden layers, where “n” is a real number greater than or equal to one. Input layer 502, output layer 506, and each hidden layer 504 includes a plurality of nodes (or “neurons”), designed as 502 a through 502 n for input layer 502, 506-1 a through 506-1 n for pseudo labels portion 508 of output layer 506, 506-2 a through 506-2 n for actual labels portion 510 of output layer 506, 504 a-a through 504 a-n for the first hidden layer 504, and 504 n-a through 504 n-n for the last hidden layer 504.” Teaches initial neural networks more than 1).
and wherein the generating of the new neural network comprises the programming instructions for (page 8 Column 3 “These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks” and Page 12 Column 11 “ As shown in FIG. 5, the neural network includes input layer 502, output layer 506, and hidden layers 504. In this embodiment, there may be between zero and “n” hidden layers, where “n” is a real number greater than or equal to one. Input layer 502, output layer 506, and each hidden layer 504 includes a plurality of nodes (or “neurons”), designed as 502 a through 502 n for input layer 502, 506-1 a through 506-1 n for pseudo labels portion 508 of output layer 506, 506-2 a through 506-2 n for actual labels portion 510 of output layer 506, 504 a-a through 504 a-n for the first hidden layer 504, and 504 n-a through 504 n-n for the last hidden layer 504.” teaches computer program product that generate neural network).
	Suganuma et al. further teaches selecting one or more of the middle nodes of the N initial neural networks; and including the selected one or more middle nodes in the new middle layer of the new neural network. (Page 500 Section 3.2 Evolutionary Algorithm “Select an elite individual from the set of P and C, and then replace P with the elite individual” and Page 499 Figure 1, as shown in figure 1 from the initial middle layer ( white square box) select middle nodes and selected middle nodes are include in neural network).
Cao et al. and Suganuma et al. are analogous art because they are directed to classification using neural network.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, selecting one or more of the middle nodes of the N initial neural networks; and including the selected one or more middle nodes in the new middle layer of the new neural network as taught by Suganuma et al. to the disclosed invention of Cao et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method.”(Suganuma et al., Page 498 Section 3 Cnn Architecture Design Using Cartesian Genetic Programming).
Cao in view of Suganuma does not explicitly teach N being integer larger than 1. 
However, Kim et al. teaches N being integer larger than 1 (Page 3 Paragraph 0076 “Although FIG. 1 illustrates three artificial neural networks 51, 52, and 53” teaches neural network more than 1). 
Cao et al., Suganuma et al. and Kim et al. are analogous art because they are directed to using neural networks in which a correlation between features is learned.  
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, N being integer larger than 1 as taught by Kim et al. to the disclosed invention of Cao et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “The image registration device 1 may perform image registration using a plurality of pre-trained artificial neural networks 50 (51,52, and 53). Although FIG. 1 illustrates three artificial neural networks 51, 52, and 53, it is understood that one or more other exemplary embodiments are not limited thereto, and the number of pre-trained artificial neural networks may be greater or less than three” and “The first image and the second image have different features. Therefore, in order to extract features from the first image and the second image, the different artificial neural networks 51 and 52 may be used” (Kim et al. Page 3 Paragraph 0076 and Page 4 Paragraph 0085).
Regarding Claim 17: 
Cao et al. in view of Suganuma et al. and further in view of Kim et al. the system as recited in claim 16.
Suganuma et al. further teaches wherein the selecting of the one or more of the middle nodes of the N initial neural networks comprises (Page 500 Section 3.2 Evolutionary Algorithm “Select an elite individual from the set of P and C, and then replace P with the elite individual” and Page 499 Figure 1, as shown in figure 1 from the initial middle layer (white square box) select middle nodes and selected middle nodes are include in neural network).
and selecting at least one of the middle nodes in the middle layer (Page 500 Section 3.2 Evolutionary Algorithm   “(3) Train the λ CNNs represented by offsprings C in parallel, and assign the validation accuracies as the fitness. (4) Apply the neutral mutation to parent P. (5) Select an elite individual from the set of P and C, and then replace P with the elite individual” and Page 499 Figure 1, as shown in figure 1 from the initial middle layer ( white square box), best middle nodes picked)..
such that selected middle nodes contribute to an output from the output layer to a greater degree than non-selected middle nodes (Page 599 “not all of the nodes are connected to the output nodes. Node No. 5 on the left side of Figure 2 is an inactive node” and Page 499 – 500 Section 3.2 Evolutionary Algorithm “To efficiently use the computational resource, we want to evaluate some candidate solutions in parallel at each generation. Therefore, we apply the mutation operator until at least one active node changes for reproducing the candidate solution…Select an elite individual from the set of P and C, and then replace P with the elite individual. (6) Return to step 2 until a stopping criterion is satisfied.” and page 499 Fig 1. Teaches connection between in output layer and middle layers (white square box as shown in figure 1) and on active node we apply the mutation, selected active node provide output)).
Cao et al. and Suganuma et al. are analogous art because they are directed to classification using neural network.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, wherein the selecting of the one or more of the middle nodes of the N initial neural networks comprises… and selecting at least one of the middle nodes in the middle layer ….such that selected middle nodes contribute to an output from the output layer to a greater degree than non-selected middle nodes  as taught by Suganuma et al. to the disclosed invention of Cao et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method.”(Suganuma et al., Page 498 Section 3 Cnn Architecture Design Using Cartesian Genetic Programming).
Kim et al. further teaches obtaining K different sets of training data, K being an integer more than 1 (Page 4 paragraph [0093] “In this manner, training data 41 and 42 used for learning of the first artificial neural network 51 and the second artificial neural network 52 are different. Even when the first artificial neural network 51 and the second artificial neural network 52 have the same structure, the connection strengths W11 to W14 of the first artificial neural network 51 and the connection strengths W21 to W24 of the second artificial neural network 52 are different” and Figure 1 Teaches learning device (40) containing more than 1 training data (41, 42, 43)).
performing supervised training on the N initial neural networks with each of the K different sets of training data (Page 7 Paragraph [0157] “The third artificial neural network 53 may have a multilayer perceptron structure. For example, as illustrated in FIG. 11, the third artificial neural network 53 may include a plurality of conversion layers L31 to L35” and Page 7 paragraph [0159] “converted connection strengths W31 to W36 of units are determined by supervised learning” and Figure 14 and Figure 1, teaches train neural networks (for example 53 in figure 1) using supervised learning with more than 1 training data (for example 43 in figure 1)). 
To obtain K training results for each of the N initial neural networks (Page 8 paragraph [0166-0167] “The learning device 40 performs multilayer learning of the artificial neural network 50 (operation S523)...... In the multilayer learning operation, the converted connection strengths W31 to W36 of the third artificial neural network 53 are adjusted” and Page 4 Paragraph [0093] “of updating the connection strengths W11 to W14 using training data 41 and 42 and” and Page 4 paragraph [0093] “In this manner, training data 41 and 42 used for learning of the first artificial neural network51 and the second artificial neural network 52 are different.” Teaches multiple neural networks receive multiple training data).
of the N initial neural networks using the K training results (Page 4 paragraph [paragraph [0093] “In this manner, training data 41 and 42 used for learning of the first artificial neural network 51 and the second artificial neural network 52 are different. Even when the first artificial neural network 51 and the second artificial neural network 52 have the same structure, the connection strengths W11 to W14 of the first artificial neural network 51 and the connection strengths W21 to W24 of the second artificial neural network 52 are different” and Page 4 Paragraph [0093] “of updating the connection strengths W11 to W14 using training data 41 and 42” and Figure 1. Teaches using training data update neural network connection).
Cao et al., Suganuma et al. and Kim et al. are analogous art because they are directed to using neural networks in which a correlation between features is learned.  
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, obtaining K different sets of training data, K being an integer more than 1; performing supervised training on the N initial neural networks with each of the K different sets of training data to obtain K training results for each of the N initial neural networks ….. of the N initial neural networks using the K training results as taught by Kim et al. to the disclosed invention of Cao et al in view of Suganuma et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “The image registration device 1 may perform image registration using a plurality of pre-trained artificial neural networks 50 (51,52, and 53). Although FIG. 1 illustrates three artificial neural networks 51, 52, and 53, it is understood that one or more other exemplary embodiments are not limited thereto, and the number of pre-trained artificial neural networks may be greater or less than three” and “The first image and the second image have different features. Therefore, in order to extract features from the first image and the second image, the different artificial neural networks 51 and 52 may be used” (Kim et al. Page 3 Paragraph 0076 and Page 4 Paragraph 0085).
Regarding claim 18: 
Cao et al. in view of Suganuma et al. and further in view of Kim et al. teaches the system as recited in claim 16.
Suganuma et al. further teaches wherein the middle layer of each of the plurality of initial neural network comprises L middle nodes, L being an integer larger than 2, and wherein the number of the middle nodes in the new middle layer is equal to or less than L (Page 499 Figure 1. Teaches initial middle layer (white square box) nodes larger than 2 and as shown new middle layers less than initial middle nodes).
 Cao et al. and Suganuma et al. are analogous art because they are directed to classification using neural network.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, wherein the middle layer of each of the plurality of initial neural network comprises L middle nodes, L being an integer larger than 2, and wherein the number of the middle nodes in the new middle layer is equal to or less than L as taught by Suganuma et al. to the disclosed invention of Cao et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method” (Suganuma et al., Page 498 Section 3 Cnn Architecture Design Using Cartesian Genetic Programming).
Claims 12 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Cao et al. (US 2016/0328388 A1) in view of Suganuma et al. “A Genetic Programming Approach to Designing Convolutional Neural Network Architectures" and further in view of Kim et al. (US 2016/003050 A1) and Wang et al. "A Comparison among three neural networks for text classification".
Regarding Claim 12: 
Cao et al. in view of Suganuma et al. and further in view of Kim et al. the computer program product as recited in claim 9.  
Cao et al. teaches  wherein the generating of the new neural network further comprises the programming instructions for (page 8 Column 3 “These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks” and Page 12 Column 11 “ As shown in FIG. 5, the neural network includes input layer 502, output layer 506, and hidden layers 504. In this embodiment, there may be between zero and “n” hidden layers, where “n” is a real number greater than or equal to one. Input layer 502, output layer 506, and each hidden layer 504 includes a plurality of nodes (or “neurons”), designed as 502 a through 502 n for input layer 502, 506-1 a through 506-1 n for pseudo labels portion 508 of output layer 506, 506-2 a through 506-2 n for actual labels portion 510 of output layer 506, 504 a-a through 504 a-n for the first hidden layer 504, and 504 n-a through 504 n-n for the last hidden layer 504.” teaches computer program product that generate neural network).
Suganuma et al. further teaches such that certain middle nodes are avoided (Page 500 Section 3.2 Evolutionary Algorithm “Apply the neutral mutation to parent P. (5) Select an elite individual from the set of P and C” teaches selecting set of p and c (middle nodes) after mutation which means certain middle nodes are avoided). 
Cao et al. and Suganuma et al. are analogous art because they are directed to classification using neural network.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, such that certain middle nodes are avoided as taught by Suganuma et al. to the disclosed invention of Cao et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method” (Suganuma et al., Page 498 Section 3 Cnn Architecture Design Using Cartesian Genetic Programming).
Cao in view of Suganuma and further in view of Kim does not explicitly teach performing unsupervised training on the selected middle nodes, the unsupervised training comprising biasing the middle nodes.
However, Wang et al. does teaches performing unsupervised training on the selected middle nodes (Page 2 Section 4.2 “are input to the first layer and fanned out to the hidden layer. ….. functions turn the input to output, adjusting the weight of the input to the hidden layer” and Page 3 Section 4.2 Training “first is unsupervised learning, which adjusts the weight vector between the input and hidden layer” teaches hidden nodes (middle nodes) updated using unsupervised training).  
 the unsupervised training comprising biasing the middle nodes (Page 3 Section 4.2 Training “first is unsupervised learning, which adjusts the weight vector between the input and hidden layer” teaches unsupervised learning applied to the hidden nodes (middle nodes)).  
Cao et al., Suganuma et al., Kim et al. and Wang et al. are analogous art because they are directed to neural network which is used for solving classifying problems.  
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, performing unsupervised training on the selected middle nodes, the unsupervised training comprising biasing the middle nodes as taught by Wang et al. to the disclosed invention of Cao et al. in view of Suganuma et al. and Kim et al. 
One of ordinary skill in the arts would have been motivated to make this modification in order to provide users “network is a three-layer feed-forward neural network, between the input and the output layers there is a “hidden layer”” and “The RBF network shows its quickness in training….RBF nodes can be joined with competitive nodes to constitute a Probabilistic network which is also used in solving classifying problems” (Wang et al., Page 2 Section 4.1 Structure and Page 4 Section 7 Conclusions).
Regarding Claim 19: 
Cao et al. in view of Suganuma et al. The system as recited in claim 16.  
Cao et al. further teaches wherein the generating of the new neural network further comprises (page 8 Column 3 “These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks” and Page 12 Column 11 “ As shown in FIG. 5, the neural network includes input layer 502, output layer 506, and hidden layers 504. In this embodiment, there may be between zero and “n” hidden layers, where “n” is a real number greater than or equal to one. Input layer 502, output layer 506, and each hidden layer 504 includes a plurality of nodes (or “neurons”), designed as 502 a through 502 n for input layer 502, 506-1 a through 506-1 n for pseudo labels portion 508 of output layer 506, 506-2 a through 506-2 n for actual labels portion 510 of output layer 506, 504 a-a through 504 a-n for the first hidden layer 504, and 504 n-a through 504 n-n for the last hidden layer 504.” teaches computer program product that generate neural network).
Suganuma et al. further teaches such that certain middle nodes are avoided (Page 500 Section 3.2 Evolutionary Algorithm “Apply the neutral mutation to parent P. (5) Select an elite individual from the set of P and C” teaches selecting set of p and c (middle nodes) after mutation which means certain middle nodes are avoided). 
Cao et al. and Suganuma et al. are analogous art because they are directed to classification using neural network.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, such that certain middle nodes are avoided as taught by Suganuma et al. to the disclosed invention of Cao et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method” (Suganuma et al., Page 498 Section 3 Cnn Architecture Design Using Cartesian Genetic Programming).
Cao in view of Suganuma and further in view of Kim does not explicitly teach performing unsupervised training on the selected middle nodes, the unsupervised training comprising biasing the middle nodes. 
However, Wang et al. does teaches performing unsupervised training on the selected middle nodes (Page 2 Section 4.2 “are input to the first layer and fanned out to the hidden layer. ….. functions turn the input to output, adjusting the weight of the input to the hidden layer” and Page 3 Section 4.2 Training “first is unsupervised learning, which adjusts the weight vector between the input and hidden layer” teaches hidden nodes (middle nodes) updated using unsupervised training).  
 the unsupervised training comprising biasing the middle nodes (Page 3 Section 4.2 Training “first is unsupervised learning, which adjusts the weight vector between the input and hidden layer” teaches unsupervised learning applied to the hidden nodes (middle nodes)).  
Cao et al. and Wang et al. are analogous art because they are directed to neural network which is used for solving classifying problems.  
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, performing unsupervised training on the selected middle nodes, the unsupervised training comprising biasing the middle nodes as taught by Wang et al. to the disclosed invention of Cao et al.
 One of ordinary skill in the arts would have been motivated to make this modification in order to provide users “network is a three-layer feed-forward neural network, between the input and the output layers there is a “hidden layer”.” and “The RBF network shows its quickness in training….RBF nodes can be joined with competitive nodes to constitute a Probabilistic network which is also used in solving classifying problems” (Wang et al., Page 2 Section 4.1 Structure and Page 4 Section 7 Conclusions).
Claims 13-14 and 20-21 are rejected under 35 U.S.C. 103 as being unpatentable over Cao et al. (US 2016/0328388 A1) in view of Suganuma et al. “A Genetic Programming Approach to Designing Convolutional Neural Network Architectures" and further in view of  Wang et al. "A Comparison among three neural networks for text classification”.
Regarding Claim 13: 
Cao et al. in view of Suganuma et al. the computer program product as recited in claim 8.  
Cao et al. further teach wherein the preparing of the plurality of initial neural networks comprises the programming instructions for (Page 12 Column 11 “ As shown in FIG. 5, the neural network includes input layer 502, output layer 506, and hidden layers 504. In this embodiment, there may be between zero and “n” hidden layers, where “n” is a real number greater than or equal to one. Input layer 502, output layer 506, and each hidden layer 504 includes a plurality of nodes (or “neurons”), designed as 502 a through 502 n for input layer 502, 506-1 a through 506-1 n for pseudo labels portion 508 of output layer 506, 506-2 a through 506-2 n for actual labels portion 510 of output layer 506, 504 a-a through 504 a-n for the first hidden layer 504, and 504 n-a through 504 n-n for the last hidden layer 504.” Teaches neural network contain input, output and middle layer with each has one or more nodes).  
obtaining N initial conditions, N being an integer larger than 1, each condition corresponding to one of the initial neural networks (Page 12 column 11-12 “provided inputs (for example, words, phrases, and/or documents)…….each input word or phrase is initially provided with an input vector which…..is randomly initialized. For longer inputs (such as documents), the input is divided into multiple parts, where each part includes its own vector representation” teaches input (initial neural network) provide own vector representations (conditions)). 
Cao in view of Suganuma does not does not explicitly teach performing unsupervised training of the middle layer of each initial neural network using the corresponding initial condition; and performing supervised training of the output layer of each initial neural network using a set of training data. 
However, Wang et al. teaches  performing unsupervised training of the middle layer of each initial neural network using the corresponding initial condition (Page 3 Section 4.2 Training “first is unsupervised learning, which adjusts the weight vector between the input and hidden layer.” Teaches unsupervised training performed between middle layer and input layer).
and performing supervised training of the output layer of each initial neural network using a set of training data (Page 3 Section 4.2 Training “The training of the RBF network should be divided into two processes……The other is supervised learning, which adjusts the weight vector between the hidden and output layer” teaches performing supervised training on initial output neural networks3P201703879US01PATENT).
Cao et al., Suganuma et al. and Wang et al. are analogous art because they are directed to neural network which is used for solving classifying problems.  
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, performing unsupervised training of the middle layer of each initial neural network using the corresponding initial condition; and performing supervised training of the output layer of each initial neural network using a set of training data as taught by Wang et al. to the disclosed invention of Cao et al. in view of Suganuma et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “network is a three-layer feed-forward neural network, between the input and the output layers there is a “hidden layer”.” and “The RBF network shows its quickness in training….RBF nodes can be joined with competitive nodes to constitute a Probabilistic network which is also used in solving classifying problems” (Wang et al., Page 2 Section 4.1 Structure and Page 4 Section 7 Conclusions).
Regarding claim 14: 
Cao et al. in view of Suganuma et al. the computer program product as recited in claim 8.
Cao et al. further teaches wherein the preparing of the plurality of initial neural networks comprises the programming instructions for (Page 12 Column 11 “ As shown in FIG. 5, the neural network includes input layer 502, output layer 506, and hidden layers 504. In this embodiment, there may be between zero and “n” hidden layers, where “n” is a real number greater than or equal to one. Input layer 502, output layer 506, and each hidden layer 504 includes a plurality of nodes (or “neurons”), designed as 502 a through 502 n for input layer 502, 506-1 a through 506-1 n for pseudo labels portion 508 of output layer 506, 506-2 a through 506-2 n for actual labels portion 510 of output layer 506, 504 a-a through 504 a-n for the first hidden layer 504, and 504 n-a through 504 n-n for the last hidden layer 504.” Teaches neural network contain input, output and middle layer with each has one or more nodes).
obtaining M initial conditions, M being an integer larger than 2, each condition corresponding to one of M candidate neural networks (Page 12 column 11-12 “provided inputs (for example, words, phrases, and/or documents)…….each input word or phrase is initially provided with an input vector which…..is randomly initialized. For longer inputs (such as documents), the input is divided into multiple parts, where each part includes its own vector representation” teaches input (initial neural network) provide own vector representations (conditions)). 
Suganuma et al. further teaches and selecting N initial neural networks from among the M candidate neural networks using the performances (Page 500 Section 3.2 Evolutionary Algorithm “(3) Train the λ CNNs represented by offsprings C in parallel, and assign the validation accuracies as the fitness. (4) Apply the neutral mutation to parent P. (5) Select an elite individual from the set of P and C, and then replace P with the elite individual.” Teaches best middle nodes picked after training result).
N being an integer larger than 1 and smaller than M.   (Page 499 Figure 1 teaches initial neural nodes more than 1 and less than candidate neural network). 
Cao et al. and Suganuma et al. are analogous art because they are directed to classification using neural network.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, and selecting N initial neural networks from among the M candidate neural networks using the performances, N being an integer larger than 1 and smaller than M as taught by Suganuma et al. to the disclosed invention of Cao et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method.”(Suganuma et al., Page 498 Section 3 Cnn Architecture Design Using Cartesian Genetic Programming).
Cao in view of Suganuma does not explicitly teach performing unsupervised training of the middle layer of each candidate neural network using the corresponding initial condition; performing supervised training of the output layer of each candidate neural network using a set of training data; evaluating a performance of each candidate neural network.
However, Wang et al. teaches performing unsupervised training of the middle layer of each candidate neural network using the corresponding initial condition (Page 3 Section 4.2 Training “first is unsupervised learning, which adjusts the weight vector between the input and hidden layer.” Teaches unsupervised training performed between middle layer and input layer).
performing supervised training of the output layer of each candidate neural network using a set of training data; evaluating a performance of each candidate neural network (Page 3 Section 4.2 Training “The training of the RBF network should be divided into two processes……The other is supervised learning, which adjusts the weight vector between the hidden and output layer.” Teaches performing supervised training on initial neural network and update the weight using supervised learning).
   Cao et al. Suganuma et al. and Wang et al. are analogous art because they are directed to neural network which is used for solving classifying problems.  
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, performing unsupervised training of the middle layer of each candidate neural network using the corresponding initial condition; performing supervised training of the output layer of each candidate neural network using a set of training data; evaluating a performance of each candidate neural network as taught by Wang et al. to the disclosed invention of Cao et al. in view of Suganuma et al.  
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “network is a three-layer feed-forward neural network, between the input and the output layers there is a “hidden layer”.” and “The RBF network shows its quickness in training….RBF nodes can be joined with competitive nodes to constitute a Probabilistic network which is also used in solving classifying problems” (Wang et al., Page 2 Section 4.1 Structure and Page 4 Section 7 Conclusions).
Regarding Claim 20: 
Cao et al. in view of Suganuma et al. the system as recited in claim 15.  
Cao et al. further teach wherein the preparing of the plurality of initial neural networks comprises (Page 12 Column 11 “ As shown in FIG. 5, the neural network includes input layer 502, output layer 506, and hidden layers 504. In this embodiment, there may be between zero and “n” hidden layers, where “n” is a real number greater than or equal to one. Input layer 502, output layer 506, and each hidden layer 504 includes a plurality of nodes (or “neurons”), designed as 502 a through 502 n for input layer 502, 506-1 a through 506-1 n for pseudo labels portion 508 of output layer 506, 506-2 a through 506-2 n for actual labels portion 510 of output layer 506, 504 a-a through 504 a-n for the first hidden layer 504, and 504 n-a through 504 n-n for the last hidden layer 504.” Teaches neural network contain input, output and middle layer with each has one or more nodes).  
obtaining N initial conditions, N being an integer larger than 1, each condition corresponding to one of the initial neural networks (Page 12 column 11-12 “provided inputs (for example, words, phrases, and/or documents)…….each input word or phrase is initially provided with an input vector which…..is randomly initialized. For longer inputs (such as documents), the input is divided into multiple parts, where each part includes its own vector representation” teaches input (initial neural network) provide own vector representations (conditions)). 
Cao in view of Suganuma does not explicitly teach performing unsupervised training of the middle layer of each initial neural network using the corresponding initial condition; and performing supervised training of the output layer of each initial neural network using a set of training data.
However, Wang et al. teaches  performing unsupervised training of the middle layer of each initial neural network using the corresponding initial condition (Page 3 Section 4.2 Training “first is unsupervised learning, which adjusts the weight vector between the input and hidden layer.” Teaches unsupervised training performed between middle layer and input layer).
and performing supervised training of the output layer of each initial neural network using a set of training data ( Page 3 Section 4.2 Training  “The training of the RBF network should be divided into two processes……The other is supervised learning, which adjusts the weight vector between the hidden and output layer.” Teaches performing supervised training on initial output neural networks3P201703879US01PATENT).
Cao et al., Suganuma et al. and Wang et al. are analogous art because they are directed to neural network which is used for solving classifying problems.  
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, performing unsupervised training of the middle layer of each initial neural network using the corresponding initial condition; and performing supervised training of the output layer of each initial neural network using a set of training data as taught by Wang et al. to the disclosed invention of Cao et al. in view of Suganuma et al.  
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “network is a three-layer feed-forward neural network, between the input and the output layers there is a “hidden layer”.” and “The RBF network shows its quickness in training….RBF nodes can be joined with competitive nodes to constitute a Probabilistic network which is also used in solving classifying problems” (Wang et al., Page 2 Section 4.1 Structure and Page 4 Section 7 Conclusions).
Regarding claim 21: 
Cao et al. in view of Suganuma et al. the system as recited in claim 15.
Cao et al. further teaches wherein the preparing of the plurality of initial neural networks comprises (Page 12 Column 11 “ As shown in FIG. 5, the neural network includes input layer 502, output layer 506, and hidden layers 504. In this embodiment, there may be between zero and “n” hidden layers, where “n” is a real number greater than or equal to one. Input layer 502, output layer 506, and each hidden layer 504 includes a plurality of nodes (or “neurons”), designed as 502 a through 502 n for input layer 502, 506-1 a through 506-1 n for pseudo labels portion 508 of output layer 506, 506-2 a through 506-2 n for actual labels portion 510 of output layer 506, 504 a-a through 504 a-n for the first hidden layer 504, and 504 n-a through 504 n-n for the last hidden layer 504.” Teaches neural network contain input, output and middle layer with each has one or more nodes).
obtaining M initial conditions, M being an integer larger than 2, each condition corresponding to one of M candidate neural networks (Page 12 column 11-12 “provided inputs (for example, words, phrases, and/or documents)…….each input word or phrase is initially provided with an input vector which…..is randomly initialized. For longer inputs (such as documents), the input is divided into multiple parts, where each part includes its own vector representation” teaches input (initial neural network) provide own vector representations (conditions)). 
Suganuma et al. further teaches and selecting N initial neural networks from among the M candidate neural networks using the performances (Page 500 Section 3.2 Evolutionary Algorithm “(3) Train the λ CNNs represented by offsprings C in parallel, and assign the validation accuracies as the fitness.
(4) Apply the neutral mutation to parent P. (5) Select an elite individual from the set of P and C, and then replace P with the elite individual.” Teaches best middle nodes picked after training result).
N being an integer larger than 1 and smaller than M (Page 499 Figure 1 teaches initial neural nodes more than 1 and less than candidate neural network). 
Cao et al. and Suganuma et al. are analogous art because they are directed to classification using neural network.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, and selecting N initial neural networks from among the M candidate neural networks using the performances, N being an integer larger than 1 and smaller than M as taught by Suganuma et al. to the disclosed invention of Cao et al. 
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method.”(Suganuma et al., Page 498 Section 3 Cnn Architecture Design Using Cartesian Genetic Programming).
Cao in view of Suganuma does not explicitly teach performing unsupervised training of the middle layer of each candidate neural network using the corresponding initial condition.
However, Wang et al. teaches performing unsupervised training of the middle layer of each candidate neural network using the corresponding initial condition (Page 3 Section 4.2 Training “first is unsupervised learning, which adjusts the weight vector between the input and hidden layer.” Teaches unsupervised training performed between middle layer and input layer).
performing supervised training of the output layer of each candidate neural network using a set of training data; evaluating a performance of each candidate neural network (Page 3 Section 4.2 Training “The training of the RBF network should be divided into two processes……The other is supervised learning, which adjusts the weight vector between the hidden and output layer.” Teaches performing supervised training on initial neural network and update the weight using supervised learning).
   Cao et al., Suganuma et al. and Wang et al. are analogous art because they are directed to neural network which is used for solving classifying problems.  
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate, performing unsupervised training of the middle layer of each candidate neural network using the corresponding initial condition; performing supervised training of the output layer of each candidate neural network using a set of training data; evaluating a performance of each candidate neural network as taught by Wang et al. to the disclosed invention of Cao et al. in view of Suganuma et al.  
One of ordinary skill in the arts would have been motivated to make this modification because of the following, “network is a three-layer feed-forward neural network, between the input and the output layers there is a “hidden layer”.” and “The RBF network shows its quickness in training….RBF nodes can be joined with competitive nodes to constitute a Probabilistic network which is also used in solving classifying problems” (Wang et al., Page 2 Section 4.1 Structure and Page 4 Section 7 Conclusions).
Response to Arguments
Applicant's arguments filed on 12/31/2021 with respect to double patenting rejection to Claims 8-21 have been fully considered but not persuasive.
Regarding Claims 8-21 , Applicant asserts “Since none of the claims have been allowed in the present Application, Applicant defers responding to this rejection. II.” (Remarks Pg. 7). 
Examiner response: 
The examiner respectfully disagrees. First, according to MPEP § 804.01, “A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985). In determining whether a nonstatutory basis exists for a double patenting rejection, the first question to be asked is: is any invention claimed in the application anticipated by, or an obvious variation of, an invention claimed in the patent? If the answer is yes, then a nonstatutory double patenting rejection may be appropriate. Nonstatutory double patenting requires rejection of an application claim when the claimed subject matter is not patentably distinct from the subject matter claimed in a commonly owned patent, or a non-commonly owned patent but subject to a joint research agreement as set forth in 35 U.S.C. 102(c)  or pre-AIA  35 U.S.C. 103(c)(2) and (3), when the issuance of a second patent would provide unjustified extension of the term of the right to exclude granted by a patent. See Eli Lilly & Co. v. Barr Labs., Inc., 251 F.3d 955, 58 USPQ2d 1869 (Fed. Cir. 2001); Ex parte Davis, 56 USPQ2d 1434, 1435-36 (Bd. Pat. App. & Inter. 2000).” (emphasis added). Second, Applicant did not provide specific arguments as to the double patent rejection. Therefore, the non-statutory double patenting rejection to claims 8-21 is maintained. 
Applicant's arguments filed on 12/31/2021 with respect to the 35 U.S.C. 101 rejection to Claims 8, 9, 11, 15, 16 and 18  have been fully considered but not persuasive.
Applicant asserts that “Furthermore, Applicant's claimed invention does indeed include limitations requiring computer implementation of the methods. For example, independent claim 8 (and similarly independent claim 15) of Applicant's claimed invention is directed to "preparing a plurality of initial neural networks, each of which comprises an input layer containing one or more input nodes, a middle layer containing one or more middle nodes, and an output layer containing one or more output nodes; and generating a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks." These steps cannot be performed mentally or with a pencil and paper. 
For example, how can one prepare a plurality of initial neural networks mentally or with a pencil and paper? In another example, how can one generate a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks mentally or with a pencil and paper?” and “Furthermore, Applicant's claimed invention cannot be performed by humans without a computer. For example, as discussed above, independent claim 8 (and similarly independent claim 15) of Applicant's claimed invention is directed to "preparing a plurality of initial neural networks, each of which comprises an input layer containing one or more input nodes, a middle layer containing one or more middle nodes, and an output layer containing one or more output nodes; and generating a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks." 
How can one prepare a plurality of initial neural networks without the use of a computing device? In another example, how can one generate a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks without the use of a computing device?” (Remarks Pg. 9 and Pg. 11)
Examiner response:
The examiner respectfully disagrees. First according to MPEP 2106.04(a)(2) III “The courts do not distinguish between mental processes that are performed entirely in the human mind and mental processes that require a human to use a physical aid (e.g., pen and paper or a slide rule) to perform the claim limitation. See, e.g., Benson, 409 U.S. at 67, 65, 175 USPQ at 674-75, 674 (noting that the claimed "conversion of [binary-coded decimal] numerals to pure binary numerals can be done mentally," i.e., "as a person would do it by head and hand."); Synopsys, Inc. v. Mentor Graphics Corp., 839 F.3d 1138, 1139, 120 USPQ2d 1473, 1474 (Fed. Cir. 2016) (holding that claims to a mental process of "translating a functional description of a logic circuit into a hardware component description of the logic circuit" are directed to an abstract idea, because the claims "read on an individual performing the claimed steps mentally or with pencil and paper"). Mental processes performed by humans with the assistance of physical aids such as pens or paper are explained further below with respect to point B.
Nor do the courts distinguish between claims that recite mental processes performed by humans and claims that recite mental processes performed on a computer. As the Federal Circuit has explained, "[c]ourts have examined claims that required the use of a computer and still found that the underlying, patent-ineligible invention could be performed via pen and paper or in a person’s mind." Versata Dev. Group v. SAP Am., Inc., 793 F.3d 1306, 1335, 115 USPQ2d 1681, 1702 (Fed. Cir. 2015). See also Intellectual Ventures I LLC v. Symantec Corp., 838 F.3d 1307, 1318, 120 USPQ2d 1353, 1360 (Fed. Cir. 2016) (‘‘[W]ith the exception of generic computer-implemented steps, there is nothing in the claims themselves that foreclose them from being performed by a human, mentally or with pen and paper.’’); Mortgage Grader, Inc. v. First Choice Loan Servs. Inc., 811 F.3d 1314, 1324, 117 USPQ2d 1693, 1699 (Fed. Cir. 2016) (holding that computer-implemented method for "anonymous loan shopping" was an abstract idea because it could be "performed by humans without a computer"). Mental processes recited in claims that require computers are explained further below with respect to point C.” (emphasis added).
The Final Rejection, in accordance with guidance from  MPEP 2106.04(a)(2) III, established that the claim limitations in claims 8 and 15, for example, “as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components.” (Section 6 of this Final Rejection). With the assistance of pen and paper, a human can evaluate and make judgment about the structure of a neural network and establish (generate) a neural network by drawing the structure (layer, nodes) on paper, and writing down the weights and bias values on paper. By evaluating human can construct (prepare) a neural network and make judgment about the structure (layer and node) of neural network.
Applicant asserts that “Assuming arguendo that Applicant's claimed invention can be interpreted as reciting a judicial exception (abstract idea), Applicant's claimed invention is not directed to the judicial exception since the judicial exception is integrated into a practical application of the judicial exception. 
For example, the combination of steps in the claimed invention, such as "preparing a plurality of initial neural networks, each of which comprises an input layer containing one or more input nodes, a middle layer containing one or more middle nodes, and an output layer containing one or more output nodes; and generating a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks" as recited in independent claim 8 (and similarly in independent claims 15), reflect an improvement to another technology or technical field.” (Remarks Pg. 14)
Examiner response:
The examiner respectfully disagrees. The limitations of 8 and 15 amount to mental process and are not additional elements, therefore they cannot reflect an improvement. See MPEP 2106.05(a) (“It is important to note, the judicial exception alone cannot provide the improvement. The improvement can be provided by one or more additional elements. See the discussion of Diamond v. Diehr, 450 U.S. 175, 187 and 191-92, 209 USPQ 1, 10 (1981)) in subsection II, below.”).
Applicant asserts that “As discussed above, by generating a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of an initial set of neural networks, less computational resources need to be utilized thereby improving the functioning of the computer. See, e.g., paragraph [0095] of Applicant's specification.” (Remarks Pg. 19)
Examiner response: 
The examiner respectfully disagrees. the limitations of 8 and 15 amount to mental process and are not additional elements, therefore they cannot reflect an improvement. See MPEP 2106.05(a) (“It is important to note, the judicial exception alone cannot provide the improvement. The improvement can be provided by one or more additional elements. See the discussion of Diamond v. Diehr, 450 U.S. 175, 187 and 191-92, 209 USPQ 1, 10 (1981)) in subsection II, below.”).
Applicant asserts that “Furthermore, the Examiner asserts that the additional elements or combination of elements are directed to routine, conventional and well-understood activities previously engaged in by those in the field of the present invention, and therefore, the claimed invention is directed to non-statutory subject matter…. The Examiner must perform a factual determination as to whether the claim limitations of Applicant's claimed invention are routine, conventional and well-understood that were previously engaged in by those in the field of the present invention” (Remarks, pg. 19) and “How are these "additional elements" simply mere instructions to apply the exception using a generic computer component?” (Remarks, Pg. 20)
Examiner response: 
The Office Action indicates that the claims recite additional elements that are mere instruction to implement an abstract idea on a computer, or merely uses a computer tools to perform an abstract idea. See MPEP 2106.05(f) (“In Alice Corp., the claim recited the concept of intermediated settlement as performed by a generic computer. The Court found that the recitation of the computer in the claim amounted to mere instructions to apply the abstract idea on a generic computer”). For example, claim 8 recites additional elements that are merely components of a generic computer, including the following: “the computer program product comprising a computer readable storage medium having program code embodied therewith, the program code comprising the programming instructions for...”
Applicant assert that “How are the limitations of "selecting one or more of the middle nodes of the N initial neural networks; and including the selected one or more middle nodes in the new middle layer of the new neural network" as recited in claims 9 and 16 simply mere instructions to apply the exception using a generic computer component? How are the limitations of "obtaining K different sets of training data, K being an integer more than 1; performing supervised training on the N initial neural networks with each of the K different sets of training data to obtain K training results for each of the N initial neural networks; and selecting at least one of the middle nodes in the middle layer of the N initial neural networks using the K training results, such that selected middle nodes contribute to an output from the output layer to a greater degree than non-selected middle nodes" as recited in claims 10 and 17 simply mere instructions to apply the exception using a generic computer component? How are the limitations of "wherein the middle layer of each of the plurality of initial neural network comprises L middle nodes, L being an integer larger than 2, and wherein the number of the middle nodes in the new middle layer is equal to or less than L" as recited in claims 11 and 18 simply mere instructions to apply the exception using a generic computer component?” (Remarks, Pg. 20).
Examiner response: 
The Office Action does not assert these claim limitations as additional elements that amount to mere instruction to apply. Please see Office Action for additional information.
Applicant's arguments filed on 12/31/2021 with respect to the 35 U.S.C. 103 rejection to Claims 8 and 15 have been fully considered but not persuasive.
Applicant asserts that “There is no language in Suganuma that teaches generating a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks. Instead, Suganuma is simply focused on presenting the CNN architectures based on Cartesian genetic programming. In particular, Suganuma teaches automatically constructing CNN architectures for an image classification task based on Cartesian genetic programming. See, e.g., page 497. There is no discussion in Suganuma regarding such CNN architecture comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the initial neural networks.”, (Remarks Pg. 24).
Examiner response: 
The examiner respectfully disagrees. As discussed in the rejection above, Suganuma et al. teaches “generating a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks.” Suganuma et al. in Fig. 1 teaches CNN architecture wherein the white square box (Corresponds to middle layer) as shown in figure that white square box (corresponds to middle layer) are evaluated based on initial neural network of the white square box (corresponds to middle layer) wherein the white square box is corresponds to middle layer. Further Pg. 499 Section 3.2 Evolutionary Algorithm “To efficiently use the computational resource, we want to evaluate some candidate solutions in parallel at each generation. Therefore, we apply the mutation operator until at least one active node changes for reproducing the candidate solution” mutation apply to the active node (corresponds to middle node). Therefore, the 35 U.S.C. 103 rejection made in the previous office action is maintained.
Applicant asserts that “There is no language in Suganuma (and in particular page 498 of Suganuma) that makes any suggestion to generate a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks (missing claim limitation) in order to optimize the CNN architecture to maximize the validation accuracy (Examiner's reasoning). Instead, the Examiner's source of reasoning simply discusses that the invention of Suganuma focuses on optimizing the CNN architecture defined by Cartesian genetic programming. How does optimizing the CNN architecture defined by Cartesian genetic programming provide motivation for one skilled in the art to modify Cao to generate a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks (missing claim limitation) in order to optimize the CNN architecture to maximize the validation accuracy (Examiner's reasoning)? The Examiner's source of reasoning fails to provide such a rational underpinning. Hence, the Examiner's source of reasoning fails to provide motivation for modifying the teachings of Cao to include the above-cited missing claim limitation.”, and “Why would the reason to modify Cao (whose purpose is to address the need for learning word embeddings for determining similarity between words and phrases) to generate a new neural network comprising a new middle layer containing one or more middle nodes based on the middle nodes of the middle layers of the plurality of initial neural networks (missing claim limitation) be to optimize the CNN architecture to maximize the validation accuracy (Examiner's reasoning)?” (remarks Pg. 25-26).
Examiner response: 
The examiner respectfully disagrees. According to MPEP 2144,
“The strongest rationale for combining references is a recognition, expressly or impliedly in the prior art or drawn from a convincing line of reasoning based on established scientific principles or legal precedent, that some advantage or expected beneficial result would have been produced by their combination. In re Sernaker, 702 F.2d 989, 994-95, 217 USPQ 1, 5-6 (Fed. Cir. 1983). See also Dystar Textilfarben GmbH & Co. Deutschland KG v. C.H. Patrick, 464 F.3d 1356, 1368, 80 USPQ2d 1641, 1651 (Fed. Cir. 2006) ("Indeed, we have repeatedly held that an implicit motivation to combine exists not only when a suggestion may be gleaned from the prior art as a whole, but when the ‘improvement’ is technology-independent and the combination of references results in a product or process that is more desirable, for example because it is stronger, cheaper, cleaner, faster, lighter, smaller, more durable, or more efficient. Because the desire to enhance commercial opportunities by improving a product or process is universal—and even common-sensical—we have held that there exists in these situations a motivation to combine prior art references even absent any hint of suggestion in the references themselves.").” (emphasis added)
In Cao et al., Fig. 5 teaches generating a neural network. Further, Cao et al. Pg. 7 Column 2 discussed constructing neural network that contains Input, middle and output layer. Moreover, Suganuma et al. Pg. 497 Section Abstract “we attempt to automatically construct CNN architectures for an image classification task based on Cartesian genetic programming (CGP).” Further, Suganuma et al. Fig. 1 teaches middle layer of neural network that select elite solution from the set. Therefore, It would be reasonable to combine Cao et al. and Suganuma et al.  to improve neural network of Cao et al. because of the following: “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method” (Suganuma et al., Page 498 Section 3). Further, it is reminded that “[The] reason or motivation to modify the reference may often suggest what the inventor has done, but for a different purpose or to solve a different problem. It is not necessary that the prior art suggest the combination to achieve the same advantage or result discovered by applicant. See, e.g., In re Kahn, 441 F.3d 977, 987, 78 USPQ2d 1329, 1336 (Fed. Cir. 2006).” See MPEP 2144.  
Applicant's arguments filed on 12/31/2021 with respect to the 35 U.S.C. 103 rejection to Claims 9 and 16 have been fully considered but not persuasive. 
Applicant asserts that “There is no language in the cited passages of Cao, Suganuma and Kim that teaches or suggests selecting one or more of the middle nodes of the N initial neural networks.
Instead, Suganuma simply teaches selecting an elite individual from the set of P and C, and then replacing P with the elite individual. As understood by Applicant, the "P" corresponds to the parent and "C" corresponds to the offsprings which is generated by applying the forced mutation to P. See, e.g., page 500 of Suganuma. The Examiner has not explained how such a set of parents and offsprings corresponds to selecting one or more of the middle nodes of the initial neural networks.”, (remarks Pg. 29).
Examiner response: 
The examiner respectfully disagrees. As discussed in the rejection above, Suganuma et al.  teaches “selecting one or more of the middle nodes of the N initial neural networks.” Suganuma in Fig. 1 teaches CNN architecture wherein the white square box (Corresponds to middle nodes). Further Pg. 500 Section 3.2 Evolutionary algorithms “Select an elite individual from the set of P and C, and then replace P with the elite individual” teach Selecting elite individual (corresponds to selecting one of the middle nodes) from the set of P and c (corresponds to initial neural networks).Therefore, the 35 U.S.C. 103 rejection made in the previous office action is maintained.
b.	Applicant asserts that “There is no language in Suganuma (and in particular page 498 of Suganuma) that makes any suggestion to: (1) select one or more of the middle nodes of the N initial neural networks; and (2) include the selected one or more middle nodes in the new middle layer of the new neural network (missing claim limitations) in order to optimize the CNN architecture to maximize the validation accuracy (Examiner's reasoning). Instead, the Examiner's source of reasoning simply discusses that the invention of Suganuma focuses on optimizing the CNN architecture defined by Cartesian genetic programming. How does optimizing the CNN architecture defined by Cartesian genetic programming provide motivation for one skilled in the art to modify Cao to: (1) select one or more of the middle nodes of the N initial neural networks; and (2) include the selected one or more middle nodes in the new middle layer of the new neural network (missing claim limitations) in order to optimize the CNN architecture to maximize the validation accuracy (Examiner's reasoning)? The Examiner's source of reasoning fails to provide such a rational underpinning. Hence, the Examiner's source of reasoning fails to provide motivation for modifying the teachings of Cao to include the above-cited missing claim limitations.” (remarks Pg.36).
Examiner response: 
The examiner respectfully disagrees. According to MPEP 2144,
“The strongest rationale for combining references is a recognition, expressly or impliedly in the prior art or drawn from a convincing line of reasoning based on established scientific principles or legal precedent, that some advantage or expected beneficial result would have been produced by their combination. In re Sernaker, 702 F.2d 989, 994-95, 217 USPQ 1, 5-6 (Fed. Cir. 1983). See also Dystar Textilfarben GmbH & Co. Deutschland KG v. C.H. Patrick, 464 F.3d 1356, 1368, 80 USPQ2d 1641, 1651 (Fed. Cir. 2006) ("Indeed, we have repeatedly held that an implicit motivation to combine exists not only when a suggestion may be gleaned from the prior art as a whole, but when the ‘improvement’ is technology-independent and the combination of references results in a product or process that is more desirable, for example because it is stronger, cheaper, cleaner, faster, lighter, smaller, more durable, or more efficient. Because the desire to enhance commercial opportunities by improving a product or process is universal—and even common-sensical—we have held that there exists in these situations a motivation to combine prior art references even absent any hint of suggestion in the references themselves.").” (emphasis added)
In Cao et al., Fig. 5 teaches generating a neural network. Further, Cao et al. Pg. 7 Column 2 discussed constructing neural network that contains Input, middle and output layer. Moreover, Suganuma et al. Pg. 497 Section Abstract “we attempt to automatically construct CNN architectures for an image classification task based on Cartesian genetic programming (CGP).” Further, Suganuma et al. Fig. 1 teaches middle layer of neural network that select elite solution from the set. Therefore, It would be reasonable to combine Cao et al. and Suganuma et al.  to improve neural network of Cao et al. because of the following: “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method” (Suganuma et al., Page 498 Section 3). Further, it is reminded that “[The] reason or motivation to modify the reference may often suggest what the inventor has done, but for a different purpose or to solve a different problem. It is not necessary that the prior art suggest the combination to achieve the same advantage or result discovered by applicant. See, e.g., In re Kahn, 441 F.3d 977, 987, 78 USPQ2d 1329, 1336 (Fed. Cir. 2006).” See MPEP 2144.  
Applicant's arguments filed on 12/31/2021 with respect to the 35 U.S.C. 103 rejection to Claims 10 and 17 have been fully considered but not persuasive. 
Applicant asserts that “There is no language in the cited passages of Cao, Suganuma and Kim that teaches or suggests performing supervised training on the N initial neural networks with each of the K different sets of training data to obtain K training results for each of the N initial neural networks. 
Instead, Kim simply teaches updating the connection strengths using training data. While Kim discusses training data, there is no language in Kim that teaches or suggests performing supervised training on the N initial neural networks with each of the K different sets of training data to obtain K training results for each of the N initial neural networks. 
Furthermore, Cao simply discusses that these computer readable program instructions may be provided to a processor of a general purpose computer. Additionally, Suganuma teaches applying the mutation operator until at least one active node changes for reproducing the candidate solution. Such disclosures do not teach or suggest performing supervised training on the N initial neural networks with each of the K different sets of training data to obtain K training results for each of the N initial neural networks” (Remarks, Pg. 32-33). 
Examiner response: 
The examiner respectfully disagrees. As discussed in the rejection above, Kim et al. teaches “performing supervised training on the N initial neural networks with each of the K different sets of training data to obtain K training results for each of the N initial neural networks.” Kim et al. Fig. 1 as illustrate in figure 51, 52 (corresponds to N initial neural network) which obtain different feature vector (corresponds to K different result). Further Pg. 7 para [0159] “converted connection strengths W31 to W36 of units are determined by supervised learning” teaches supervised learning on the neural network. Pg. 4 Para [0093] “training data 41 and 42 used for learning of the first artificial neural network51 and the second artificial neural network 52 are different” teaches obtain training data 41 and 42 (corresponds to k different sets of training data. further Fig 1 obtain different feature vectors (corresponds to K different result) for each of 51, 52 neural networks. Therefore, the 35 U.S.C. 103 rejection made in the previous office action is maintained.
Applicant asserts that “Neither is there any language in the cited passages of Cao, Suganuma and Kim that teaches or suggests selecting at least one of the middle nodes in the middle layer of the N initial neural networks using the K training results, such that selected middle nodes contribute to an output from the output layer to a greater degree than non-selected middle nodes. 
Instead, Suganuma simply teaches selecting an elite individual from the set of P and C, and then replacing P with the elite individual. As understood by Applicant, the "P" corresponds to the parent and "C" corresponds to the offsprings which is generated by applying the forced mutation to P. See, e.g., page 500 of Suganuma. The Examiner has not explained how such a set of parents and offsprings corresponds to selecting at least one of the middle nodes in the middle layer of the N initial neural networks using the K training results, such that selected middle nodes contribute to an output from the output layer to a greater degree than non-selected middle nodes. 
Additionally, Kim simply teaches that even when the first artificial neural network and the second artificial neural network have the same structure, the 
connection strengths W 11 to W14 of the first artificial neural network and the connection strengths W21 to W24 of the second artificial neural network are different. Having different connection strengths in the artificial neural networks does not imply selecting at least one of the middle nodes in the middle layer of the N initial neural networks using the K training results, such that selected middle nodes contribute to an output from the output layer to a greater degree than non-selected middle nodes. 
Furthermore, Cao simply teaches that the neural network includes an input layer, an output layer and hidden layers. Such a disclosure does not teach or suggest selecting at least one of the middle nodes in the middle layer of the N initial neural networks using the K training results, such that selected middle nodes contribute to an output from the output layer to a greater degree than non-selected middle nodes.” (Remarks Pg. 33-34)
Examiner response: 
The examiner respectfully disagrees. As discussed in the rejection above, Suganuma teaches “selecting at least one of the middle nodes in the middle layer of the N initial neural networks using the K training results, such that selected middle nodes contribute to an output from the output layer to a greater degree than non-selected middle nodes.” Suganuma et al. Pg. 500 Section 3.2 “Select an elite individual from the set of P and C, and then replace P with the elite individual” teaches selecting elite individual (corresponds to best middle nodes are picked) from the set. Suganuma et al. as shown in fig.1 from initial white square box (corresponds to middle layer). Further Suganuma et al. Pg. 499 Section 3.2 “To efficiently use the computational resource, we want to evaluate some candidate solutions in parallel at each generation. Therefore, we apply the mutation operator until at least one active node changes for reproducing the candidate solution” teaches applying mutation to at least one active node (corresponds to greater degree than non-selected middle nodes) changes for reproducing the candidate solution (corresponds to output). Further, Kim et al. Page 4 para [0093] “In this manner, training data 41 and 42 used for learning of the first artificial neural network 51 and the second artificial neural network 52 are different” teaches neural networks  trained using different training results. Therefore, the 35 U.S.C. 103 rejection made in the previous office action is maintained.
Applicant asserts that “There is no language in Suganuma (and in particular page 498 of Suganuma) that makes any suggestion to select at least one of the middle nodes in the middle layer... such that selected middle nodes contribute to an output from the output layer to a greater degree than non-selected middle nodes (missing claim limitation) in order to optimize the CNN architecture to maximize the validation accuracy (Examiner's reasoning). Instead, the Examiner's source of reasoning simply discusses that the invention of Suganuma focuses on optimizing the CNN architecture defined by Cartesian genetic programming. How does optimizing the CNN architecture defined by Cartesian genetic programming provide motivation for one skilled in the art to modify Cao to select at least one of the middle nodes in the middle layer... such that selected middle nodes contribute to an output from the output layer to a greater degree than non-selected middle nodes (missing claim limitation) in order to optimize the CNN architecture to maximize the validation accuracy (Examiner's reasoning)? The Examiner's source of reasoning fails to provide such a rational underpinning.” (remarks Pg.39).
Examiner response: 
The examiner respectfully disagrees. According to MPEP 2144,
“The strongest rationale for combining references is a recognition, expressly or impliedly in the prior art or drawn from a convincing line of reasoning based on established scientific principles or legal precedent, that some advantage or expected beneficial result would have been produced by their combination. In re Sernaker, 702 F.2d 989, 994-95, 217 USPQ 1, 5-6 (Fed. Cir. 1983). See also Dystar Textilfarben GmbH & Co. Deutschland KG v. C.H. Patrick, 464 F.3d 1356, 1368, 80 USPQ2d 1641, 1651 (Fed. Cir. 2006) ("Indeed, we have repeatedly held that an implicit motivation to combine exists not only when a suggestion may be gleaned from the prior art as a whole, but when the ‘improvement’ is technology-independent and the combination of references results in a product or process that is more desirable, for example because it is stronger, cheaper, cleaner, faster, lighter, smaller, more durable, or more efficient. Because the desire to enhance commercial opportunities by improving a product or process is universal—and even common-sensical—we have held that there exists in these situations a motivation to combine prior art references even absent any hint of suggestion in the references themselves.").” (emphasis added)
In Cao et al., Fig. 5 teaches generating a neural network. Further, Cao et al. Pg. 7 Column 2 discussed constructing neural network that contains Input, middle and output layer. Moreover, Suganuma et al. Pg. 497 Section Abstract “we attempt to automatically construct CNN architectures for an image classification task based on Cartesian genetic programming (CGP).” Further, Suganuma et al. Fig. 1 teaches middle layer of neural network that select elite solution from the set. Therefore, It would be reasonable to combine Cao et al. and Suganuma et al. to improve neural network of Cao et al. because of the following: “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method” (Suganuma et al., Page 498 Section 3). Further, it is reminded that “[The] reason or motivation to modify the reference may often suggest what the inventor has done, but for a different purpose or to solve a different problem. It is not necessary that the prior art suggest the combination to achieve the same advantage or result discovered by applicant. See, e.g., In re Kahn, 441 F.3d 977, 987, 78 USPQ2d 1329, 1336 (Fed. Cir. 2006).” See MPEP 2144.  
Applicant asserts that “There is no language in Kim (and in particular paragraphs [0076 and 0085] of Kim) that makes any suggestion to: (1) obtain K different sets of training data, K being an integer more than 1; (2) perform supervised training on the N initial neural networks with each of the K different sets of training data to obtain K training results for each of the N initial neural networks; and (3) select at least one of the middle nodes in the middle layer of the N initial neural networks using the K training results (missing claim limitations) in order to extract features from the images (Examiner's reasoning). Instead, the Examiner's source of reasoning simply discusses that the artificial neural networks used for feature extraction are pre-trained by the learning device. How does pre-training artificial neural networks provide motivation for one skilled in the art to modify Cao to: (1) obtain K different sets of training data, K being an integer more than 1; (2) perform supervised training on the N initial neural networks with each of the K different sets of training data to obtain K training results for each of the N initial neural networks; and (3) select at least one of the middle nodes in the middle layer of the N initial neural networks using the K training results (missing claim limitations)? The Examiner's source of reasoning fails to provide such a rational underpinning. Hence, the Examiner's source of reasoning fails to provide motivation for modifying the teachings of Cao to include the above-cited missing claim limitation. Accordingly, the Examiner has not presented a primafacie case of obviousness for rejecting claims 10 and 17.” (remarks Pg.42-43).
Examiner response: 
The examiner respectfully disagrees. According to MPEP 2144,
“The strongest rationale for combining references is a recognition, expressly or impliedly in the prior art or drawn from a convincing line of reasoning based on established scientific principles or legal precedent, that some advantage or expected beneficial result would have been produced by their combination. In re Sernaker, 702 F.2d 989, 994-95, 217 USPQ 1, 5-6 (Fed. Cir. 1983). See also Dystar Textilfarben GmbH & Co. Deutschland KG v. C.H. Patrick, 464 F.3d 1356, 1368, 80 USPQ2d 1641, 1651 (Fed. Cir. 2006) ("Indeed, we have repeatedly held that an implicit motivation to combine exists not only when a suggestion may be gleaned from the prior art as a whole, but when the ‘improvement’ is technology-independent and the combination of references results in a product or process that is more desirable, for example because it is stronger, cheaper, cleaner, faster, lighter, smaller, more durable, or more efficient. Because the desire to enhance commercial opportunities by improving a product or process is universal—and even common-sensical—we have held that there exists in these situations a motivation to combine prior art references even absent any hint of suggestion in the references themselves.").” (emphasis added)
In Cao et al., Fig. 5 teaches generating a neural network. Further, Cao et al. Pg. 7 Column 2 discussed constructing neural network that contains Input, middle and output layer. Moreover, Suganuma et al. Pg. 497 Section Abstract “we attempt to automatically construct CNN architectures for an image classification task based on Cartesian genetic programming (CGP).” Further, Suganuma et al. Fig. 1 teaches middle layer of neural network that select elite solution from the set. Moreover, Kim et al. Pg. 4 para [0093] “In this manner, training data 41 and 42 used for learning of the first artificial neural network 51 and the second artificial neural network 52 are different”. Therefore, It would be reasonable to combine Kim et al. and Cao et al. in view of Suganuma et al. to improve neural network of Cao et al. because of the following: “The first image and the second image have different features. Therefore, in order to extract features from the first image and the second image, the different artificial neural networks 51 and 52 may be used”(Kim et al., Page 4 Paragraph [0085]). Further, it is reminded that “[The] reason or motivation to modify the reference may often suggest what the inventor has done, but for a different purpose or to solve a different problem. It is not necessary that the prior art suggest the combination to achieve the same advantage or result discovered by applicant. See, e.g., In re Kahn, 441 F.3d 977, 987, 78 USPQ2d 1329, 1336 (Fed. Cir. 2006).” See MPEP 2144.  
Applicant's arguments filed on 12/31/2021 with respect to the 35 U.S.C. 103 rejection to Claims 11 and 18 have been fully considered but not persuasive. 
Applicant asserts that “There is no language in the cited passage that teaches or suggests that the middle layer of each of the plurality of initial neural network comprises L middle nodes, L being an integer larger than 2, and where the number of the middle nodes in the new middle layer is equal to or less than L. Instead, Suganuma simply teaches constructing CNN architectures for an image classification task based on Cartesian genetic programming.” (Remarks, Pg. 33-34). 
Examiner response: 
The examiner respectfully disagrees. As discussed in the rejection above, Suganuma et al. teaches “the middle layer of each of the plurality of initial neural network comprises L middle nodes, L being an integer larger than 2, and where the number of the middle nodes in the new middle layer is equal to or less than L”. Suganuma et al. Fig. 1 teaches CNN architecture wherein the white square box (Corresponds to middle layer) as shown in figure that white square box (corresponds to middle layer) are evaluated based on initial neural network of the white square box (corresponds to middle layer). Therefore, the 35 U.S.C. 103 rejection made in the previous office action is maintained.
Applicant asserts that “There is no language in Suganuma (and in particular page 498 of Suganuma) that makes any suggestion to: (1) have the middle layer of each of the plurality of initial neural network comprises L middle nodes, L being an integer larger than 2, and (2) have the number of the middle nodes in the new middle layer be equal to or less than L (missing claim limitations) in order to optimize the CNN architecture to maximize the validation accuracy (Examiner's reasoning). Instead, the Examiner's source of reasoning simply discusses that the invention of Suganuma focuses on optimizing the CNN architecture defined by Cartesian genetic programming. How does optimizing the CNN architecture defined by Cartesian genetic programming provide motivation for one skilled in the art to modify Cao to: (1) have the middle layer of each of the plurality of initial neural network comprises L middle nodes, L being an integer larger than 2, and (2) have the number of the middle nodes in the new middle layer be equal to or less than L (missing claim limitations)? The Examiner's source of reasoning fails to provide such a rational underpinning. Hence, the Examiner's source of reasoning fails to provide motivation for modifying the teachings of Cao to include the above-cited missing claim limitation.” (remarks Pg.46).
Examiner response: 
The examiner respectfully disagrees. According to MPEP 2144,
“The strongest rationale for combining references is a recognition, expressly or impliedly in the prior art or drawn from a convincing line of reasoning based on established scientific principles or legal precedent, that some advantage or expected beneficial result would have been produced by their combination. In re Sernaker, 702 F.2d 989, 994-95, 217 USPQ 1, 5-6 (Fed. Cir. 1983). See also Dystar Textilfarben GmbH & Co. Deutschland KG v. C.H. Patrick, 464 F.3d 1356, 1368, 80 USPQ2d 1641, 1651 (Fed. Cir. 2006) ("Indeed, we have repeatedly held that an implicit motivation to combine exists not only when a suggestion may be gleaned from the prior art as a whole, but when the ‘improvement’ is technology-independent and the combination of references results in a product or process that is more desirable, for example because it is stronger, cheaper, cleaner, faster, lighter, smaller, more durable, or more efficient. Because the desire to enhance commercial opportunities by improving a product or process is universal—and even common-sensical—we have held that there exists in these situations a motivation to combine prior art references even absent any hint of suggestion in the references themselves.").” (emphasis added)
In Cao et al., Fig. 5 teaches generating a neural network. Further, Cao et al. Pg. 7 Column 2 discussed constructing neural network that contains Input, middle and output layer. Moreover, Suganuma et al. Pg. 497 Section Abstract “we attempt to automatically construct CNN architectures for an image classification task based on Cartesian genetic programming (CGP).” Further, Suganuma et al. Fig. 1 teaches middle layer of neural network that select elite solution from the set. Therefore, It would be reasonable to combine Cao et al. and Suganuma et al. to improve neural network of Cao et al. because of the following: “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method” (Suganuma et al., Page 498 Section 3). Further, it is reminded that “[The] reason or motivation to modify the reference may often suggest what the inventor has done, but for a different purpose or to solve a different problem. It is not necessary that the prior art suggest the combination to achieve the same advantage or result discovered by applicant. See, e.g., In re Kahn, 441 F.3d 977, 987, 78 USPQ2d 1329, 1336 (Fed. Cir. 2006).” See MPEP 2144.  
Applicant's arguments filed on 12/31/2021 with respect to the 35 U.S.C. 103 rejection to Claims 12 and 19 have been fully considered but not persuasive. 
Applicant asserts that “There is no language in the cited passages of Cao, Suganuma and Wang that teaches or suggests performing unsupervised training on the selected middle nodes. Instead, Wang simply teaches adjusting the weight vector between the input and hidden layer by unsupervised learning.
Neither is there any language in the cited passages of Cao, Suganuma and Wang that teaches or suggests that the unsupervised training comprising biasing the middle nodes such that certain middle nodes are avoided. 
Instead, Wang simply teaches adjusting the weight vector between the input and hidden layer by unsupervised learning. 
Furthermore, Suganuma simply teaches selecting an elite individual from the set of P and C, and then replacing P with the elite individual. As understood by Applicant, the "P" corresponds to the parent and "C" corresponds to the offsprings which is generated by applying the forced mutation to P. See, e.g., page 500 of Suganuma. The Examiner has not explained how replacing P with the elite individual corresponds to biasing the middle nodes such that certain middle nodes are avoided.” (Remarks, Pg. 50). 
Examiner response: 
The examiner respectfully disagrees. As discussed in the rejection above, Wang et al. teaches “performing unsupervised training on the selected middle nodes”. Wang et al. Page 2 Section 4.2 “are input to the first layer and fanned out to the hidden layer. ….. functions turn the input to output, adjusting the weight of the input to the hidden layer” teaches hidden nodes (middle nodes) updated using unsupervised training. Therefore, the 35 U.S.C. 103 rejection made in the previous office action is maintained.
Applicant asserts that “There is no language in Suganuma (and in particular page 498 of Suganuma) that makes any suggestion to have the unsupervised training comprise biasing the middle nodes such that certain middle nodes are avoided (missing claim limitation) in order to optimize the CNN architecture to maximize the validation accuracy (Examiner's reasoning).  Instead, the Examiner's source of reasoning simply discusses that the invention of Suganuma focuses on optimizing the CNN architecture defined by Cartesian genetic programming.  How does optimizing the CNN architecture defined by Cartesian genetic programming provide motivation for one skilled in the art to modify Cao to have the unsupervised training comprise biasing the middle nodes such that certain middle nodes are avoided (missing claim limitation)? The Examiner's source of reasoning fails to provide such a rational underpinning. Hence, the Examiner's source of reasoning fails to provide motivation for modifying the teachings of Cao to include the above-cited missing claim limitation. Accordingly, the Examiner has not presented a prima facie case of obviousness for rejecting claims 12 and 19.” (remarks Pg.50).
Examiner response: 
The examiner respectfully disagrees. According to MPEP 2144,
“The strongest rationale for combining references is a recognition, expressly or impliedly in the prior art or drawn from a convincing line of reasoning based on established scientific principles or legal precedent, that some advantage or expected beneficial result would have been produced by their combination. In re Sernaker, 702 F.2d 989, 994-95, 217 USPQ 1, 5-6 (Fed. Cir. 1983). See also Dystar Textilfarben GmbH & Co. Deutschland KG v. C.H. Patrick, 464 F.3d 1356, 1368, 80 USPQ2d 1641, 1651 (Fed. Cir. 2006) ("Indeed, we have repeatedly held that an implicit motivation to combine exists not only when a suggestion may be gleaned from the prior art as a whole, but when the ‘improvement’ is technology-independent and the combination of references results in a product or process that is more desirable, for example because it is stronger, cheaper, cleaner, faster, lighter, smaller, more durable, or more efficient. Because the desire to enhance commercial opportunities by improving a product or process is universal—and even common-sensical—we have held that there exists in these situations a motivation to combine prior art references even absent any hint of suggestion in the references themselves.").” (emphasis added)
In Cao et al., Fig. 5 teaches generating a neural network. Further, Cao et al. Pg. 7 Column 2 discussed constructing neural network that contains Input, middle and output layer. Moreover, Suganuma et al. Pg. 497 Section Abstract “we attempt to automatically construct CNN architectures for an image classification task based on Cartesian genetic programming (CGP).” Further, Suganuma et al. Fig. 1 teaches middle layer of neural network that select elite solution from the set. Therefore, It would be reasonable to combine Cao et al. and Suganuma et al. to improve neural network of Cao et al. because of the following: “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method” (Suganuma et al., Page 498 Section 3). Further, it is reminded that “[The] reason or motivation to modify the reference may often suggest what the inventor has done, but for a different purpose or to solve a different problem. It is not necessary that the prior art suggest the combination to achieve the same advantage or result discovered by applicant. See, e.g., In re Kahn, 441 F.3d 977, 987, 78 USPQ2d 1329, 1336 (Fed. Cir. 2006).” See MPEP 2144.  
Applicant asserts that “There is no language in Wang (and in particular pages 2 and 4 of Wang) that makes any suggestion to perform unsupervised training on the selected middle nodes, where the unsupervised training comprises biasing the middle nodes (missing claim limitations) in order to solve classifying problems (Examiner's reasoning). Instead, the Examiner's source of reasoning simply discusses that the invention of Wang includes a radial basis function (RBF) network that is a three-layer feed-forward neural network. How does using a radial basis function (RBF) network that is a three-layer feed-forward neural network provide motivation for one skilled in the art to modify Cao to perform unsupervised training on the selected middle nodes, where the unsupervised training comprises biasing the middle nodes (missing claim limitations)? The Examiner's source of reasoning fails to provide such a rational underpinning. Hence, the Examiner's source of reasoning fails to provide motivation for modifying the teachings of Cao to include the above-cited missing claim limitations. Accordingly, the Examiner has not presented a prima facie case of obviousness for rejecting claims 12 and 19.” (remarks Pg.55).
Examiner response: 
The examiner respectfully disagrees. According to MPEP 2144,
“The strongest rationale for combining references is a recognition, expressly or impliedly in the prior art or drawn from a convincing line of reasoning based on established scientific principles or legal precedent, that some advantage or expected beneficial result would have been produced by their combination. In re Sernaker, 702 F.2d 989, 994-95, 217 USPQ 1, 5-6 (Fed. Cir. 1983). See also Dystar Textilfarben GmbH & Co. Deutschland KG v. C.H. Patrick, 464 F.3d 1356, 1368, 80 USPQ2d 1641, 1651 (Fed. Cir. 2006) ("Indeed, we have repeatedly held that an implicit motivation to combine exists not only when a suggestion may be gleaned from the prior art as a whole, but when the ‘improvement’ is technology-independent and the combination of references results in a product or process that is more desirable, for example because it is stronger, cheaper, cleaner, faster, lighter, smaller, more durable, or more efficient. Because the desire to enhance commercial opportunities by improving a product or process is universal—and even common-sensical—we have held that there exists in these situations a motivation to combine prior art references even absent any hint of suggestion in the references themselves.").” (emphasis added)
In Cao et al., Fig. 5 teaches generating a neural network. Further, Cao et al. Pg. 7 Column 2 discussed constructing neural network that contains Input, middle and output layer. Moreover, Suganuma et al. Pg. 497 Section Abstract “we attempt to automatically construct CNN architectures for an image classification task based on Cartesian genetic programming (CGP).” Further, Suganuma et al. Fig. 1 teaches middle layer of neural network that select elite solution from the set. Wang et al. Pg. 3 Section 4.2 “first is unsupervised learning, which adjusts the weight… The other is supervised learning, which adjusts the weight vector between the hidden and output layer”. Therefore, It would be reasonable to combine Wang et al. and Cao et al. in view of Suganuma et al. to improve neural network of Cao et al. because of the following: “network is a three-layer feed-forward neural network, between the input and the output layers there is a “hidden layer”.” and “The RBF network shows its quickness in training” (Wang et al., Page 2 Section 4.1 and Page 4 Section 7). Further, it is reminded that “[The] reason or motivation to modify the reference may often suggest what the inventor has done, but for a different purpose or to solve a different problem. It is not necessary that the prior art suggest the combination to achieve the same advantage or result discovered by applicant. See, e.g., In re Kahn, 441 F.3d 977, 987, 78 USPQ2d 1329, 1336 (Fed. Cir. 2006).” See MPEP 2144.  
Applicant's arguments filed on 12/31/2021 with respect to the 35 U.S.C. 103 rejection to Claims 13 and 20 have been fully considered but not persuasive. 
Applicant asserts that “There is no language in the cited passages of Cao and Wang that teaches or suggests obtaining N initial conditions, N being an integer larger than 1, each condition corresponding to one of the initial neural networks. Instead, Cao simply teaches that each input word or phrase is initially provided with an input vector which, in many cases, is randomly initialized. The Examiner has not explained how each word or phrase corresponds to a condition corresponding to one of the initial neural networks. 
Neither is there any language in the cited passages of Cao and Wang that teaches or suggests performing unsupervised training of the middle layer of each initial neural network using the corresponding initial condition. Instead, Wang simply teaches adjusting the weight vector between the input and hidden layer using unsupervised learning. 
Neither is there any language in the cited passages of Cao and Wang that teaches or suggests performing supervised training of the output layer of each initial neural network using a set of training data. Instead, Wang simply teaches adjusting the weight vector between the hidden and output layer using supervised learning.” (Remarks, Pg. 58). 
Examiner response: 
The examiner respectfully disagrees. As discussed in the rejection above, Cao et al. teaches “obtaining N initial conditions, N being an integer larger than 1, each condition corresponding to one of the initial neural network”. Pg. 12 Column 11-12 “provided inputs (for example, words, phrases, and/or documents)…….each input word or phrase is initially provided with an input vector which…..is randomly initialized. For longer inputs (such as documents), the input is divided into multiple parts, where each part includes its own vector representation” teaches input (corresponds to initial neural network) because neural network contains layer of input and input provide own vector representations (corresponds to conditions). Further, Wang et al. teaches “performing supervised training of the output layer of each initial neural network using a set of training data”. Wang et al. Page 3 Section 4.2 Training “The training of the RBF network should be divided into two processes……The other is supervised learning, which adjusts the weight vector between the hidden and output layer” teaches performing supervised training on initial output neural networks. Therefore, the 35 U.S.C. 103 rejection made in the previous office action is maintained.
Applicant asserts that “There is no language in Wang (and in particular pages 2 and 4 of Wang) that makes any suggestion to: (1) perform unsupervised training of the middle layer of each initial neural network using the corresponding initial condition; and (2) perform supervised training of the output layer of each initial neural network using a set of training data (missing claim limitations) in order to solve classifying problems (Examiner's reasoning).  Instead, the Examiner's source of reasoning simply discusses that the invention of Wang includes a radial basis function (RBF) network that is a three-layer feed-forward neural network. How does using a radial basis function (RBF) network that is a three-layer feed-forward neural network provide motivation for one skilled in the art to modify Cao to: (1) perform unsupervised training of the middle layer of each initial neural network using the corresponding initial condition; and (2) perform supervised training of the output layer of each initial neural network using a set of training data (missing claim limitations)?  The Examiner's source of reasoning fails to provide such a rational underpinning. Hence, the Examiner's source of reasoning fails to provide motivation for modifying the teachings of Cao to include the above-cited missing claim limitations. Accordingly, the Examiner has not presented a primafacie case of obviousness for rejecting claims 13 and 20.” (remarks Pg. 62).
Examiner response: 
The examiner respectfully disagrees. According to MPEP 2144,
“The strongest rationale for combining references is a recognition, expressly or impliedly in the prior art or drawn from a convincing line of reasoning based on established scientific principles or legal precedent, that some advantage or expected beneficial result would have been produced by their combination. In re Sernaker, 702 F.2d 989, 994-95, 217 USPQ 1, 5-6 (Fed. Cir. 1983). See also Dystar Textilfarben GmbH & Co. Deutschland KG v. C.H. Patrick, 464 F.3d 1356, 1368, 80 USPQ2d 1641, 1651 (Fed. Cir. 2006) ("Indeed, we have repeatedly held that an implicit motivation to combine exists not only when a suggestion may be gleaned from the prior art as a whole, but when the ‘improvement’ is technology-independent and the combination of references results in a product or process that is more desirable, for example because it is stronger, cheaper, cleaner, faster, lighter, smaller, more durable, or more efficient. Because the desire to enhance commercial opportunities by improving a product or process is universal—and even common-sensical—we have held that there exists in these situations a motivation to combine prior art references even absent any hint of suggestion in the references themselves.").” (emphasis added)
In Cao et al., Fig. 5 teaches generating a neural network. Further, Cao et al. Pg. 7 Column 2 discussed constructing neural network that contains Input, middle and output layer. Moreover, Suganuma et al. Pg. 497 Section Abstract “we attempt to automatically construct CNN architectures for an image classification task based on Cartesian genetic programming (CGP).” Further, Suganuma et al. Fig. 1 teaches middle layer of neural network that select elite solution from the set. Wang et al. Pg. 3 Section 4.2 “first is unsupervised learning, which adjusts the weight… The other is supervised learning, which adjusts the weight vector between the hidden and output layer”. Therefore, It would be reasonable to combine Wang et al. and Cao et al. in view of Suganuma et al. to improve neural network of Cao et al. because of the following: “network is a three-layer feed-forward neural network, between the input and the output layers there is a “hidden layer”.” and “The RBF network shows its quickness in training” (Wang et al., Page 2 Section 4.1 and Page 4 Section 7). Further, it is reminded that “[The] reason or motivation to modify the reference may often suggest what the inventor has done, but for a different purpose or to solve a different problem. It is not necessary that the prior art suggest the combination to achieve the same advantage or result discovered by applicant. See, e.g., In re Kahn, 441 F.3d 977, 987, 78 USPQ2d 1329, 1336 (Fed. Cir. 2006).” See MPEP 2144.  
Applicant's arguments filed on 12/31/2021 with respect to the 35 U.S.C. 103 rejection to Claims 14 and 21 have been fully considered but not persuasive. 
Applicant asserts that “There is no language in the cited passages of Cao, Suganuma and Wang that teaches or suggests obtaining M initial conditions, M being an integer larger than 2, each condition corresponding to one of M candidate neural networks. 
Instead, Cao simply teaches each input word or phrase is initially provided with an input vector which, in many cases, is randomly initialized. The Examiner has not explained how each word or phrase corresponds to one of the M candidate neural networks. 
Neither is there any language in the cited passages of Cao, Suganuma and Wang that teaches or suggests performing unsupervised training of the middle layer of each candidate neural network using the corresponding initial condition. Instead, Wang simply teaches adjusting the weight vector between the input and hidden layer using unsupervised learning. 
Neither is there any language in the cited passages of Cao, Suganuma and Wang that teaches or suggests performing supervised training of the output layer of each candidate neural network using a set of training data. Instead, Wang simply teaches adjusting the weight vector between the hidden and output layer using supervised learning. 
Neither is there any language in the cited passages of Cao, Suganuma and Wang that teaches or suggests evaluating a performance of each candidate neural network. Instead, as discussed above, Wang simply teaches adjusting the weight vector between the hidden and output layer using supervised learning. The Examiner has not explained how adjusting such a weight vector corresponds to evaluating a performance of each candidate neural network. 
Neither is there any language in the cited passages of Cao, Suganuma and Wang that teaches or suggests selecting N initial neural networks from among the M candidate neural networks using the performances, N being an integer larger than 1 and smaller than M. Instead, Suganuma simply teaches selecting an elite individual from the set of P and C, and then replacing P with the elite individual. The Examiner has not explained how selecting an elite individual, as discussed in Suganuma, corresponds to selecting N initial neural networks from among the M candidate neural networks using the performances.” (Remarks, Pg. 60-61). 
Examiner response: 
The examiner respectfully disagrees. As discussed in the rejection above, Cao et al. teaches “obtaining M initial conditions, M being an integer larger than 2, each condition corresponding to one of M candidate neural networks”. Pg. 12 Column 11-12 “provided inputs (for example, words, phrases, and/or documents)…….each input word or phrase is initially provided with an input vector which…..is randomly initialized. For longer inputs (such as documents), the input is divided into multiple parts, where each part includes its own vector representation” teach input (corresponds to initial neural network) because neural network contains layer of input and input provide own vector representations (corresponds to conditions). Wang et al. teach “performing unsupervised training of the middle layer of each candidate neural network using the corresponding initial condition”. Pg. 3 Section 4.2 “The first is unsupervised learning, which adjusts the weight vector between the input and hidden layer” unsupervised learning on the middle layer using each input sample (corresponds to initial conditions). Wang et al. teach “performing supervised training of the output layer of each candidate neural network using a set of training data”. Wang et al. Pg. 3 Section 4.2 “The other is supervised learning, which adjusts the weight vector between the hidden and output layer” teach Supervised training on the output layer for using each sample. Wang et al. teach “evaluating a performance of each candidate neural network”. Wang et al. pg. 2 Section 4.1 “the input vectors and the weight vectors, which have been adjusted by training process, is calculated. Each input sample is sorted to a class” evaluating each input sample to class (corresponds to solution). Suganuma et al. teach “selecting N initial neural networks from among the M candidate neural networks using the performances, N being an integer larger than 1 and smaller than M.” Suganuma et al. Pg. 500 Section 3.2 “Select an elite individual from the set of P and C, and then replace P with the elite individual” teach selection of solution. Suganuma et al. Figure 1 teach initial neural nodes more than 1 and less than candidate neural networks. Therefore, the 35 U.S.C. 103 rejection made in the previous office action is maintained.
Applicant asserts that “There is no language in Suganuma (and in particular page 498 of Suganuma) that makes any suggestion to select N initial neural networks from among the M candidate neural networks using the performances, N being an integer larger than 1 and smaller than M (missing claim limitation) in order to optimize the CNN architecture to maximize the validation accuracy (Examiner's reasoning). Instead, the Examiner's source of reasoning simply discusses that the invention of Suganuma focuses on optimizing the CNN    architecture defined by Cartesian genetic programming. How does optimizing the CNN architecture defined by Cartesian genetic programming provide motivation for one skilled in the art to modify Cao to select N initial neural networks from among the M candidate neural networks using the performances, N being an integer larger than 1 and smaller than M (missing claim limitation)? The Examiner's source of reasoning fails to provide such a rational underpinning. Hence, the Examiner's source of reasoning fails to provide motivation for modifying the teachings of Cao to include the above-cited missing claim limitation. Accordingly, the Examiner has not presented a prima facie case of obviousness for rejecting claims 14 and 21.” (remarks Pg. 65-66).
Examiner response: 
The examiner respectfully disagrees. According to MPEP 2144,
“The strongest rationale for combining references is a recognition, expressly or impliedly in the prior art or drawn from a convincing line of reasoning based on established scientific principles or legal precedent, that some advantage or expected beneficial result would have been produced by their combination. In re Sernaker, 702 F.2d 989, 994-95, 217 USPQ 1, 5-6 (Fed. Cir. 1983). See also Dystar Textilfarben GmbH & Co. Deutschland KG v. C.H. Patrick, 464 F.3d 1356, 1368, 80 USPQ2d 1641, 1651 (Fed. Cir. 2006) ("Indeed, we have repeatedly held that an implicit motivation to combine exists not only when a suggestion may be gleaned from the prior art as a whole, but when the ‘improvement’ is technology-independent and the combination of references results in a product or process that is more desirable, for example because it is stronger, cheaper, cleaner, faster, lighter, smaller, more durable, or more efficient. Because the desire to enhance commercial opportunities by improving a product or process is universal—and even common-sensical—we have held that there exists in these situations a motivation to combine prior art references even absent any hint of suggestion in the references themselves.").” (emphasis added)
In Cao et al., Fig. 5 teaches generating a neural network. Further, Cao et al. Pg. 7 Column 2 discussed constructing neural network that contains Input, middle and output layer. Moreover, Suganuma et al. Pg. 497 Section Abstract “we attempt to automatically construct CNN architectures for an image classification task based on Cartesian genetic programming (CGP).” Further, Suganuma et al. Fig. 1 teaches middle layer of neural network that select elite solution from the set. Therefore, It would be reasonable to combine Cao et al. and Suganuma et al.  to improve neural network of Cao et al. because of the following: “Our method directly encodes the CNN architectures based on CGP [8, 21, 22] and uses the highly functional modules as the node functions. The CNN architecture defined by CGP is trained using a training dataset, and the validation accuracy is assigned as the fitness of the architecture. Then, the architecture is optimized to maximize the validation accuracy by the evolutionary algorithm. Figure 1 illustrates an overview of our method” (Suganuma et al., Page 498 Section 3). Further, it is reminded that “[The] reason or motivation to modify the reference may often suggest what the inventor has done, but for a different purpose or to solve a different problem. It is not necessary that the prior art suggest the combination to achieve the same advantage or result discovered by applicant. See, e.g., In re Kahn, 441 F.3d 977, 987, 78 USPQ2d 1329, 1336 (Fed. Cir. 2006).” See MPEP 2144.  
Applicant asserts that “There is no language in Wang (and in particular pages 2 and 4 of Wang) that makes any suggestion to: (1) perform unsupervised training of the middle layer of each candidate neural network using the corresponding initial condition; (2) perform supervised training of the output layer of each candidate neural network using a set of training data; and (3) evaluate a performance of each candidate neural network (missing claim limitations) in order to solve classifying problems (Examiner's reasoning). Instead, the Examiner's source of reasoning simply discusses that the invention of Wang includes a radial basis function (RBF) network that is a three-layer feed-forward neural network. How does using a radial basis function (RBF) network that is a three-layer feed-forward neural network provide motivation for one skilled in the art to modify Cao to: (1) perform unsupervised training of the middle layer of each candidate neural network using the corresponding initial condition; (2) perform supervised training of the output layer of each candidate neural network using a set of training data; and (3) evaluate a performance of each candidate neural network (missing claim limitations)? The Examiner's source of reasoning fails to provide such a rational underpinning. Hence, the Examiner's source of reasoning fails to provide motivation for modifying the teachings of Cao to include the above-cited missing claim limitations. Accordingly, the Examiner has not presented aprimafacie case of obviousness for rejecting claims 14 and 21.” (remarks Pg. 68-69).
Examiner response: 
The examiner respectfully disagrees. According to MPEP 2144,
“The strongest rationale for combining references is a recognition, expressly or impliedly in the prior art or drawn from a convincing line of reasoning based on established scientific principles or legal precedent, that some advantage or expected beneficial result would have been produced by their combination. In re Sernaker, 702 F.2d 989, 994-95, 217 USPQ 1, 5-6 (Fed. Cir. 1983). See also Dystar Textilfarben GmbH & Co. Deutschland KG v. C.H. Patrick, 464 F.3d 1356, 1368, 80 USPQ2d 1641, 1651 (Fed. Cir. 2006) ("Indeed, we have repeatedly held that an implicit motivation to combine exists not only when a suggestion may be gleaned from the prior art as a whole, but when the ‘improvement’ is technology-independent and the combination of references results in a product or process that is more desirable, for example because it is stronger, cheaper, cleaner, faster, lighter, smaller, more durable, or more efficient. Because the desire to enhance commercial opportunities by improving a product or process is universal—and even common-sensical—we have held that there exists in these situations a motivation to combine prior art references even absent any hint of suggestion in the references themselves.").” (emphasis added)
In Cao et al., Fig. 5 teaches generating a neural network. Further, Cao et al. Pg. 7 Column 2 discussed constructing neural network that contains Input, middle and output layer. Moreover, Suganuma et al. Pg. 497 Section Abstract “we attempt to automatically construct CNN architectures for an image classification task based on Cartesian genetic programming (CGP).” Further, Suganuma et al. Fig. 1 teaches middle layer of neural network that select elite solution from the set. Wang et al. Pg. 3 Section 4.2 “first is unsupervised learning, which adjusts the weight… The other is supervised learning, which adjusts the weight vector between the hidden and output layer”. Therefore, It would be reasonable to combine Wang et al. and Cao et al. in view of Suganuma et al. to improve neural network of Cao et al. because of the following: “network is a three-layer feed-forward neural network, between the input and the output layers there is a “hidden layer”.” and “The RBF network shows its quickness in training” (Wang et al., Page 2 Section 4.1 and Page 4 Section 7). Further, it is reminded that “[The] reason or motivation to modify the reference may often suggest what the inventor has done, but for a different purpose or to solve a different problem. It is not necessary that the prior art suggest the combination to achieve the same advantage or result discovered by applicant. See, e.g., In re Kahn, 441 F.3d 977, 987, 78 USPQ2d 1329, 1336 (Fed. Cir. 2006).” See MPEP 2144.  
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LOKESHA G PATEL whose telephone number is (571)272-6267. The examiner can normally be reached Monday-Friday 8am-5pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Afshar, Kamran can be reached on (571) 272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/LOKESHA G PATEL/Examiner, Art Unit 2125 

/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125