DETAILED ACTION
This final rejection is responsive to amendments and remarks filed 07 January 2020.
Claims 1-3, 5-7, and 9-11 are amended. No claims are added, cancelled, or withdrawn. Therefore, claims 1-12 are presently pending.

Response to Arguments
In view of the amendments, the claim objections and the claim rejections under 35 U.S.C. § 112 are withdrawn. However, upon further consideration, a new ground(s) of rejection is made in view of the amendments.
Applicant’s arguments with respect to the rejection of the claims under 35 U.S.C. § 102(a)(1) have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Objections
Claim 1 is objected to because of the following informalities:  claim 1 recites the limitation “to learn coupling weight between the units in each layer”; it is unclear whether this “coupling weight” is a single weight or multiple weights that couples the units in each layer. Appropriate correction is required.
Claims 5 and 9 recite the same limitation and are objected to for the same reasons.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):



The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 2-3, 6-7, and 10-11 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 2 recites the limitation “the neural networks randomly selected from the parent group.”  There is insufficient antecedent basis for this limitation in the claim.
Claims 3, 7, and 11 are rejected for their dependency on indefinite claims. 
Claim 3 recites the limitation “the plurality of learning durations for the next-generation neural networks.”  There is insufficient antecedent basis for this limitation in the claim.
Further, claim 3 recites the limitations “equal to or more than a threshold” and “less than the threshold.” Claim 3 ultimately depends on claim 1, which recites the limitation “determining whether the second learning using the specific algorithm is to be completed depending on whether the variance value of the predicted error of the neural networks, that have trained for the number of epochs corresponding to the termination epoch number, is less than a threshold.” It is unclear whether “the threshold” recited in claim 3 refers to “a threshold,” as recited in claim 1, or “a threshold,” as recited in claim 3. 
Claims 7 and 11 recite similar limitations as claim 3 and are rejected for the same reasons.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1, 5, and 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Guha et al. (US 5,140,530) (“Guha”) in view of L’Ecuyer (“Uniform random number generation,” 1994, Annals of Operations Research 53, pp. 77-120) (“L’Ecuyer”), Kee et al. (“An Adaptive Genetic Algorithm,” July 2001, Proceedings of the 3rd Annual Conference on Genetic and Evolutionary Computation, pp. 391-397) (“Kee”), and Kalderstam et al. (“Training artificial neural networks directly on the concordance index for censored data using genetic algorithms,” 2013, Artificial Intelligence in Medicine 58, pp. 125-132) (“Kalderstam”).
Regarding claim 1, Guha teaches a non-transitory computer-readable recording medium having stored therein a learning program that causes a computer to execute a process (Guha does not explicitly teach these components, but they would have been required to perform the disclosed methods.) comprising: 
training a plurality of neural networks, having a number of units in each layer …, by learning duration of [at least 1] epoch to learn coupling weight between the units in each layer (Guha, col. 2, lines 49-66, FIGS. 1-2, “The network 10 is illustrated as having three layers (or areas) 12, 14 and 16 but could have more than three layers or as few as one layer if desired. Each of the layers has computational units 18 [a number of units in each layer] joined by connections 19 [coupling weights between the units in each layer] which have variable weights associated. … FIG. 2 illustrates schematically how a population of blueprints 20 (i.e. bit string designs for different neural networks) [a plurality of neural networks] are cyclically updated by a genetic algorithm based on their fitness.” Guha, col. 5, lines 33-35, “The ‘total size’ parameter determines how many computational units 18 the area will have. It ranges from 0 to 7, and is interpreted as the logarithm (base 2) of the actual number of units.” Guha, col. 6, lines 42-43, “The weights are adjusted by a learning rule during the training of the network.” Guha, col. 12, lines 66-68 and col. 13, lines 1-3, “Learning is halted under the first criterion when rms error during the previous epoch [disclosing training of at least 1 epoch] was lower than a given threshold. The learning phase is terminated under the second criterion after a fixed number of epochs [disclosing training of at least 1 epoch] has been counted; this threshold is set by the experimenter according to the problem.”);
determining a termination epoch number from a variance value of a predicted error of the neural networks that have learnt through a first learning of the learning duration of [at least] 1 epoch (Guha, col. 12, lines 66-68, “Learning is halted under the first criterion when rms error during the previous epoch was lower than a given threshold.” If learning is not halted, the learning continues for another epoch, thereby increasing and determining a termination epoch number from a variance value of a predicted error of the neural networks that have learnt through a first learning of the learning duration of at least 1 epoch.); 
performing a second learning for the number of epochs corresponding to the termination epoch number on the neural networks that have learnt through the first learning (Guha, col. 12, lines 66-68, “Learning is halted under the first criterion when rms error during the previous epoch was lower than a given threshold.” If learning is not halted, the learning continues for another epoch, thereby increasing and updating the number of epochs corresponding to the termination epoch number.), 
the second learning using a specific algorithm to optimize the number of units in each layer … in the neural networks that have learnt through the first learning (Guha, “FIG. 2 illustrates schematically how a population of blueprints 20 (i.e. bit string designs for different neural networks) are cyclically updated [including second learning] by a genetic algorithm [a specific algorithm] based on their fitness.” Guha, col. 2, lines 63-68 and col. 3, lines 1-3, “Each new ‘generation’ of the population is created by first sampling the previous generation according to fitness; the method used for differential selection is known to be a near-optimal method of sampling the search space. Novel strings are created by altering selected individuals with genetic operators. Prominent among these is the crossover operator which synthesizes new strings by splicing together segments of two sampled individuals.” Guha, col. 5, lines 33-35, “The ‘total size’ parameter [included in each substring] determines how many computational units 18 the area will have.”); 
determining whether the second learning using the specific algorithm is to be completed depending on whether the variance value of the predicted error of the neural networks, that have trained for the number of epochs corresponding to the termination epoch number, is less than a threshold (Guha, col. 12, lines 64-68 and col. 13, lines 1-3, “Our compromise is to employ two criteria for halting the learning phase. Learning is halted under the first criterion when rms error during the previous epoch was lower than a given threshold. The learning phase is terminated under the second criterion after a fixed number of epochs has been counted; this threshold is set by the experimenter according to the problem.”); and 
in a case where the variance value is less than the threshold, completing the second learning using the specific algorithm, and in a case where the variance value is greater than the threshold, determining the termination epoch number from the variance value and performing the second learning using the specific algorithm for the number of epochs corresponding to the termination epoch number (Guha, col. 12, lines 64-68 and col. 13, lines 1-3, “Our compromise is to employ two criteria for halting the learning phase. Learning is halted under the first criterion when rms error during the previous epoch was lower than a given threshold [in a case where the variance value is less than the threshold]. The learning phase is terminated under the second criterion [in a case where the variance value is greater than the threshold] after a fixed number of epochs has been counted; this threshold is set by the experimenter according to the problem.”).
Guha does not disclose the (italicized portion of the) non-transitory computer-readable recording medium comprising: 
… having a number of units in each layer determined by using a uniform random number, by learning duration of 1 epoch …
the second learning using a specific algorithm to optimize the number of units in each layer by fixing the number of layers ….
Guha teaches randomly generated a first population of neural networks that parametrizes the number of units in a layer (Guha; col. 3, lines 3-4; col. 4, lines 57-65; and col. 5, lines 33-35) but does not teach using a uniform random number. 
However, L’Ecuyer teaches having a number of units in each layer determined by using a uniform random number (L’Ecuyer, p. 79, Section 1.3, “For pseudorandom number generators, one would expect the observations to behave from the outside as if they were the values of i.i.d, random variables, uniformly distributed over                                 
                                    U
                                
                            . The set                                 
                                    U
                                
                             is often a set of integers of the form                                 
                                    {
                                    0
                                    ,
                                    …
                                    ,
                                    m
                                    -
                                    1
                                    }
                                
                            .”).
Both Guha and L’Ecuyer are directed to random generation. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the random generation in Guha to use a random uniform number, as disclosed in L’Ecuyer, to yield predictable results of choosing a random uniform number of units in a layer.
Neither Guha nor L’Ecuyer teach the (italicized portion of the) non-transitory computer-readable recording medium comprising: 
… learning duration of 1 epoch …
the second learning using a specific algorithm to optimize the number of units in each layer by fixing the number of layers ….
However, Kee teaches a learning duration of 1 epoch (Kee, p. 395, Section 4.2, “In the rule-based approach, a training epoch consisted of one generation.”).
Guha teaches learning duration of at least 1 epoch but is not explicit in teaching a learning duration of only 1 epoch. However, Kee teaches a learning duration where one epoch trains one generation. Both Guha and Kee are directed to genetic algorithms. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the learning duration in Guha to be of 1 epoch, as disclosed in Kee, to yield predictable results of assigning one training epoch per generation, as opposed to multiple training epochs per generation. 
None of Guha, L’Ecuyer, or Kee teach the (italicized portion of the) non-transitory computer-readable recording medium comprising: 
…
the second learning using a specific algorithm to optimize the number of units in each layer by fixing the number of layers ….
However, Kalderstam teaches fixing the number of layers (Kalderstam, p. 127, Section 2.6.1, “To keep the procedure as simple as possible, the actual architecture of the ANNs [artificial neural networks] is fixed” during training using genetic algorithms.).
Both Guha and Kalderstam are directed to training neural networks using genetic algorithms. It would have been obvious to modify Guha to fix the architecture (and therefore the number of layers) during training, as disclosed in Kalderstam. One would be motivated to do so, since it keeps “the procedure as simple as possible” (Kalderstam, p. 1277, Section 2.6.1).

Regarding claim 5, claim 5 is directed to a learning method corresponding to claim 1. Therefore the rejection made to claim 1 is applied to claim 5.

Regarding claim 9, claim 9 is directed to an information processing apparatus comprising a processor that executes the process recited in claim 1. Therefore the rejection made to claim 1 is applied to claim 9.

Claims 2-4, 6-8, and 10-12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Guha in view of L’Ecuyer, Kee, and Kalderstam; further in view of Arifovic et al. (“Using genetic algorithms to select architecture of a feedforward artificial neural network,” 2001, Physica A, pp. 574-594) (“Arifovic”).
Regarding claim 2, Guha in view of L’Ecuyer, Kee, and Kalderstam teaches the non-transitory computer-readable recording medium according to claim 1.
Guha further teaches the non-transitory computer-readable recording medium, further comprising, 
generating a plurality of next-generation neural networks whose number is identical to the number of the plurality of neural networks (Guha, col. 7, lines 36-38 and FIG. 15, “The basic plan for generating each new generation [a plurality of next-generation neural networks] is given in FIG. 15.” The method iteratively adds individuals into the new generation until it is full, and it is suggested that the population size remains constant.), wherein 
the training includes setting the neural networks for a parent group and executing the learning duration of [at least] 1 epoch on the neural networks (Guha, col. 2, lines 63-66 and col. 3, lines 3-4, “FIG. 2 illustrates schematically how a population of blueprints 20 (i.e. bit string designs for different neural networks) are cyclically updated by a genetic algorithm based on their fitness. … The method begins with a population of randomly generated bit strings 20 [a parent group].” Guha, col. 12, lines 64-68 and col. 13, lines 1-3, “Our compromise is to employ two criteria for halting the learning phase. Learning is halted under the first criterion when rms error during the previous epoch was lower than a given threshold. The learning phase is terminated under the second criterion after a fixed number of epochs has been counted; this threshold is set by the experimenter according to the problem.”), 
the performing includes generating neural networks of child individuals through crossover by using the neural networks … selected from the parent group to generate a child group (Guha, col. 2, lines 63-68 and col. 3, lines 1-3, “Each new ‘generation’ of the population is created by first sampling the previous generation according to fitness; the method used for differential selection is known to be a near-optimal method of sampling the search space. Novel strings are created by altering selected individuals with genetic operators. Prominent among these is the crossover operator which synthesizes new strings by splicing together segments of two sampled individuals.”), and 
performing, on each neural network of the child individuals included in the child group, the second learning using the specific algorithm for the number of epochs corresponding to the determined termination epoch number (Guha, col. 2, lines 63-66 and FIG. 2, “FIG. 2 illustrates schematically how a population of blueprints 20 (i.e. bit string designs for different neural networks) are cyclically updated by a genetic algorithm based on their fitness.” Guha, col. 12, lines 64-68 and col. 13, lines 1-3, “Our compromise is to employ two criteria for halting the learning phase. Learning is halted under the first criterion when rms error during the previous epoch was lower than a given threshold. The learning phase is terminated under the second criterion after a fixed number of epochs has been counted; this threshold is set by the experimenter according to the problem.”), 
the generating includes selecting, from the parent group and the child group, a top predetermined number of neural networks having a small predicted error and whose number is identical to the number of the neural networks of the parent group, and generating the plurality of next-generation neural networks (Guha, col. 2, lines 63-68, “Each new ‘generation’ of the population is created by first sampling the previous generation according to fitness; the method used for differential selection is known to be a near-optimal method of sampling the search space. Novel strings are created by altering selected individuals with genetic operators.” Guha, col. 7, lines 36-43 and FIG. 15, “The basic plan for generating each new generation [a plurality of next-generation neural networks] is given in FIG. 15. … A final step was added to insure that the best individual from generation i was always retained in generation i+1.” The method iteratively adds individuals into the new generation until it is full, and it is suggested that the population size remains constant. Guha, col. 8, lines 26-61, “Suitable improvements over generations can only be accomplished if the evaluation function used to measure the fitness of a network is appropriate. … if accuracy and noise tolerance is more crucial, then the performance on noisy input patterns would be given a higher weight.”), and 
the determining includes determining whether the second learning using the specific algorithm is to be completed for the selected next-generation neural networks (Guha, col. 12, lines 64-68 and col. 13, lines 1-3, “Our compromise is to employ two criteria for halting the learning phase. Learning is halted under the first criterion when rms error during the previous epoch was lower than a given threshold. The learning phase is terminated under the second criterion after a fixed number of epochs has been counted; this threshold is set by the experimenter according to the problem.”).
Guha teaches learning duration of at least 1 epoch but is not explicit in teaching a learning duration of only 1 epoch. 
However, Kee teaches a learning duration of 1 epoch (Kee, p. 395, Section 4.2, “In the rule-based approach, a training epoch consisted of one generation.”).
Both Guha and Kee are directed to genetic algorithms. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the learning duration in Guha to be of 1 epoch, as disclosed in Kee, to yield predictable results of assigning one training epoch per generation, as opposed to multiple training epochs per generation. 
Guha does not disclose the (italicized portion of the) non-transitory computer-readable recording medium, further comprising,
… 
the performing includes generating neural networks of child individuals through crossover by using the neural networks randomly selected from the parent group to generate a child group 
…. 
However, Arifovic teaches crossover by using the neural networks randomly selected from the parent group to generate a child group (Arifovic, p. 581, Section 3, “Crossover exchanges parts of randomly selected binary strings. First, two binary strings are selected from the mating pool at random.” Arifovic, p. 579, Section 3, “Each binary string … encodes a neural network architecture.”).
Guha teaches crossover using neural networks selected from a parent group to generate a child group but does not explicitly disclose randomly selecting parent neural networks to perform crossover. However, Arifovic is also directed to crossover using neural networks selected from a parent group to generate a child group and teaches randomly selecting parent neural networks to perform crossover. It would have been obvious to modify the parent selection in crossover in Guha to utilize random parent selection, as disclosed in Arifovic, to yield predictable results of selecting neural networks from a parent group to generate a child group through crossover.

Regarding claim 3, Guha in view of L’Ecuyer, Kee, Kalderstam, and Arifovic teaches the non-transitory computer-readable recording medium according to claim 2.
Guha further teaches the non-transitory computer-readable recording medium, wherein the determining includes, 
in a case where the variance value of each of a plurality of previous-generation neural networks, which are previous execution targets, is equal to or more than a threshold, determining that the plurality of learning durations for the next-generation neural networks is a value that is obtained by subtracting a predetermined number from a plurality of previous learning durations (Guha, col. 12, lines 64-68 and col. 13, lines 1-3, “Our compromise is to employ two criteria for halting the learning phase. Learning is halted under the first criterion when rms error during the previous epoch was lower than a given threshold. The learning phase is terminated under the second criterion after a fixed number of epochs has been counted [in a case where the variance value is equal to or more than a threshold]; this threshold is set by the experimenter according to the problem.” When learning is halted, the plurality of learning durations for the next-generation neural networks is a value that is obtained by subtracting a predetermined number zero from a plurality of previous learning durations.) and, 
in a case where the variance value of the accuracy of each of the previous-generation neural networks, which are the previous execution targets, is less than the threshold, determining that the plurality of learning durations is a value that is obtained by adding a predetermined number to the plurality of previous learning durations (Guha, col. 12, lines 64-68 and col. 13, lines 1-3, “Our compromise is to employ two criteria for halting the learning phase. Learning is halted under the first criterion when rms error during the previous epoch was lower than a given threshold [in a case where the variance value is less than the threshold]. The learning phase is terminated under the second criterion after a fixed number of epochs has been counted; this threshold is set by the experimenter according to the problem.” When learning is halted in a learning phase, a plurality of learning durations has accumulated, and when the next generation is learned, at least a predetermined number of one epoch is added to the plurality of learning durations.).

Regarding claim 4, Guha in view of L’Ecuyer, Kee, Kalderstam, and Arifovic teaches the non-transitory computer-readable recording medium according to claim 1.
Guha further teaches the non-transitory computer-readable recording medium, wherein the specific algorithm is a genetic algorithm (Guha, col. 2, lines 63-66 and FIG. 2, “FIG. 2 illustrates schematically how a population of blueprints 20 (i.e. bit string designs for different neural networks) are cyclically updated by a genetic algorithm based on their fitness.”).

Regarding claims 6-8, claims 6-8 are directed to a learning method corresponding to claims 2-4, respectively. Therefore the rejections made to claims 2-4 are applied to claims 6-8.

Regarding claims 10-12, claims 10-12 are directed to an information processing apparatus comprising a processor that executes the process recited in claims 2-4, respectively. Therefore the rejections made to claims 2-4 are applied to claims 10-12.




Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CATHERINE F LEE whose telephone number is (571)270-7487.  The examiner can normally be reached on Monday thru Friday, 10:00AM-6:00PM EDT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/C.F.L./Examiner, Art Unit 2124                                                                                                                                                                                                        
/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124