DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Status
The present application is being examined under the claims filed on 06/20/2019.
Claims 1-16 are rejected.
Claims 1-16 are pending

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 06/20/2019 was filed.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement(s) is/are being considered by the examiner.

Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No JP2018-124226, filed on 06/29/2018.

Drawings
The drawings are objected to because S241 – S249 in Figure 27 should read S141 – S149 in [p. 76 - 78] in the Specification.  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 
The following title is suggested: Evaluation and training method for learning models using a Bayesian statistical model checking method.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-4 and 10-15 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception.

Step 1: This part of the eligibility analysis evaluates whether the claim falls within any statutory category. MPEP 2106.03:
Claims 1-4 and 10-15 recite a method/process; therefore, they fall into one of the statutory categories of invention.

Analysis of Claim 1
Step 2A Prong One: This part of the eligibility analysis evaluates whether the claim recites a judicial exception. As explained in MPEP 2106.04(II) and the 2019 PEG, a claim “recites” a judicial exception when the judicial exception is “set forth” or “described” in the claim. 
The claim recites a judicial exception (i.e., an abstract idea) without significantly more. For example, applicant claim limitations under broadest reasonable interpretation covers activities classified under mathematical concept. Abstract ideas classified under mathematical concepts include mathematical relationships, mathematical formulas or equations, and mathematical calculations, see MPEP 2106.04(a)(2), as highlighted in the claim analysis below.
inter alia: 
(C) determining whether or not the first and second execution results satisfy a predetermined logical formula … (Mental process. For example, a person can reasonably determine whether or not the first and second execution results satisfy a predetermined logical formula (values).); and 
(D) comparing …, using a Bayesian statistical model checking method, respective behaviors of the first and second learning models with each other on the basis of a result of the determination in the step (C). (Mental process. For example, a person can reasonably compare respective behaviors (outputs) of the first and second learning models.) (Mathematical calculation. For example, a person can calculate Bayes factor and related steps used in Bayesian statistical model checking method.)
The claim falls within the “mathematical concepts” and/or “mental processes” grouping of abstract ideas as discussed above. Therefore, the claim recites an abstract idea.

Step 2A Prong Two: This part of the eligibility analysis evaluates whether the claim recite additional elements that integrate the judicial exception into a practical application. 
The claim recites, inter alia
(A) generating a first execution result of a first learning model as an exemplar model using checking data …; (merely generating a result using a processor as a tool.), 
(B) generating a second execution result of a second learning model using the checking data …; (merely generating a result using a processor as a tool.),
using a memory and a processor; by the processor; by the computer (Generic components, recited at a high level.)
In this case, after considering those additional elements individually and in combination, it is determined that those additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.

Step 2B: This part of the eligibility analysis evaluates whether the claim as a whole amounts to significantly more than the recited exception, i.e., whether any additional element, or combination of additional elements, adds an inventive concept to the claim. MPEP 2106.05.
The claim recites, inter alia: 
(A) generating a first execution result of a first learning model as an exemplar model using checking data …; (merely generating a result using a processor as a tool.  This limitation, as broadly recited is similar to “generating a menu … as performed by generic computer components”, Apple, Inc. v. Ameranth, Inc., 842 F.3d 1229, 1243-44, 120 USPQ2d 1844, 1855-57 (Fed. Cir. 2016) and set forth in MPEP 2106.05(f).), 
(B) generating a second execution result of a second learning model using the checking data …; (merely generating a result using a processor as a tool.  This limitation, as broadly recited is similar to “generating a menu … as performed by generic computer components”, Apple, Inc. v. Ameranth, Inc., 842 F.3d 1229, 1243-44, 120 USPQ2d 1844, 1855-57 (Fed. Cir. 2016) and set forth in MPEP 2106.05(f).),
using a memory and a processor; by the processor; by the computer (mere physical or tangible implementation of an exception is not in itself an inventive concept and does not guarantee eligibility, Alice Corp. in MPEP 2106.05.I.B.)
Those additional elements do not add significantly more to the exception when considered separately and in combination and/or do not amount to an inventive concept to the claim. 
In Summary, the claim recites abstract idea without being integrated into a practical application, and does not provide additional elements that would amount to significantly more. As such, taken as a whole, the claim is ineligible under the 35 USC 101.

Analysis of Claim 2
Step 2A Prong One: The claim recites, inter alia
calculating a Bayes factor for a hypothesis associated with an establishment probability of the logical formula; (Mathematical calculation)
performing Bayesian hypothetical testing to determine whether or not the establishment probability is not less than a probability threshold (Mental process); and 
evaluating, based on a result of the Bayesian hypothetical testing, behavioral equivalence between the first learning model and the second learning model (Mental process) 
The claim falls within the “mathematical concepts” and/or “mental processes” groupings of abstract ideas as discussed above. Therefore, the claim recites an abstract idea.
Step 2A Prong Two: No additional elements integrates the abstract idea into a practical application.
Step 2B: No additional elements amount to an inventive concept to the claim. 
In Summary, the claim recites abstract idea without being integrated into a practical application, and does not provide additional elements that would amount to significantly more. As such, taken as a whole, the claim is ineligible under the 35 USC 101.

Analysis of Claim 3
Step 2A Prong One: The claim recites, inter alia
calculating a confidence interval which satisfies an establishment probability of the logical formula; (Mathematical calculation)
calculating a posterior probability on the basis of the confidence interval (Mathematical calculation); and 
evaluating, on the basis of the posterior probability, behavioral equivalence between the first learning model and the second learning model (Mental process)
The claim falls within the “mathematical concepts” and/or “mental processes” groupings of abstract ideas as discussed above. Therefore, the claim recites an abstract idea.
Step 2A Prong Two: No additional elements integrates the abstract idea into a practical application.
Step 2B: No additional elements amount to an inventive concept to the claim. 
In Summary, the claim recites abstract idea without being integrated into a practical application, and does not provide additional elements that would amount to significantly more. As such, taken as a whole, the claim is ineligible under the 35 USC 101.

Analysis of Claim 4
Step 2A Prong One: The claim recites, inter alia:  
determining n-th (n is an integer of not less than 1) and lower order differences in output sequence data corresponding to the sequence data (Mental process)
wherein the step (C) includes determining whether or not the first and second execution results including one or more of the n- th and lower order differences satisfy the logical formula (Mental process)
The claim falls within the “mathematical concepts” and/or “mental processes” groupings of abstract ideas as discussed above. Therefore, the claim recites an abstract idea.
Step 2A Prong Two: The claim recites, inter alia:  
sequentially inputting sequence data as the checking data 
In this case, after considering those additional elements individually and in combination, it is determined that those additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
Step 2B: The claim recites, inter alia:  
sequentially inputting sequence data as the checking data (Typically, “inputting data” is treated under MPEP 2106.05(d)(II) as well-understood, routine, conventional (WURC) activity)
Those additional elements do not add significantly more to the exception when considered separately and in combination and/or do not amount to an inventive concept to the claim. 
In Summary, the claim recites abstract idea without being integrated into a practical application, and does not provide additional elements that would amount to significantly more. As such, taken as a whole, the claim is ineligible under the 35 USC 101.

Analysis of Claim 10
Step 2A Prong One: This part of the eligibility analysis evaluates whether the claim recites a judicial exception. As explained in MPEP 2106.04(II) and the 2019 PEG, a claim “recites” a judicial exception when the judicial exception is “set forth” or “described” in the claim. 
The claim recites a judicial exception (i.e., an abstract idea) without significantly more. For example, applicant claim limitations under broadest reasonable interpretation covers activities classified under mathematical concept. Abstract ideas classified under mathematical concepts include mathematical relationships, mathematical formulas or equations, and mathematical calculations, see MPEP 2106.04(a)(2), as highlighted in the claim analysis below.
The claim recites, inter alia: 
(4) comparing respective behaviors of the first learning model and the pre-trained second learning model with each other (Mental process. For example, a person can reasonably compare respective behaviors (outputs) of the first learning model and the pre-trained second learning model with each other.) 
The claim falls within the “mathematical concepts” and/or “mental processes” grouping of abstract ideas as discussed above. Therefore, the claim recites an abstract idea.

Step 2A Prong Two: This part of the eligibility analysis evaluates whether the claim recite additional elements that integrate the judicial exception into a practical application. 
The claim recites, inter alia: 
(1) obtaining, using training data, a first output result based on a first learning model as a teacher model; 
(2) obtaining, using the training data, a second output result based on a second learning model as a student model; 
(3) performing, using an evaluation parameter based on the first and second output results, training of the second learning model; 
In this case, after considering those additional elements individually and in combination, it is determined that those additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.

Step 2B: This part of the eligibility analysis evaluates whether the claim as a whole amounts to significantly more than the recited exception, i.e., whether any additional element, or combination of additional elements, adds an inventive concept to the claim. MPEP 2106.05.
The claim recites, inter alia: 
(1) obtaining, using training data, a first output result based on a first learning model as a teacher model (Typically, “receiving data” or “obtaining data” is treated under MPEP 2106.05(d)(II) as well-understood, routine, conventional (WURC) activity.); 
(2) obtaining, using the training data, a second output result based on a second learning model as a student model (Typically, “receiving data” or “obtaining data” is treated under MPEP 2106.05(d)(II) as well-understood, routine, conventional (WURC) activity.); 
(3) performing, using an evaluation parameter based on the first and second output results, training of the second learning model (The limitation of “performing, using an evaluation parameter … training of the second learning model” is an act or action which is beyond mental.  Certainly as broadly recited, this limitation could be treated Mere instructions to apply an exception (MPEP 2106.05(f)) since no details as to how the “training” occurs are recited.); 
Those additional elements do not add significantly more to the exception when considered separately and in combination and/or do not amount to an inventive concept to the claim. 
In Summary, the claim recites abstract idea without being integrated into a practical application, and does not provide additional elements that would amount to significantly more. As such, taken as a whole, the claim is ineligible under the 35 USC 101.

Analysis of Claim 11
Step 2A Prong One: The claim recites, inter alia
(4-3) determining whether or not the first and second execution results satisfy a logical formula; and (4-4) evaluating, using a Bayesian statistical model checking method, behavioral equivalence between the first learning model and the second learning model on the basis of a result of the determination in the step (4-3) (Mental process)
The claim falls within the “mathematical concepts” and/or “mental processes” groupings of abstract ideas as discussed above. Therefore, the claim recites an abstract idea.
Step 2A Prong Two: The claim recites, inter alia:  
(4-1) obtaining, using checking data, a first execution result based on the first learning model; (4-2) obtaining, using the checking data, a second execution result based on the second learning model; 
In this case, after considering those additional elements individually and in combination, it is determined that those additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
Step 2B: The claim recites, inter alia:  
(4-1) obtaining, using checking data, a first execution result based on the first learning model; (4-2) obtaining, using the checking data, a second execution result based on the second learning model (WURC activity, MPEP 2106.05(d)(II)); 

In Summary, the claim recites abstract idea without being integrated into a practical application, and does not provide additional elements that would amount to significantly more. As such, taken as a whole, the claim is ineligible under the 35 USC 101.

Analysis of Claim 12
Step 2A Prong One: The claim recites, inter alia:  
wherein the step (1) includes adjusting, using an adjustment parameter, the first output data on the basis of the label, wherein the step (3) includes performing, using an evaluation parameter based on the adjusted first output data and the second output result, training of the second learning model, and wherein the step (4) includes changing the adjustment parameter when the behaviors do not satisfy a predetermined criterion (Mental process)
The claim falls within the “mathematical concepts” and/or “mental processes” groupings of abstract ideas as discussed above. Therefore, the claim recites an abstract idea.
Step 2A Prong Two: No additional elements integrates the abstract idea into a practical application.
Step 2B: No additional elements amount to an inventive concept to the claim. 
In Summary, the claim recites abstract idea without being integrated into a practical application, and does not provide additional elements that would amount to significantly more. As such, taken as a whole, the claim is ineligible under the 35 USC 101.

Analysis of Claim 13
Step 2A Prong One: The claim recites, inter alia:  
calculating (n+k)-th (each of n and k is an integer of not less than 0) and lower order differences in output sequence data corresponding to the sequence data, (Mathematical calculation)
(4-3) determining whether or not the first and second execution results including the n-th and lower order differences satisfy a logical formula; (4-4) evaluating, using a Bayesian statistical model checking method, behavioral equivalence between the first learning model and the second learning model on the basis of a result of the determination in the step (4-3); and (4-5) selecting, when the behavioral equivalence satisfies a predetermined criterion in the step (4-4), a most accurate learning model from among (n+1) learning models (Mental process)
The claim falls within the “mathematical concepts” and/or “mental processes” groupings of abstract ideas as discussed above. Therefore, the claim recites an abstract idea.
Step 2A Prong Two: The claim recites, inter alia
wherein the steps (1) and (2) include: sequentially inputting sequence data as the training data 
wherein the step (3) includes performing, using an evaluation parameter based on the first and second output results including the (n+k)-th and lower order differences, training of the second learning model to construct (n+k+1) pre-trained models, 
wherein the step (4) includes the steps of: acquiring, from among the (n+k+1) pre-trained models, partial models including n-th and lower order differences; 
(4-1) obtaining, using checking data as sequence data, a first execution result based on the first learning model; (4-2) obtaining, using the checking data, a second execution result based on the partial models; 
In this case, after considering those additional elements individually and in combination, it is determined that those additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.
Step 2B: The claim recites, inter alia:  
wherein the steps (1) and (2) include: sequentially inputting sequence data as the training data (WURC activity, MPEP 2106.05(d)(II))
wherein the step (3) includes performing, using an evaluation parameter based on the first and second output results including the (n+k)-th and lower order differences, training of the second learning model to construct (Mere instructions to apply an exception (MPEP 2106.05(f))
wherein the step (4) includes the steps of: acquiring, from among the (n+k+1) pre-trained models, partial models including n-th and lower order differences; (Mere instructions to apply an exception (MPEP 2106.05(f))
(4-1) obtaining, using checking data as sequence data, a first execution result based on the first learning model; (4-2) obtaining, using the checking data, a second execution result based on the partial models; (WURC activity, MPEP 2106.05(d)(II))
Those additional elements do not add significantly more to the exception when considered separately and in combination and/or do not amount to an inventive concept to the claim. 
In Summary, the claim recites abstract idea without being integrated into a practical application, and does not provide additional elements that would amount to significantly more. As such, taken as a whole, the claim is ineligible under the 35 USC 101.

Analysis of Claim 14
Step 2A Prong One: The claim recites, inter alia:  
when the behavioral equivalence does not satisfy the predetermined criterion in the step (4-4), at least one of the n and k is changed (Mental process)

Step 2A Prong Two: No additional elements integrates the abstract idea into a practical application.
Step 2B: No additional elements amount to an inventive concept to the claim. 
In Summary, the claim recites abstract idea without being integrated into a practical application, and does not provide additional elements that would amount to significantly more. As such, taken as a whole, the claim is ineligible under the 35 USC 101.

Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claim 10 recites the limitation "the pre-trained second learning model". There is insufficient antecedent basis for this limitation in the claim.  

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) 10 and 16 is/are rejected under 35 U.S.C. 102(a)(1) as anticipated by Li et al. (US 10643602 B2, hereinafter Li).

Regarding claim 10, Li teaches: A training method which is implemented by using a memory and a processor (Fig. 9 e.g., elements 902 and 904), the training method comprising the steps of: 
(1) obtaining, using training data, a first output result based on a first learning model as a teacher model; (2) obtaining, using the training data, a second output result based on a second learning model as a student model ([Col. 11 Ln. 55-56] e.g., “the results from the teacher model 204 and the student model” Examiner notes that the “teacher model” is mapped to the “first learning model”, and the “student model” is mapped to the “second learning model”.);  
([Col. 10 ln. 4-9] e.g., “Over successive epochs of training of the student model 208, the weights applied to various inputs are adjusted to minimize the divergence score between the two speech recognition models 204, 208. As will be appreciated, only the parameters of the student model 208 are adjusted during the student model training.” Examiner notes that the “student model” is mapped to the “second learning model”.); and 
(4) comparing respective behaviors of the first learning model and the pre-trained second learning model with each other (Fig. 3 e.g., element 304, [Col. 11 ln. 29-32] e.g., “At operation 410, a check is made to determine if the behavior of the student model 208 converges with the behavior of the teacher model 204.”).   

Regarding claim 16, A computer readable storage medium for causing a computer to implement the method of evaluating the learning models according to claim 10 (see claim 10 for the citation as the rationale for this limitation. Examiner notes that the term “computer readable storage medium”, under the broadest reasonable interpretation (BRI), cover an ineligible signal per se and the instant specification is silent, but claim 16 is dependent on claim 10 which recites a processor.).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1 and 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (“Consistency of Bayes factor for nonnested model selection when the model dimension grows” 2016, hereinafter Wang) in view of Corani et al. ("Statistical comparison of classifiers through Bayesian hierarchical modelling " 2017, hereinafter Corani).

Regarding claim 1, Wang teaches: A method of evaluating [learning] models [by using a memory and a processor], the method comprising the steps of: 
(A) generating a first execution result of a first learning model as an exemplar model using checking data by the processor; (B) generating a second execution result of a second learning model using the checking data by the processor ([p. 2 ¶ 2] e.g., “In the class of linear regression models, we often assume that there is an unknown subset of the important predictors which contributes to the prediction of Y or has an impact on the response variable Y. This is by natural a model selection problem where we would like to select a linear model by identifying the important predictors in this subset. Suppose that we have two such linear regression models Mj and Mi, with dimensions j and i,

    PNG
    media_image1.png
    71
    497
    media_image1.png
    Greyscale
” Examiner notes that model Mj and model Mi generates a result Y, respectively.); 
(C) determining whether or not the first and second execution results satisfy a predetermined logical formula by the processor ([p. 9 ¶ 1] e.g., “Second, Theorem 4 can be extended to the case of nested model comparisons (i.e., Mi is nested in Mj) by assuming that M0 = Mi. Third, the Bayes factor depends on the choice of the base model through the value of δ∗j0, and therefore, to enlarger the consistency region in (3.4), we need to make δ∗j0 be as large as possible. This justifies that the null model M0 would be the best choice as the base model. Fourth, the lower bound of δji, denoted by k(r, δj0), is a bounded decreasing function in r and satisfies that for any δj0 > 0” [p. 9 ¶ 3] e.g., “It is interesting to observe that the asymptotic behaviors of the two Bayes factors depend on the pseudo-distance between models δji bounded by δj0.” Examiner notes that determining whether results of two nested models satisfy a predetermined logical formula (models δji bounded by δj0).); and 
(D) comparing by the processor, using a Bayesian statistical model checking method, respective behaviors of the first and second [learning] models with each other on the basis of a result of the determination in the step (C) ([p. 9 ¶ 1] e.g., “Some of the interesting findings can be drawn from the theorem as follows. First, the lower bound of δ∗j0, denoted by δ(r), is exactly the same as the one in Theorem 2 of [22] for comparing nested linear models.” [p. 6 § 3¶ 1] e.g., “we consider the model selection consistency of Bayes factor for comparing nonnested models under the three asymptotic scenarios.” [p. 9 ¶ 3] e.g., “It is interesting to observe that the asymptotic behaviors of the two Bayes factors depend on the pseudo-distance between models δ∗ji bounded by δ∗j0.” Examiner notes that comparing asymptotic behaviors, using Bayes factors, of nested linear models with each other on the basis of a result of a bounded logical formula.).  
Wang does not explicitly teach: learning models by using a memory and a processor.
However, Corani teaches: learning models by using a memory and a processor ([p. 1818 ¶ 1] e.g., “two learning algorithms for classification (referred to as classifiers in the following).” [p. 1824 § 3.4 ¶ 1] e.g., “We implemented the hierarchical model in Stan (Carpenter et al. 2017), a language for Bayesian inference… Inferring the hierarchical model on the results of ten runs of tenfolds cross-validation on 50 data sets (a total of 5000 observations) takes about three minutes on a standard laptop.”)
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of model comparison using Bayesian statistical model checking of Wang to incorporate the method of model comparison using Bayesian statistical model checking for two learning algorithms of Corani. The motivation/suggestion for doing this would be for the purpose of comparing the accuracy of two learning algorithms for classification (Corani [p. 1818 ¶ 1]).

Regarding claim 9, A computer readable storage medium for causing a computer to implement the method of evaluating the learning models according to claim 1 (see claim 1 for the citation as the rationale for this limitation.).

Claim(s) 2-8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang in view of Corani, further in view of Zuliani et al. ("Bayesian Statistical Model Checking with Application to Stateflow/Simulink Verification" 2010, hereinafter Zuliani).

Regarding claim 2, Wang in view of Corani teaches: the method of evaluating the learning models.   
Wang further teaches: wherein the step (D) includes: calculating a Bayes factor for a hypothesis associated with an establishment probability of the logical formula ([p. 6 § 3 ¶ 1] e.g., “we can calculate the Bayes factor” [p. 9 ¶ 3 – p. 10 ¶ 1] e.g., “we may conclude that as δj0 increases, the proposed Bayes factor outperforms the intrinsic Bayes factor from a theoretical viewpoint. It deserves mentioning that the existence of an inconsistency region around the null hypothesis is quite reasonable from a practical point of view.” [p. 9 ¶ 3] e.g., “It is interesting to observe that the asymptotic behaviors of the two Bayes factors depend on the pseudo-distance between models δji bounded by δj0.”); 
Wang in view of Corani does not explicitly teaches: performing Bayesian hypothetical testing to determine whether or not the establishment probability is not less than a probability threshold; and  

	However, Zuliani teaches: performing Bayesian hypothetical testing to determine whether or not the establishment probability is not less than a probability threshold ([§ 4.1, ¶ 2] e.g., "To test H0 vs. H1, we compute the Bayes factor B of the available data d and then compare it against a fixed threshold T > =1: we shall accept H0 iff B > T. Jeffreys interprets the value of the Bayes factor as a measure of the evidence in favor of H0 (dually, 1/B is the evidence in favor of H1)."  [§ 1, ¶ 1] e.g., "In this paper, property ϕ is expressed in Bounded Linear Temporal Logic (BLTL)” Examiner notes that Zuliani teaches performing Bayesian hypothetical testing to determine whether the Bayes factor is greater than a threshold); and 
	evaluating, based on a result of the Bayesian hypothetical testing, behavioral equivalence between the first [learning] model and the second [learning] model ([§ 3, ¶ 2] e.g., “Recall that the PMC problem is to decide whether M |=P>= θ (ϕ), where θ ∈ (0, 1) and ϕ is a BLTL formula. Let p be the (unknown but fixed) probability of the model satisfying ϕ: thus, the PMC problem can now be stated as deciding between two hypotheses: H0: p >= θ   H1: p < θ.”  Examiner notes that Zuliani teaches evaluating behavioral equivalence between two models “deciding between two hypotheses” as a Probabilistic Model Checking (PMC) problem.  The "first learning model and the second learning model" are taught by Corani in [p. 1818 ¶ 1].).  
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of model comparison (Zuliani [Abstract]).

Regarding claim 3, Wang in view of Corani teaches: The method of evaluating the learning models according to claim 1.
Wang does not explicitly teach: evaluating, on the basis of the posterior probability, behavioral equivalence between the first learning model and the second learning model.
However, Corani teaches: evaluating, on the basis of the posterior probability, behavioral equivalence between the first learning model and the second learning model ([p. 1818 § 1 ¶ 8] e.g., “we compute the posterior probability of the two classifiers being practically equivalent or significantly different.”).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of model comparison using Bayesian statistical model checking of Wang to incorporate the method of model comparison using Bayesian statistical model checking for two learning algorithms of Corani. The motivation/suggestion for doing this would be for the purpose of comparing the accuracy of two learning algorithms for classification (Corani [p. 1818 ¶ 1]).	

 	calculating a posterior probability on the basis of the confidence interval.
However, Zuliani teaches: wherein the step (D) includes: calculating a confidence interval which satisfies an establishment probability of the logical formula ([§ 3, ¶ 1] e.g., "we are interested in estimating p, the (unknown) probability that a random execution trace of M satisfies a fixed BLTL property. The estimate will be in the form of a confidence interval, i.e., an interval which will contain p with arbitrarily high probability."  [§ 3, ¶ 2] e.g., "Recall that the PMC problem is to decide whether M |=P>= θ (ϕ), where θ ∈ (0, 1) and ϕ is a BLTL formula." Examiner notes that Zuliani teaches calculating a confidence interval (estimating p) which satisfies a fixed BLTL property “logical formula”.  The instant specification describes "the logical formula is a BLTL (Bounded Linear Temporal Logic) formula." in [p. 9 ln. 8-9].).   
	calculating a posterior probability on the basis of the confidence interval ([§ 3, ¶ 1] e.g., "we are interested in estimating p, the (unknown) probability that a random execution trace of M satisfies a fixed BLTL property. The estimate will be in the form of a confidence interval, i.e., an interval which will contain p with arbitrarily high probability."  Examiner notes that Wiki describes the posterior probability of a random event or an uncertain proposition is the conditional probability that is assigned after the relevant evidence or background is taken into account. Hence, a posterior probability is an unknown probability, but estimating it.).  
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of model comparison using Bayes factor associated with Zellner’s g-prior of Wang in view of Corani to incorporate the method of hypothesis testing using intrinsic Bayes factor of Zuliani. The motivation/suggestion for doing this would be for the purpose of solving the verification problem by combining randomized sampling of system traces with hypothesis testing or estimation (Zuliani [Abstract]).

Regarding claim 4, Wang in view of Corani teaches: The method of evaluating the learning models according to claim 1.
	Wang in view of Corani does not explicitly teach: wherein the steps (A) and (B) include: sequentially inputting sequence data as the checking data; and determining n-th (n is an integer of not less than 1) and lower order differences in output sequence data corresponding to the sequence data, and wherein the step (C) includes determining whether or not the first and second execution results including one or more of the n- th and lower order differences satisfy the logical formula 
However, Zuliani teaches: wherein the steps (A) and (B) include: sequentially inputting sequence data as the checking data ([§ 4 ¶ 1] e.g., “Let X1,…,Xn be a sequence of Bernoulli random variables” [§ 4.2 ¶ 1] e.g., “we can model this procedure as independent sampling from a Bernoulli distribution X of unknown parameter p”); and 
([§ 4.2 ¶ 1] “Remember we want to establish whether M |=P>= θ (ϕ), where θ ∈ (0, 1) and ϕ is a BLTL formula. The algorithm iteratively draws independent and identically distributed sample traces σ1, σ2, …, and checks whether they satisfy ϕ. Again, we can model this procedure as independent sampling from a Bernoulli distribution X of unknown parameter p - the actual probability of the model satisfying ϕ.”  Examiner notes that Zuliani teaches sample X1,…,Xn “sequence data”, and determining whether execution results “traces σ1, σ2, …,” satisfy the logical formula ϕ.  The instant specification describes "the logical formula is a BLTL (Bounded Linear Temporal Logic) formula." in [p. 9 ln. 8-9]. The instant specification further describes “In the logical formulae ϕ in Examples 9 to 12, d1, d2, or d3 is used, but it is also possible to use an n-th (where n is an arbitrary integer of not less than 1) order difference. It is sufficient to use one or more of the n-th and lower order differences.” in [p. 46 ln. 20-23] and “the training device sequentially inputs sequence data as the checking test data and determines the n-th (where n is an integer of not less than 1) and lower order differences in the output sequence data corresponding to the sequence data. Then, the training device determines whether or not the first and second execution results including the n-th and lower order differences satisfy the logical formula” in [p. 53 ln. 13-19].).
(Zuliani [Abstract]).

Regarding claim 5, the claim recites a device for evaluating learning models which performs the steps of: claim 1, and is similarly analyzed.

Regarding claim 6, the claim recites the device for evaluating the learning models of claim 2, and is similarly analyzed.

Regarding claim 7, the claim recites the device for evaluating the learning models of claim 3, and is similarly analyzed.

Regarding claim 8, the claim recites the device for evaluating the learning models of claim 4, and is similarly analyzed.

Claim(s) 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li in view of Wang.

Regarding claim 11, Li teaches: The training method according to claim 10. 
Li further teaches: wherein the step (4) includes the steps of: (4-1) obtaining, using checking data, a first execution result based on the first learning model; (4-2) obtaining, using the checking data, a second execution result based on the second learning model ([Col. 11 Ln. 55-56] e.g., “the results from the teacher model 204 and the student model” Examiner notes that the “teacher model” is mapped to the “first learning model”, and the “student model” is mapped to the “second learning model”. The “training data” is used as the “checking data”.); 
Li does not explicitly teach: (4-3) determining whether or not the first and second execution results satisfy a logical formula; and (4-4) evaluating, using a Bayesian statistical model checking method, behavioral equivalence between the first learning model and the second learning model on the basis of a result of the determination in the step (4-3).  
However, Wang teaches: (4-3) determining whether or not the first and second execution results satisfy a logical formula ([p. 9 ¶ 1] e.g., “Second, Theorem 4 can be extended to the case of nested model comparisons (i.e., Mi is nested in Mj) by assuming that M0 = Mi. Third, the Bayes factor depends on the choice of the base model through the value of δ∗j0, and therefore, to enlarger the consistency region in (3.4), we need to make δ∗j0 be as large as possible. This justifies that the null model M0 would be the best choice as the base model. Fourth, the lower bound of δji, denoted by k(r, δj0), is a bounded decreasing function in r and satisfies that for any δj0 > 0” [p. 9 ¶ 3] e.g., “It is interesting to observe that the asymptotic behaviors of the two Bayes factors depend on the pseudo-distance between models δji bounded by δj0.” Examiner notes that determining whether results of two nested models satisfy a predetermined logical formula (models δji bounded by δj0).); and 
(4-4) evaluating, using a Bayesian statistical model checking method, behavioral equivalence between the first learning model and the second learning model on the basis of a result of the determination in the step (4-3) ([p. 9 ¶ 1] e.g., “Some of the interesting findings can be drawn from the theorem as follows. First, the lower bound of δ∗j0, denoted by δ(r), is exactly the same as the one in Theorem 2 of [22] for comparing nested linear models.” [p. 6 § 3¶ 1] e.g., “we consider the model selection consistency of Bayes factor for comparing nonnested models under the three asymptotic scenarios.” [p. 9 ¶ 3] e.g., “It is interesting to observe that the asymptotic behaviors of the two Bayes factors depend on the pseudo-distance between models δ∗ji bounded by δ∗j0.” Examiner notes that evaluating/comparing asymptotic behaviors, using Bayes factors, of nested linear models with each other on the basis of a result of a bounded logical formula.).  
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of training two learning models (teacher/student) of Li to incorporate the method of model comparison using Bayesian statistical model checking of Wang. The motivation/suggestion for doing this would be for the purpose to use nice theoretical properties and good performances in practical applications (Wang [p. 5 ¶ 4]).

Claim(s) 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li in view of Wang et al. (US 2019/0059719 A1, hereinafter Wang2).

Regarding claim 12, Li teaches: The training method according to claim 10.
	Li further teaches: wherein the step (3) includes performing, using an evaluation parameter based on the adjusted first output data and the second output result, training of the second learning model ([Col. 3 ln. 15-19] e.g., “Training the student model with adversarial teacher-student learning further includes minimizing a teacher-student loss that measures a divergence of outputs between the teacher model and the student model”)
Li does not explicitly teach: wherein the training data is labeled data with a label,
wherein the step (1) includes adjusting, using an adjustment parameter, the first output data on the basis of the label,
wherein the step (4) includes changing the adjustment parameter when the behaviors do not satisfy a predetermined criterion
However, Wang2 teaches: wherein the training data is labeled data with a label ([0066] e.g., “the initial deep learning network automatically compares the prediction data with the labeled data to train parameters of the initial deep learning network.”)
wherein the step (1) includes adjusting, using an adjustment parameter, the first output data on the basis of the label ([0008] e.g., “comparing the labeled data with the prediction data based on the loss function to obtain a comparison result; adjusting parameters in the deep learning network according to the comparison result”),  
wherein the step (4) includes changing the adjustment parameter when the behaviors do not satisfy a predetermined criterion ([0008] e.g., “adjusting parameters in the deep learning network according to the comparison result; and repeating the above steps until the comparison result reaches a preset threshold, so as to obtain the trained deep learning network”).  
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of training two learning models (teacher/student) of Li to incorporate the method of training a neural network by adjusting parameters of Wang2. The motivation/suggestion for doing this would be for the purpose to obtain the trained deep learning network (Wang2 [0008]).

Claim(s) 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li in view of Wang and Zuliani, and further in view of Ogilvie et al. (US 2015/0006442 A1, hereinafter Ogilvie).  

Regarding claim 13, Li teaches: The training method according to claim 10.
Li further teaches: wherein the steps (1) and (2) include: sequentially inputting sequence data as the training data ([Col. 9 ln. 48-51] e.g., “an input sequence of source-domain speech frames to the teacher model XT = {x1T, . . . , xNT} and an input sequence of target-domain speech frames to the student model XS = (x1S, . . . , xNS).” [Col. 7 ln. 45-50] e.g., “the training dataset inputs are provided from target domain data 206 to train the student model 208 during its learning phase, and the parallel source domain data 202 are analyzed by the teacher model 204 to compute the K 
    PNG
    media_image2.png
    42
    29
    media_image2.png
    Greyscale
 divergence between the teacher and student output distributions.”); and
calculating (n+k)-th (each of n and k is an integer of not less than 0) and lower order differences in output sequence data corresponding to the sequence data ([Col. 18 ln. 47-51] e.g., " the teacher-student loss is calculated by calculating a teacher senone posterior, calculating a student senone posterior for the deep feature, and calculating the teacher-student loss as a difference between the teacher senone posterior and the student senone posterior.” [Col. 2 ln. 58-61] e.g., “training the student model with adversarial teacher-student learning based on the teacher speech data and student speech data” Examiner notes that the number of the “teacher speech data” is mapped to “n”, the number of the “student speech data” is mapped to “k”.), 
wherein the step (3) includes performing, using an evaluation parameter based on the first and second output results including the (n+k)-th and lower order differences, training of the second learning model [to construct (n+k+1) pre-trained models] ([Col. 10 ln. 4-9] e.g., “Over successive epochs of training of the student model 208, the weights applied to various inputs are adjusted to minimize the divergence score between the two speech recognition models 204, 208. As will be appreciated, only the parameters of the student model 208 are adjusted during the student model training.” Examiner notes that the “student model” is mapped to the “second learning model”.), and 
([Col. 10 ln. 10-18] e.g., “One goal is for the student network to behave the same as the teacher network by having the student and the teacher network produce similar probability distributions. If the models behave the same, then the distributions will be the same, or similar. If the distributions are identical, then the result of the log operation will be zero. The goal is to change θS (the parameters of the student network) to obtain a K 
    PNG
    media_image2.png
    42
    29
    media_image2.png
    Greyscale
 as small as possible.”);
(4-1) obtaining, using checking data as sequence data, a first execution result based on the first learning model; (4-2) obtaining, using the checking data, a second execution result based on the partial models ([Col. 11 Ln. 55-56] e.g., “the results from the teacher model 204 and the student model” Examiner notes that the “teacher model” is mapped to the “first learning model”, and the “student model” is mapped to the “second learning model”.).
Li does not explicitly teaches: (4-3) determining whether or not the first and second execution results including the n-th and lower order differences satisfy a logical formula;
However, Wang teaches: (4-3) determining whether or not the first and second execution results including the n-th and lower order differences satisfy a logical formula ([§4.2 ¶1] “Remember we want to establish whether M |=P>= θ (ϕ), where θ ∈ (0, 1) and ϕ is a BLTL formula. The algorithm iteratively draws independent and identically distributed sample traces σ1, σ2, …, and checks whether they satisfy ϕ. Again, we can model this procedure as independent sampling from a Bernoulli distribution X of unknown parameter p - the actual probability of the model satisfying ϕ.”  Examiner notes that Zuliani teaches determining whether each sample trace "first and second execution results" satisfies the Bounded Linear Temporal Logic (BLTL) property ϕ "predetermined logical formula".  The instant specification describes "the logical formula is a BLTL (Bounded Linear Temporal Logic) formula." in [p. 9 ln. 8-9]. The instant specification further describes “In the logical formulae ϕ in Examples 9 to 12, d1, d2, or d3 is used, but it is also possible to use an n-th (where n is an arbitrary integer of not less than 1) order difference. It is sufficient to use one or more of the n-th and lower order differences.” in [p. 46 ln. 20-23] and “the training device sequentially inputs sequence data as the checking test data and determines the n-th (where n is an integer of not less than 1) and lower order differences in the output sequence data corresponding to the sequence data. Then, the training device determines whether or not the first and second execution results including the n-th and lower order differences satisfy the logical formula” in [p. 53 ln. 13-19].).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of training two learning models (teacher/student) of Li to incorporate the method of model comparison using Bayesian statistical model checking of Wang. The motivation/suggestion for doing this would be for the purpose to use nice theoretical properties and good performances in practical applications (Wang [p. 5 ¶ 4]).
Li in view of Wang does not explicitly teaches: (4-4) evaluating, using a Bayesian statistical model checking method, behavioral equivalence between the first learning 
However, Zuliani teaches: (4-4) evaluating, using a Bayesian statistical model checking method, behavioral equivalence between the first [learning] model and the second [learning] model on the basis of a result of the determination in the step (4-3) ([§3, ¶2] e.g., “Recall that the PMC problem is to decide whether M |=P>= θ (ϕ), where θ ∈ (0, 1) and ϕ is a BLTL formula. Let p be the (unknown but fixed) probability of the model satisfying ϕ: thus, the PMC problem can now be stated as deciding between two hypotheses: H0: p >= θ   H1: p < θ.”  Examiner notes that Zuliani teaches evaluating behavioral equivalence between the first model and the second model “deciding between two hypotheses” as a Probabilistic Model Checking (PMC) problem.  The "first learning model and the second learning model" are taught by Li in [0042] and [0064].).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of training two learning models (teacher/student) of Li in view of Wang to incorporate the method of hypothesis testing using intrinsic Bayes factor of Zuliani. The motivation/suggestion for doing this would be for the purpose of solving the verification problem by combining randomized sampling of system traces with hypothesis testing or estimation (Zuliani [Abstract]).
Li in view of Wang and Zuliani does not explicitly teach: (4-5) selecting, when the behavioral equivalence satisfies a predetermined criterion in the step (4-4), a most accurate learning model from among (n+1) learning models.
([0013] e.g., "the system uses a machine-learning technique to train a set of models and to select the best model based on one or more evaluation metrics using the training set. The system then evaluates the performance of the best model on the test set. If the performance of the best model satisfies a performance criterion, the system uses the best model to predict responses for the online social network.").  
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of training two learning models (teacher/student) of Li in view of Wang and Zuliani to incorporate the method of selecting the best model of Ogilvie. The motivation/suggestion for doing this would be for the purpose to select the best model based on the accuracy of prediction (Ogilvie [0041]).

Claim(s) 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li in view of Wang, Zuliani and Ogilvie, and further in view of Acuna Agost et al. (US 2019/0080362 A1, hereinafter Acuna)).

Regarding claim 14, Li in view of Wang, Zuliani and Ogilvie teaches: The training method according to claim 13.

	However, Acuna teaches: wherein, when the behavioral equivalence does not satisfy the predetermined criterion in the step (4-4), at least one of the n and k is changed ([0054] e.g., "At decision block 412, the results of the test block 410 may be evaluated to determine whether they satisfy a suitable criterion of quality (examples of which are described below with reference to FIGS. 6 and 7). If not, then at block 414 the model parameters/hyperparameters may be updated and the model reinitialized for retraining at block 408."  Examiner notes that the “n and k” are interpreted as parameters/hyperparameters).  
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of training two learning models (teacher/student) of Li in view of Wang, Zuliani and Ogilvie to incorporate the method of Acuna for adjusting parameters when test results do not satisfy a predetermined criterion. The motivation/suggestion for doing this would be for the purpose of minimizing an objective function which reflects an accuracy of the trained model in classifying the feature vectors (Acuna [0053]).

Claim(s) 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li in view of Lee et al. (US 2019/0130110 A1, hereinafter Lee).

Regarding claim 15, Li teaches: The training method according to claim 10.

	However, Lee teaches: wherein the step (4) includes using checking data including labeled data forming an adversarial example ([0057] e.g., "The result is a misclassified labeled data set 290 that is input to the cognitive system 250 which in turn performs an incorrect cognitive operation due to the misclassification by the neural network 230, due to the adversarial input 270, which is reflected in the misclassified labeled data set 290.").  
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of training two learning models (teacher/student) of Li to incorporate the method of Lee to classify with adversarial input. The motivation/suggestion for doing this would be for the purpose to train the output nodes to properly classify the adversarial data set (Lee [0028]).

Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure are listed below:
Xiong et al. (US 10885435 B2): teaches methods for training a neural network or an ensemble of neural networks.  The method comprises training outputs by repeatedly applying the neural network to the at least one training data item while disabling at least one of the hidden units or input units randomly with the predetermined probability.
Li et al. (US 20180025721 A1): teaches an acoustic model that includes a neural network having first memory blocks for time information and second memory blocks for frequency information.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAEYONG J PARK whose telephone number is (571) 272-3898. The examiner can normally be reached on M-F 9:00 a.m. - 6:00 p.m.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached at (571) 272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571 -272-1000.

/JAEYONG J PARK/Examiner, Art Unit 2126
/MICHAEL J HUNTLEY/Primary Examiner, Art Unit 2116