Notice of Pre-AIA  or AIA  Status
         The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
 
Status of Claims
            The amendment filed on December 30, 2021 in response to the October 18, 2021 non-final Office action has been entered. The status of the claims is as follows:
Claims 1-20 remain pending in the application.
Claims 1-20 have been amended. 
 
Information Disclosure Statement
            The information disclosure statement (IDS) submitted on 12/30/2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
 

Response to Arguments
            The amendment and arguments filed on 11/15/2021 have been fully considered. The examiner’s response is delineated as follows.
(a)       Response to Arguments Regarding Objections to the Specification: The objection to the specification is hereby withdrawn in view of Applicant’s amendment to the specification.
Response to Arguments Regarding Objections to the Claims: The objection to the claims 6 and 10-16 is hereby withdrawn in view of Applicant’s amendment to the claims 6 and 10-16.
(c)        Response to Arguments Regarding Rejection of Claims under 35 U.S.C. § 112(b):
	The rejections of claims under 35 U.S.C. § 112(b) in the previous non-final Office action are withdrawn with the exception of what is delineated in the rejection of claims under 35 U.S.C. § 112(b) below.
In addition, the claim amendment necessitates new ground(s) of rejection to the claims under 35 U.S.C. § 112(b).
(d)       Response to Arguments Concerning Rejections of claims under 35 U.S.C. § 103:Applicant’s arguments are regarding newly amended claim language which is addressed in the rejections of claims under 35 U.S.C. § 103 below.  

 
Claim Objections
Claim 6 stands objected to because of the following informalities: The limitation “applying, by the computer system, one or more calibration rules to separate score to calibrate the threshold to assess the likelihood that the service provided by the application programming interface is running the particular machine learning model.” appears to have minor informalities because the noun “score” is countable.  The examiner suggests the separate score to calibrate the threshold to assess the likelihood that the service provided by the application programming interface is running the particular machine learning model.”

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
 
The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
 
            Claims 1-20 stand rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
(a)      Claims 1, 9, and 17: Claim 1, as amended, recites the following limitations: “wherein the separate samples are distorted, based on inverting a portion of bits comprising the separate samples, to induce the particular machine learning model to misclassify the separate samples as different classes from among the plurality of classes 
More specifically, the present disclosure does not provide adequate written description to support the amended claim language. Further, the present disclosure describes, inter alia, “In one example, a characteristic of machine learning models, such as proprietary model 112, may be that it is relatively sensitive to minor distortions of a few bits in images that cause misclassification, even after significant amounts of data are used in training data 220 and other robustness safeguards are applied. 
For example, for an image that includes a cat, and should be classified as a cat image, due to the sensitivity of machine learning models, the image may be slightly distorted by a few bits or a bit pattern in a way that will induce the classifier of proprietary model 112 to misclassify the image under the class of dog images, rather than under the class of cat images, 100% of the time.” ¶ [0028].  ¶ [0035] further describes that “[i]n one example, the minimal distortion applied by adversarial transform 234 may include a few bits or a pattern of bits that are distorted in sample 230.” That is, the present disclosure merely describes minor distortions of a few bits, an image may be slightly distorted by a few bits or a bit pattern, and a few bits or a pattern of bits that are distorted in a sample. Nonetheless, the entire present disclosure does not explicitly describe inverting a portion of bits recited in the amended claims 1, 9, and 17. Therefore, claims 1, 9, and 17 are rejected under 35 U.S.C. 112(a).

 
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
 
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
 
            Claims 1-20 stand rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
(a)       Claims 1, 9, and 17: 
(1)	Claim 1, as amended, recites the following limitations: “wherein the separate samples are distorted, based on inverting a portion of bits comprising the separate samples, to induce the particular machine learning model to misclassify the separate samples as different classes from among the plurality of classes as different classes from among the plurality of classes” that are indefinite because it is unclear how “a portion of bits” may possibly “compris[e] the separate samples”.  
A sample comprises bits, and according to the claimed limitation, a smaller portion of these bits is inverted.  Nonetheless, according to the limitation again, this smaller portion of these bits comprises the separate sample.  That is, the claimed limitation recites 
(2)	Further, claims 1, 9, and 17 respectively recite a series of consecutive participle phrases without clearly pointing out which nouns or pronouns are respectively modified by these consecutive participle phrases.  For example, the claimed limitations “each of the plurality of synthetic samples representing a separate sample assigned an original class from among a plurality of classes classified by a particular machine learning model and distorted to induce …” contain one present participle phrase “representing a separate sample” immediately followed by three past participle phrases “assigned an original class …”, “classified by a particular machine learning model”, and “distorted to induce …”. 
For the purpose of examination, the examiner interprets the limitations as each synthetic sample represents a separate sample”; the aforementioned “separate sample” is assigned an “original class”; the “original class” is classified by a machine learning model; and the “original sample” is “distorted” to induce misclassification.  Applicant is requested to state on record what nouns/pronouns are respectively modified by these four consecutive participle phrases. 
(b)       Claims 2-8, 10-16, and 18-20 respectively depend from independent claims 1, 7, and 13 and thus inherit the aforementioned deficiencies.  Claims 2-8, 10-16, and 18-20 are thus also rejected under 35 U.S.C. § 112(b) for at least the foregoing reasons.
 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to 
 
       Claims 1-4, 9-12, and 17-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al., U.S. Patent No. 10,839,291, filed Jul. 1, 2017, and issued Nov. 17, 2020 (hereinafter Chen) in view of Moore U.S. Pat. App. Pub. No. 2011/0289027, filed May 20, 2010 (hereinafter Moore) and further in view of Goodfellow et al., Explaining and Harnessing Adversarial Examples (20 March 2015) (hereinafter Goodfellow) which has already been placed on record in the Oct. 18, 2021 non-final Office action.
 
With respect to claim 1 as amended, Chen teaches:
A method comprising: querying, by a computer system, an application programming interface with a plurality of synthetic samples, (FIG. 8: “computing architecture 800” having “processing unit 804, a system memory 806 and a system bus 808.”  Col. 15, ll. 8-24: “[v]arious embodiments may be implemented using hardware elements, software elements, or a combination of both”; and “Examples of software may include software components, programs, applications, computer programs, application programs, system programs, machine programs, operating system software, middleware, firmware, software modules, routines, subroutines, functions, methods, procedures, software interfaces, application program interfaces (API)”.  Col. 10, ll. 9-16: “At block 524, the adversarial images may be provided as an input to block 506 ‘determine current training set’. Accordingly, the current training set may be updated with the adversarial images”.  The examiner notes that Chen’s computer architecture provides adversarial images via software implemented with an API to a deep neural network (DNN) for classification (or misclassification) of the adversarial images teaches the limitation.) 
the plurality of synthetic samples representing separate samples assigned original classes from among a plurality of classes, classified by a particular machine learning model and, (col. 4, ll. 26-28: “each piece of training data may be associated with a class (e.g. class 210-1, 210-2, 210-n or classes 210)”; col. 6, ll. 52-55: “altering one or more portions of a base image (e.g., sample image 212-n) to generate a test image; providing the test image to the trained DNN (e.g., trained DNN 304-1) for classification”; col. 9, ll. 58-59: “Continuing to block 516 “test image properly classified?” it may be determined if the test image was properly classified by the DNN. For instance, if the classification of the test image matches a target classification, the test image is properly classified”.  The examiner notes that Chen’s DNN teaches a particular machine learning model, that Chen’s sample images teach the claimed separate samples, that Chen’s classes (210) respectively teach the plurality of classes, that Chen’s target classification of the sample images teaches the claimed original classes, that Chen’s test images teach the plurality of synthetic samples, and that Chen thus teaches the above limitation in its entirety.) 
wherein the separate samples are distorted to induce the particular machine learning model to misclassify the separate samples as different classes from among the plurality of classes as different classes from among the plurality of classes; (col. 6, ll. 52-55: “altering one or more portions of a base image (e.g., sample image 212-n) to generate a test image; providing the test image to the trained DNN (e.g., trained DNN 304-1) for classification”; col. 9, ll. 60-62 and col. 6, l. 59: “the classification of the test image does not match the target classification”; col. 10, ll. 1-3: “when the test image is not properly classified, the test image maybe identified as an adversarial image”. The examiner further notes that Chen was previously cited to teach the claimed separate samples and the plurality of classes.  See citations, supra. The examiner further notes Chen’s test images teach a plurality of synthetic samples, and that Chen’s altering a portion of sample images into corresponding test images so that the classifications of the test images do not match the target classifications of the sample images teaches the above claimed limitation.) 
accumulating, by the computer system, a number of results returned by the application programming interface that match expected class label assignments of the different classes for the plurality of synthetic samples; and (col. 9, ll. 56-60: “516 ‘test image properly classified?’ it may be determined if the test image was properly classified by the DNN. For instance, if the classification of the test image matches a target classification, the test image is properly classified”.  The examiner notes that that Chen’s determining whether a plurality of synthetic samples (e.g., Chen’s “test images” in col. 6, ll. 52-55, supra) match the target classifications teaches the above limitation.) 
Chen does not appear to explicitly teach: 
wherein the separate samples are distorted, based on inverting a portion of bits comprising the separate samples, to induce the particular 
accumulating, by the computer system, a score of a number of results returned by the application programming interface that match expected class label assignments of the different classes for the plurality of synthetic samples; and
in response to the score exceeding a threshold, verifying, by the computer system, that a service provided by the application programming interface is running the particular machine learning model.

Lin, however, does teach: 
accumulating, by the computer system, a score of a number of results returned by the application programming interface that match expected class label assignments of the different classes for the plurality of synthetic samples; and (¶ [0003]: “a computer-implemented system”. ¶ [0006]: “Determining the updated accuracy score for a particular trained predictive model can include:” “adding the sum of correct predictive outputs to previously determined sums of correct predictive outputs that were determined when the initial training data and other training data sets in the series of training data sets were received to determine a total number of correct predictive outputs”.  The examiner further notes that Lin’s determining a total of number of correct predictive outputs by adding the sum of correct predictive outputs to the previously determined sum, when combined with Chen’s teaching, supra teaches the above claimed limitation.) 
 (¶ [0047]: “the top ranking trained model is chosen as the selected predictive model”; ¶ [0107]: “the new accuracy scores associated with the available trained predictive models can be compared, and the most accurate trained predictive model selected”. ¶ [0127]: “the trained predictive models with the top n accuracy scores are selected from among the total available predictive models”.  The examiner notes that Lin’s selecting models having the “top n scores,” “top ranking score,” or “the most accurate trained predictive model” teaches a cumulative score exceeding a threshold.) 
verifying, by the computer system, that a service provided by the application programming interface is running the particular machine learning model. (¶ [0044]: “In some implementations, cross-validation is used to estimate the accuracy of each trained predictive model.” ¶ [0046]: “In some implementations, the predictive modeling server system 206 operates independently from the client computing system 202 and selects and provides the trained predictive model 218 as a specialized service.”  The examiner notes that Lin’s performing a cross-validation for each trained predictive model provided as a service teaches the above limitation.) 
	Chen and Lin are analogous art as both pertain to classification using machine learning.  
It would have been obvious for a person of ordinary skill in the art prior to the effective filing date to combine Chen’s querying with distorted, synthetic samples and determining whether the classification of distorted, synthetic samples matches expected Chen, supra) with Lin’s accumulating a score of matched expected class label assignments and verifying a service is running a particular machine learning model (Lin, supra). The modification helps select not only the most accurate trained predictive model that may change over time but also determines the data samples that are the most information-rich to be retained in memory to address memory limitation (Lin, ¶ [0011], “Accuracy scores can be determined that are reflective of more recently received data samples. As input data to be input into a trained predictive model to generate a predictive output changes over time, the accuracy of the trained predictive model may also change. Determining the accuracy score based on data samples that are representative of current input data can help to select the most accurate trained predictive model at a given time. Memory space can limit the Volume of data samples that can be retained. Determining which data samples are the most information-rich can be useful in selecting a set of test data and/or training data to be used and/or retained in memory.”)

	Chen modified by Lin does not appear to explicitly teach: 
wherein the separate samples are distorted, based on inverting a portion of bits comprising the separate samples, to induce the particular machine learning model to misclassify the separate samples as different classes from among the plurality of classes as different classes from among the plurality of classes and wherein the distortion is not visible to the human eye;
Goodfellow does, however teach:
(Goodfellow at ¶ 2, § 3, p. 2: “For example, digital images often use only 8 bits per pixel”.  ¶ 1, p. 6: “As control experiments, we trained training a maxout network with noise based on randomly adding ± to each pixel, or adding noise in U(-; ) to each pixel.” The examiner notes that Goodfellow’s adding ± to each pixel that uses 8 bits teaches inverting a portion of bits in a sample because a bit stores a binary value (e.g., 0 or 1) so changing a pixel’s value by adding ± changes at least the binary value (e.g., “0”) of at least one bit to the other binary value (e.g., “1”) and thus teaches inverting a portion of bits as claimed.)
wherein the distortion is not visible to the human eye; (Goodfellow at Caption of Figure 1, p. 3: “A demonstration of fast adversarial example generation applied to GoogLeNet (Szegedy et al., 2014a) on ImageNet. By adding an imperceptibly small vector whose elements are equal to the sign of the elements of the gradient of the cost function with respect to the input, we can change GoogLeNet’s classification of the image. Here our  of .007 corresponds to the magnitude of the smallest bit of an 8 bit image encoding after GoogLeNet’s conversion to real numbers.” ¶ 3, § 2, p. 1: “On some datasets, such as ImageNet (Deng et al., 2009), the adversarial examples were so close to the original examples that the differences were indistinguishable to the human eye.”)
Chen, Lin, and Goodfellow are analogous art because both references pertain to training neural networks with adversarial examples. 
It would have been obvious for a person of ordinary skill in the art prior to the effective filing date to have modified Chen in view of Lin to incorporate Goodfellow’s adversarial examples one or more bits of which are inverted with imperceptibly small perturbation (see Goodfellow, Abstract and ¶ 2, § 1, p. 1, supra). The modification addresses the shortcoming of conventional approaches that cause the activation to grow and thus “accidental steganography” by subtracting the penalty from the activation, rather than adding the penalty to the activation as in convention approaches, so as to make confident enough predictions that the error in prediction has saturated at a minimum value (Goodfelllow, p. 2, § 3, ¶ 4: “The adversarial perturbation causes the activation to grow by                         
                            
                                
                                    ω
                                
                                
                                    T
                                
                            
                            η
                        
                    ”; and p. 4, § 5, ¶ 2: “This is somewhat similar to L1 regularization. However, there are some important differences. Most significantly, the L1 penalty is subtracted off the model’s activation during training, rather than added to the training cost. This means that the penalty can eventually start to disappear if the model learns to make confident enough predictions that [Symbol font/0x7A] saturates.”)
 
With respect to claim 2, Chen modified by Lin and Goodfellow teaches the method according to claim 1 from which claim 2 depends, and Chen further teaches: 
sending, by the computer system, a separate query call to the application programming interface for the plurality of synthetic samples, (col. 10, ll. 10-16: “the current training set may be updated with the adversarial images when logic flow 500 returns to block 506 from block 526. In various embodiments, this may enable the next iteration of DNN hardening to begin”. Col. 9, ll. 35-37: “Continuing to block 508 “train DNN-i on current training set” the current iteration of DNN may be trained on the current training set.”  Chen was previously cited to teach sending querying the application programming interface for each of the plurality of synthetic samples (see claim 1 above). The examiner notes that Chen’s training a neural network with adversarial examples in the next iteration teaches the above limitation.)
wherein a user requesting to query the application programming interface with the plurality of synthetic samples (col. 13, ll. 5-7: “A user can enter commands and information into the computer 802 through one or more wire/wireless input devices”.  The examiner notes that Chen was previously cited to teach querying the application programming interface with the plurality of synthetic samples.  See claim 1 above.)
receiving, by the computer system, an output from the application programming interface for separate query calls, the output comprising result labels of (col. 5, ll. 6-8: “adversarial images 252-1, 252-2, 252-n may each be associated with a class 253-1, 253-2, 253-n (i.e., classes 253), respectively”; col. 4, ll. 26-28: “each piece of training data may be associated with a class”; col. 7, ll. 3-5: “find a similar x’ such that the classification output C(x) ≠C(x’), but x and x’ are close”.  Chen was previously cited to teach each separate query call (claim 2 above) and an API (claim 1 above). The examiner notes that for each adversarial image, Chen’s determining a “class” (C(x’) below) that does not match the target classification (C(x) below) teaches this limitation.)
Chen does not appear to explicitly teach: 
a user is only able to access the service through queries to the application programming interface. 
Lin does, however, teach: 
a user is only able to access the service through queries to the application programming interface; and (¶ [0046]: “the predictive modeling server system 206 operates independently from the client computing system 202 and selects and provides the trained predictive model 218 as a specialized service”; ¶ [0052]: “the trained model can be made accessible to the client computing system or other computer platforms by an API through a hosted development and execution platform”.  The examiner notes that a user being only able to access the service of predictive model(s) selected and provided by Lin as a specialized service teaches the claimed limitations.)
Chen, Lin, and Goodfellow are analogous art because both references pertain to training neural networks with adversarial examples. 
It would have been obvious for a person of ordinary skill in the art prior to the effective filing date to have modified Chen in view of Lin and Goodfellow to further incorporate Lin’s allowing a user to access a service through queries to an API (Lin, supra). The modification incurs expenditures for computing resources and human Lin, ¶ [0046]: “The expenditure of both computing resources and human resources and expertise to select the untrained predictive models to include in the training function repository 216, the training functions to use for the various types of available predictive models, the hyper-parameter configurations to apply to the training functions and the feature-inductors all occurs server-side. Once these selections have been completed, the training and model selection can occur in an automated fashion with little or no human intervention, unless changes to the server system 206 are desired. The client computing system 202 thereby benefits from access to a trained predictive model 218 that otherwise might not have been available to the client computing system 202, due to limitations on client-side resources.”)

With respect to claim 3, Chen modified by Lin and Goodfellow teaches the “method according to claim 1, wherein the accumulating in claim 1 further comprises”.
Chen, Lin, and Goodfellow combined thus teaches all the limitations of claim 1 from which claim 3 depends, and Chen further teaches: 
accumulating, by the computer system, the number of results returned by the application programming interface that match an expected class label assignment associated with the plurality of synthetic samples (Chen, col. 9, ll. 56-60 cited for claim 1, supra.  The examiner notes that that Chen’s determining whether a plurality of synthetic samples (e.g., Chen’s “test images” in col. 6, ll. 52-55, supra) match the target classifications teaches the above limitation.)
in a matrix of expected class labels, the matrix of expected class labels created from a plurality of results of applying the plurality of synthetic samples to the particular machine learning model prior to deployment.  (Chen, col. 12, ll. 59-65 and col. 2, ll. 26-30 cited for claim 1, supra. The examiner notes that Chen’s DNN, test images, and target classification respectively teach a particular machine model, the plurality of synthetic samples, and expected class labels. The examiner further notes that Chen’s use of “data structures” used to store “program data” such as target classification and classification of test images teach this claimed limitation. The examiner also nodes that Chen’s populating its data structure with prior training data occurs prior to the “future DNN iteration” or “trained DNN” that “perform[s] malware classification” and thus teaches that Chen accumulates results returned by an API prior to deployment and hence the above limitation.)
Chen does not appear to explicitly teach the following claimed limitations: 
accumulating, by the computer system, the score of the number of results.

Lin does, however, teaches: 
accumulating, by the computer system, the score of the number of results. (Lin, ¶¶ [0003] and [0006] cited for claim 1, supra.  The examiner notes that Lin’s determining a total of number of correct predictive outputs by adding the sum of correct predictive outputs to the previously determined sum, when combined with Chen’s teaching, supra teaches the above claimed limitation.) 
Chen, Lin, and Goodfellow are analogous art because both references pertain to training neural networks with adversarial examples. 
It would have been obvious for a person of ordinary skill in the art prior to the effective filing date to have modified Chen in view of Lin and Goodfellow to further incorporate Lin’s accumulating the score of the number of returned results (Lin, supra). The modification maintains updated accuracy of a plurality of predictive models by accumulating and keeping the scores indicative of the accuracy of the plurality of predictive models up to date so that selection of a predictive model is based on updated scores and hence accuracy (Lin, ¶ [0003]: “The predictive model repository includes multiple updateable trained predictive models which are each associated with an accuracy score that represents an estimation of the accuracy of the trained predictive model”; and “A first trained predictive model is selected from among the plurality of trained predictive models and retrained predictive models included in the predictive model repository based on the determined updated accuracy scores. Access is provided to the first trained predictive model over the network.”)

With respect to claim 4, Chen modified by Lin teaches the “method according to claim 3, wherein the accumulating in claim 3 further comprises:”. 
Chen teaches a number of results returned by the application programming interface that match an expected class label assignment of the different class for each of the plurality of synthetic samples (see Chen, col. 9, ll. 56-60 for claim 1, supra) in a matrix Chen, col. 6, ll. 52-55 cited for claim 1, supra and at col. 12, ll. 59-65 and col. 9, ll. 54 and 60-62 cited for claim 3, supra).
Chen further teaches each result returned by the application programming interface that does not match the expected class label (see Chen, col. 10, ll. 1-3 cited for claim 1, supra) in the matrix of expected class labels associated with an additional selection of the plurality of synthetic samples (see Chen, col. 6, ll. 52-55 cited for claim 1, supra and col. 12, ll. 59-65 and col. 9, ll. 54 and 60-62 cited for claim 3, supra). 

Lin further teaches: 
in response to results returned by the application programming interface that matches expected class labels in the matrix of expected class labels associated with a selection of the plurality of synthetic samples, updating, by the computer system, the cumulative score with a success; (¶ [0006]: “Determining the updated accuracy score for a particular trained predictive model can include:” “adding the sum of correct predictive outputs to previously determined sums of correct predictive outputs that were determined when the initial training data and other training data sets in the series of training data sets were received to determine a total number of correct predictive outputs”.  The examiner notes that Lin’s adding correct predictive outputs to a previously determined sum of correct predictive outputs teaches this limitation.)
s returned by the application programming interface that does not match the expected class labels in the matrix of expected class labels associated with an additional selection of the plurality of synthetic samples, updating, by the computer system, the cumulative score with lack of success. (¶ [0006]: “Determining the updated accuracy score for a particular trained predictive model can include:” “adding the sum of correct predictive outputs to previously determined sums of correct predictive outputs that were determined when the initial training data and other training data sets in the series of training data sets were received to determine a total number of correct predictive outputs”.  The examiner further notes that Chen’s adding correct predictive outputs, which exclude the incorrect predictive outputs, to the previously determined sums so Chen updates the total number of training samples to include the training sample for which an incorrect predictive output is generated to update the division of the total number of predictive outputs by the number of training samples (see, e.g., ¶ [0006]) for updating the accumulated score, and that Chen’s not adding the incorrect predictive outputs while updating the total number of samples for the aforementioned division in its data structure(s) as program data updates teaches the limitation as claimed.)
Chen, Lin, and Goodfellow are analogous art because both references pertain to training neural networks with adversarial examples. 
It would have been obvious for a person of ordinary skill in the art prior to the effective filing date to have modified Chen in view of Lin and Goodfellow to further incorporate Lin’s updating the cumulative score with a success for a correct predictive output of a synthetic sample and further updating the cumulative score with a lack of Lin, supra). The modification maintains the accuracy of a plurality of predictive models up to date by updating the cumulative accuracy score for each successful prediction and further by updating the ratio of correctly predicted samples and total number of samples so that the selection of a predictive model is based on updated scores and hence accuracy (Lin, ¶ [0003]: “The predictive model repository includes multiple updateable trained predictive models which are each associated with an accuracy score that represents an estimation of the accuracy of the trained predictive model”; and “A first trained predictive model is selected from among the plurality of trained predictive models and retrained predictive models included in the predictive model repository based on the determined updated accuracy scores. Access is provided to the first trained predictive model over the network.”)

With respect to claim 9, it is substantially similar to claim 1 and is rejected in the same manner, the same art and reasoning applying.  Further, Chen teaches:
a computer system comprising one or more processors, one or more computer- readable memories, one or more computer-readable storage devices, and program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, the stored program instructions comprising: program instructions (col. 11, l. 22-25: “computer architecture 800 may be representative, for example, of a computer system …”; col. 12, ll. 10-11: “system memory”; col. 10, l. 64-66: “storage medium 700 may store computer-executable instructions, such as computer-executable instructions to implement one or more logic flows or operations described herein”; col. 11, ll. 33-34: “a processor”).
program instructions to query an application programming interface with a plurality of synthetic samples, (FIG. 8: “computing architecture 800” having “processing unit 804, a system memory 806 and a system bus 808.”  Col. 15, ll. 8-24: “[v]arious embodiments may be implemented using hardware elements, software elements, or a combination of both”; and “Examples of software may include software components, programs, applications, computer programs, application programs, system programs, machine programs, operating system software, middleware, firmware, software modules, routines, subroutines, functions, methods, procedures, software interfaces, application program interfaces (API)”.  Col. 10, ll. 9-16: “At block 524, the adversarial images may be provided as an input to block 506 ‘determine current training set’. Accordingly, the current training set may be updated with the adversarial images”.  The examiner notes that Chen’s computer architecture provides adversarial images via software implemented with an API to a deep neural network (DNN) for classification (or misclassification) of the adversarial images teaches the limitation.)

the plurality of synthetic samples representing separate samples assigned original classes from a plurality of classes classified by a particular machine learning model and (col. 4, ll. 26-28: “each piece of training data may be associated with a class (e.g. class 210-1, 210-2, 210-n or classes 210)”; col. 6, ll. 52-55: “altering one or more portions of a base image (e.g., sample image 212-n) to generate a test image; providing the test image to the trained DNN (e.g., trained DNN 304-1) for classification”; col. 9, ll. 58-59: “Continuing to block 516 “test image properly classified?” it may be determined if the test image was properly classified by the DNN. For instance, if the classification of the test image matches a target classification, the test image is properly classified”.  The examiner notes that Chen’s DNN teaches a particular machine learning model, that Chen’s sample images teach the claimed separate samples, that Chen’s classes (210) respectively teach the plurality of classes, that Chen’s target classification of the sample images teaches the claimed original classes, that Chen’s test images teach the plurality of synthetic samples, and that Chen thus teaches the above limitation in its entirety.)
wherein the separate samples are distorted, based on inverting a portion of bits comprising the separate samples, to induce the particular machine learning model to misclassify the separate samples as different classes from among the plurality of classes; (col. 6, ll. 52-55: “altering one or more portions of a base image (e.g., sample image 212-n) to generate a test image; providing the test image to the trained DNN (e.g., trained DNN 304-1) for classification”; col. 9, ll. 60-62 and col. 6, l. 59: “the classification of the test image does not match the target classification”; col. 10, ll. 1-3: “when the test image is not properly classified, the test image maybe identified as an adversarial image”. The examiner further notes that Chen was previously cited to teach the claimed separate samples and the plurality of classes.  See citations, supra. The examiner further notes Chen’s test images teach a plurality of synthetic samples, and that Chen’s altering a portion of sample images into corresponding test images so that the classifications of the test images do not match the target classifications of the sample images teaches the above claimed limitation.)

program instructions to accumulate a number of results returned by the application programming interface that match expected class label assignments of the different classes for the plurality of synthetic samples; and (col. 9, ll. 56-60: “516 ‘test image properly classified?’ it may be determined if the test image was properly classified by the DNN. For instance, if the classification of the test image matches a target classification, the test image is properly classified”.  The examiner notes that that Chen’s determining whether a plurality of synthetic samples (e.g., Chen’s “test images” in col. 6, ll. 52-55, supra) match the target classifications teaches the above limitation.)

Chen does not appear to explicitly teach the following claimed limitations: 
wherein the separate samples are distorted, based on inverting a portion of bits comprising the separate samples, to induce the particular machine learning model to misclassify the separate samples as different classes from among the plurality of classes; 
program instructions to accumulate a score of a number of results returned by the application programming interface that match expected class label assignments of the different classes for the plurality of synthetic samples; and 
program instructions, in response to the score exceeding a threshold, to verify that a service provided by the application programming interface is running the particular machine learning model.  
Lin, however, does teach: 
wherein the separate samples are distorted, based on inverting a portion of bits comprising the separate samples, to induce the particular machine learning model to misclassify the separate samples as different classes from among the plurality of classes; 

program instructions to accumulate a score of a number of results returned by the application programming interface that match expected class label assignments of the different classes for the plurality of synthetic samples; and (¶ [0003]: “a computer-implemented system”. ¶ [0006]: “Determining the updated accuracy score for a particular trained predictive model can include:” “adding the sum of correct predictive outputs to previously determined sums of correct predictive outputs that were determined when the initial training data and other training data sets in the series of training data sets were received to determine a total number of correct predictive outputs”.  The examiner further notes that Lin’s determining a total of number of correct predictive outputs by adding the sum of correct predictive outputs to the previously determined sum, when combined with Chen’s teaching, supra teaches the above claimed limitation.)

program instructions, in response to the score exceeding a threshold, to verify that a service provided by the application programming interface is running the particular machine learning model.  (¶ [0047]: “the top ranking trained model is chosen as the selected predictive model”; ¶ [0107]: “the new accuracy scores associated with the available trained predictive models can be compared, and the most accurate trained predictive model selected”. ¶ [0127]: “the trained predictive models with the top n accuracy scores are selected from among the total available predictive models”.  ¶ [0044]: “In some implementations, cross-validation is used to estimate the accuracy of each trained predictive model.” ¶ [0046]: “In some implementations, the predictive modeling server system 206 operates independently from the client computing system 202 and selects and provides the trained predictive model 218 as a specialized service.”  The examiner notes that Lin’s selecting models having the “top n scores,” “top ranking score,” or “the most accurate trained predictive model” teaches a cumulative score exceeding a threshold. The examiner further notes that Lin’s performing a cross-validation for each trained predictive model provided as a service teaches verifying that a service provided by the application programming interface is running the particular machine learning model.)

	Chen and Lin are analogous art as both pertain to classification using machine learning.  
It would have been obvious for a person of ordinary skill in the art prior to the effective filing date to combine Chen’s querying with distorted, synthetic samples and determining whether the classification of distorted, synthetic samples matches expected class assignments (Chen, supra) with Lin’s accumulating a score of matched expected class label assignments and verifying a service is running a particular machine learning model (Lin, supra). The modification helps select not only the most accurate trained predictive model that may change over time but also determines the data samples that (Lin, ¶ [0011], “Accuracy scores can be determined that are reflective of more recently received data samples. As input data to be input into a trained predictive model to generate a predictive output changes over time, the accuracy of the trained predictive model may also change. Determining the accuracy score based on data samples that are representative of current input data can help to select the most accurate trained predictive model at a given time. Memory space can limit the Volume of data samples that can be retained. Determining which data samples are the most information-rich can be useful in selecting a set of test data and/or training data to be used and/or retained in memory.”)
	Chen modified by Lin does not appear to explicitly teach: 
wherein the separate samples are distorted, based on inverting a portion of bits comprising the separate samples, to induce the particular machine learning model to misclassify the separate samples as different classes from among the plurality of classes; 

Goodfellow does, however teach:
wherein the separate samples are distorted, based on inverting a portion of bits comprising the separate samples, to induce the particular machine learning model to misclassify the separate samples as different classes from among the plurality of classes; (Goodfellow at ¶ 2, § 3, p. 2: “For example, digital images often use only 8 bits per pixel”.  ¶ 1, p. 6: “As control experiments, we trained training a maxout network with noise based on randomly adding ± to each pixel, or adding noise in U(-; ) to each pixel.” The examiner notes that Goodfellow’s adding ± to each pixel that uses 8 bits teaches inverting a portion of bits in a sample because a bit stores a binary value (e.g., 0 or 1) so changing a pixel’s value by adding ± changes at least the binary value (e.g., “0”) of at least one bit to the other binary value (e.g., “1”) and thus teaches inverting a portion of bits as claimed.)
Chen, Lin, and Goodfellow are analogous art because both references pertain to training neural networks with adversarial examples. 
It would have been obvious for a person of ordinary skill in the art prior to the effective filing date to have modified Chen in view of Lin to incorporate Goodfellow’s adversarial examples one or more bits of which are inverted with imperceptibly small perturbation (see Goodfellow, Abstract and ¶ 2, § 1, p. 1, supra). The modification addresses the shortcoming of conventional approaches that cause the activation to grow and thus “accidental steganography” by subtracting the penalty from the activation, rather than adding the penalty to the activation as in convention approaches, so as to make confident enough predictions that the error in prediction has saturated at a minimum value (Goodfelllow, p. 2, § 3, ¶ 4: “The adversarial perturbation causes the activation to grow by                         
                            
                                
                                    ω
                                
                                
                                    T
                                
                            
                            η
                        
                    ”; and p. 4, § 5, ¶ 2: “This is somewhat similar to L1 regularization. However, there are some important differences. Most significantly, the L1 penalty is subtracted off the model’s activation during training, rather than added to the training cost. This means that the penalty can eventually start to disappear if the model learns to make confident enough predictions that [Symbol font/0x7A] saturates.”)

With respect to claim 10, it is substantially similar to claim 2 and is rejected in the same manner, the same art and reasoning applying.

With respect to claim 11, it is substantially similar to claim 3 and is rejected in the same manner, the same art and reasoning applying.

With respect to claim 12, it is substantially similar to claim 4 and is rejected in the same manner, the same art and reasoning applying.

With respect to 17, Chen teaches:
A computer program product comprises a computer readable storage medium having program instructions embodied therewith, wherein the computer readable storage medium is not a transitory signal per se, the program instructions executable by a computer to cause the computer to: (Chen, col. 11, l. 22-25: “computer architecture 800 may be representative, for example, of a computer system …”; col. 10, l. 64-66: “storage medium 700 may store computer-executable instructions, such as computer-executable instructions to implement one or more logic flows or operations described herein”.)
query, by the computer, an application programming interface with a plurality of synthetic samples, the plurality of synthetic samples representing separate samples assigned original classes from among a plurality of classes classified by a particular machine learning model and (col. 4, ll. 26-28: “each piece of training data may be associated with a class (e.g. class 210-1, 210-2, 210-n or classes 210)”; col. 6, ll. 52-55: “altering one or more portions of a base image (e.g., sample image 212-n) to generate a test image; providing the test image to the trained DNN (e.g., trained DNN 304-1) for classification”; col. 9, ll. 58-59: “Continuing to block 516 “test image properly classified?” it may be determined if the test image was properly classified by the DNN. For instance, if the classification of the test image matches a target classification, the test image is properly classified”.  The examiner notes that Chen’s DNN teaches a particular machine learning model, that Chen’s sample images teach the claimed separate samples, that Chen’s classes (210) respectively teach the plurality of classes, that Chen’s target classification of the sample images teaches the claimed original classes, that Chen’s test images teach the plurality of synthetic samples, and that Chen thus teaches the above limitation in its entirety.)

wherein the separate samples are distorted to induce the particular machine learning model to misclassify the separate samples as different classes from among the plurality of classes; (col. 6, ll. 52-55: “altering one or more portions of a base image (e.g., sample image 212-n) to generate a test image; providing the test image to the trained DNN (e.g., trained DNN 304-1) for classification”; col. 9, ll. 60-62 and col. 6, l. 59: “the classification of the test image does not match the target classification”; col. 10, ll. 1-3: “when the test image is not properly classified, the test image maybe identified as an adversarial image”. The examiner further notes that Chen was previously cited to teach the claimed separate samples and the plurality of classes.  See citations, supra. The examiner further notes Chen’s test images teach a plurality of synthetic samples, and that Chen’s altering a portion of sample images into corresponding test images so that the classifications of the test images do not match the target classifications of the sample images teaches the above claimed limitation.)

accumulate, by the computer, a score of a number of results returned by the application programming interface that match expected class label assignments of the different classes for the plurality of synthetic samples; and (col. 9, ll. 56-60: “516 ‘test image properly classified?’ it may be determined if the test image was properly classified by the DNN. For instance, if the classification of the test image matches a target classification, the test image is properly classified”.  The examiner notes that that Chen’s determining whether a plurality of synthetic samples (e.g., Chen’s “test images” in col. 6, ll. 52-55, supra) match the target classifications teaches the above limitation.)
Chen does not appear to explicitly teach the following claimed limitations: 
wherein the separate samples are distorted, based on inverting a portion of bits comprising the separate samples, to induce the particular machine learning model to misclassify the separate samples as different classes from among the plurality of classes; 
accumulate, by the computer, a score of a number of results returned by the application programming interface that match expected class label assignments of the different classes for the plurality of synthetic samples; and 

Lin, however, does teach: 
accumulate, by the computer, a score of a number of results returned by the application programming interface that match expected class label assignments of the different classes for the plurality of synthetic samples; and (¶ [0003]: “a computer-implemented system”. ¶ [0006]: “Determining the updated accuracy score for a particular trained predictive model can include:” “adding the sum of correct predictive outputs to previously determined sums of correct predictive outputs that were determined when the initial training data and other training data sets in the series of training data sets were received to determine a total number of correct predictive outputs”.  The examiner further notes that Lin’s determining a total of number of correct predictive outputs by adding the sum of correct predictive outputs to the previously determined sum, when combined with Chen’s teaching, supra teaches the above claimed limitation.)
in response to the score exceeding a threshold, verify, by the computer, that a service provided by the application programming interface is running the particular machine learning model.  (¶ [0047]: “the top ranking trained model is chosen as the selected predictive model”; ¶ [0107]: “the new accuracy scores associated with the available trained predictive models can be compared, and the most accurate trained predictive model selected”. ¶ [0127]: “the trained predictive models with the top n accuracy scores are selected from among the total available predictive models”.  ¶ [0044]: “In some implementations, cross-validation is used to estimate the accuracy of each trained predictive model.” ¶ [0046]: “In some implementations, the predictive modeling server system 206 operates independently from the client computing system 202 and selects and provides the trained predictive model 218 as a specialized service.”  The examiner notes that Lin’s selecting models having the “top n scores,” “top ranking score,” or “the most accurate trained predictive model” teaches a cumulative score exceeding a threshold. The examiner further notes that Lin’s performing a cross-validation for each trained predictive model provided as a service teaches verifying that a service provided by the application programming interface is running the particular machine learning model.)
	Chen and Lin are analogous art as both pertain to classification using machine learning.  
It would have been obvious for a person of ordinary skill in the art prior to the effective filing date to combine Chen’s querying with distorted, synthetic samples and determining whether the classification of distorted, synthetic samples matches expected class assignments (Chen, supra) with Lin’s accumulating a score of matched expected class label assignments and verifying a service is running a particular machine learning model (Lin, supra). The modification helps select not only the most accurate trained predictive model that may change over time but also determines the data samples that are the most information-rich to be retained in memory to address memory limitation (Lin, ¶ [0011], “Accuracy scores can be determined that are reflective of more recently received data samples. As input data to be input into a trained predictive model to generate a predictive output changes over time, the accuracy of the trained predictive model may also change. Determining the accuracy score based on data samples that are representative of current input data can help to select the most accurate trained predictive model at a given time. Memory space can limit the Volume of data samples that can be retained. Determining which data samples are the most information-rich can be useful in selecting a set of test data and/or training data to be used and/or retained in memory.”)
	Chen modified by Lin does not appear to explicitly teach: 
wherein the separate samples are distorted, based on inverting a portion of bits comprising the separate samples, to induce the particular machine learning model to misclassify the separate samples as different classes from among the plurality of classes; 
Goodfellow does, however teach:
wherein the separate samples are distorted, based on inverting a portion of bits comprising the separate samples, to induce the particular machine learning model to misclassify the separate samples as different classes from among the plurality of classes; (Goodfellow at ¶ 2, § 3, p. 2: “For example, digital images often use only 8 bits per pixel”.  ¶ 1, p. 6: “As control experiments, we trained training a maxout network with noise based on randomly adding ± to each pixel, or adding noise in U(-; ) to each pixel.” The examiner notes that Goodfellow’s adding ± to each pixel that uses 8 bits teaches inverting a portion of bits in a sample because a bit stores a binary value (e.g., 0 or 1) so changing a pixel’s value by adding ± changes at least the binary value (e.g., “0”) of at least one bit to the other binary value (e.g., “1”) and thus teaches inverting a portion of bits as claimed.)
Chen, Lin, and Goodfellow are analogous art because both references pertain to training neural networks with adversarial examples. 
It would have been obvious for a person of ordinary skill in the art prior to the effective filing date to have modified Chen in view of Lin to incorporate Goodfellow’s adversarial examples one or more bits of which are inverted with imperceptibly small perturbation (see Goodfellow, Abstract and ¶ 2, § 1, p. 1, supra). The modification addresses the shortcoming of conventional approaches that cause the activation to grow and thus “accidental steganography” by subtracting the penalty from the activation, rather than adding the penalty to the activation as in convention approaches, so as to make confident enough predictions that the error in prediction has saturated at a minimum value (Goodfelllow, p. 2, § 3, ¶ 4: “The adversarial perturbation causes the activation to grow by                         
                            
                                
                                    ω
                                
                                
                                    T
                                
                            
                            η
                        
                    ”; and p. 4, § 5, ¶ 2: “This is somewhat similar to L1 regularization. However, there are some important differences. Most significantly, the L1 penalty is subtracted off the model’s activation during training, rather than added to the training cost. This means that the penalty can eventually start to disappear if the model learns to make confident enough predictions that [Symbol font/0x7A] saturates.”)

With respect to claim 18, it is substantially similar to claim 2 and is rejected in the same manner, the same art and reasoning applying.

With respect to claim 19, it is substantially similar to claim 3 and is rejected in the same manner, the same art and reasoning applying.

With respect to claim 20, it is substantially similar to claim 4 and is rejected in the same manner, the same art and reasoning applying.

Claims 5, 8, 13, and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al., U.S. Patent Number 10839291 with the effective filing date of July 6, 2018 (hereinafter Chen) in view of Lin et al., U.S. Pat. App. Pub. No. 2012/0284212 published on November 8, 2012 (hereinafter Lin) and Goodfellow et al., Explaining and Harnessing Adversarial Examples (20 March 2015) (hereinafter Goodfellow) and further in view of Moore et al., U.S. Pat. App. Pub. No. 2011/0289027 (hereinafter Moore) and/or Reed et al., U.S. Pat. App. Pub. No. 2005/0192992 (hereinafter Reed).
With respect to claim 5, Chen modified by Lin teaches the method according to claim 1.  Further, Lin was further cited to teach the threshold requiring the score to reach a level (see Lin at ¶ [0047] cited for claim 1, supra).  
Chen modified by Lin does not appear to explicitly teach: 
receiving, by the computer system, a selection from a user of a percentage probability of certainty requested; and 
dynamically adjusting, by the computer system, the threshold to a value that requires the score to reach a level of certainty that the service provided by the application programming interface is running the particular machine learning model reaches the percentage probability of certainty requested.  
Reed does, however, teach: 
(Reed, ¶ [0007]: “systems and methods that respond to received data (e.g., email, voice, graphics …) based on intent of the data”; ¶ [0033]: “the intent can be provided as a binary indicator, a gray scale value, a percentage, confidence level, and/or a probability”.  Chen modified by Lin teaches the method according to claim 1. The examiner notes that Reed’s user can specify a percentage and a probability and manually define its threshold and thus teaches this limitation.)

dynamically adjusting, by the computer system, the threshold to a value that requires the score to reach a level of certainty (Reed, ¶ [0033]: “the decision-making component 230 can utilize a threshold to compare with the intent”; and “[t]he threshold can be user defined, default and/or automatically set based on past user responses. In addition, the threshold can be manually and/or automatically adjusted in real-time (dynamically) to adapt to various users and/or circumstances”.  The examiner notes that Reed’s comparing a level of percentage probability of certainty to a threshold for decision making and Reed’s dynamically adjusting a threshold that corresponds to a probability, a percentage, a confidence, etc. teach this limitation.)

Chen, Lin, Goodfellow, and Reed are analogous art because both references pertain to training neural networks with adversarial examples. 
Chen in view of Lin and Goodfellow to further incorporate Reed’s dynamically adjusting a threshold to a value that requires the score to reach a level of percentage probability certainty received from a user (Reed, supra). The modification allows representing a user’s intent in any known format such as a percentage probability that is not only further processed by a decision-making component but also dynamically adjusted in order to adapt to various different users and/or circumstances (Reed, ¶ [0033]: “For example, the decision making component 230 can utilize a threshold to compare with the intent. The threshold can be user defined, default and/or automatically Set based on past user responses. In addition, the threshold can be manually and/or automatically adjusted in real-time (dynamically) to adapt to various users and/or circumstances. Moreover, the threshold can be set based on inferences, predictions, probabilities, etc.”)
Lin was previously cited to teach verifying, by the computer system, that a service provided by the application programming interface is running the particular machine learning model (see Lin, ¶¶ [0044] and [0046] cited for claim 1, supra).  Reed was further cited to teach a threshold value that requires the score to reach a level of percentage probability certainty (see Reed, supra).
Chen modified by Lin, Goodfellow, and Reed does not appear to explicitly teach: 
a level of certainty that the service provided by the application programming interface is running the particular machine learning model reaches the percentage probability of certainty requested.  
Moore does, however, teach: 
(Moore, ¶ [0021]: “[l]earn[] the behavior of a legacy system” of “undocumented or poorly documented software components”; ¶ [0043]: “[t]hese rules, as discussed above, represent behavior [of the undocumented process] observed by monitor” ¶ [0095]: “[a]n output 130 generated by legacy process 110 may then be compared to an expected output 130 predicted by an applicable rule”; and “[i]f the actual output is equivalent to a first predicted output by the rule, then the rule is legitimate for at least a correctly-formatted input. A further check of the same rule may entail submitting to legacy process 110 a purposely incorrectly formatted document and comparing a corresponding output 130 to a second predicted output. If the actual and second predicted outputs are equivalent, then the rule is legitimate at least for the particular incorrect format of the test document.”  
The examiner notes that Moore’s determining whether the behaviors of an undocumented or poorly documented process are legitimate by comparing expected outputs to predicted outputs of the process to determine whether these outputs are equivalent teaches a service is running the particular machine learning model, and that Moore’s determination of equivalence based on a level of confidence (see e.g., ¶ [0043]) teaches a level of certainty. Therefore, the examiner thus asserts that Moore, when combined with Lin and Reed as delineated immediately above limitation.)
Chen, Lin, Goodfellow, Reed, and Moore are analogous art because both references pertain to training neural networks with adversarial examples. 
It would have been obvious for a person of ordinary skill in the art prior to the effective filing date to have modified Chen in view of Lin, Goodfellow, and Reed to further incorporate Moore’s verifying that the service provided by the application programming interface is running the particular machine learning model based on a percentage probability (Moore, supra). The modification learns the behavior of an undocumented or poorly documented software component and effectively models such a software component by a set of rules that represent the legitimacy of the software component’s behavior (Moore, ¶ [0043]: “Over a sufficient amount of time, the behavior of legacy process 110, or of a sub-process of legacy process 110, will be effectively modeled by the set of rules used by legacy interface/probe 140. These rules, as discussed above, represent behavior observed by monitor 155, checked by rule checker 150, and incorporated into the function of translator 145 and bypasser 160. Once a level of confidence is reached that the rules are acceptably accurate, legacy process 110, or one or more of its sub-processes, may be replaced with a legacy mimic 190.”)

With respect to claim 8, Chen modified by Lin and Goodfellow teaches the method according to claim 1, wherein the in response in claim 1, and Chen modified by Lin and Goodfellow teaches verifying, by the computer system, by the threshold, that the service provided by the application programming interface is running the particular machine learning model. (See citations and rationale, supra). 
Chen modified by Lin and Goodfellow teaches verifying, by the computer system, by a threshold, that the service provided by the application programming interface is running the particular machine learning model (see citations for claim 1, supra) but does not appear to explicitly teach: 
verifying, by the computer system, by a percentage of probability associated with the threshold, that the service provided by the application programming interface is running the particular machine learning model.  

Reed does, however, teach: 
verifying, by the computer system, by a percentage of probability associated with the threshold, (Reed, ¶ [0033]: “the intent can be provided as a binary indicator, a gray scale value, a percentage, confidence level, and/or a probability”; and ”the decision-making component 230 can utilize a threshold to compare with the intent”.   The examiner further notes that Reed’s teaching of a threshold that may be expressed in various known format such as a percentage probability teaches the above limitation. )  
Chen, Lin, Goodfellow, and Reed are analogous art because both references pertain to training neural networks with adversarial examples. 
It would have been obvious for a person of ordinary skill in the art prior to the effective filing date to have modified Chen in view of Lin and Goodfellow to further incorporate Reed’s dynamically adjusting a threshold to a value that requires the score to reach a level of percentage probability certainty received from a user (Reed, supra). The modification allows representing a user’s intent in any known format such as a Reed, ¶ [0033]: “For example, the decision making component 230 can utilize a threshold to compare with the intent. The threshold can be user defined, default and/or automatically Set based on past user responses. In addition, the threshold can be manually and/or automatically adjusted in real-time (dynamically) to adapt to various users and/or circumstances. Moreover, the threshold can be set based on inferences, predictions, probabilities, etc.”)

Chen modified by Lin, Goodfellow, and Reed does not appear to explicitly teach: 
verifying, by the computer system, by a confidence level, that the service provided by the application programming interface is running the particular machine learning model.  
Moore does, however, teach: 
verifying, by the computer system, by a confidence level, that the service provided by the application programming interface is running the particular machine learning model.  (Moore, ¶ [0021]: “[l]earn[] the behavior of a legacy system” of “undocumented or poorly documented software components”; ¶ [0043]: “[t]hese rules, as discussed above, represent behavior [of the undocumented process] observed by monitor 155 , checked by rule checker 150, and incorporated into the function of translator 145 and bypasser 160. Once a level of confidence is reached that the rules are acceptably accurate, legacy process 110, or one or more of its sub-processes, may be replaced with a legacy mimic 190.” ¶ [0095]: “[a]n output 130 generated by legacy process 110 may then be compared to an expected output 130 predicted by an applicable rule”; and “[i]f the actual output is equivalent to a first predicted output by the rule, then the rule is legitimate for at least a correctly-formatted input. A further check of the same rule may entail submitting to legacy process 110 a purposely incorrectly formatted document and comparing a corresponding output 130 to a second predicted output. If the actual and second predicted outputs are equivalent, then the rule is legitimate at least for the particular incorrect format of the test document.”  
The examiner notes that Moore’s determining whether the behaviors of an undocumented or poorly documented process are legitimate by comparing expected outputs to predicted outputs of the process to determine whether these outputs are equivalent teaches a service is running the particular machine learning model, and that Moore’s determination of equivalence based on a level of confidence (see e.g., ¶ [0043]) teaches a level of certainty. Therefore, the examiner thus asserts that Moore, when combined with Lin and Reed as delineated immediately above limitation.)
Chen, Lin, Goodfellow, Reed, and Moore are analogous art because both references pertain to training neural networks with adversarial examples. 
It would have been obvious for a person of ordinary skill in the art prior to the effective filing date to have modified Chen in view of Lin, Goodfellow, and Reed to further incorporate Moore’s verifying that the service provided by the application programming interface is running the particular machine learning model based on a percentage probability (Moore, supra). The modification learns, to a level of confidence, Moore, ¶ [0043]: “Over a sufficient amount of time, the behavior of legacy process 110, or of a sub-process of legacy process 110, will be effectively modeled by the set of rules used by legacy interface/probe 140. These rules, as discussed above, represent behavior observed by monitor 155, checked by rule checker 150, and incorporated into the function of translator 145 and bypasser 160. Once a level of confidence is reached that the rules are acceptably accurate, legacy process 110, or one or more of its sub-processes, may be replaced with a legacy mimic 190.”)

With respect to claim 13, it is substantially similar to claim 5 and is rejected in the same manner, the same art and reasoning applying.

With respect to claim 16, it is substantially similar to claim 8 and is rejected in the same manner, the same art and reasoning applying.

Claims 6 and 14  is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al., U.S. Patent Number 10839291 with the effective filing date of July 6, 2018 (hereinafter Chen) in view of Lin et al., U.S. Pat. App. Pub. No. 2012/0284212 published on November 8, 2012 (hereinafter Lin) and Goodfellow et al., Explaining and Harnessing Adversarial Examples (20 March 2015) (hereinafter Goodfellow) and further in view of Reed et al., U.S. Pat. App. Pub. No. 2005/0192992 (hereinafter Reed).
Chen modified by Lin teaches the method according to claim 1.
Chen further teaches: 
running, by the computer system, the plurality of synthetic samples on the plurality of additional machine learning models; (Abstract: “a deep neural network (DNN) training system that generates a hardened DNN by iteratively training DNNs with images that were misclassified by previous iterations of the DNN”.  The examiner notes that Chen’s running misclassified images through multiple deep neural networks (“DNNs”) teaches this limitation.)

for the plurality of additional machine learning models, accumulating, by the computer system, a separate number of results that match the expected class label assignment of the different classes for the plurality of synthetic samples; and (Abstract: “Some embodiments are particularly directed to a deep neural network (DNN) training system that generates a hardened DNN by iteratively training DNNs with images that were misclassified by previous iterations of the DNN.”  Col. 6, ll. 51-53: “Generally, the analytical attacks may include, for instance, altering one or more portions of a base image (e.g., sample image 212-n) to generate a test image”.  Col. 9, ll. 53-58: “Proceeding to block 515 “provide test image to DNN for classification” the test image may be provided to the DNN for classification. Continuing to block 516 “test image properly classified?” it may be determined if the test image was properly classified by the DNN. For instance, if the classification of the test image matches a target classification, the test image is properly classified.”  The examiner notes that Chen’s running test images, which are misclassified in a previous iteration and are altered, through multiple deep neural networks (DNNs) and at least some of the classification of the test image matches the target classification teaches the above limitation.)

Chen does not appear to explicitly teach: 
creating, by the computer system, a cohort set of a plurality of additional machine learning models of one or more configurations that classify the same plurality of classes as the particular machine learning model;
for the plurality of additional machine learning models, accumulating, by the computer system, a separate score of a separate number of results that match the expected class label assignment of the different classes for the plurality of synthetic samples; and
applying, by the computer system, one or more calibration rules to separate score to calibrate the threshold to assess the likelihood that the service provided by the application programming interface is running the particular machine learning model.

Lin does, however, teaches: 
creating, by the computer system, a cohort set of a plurality of additional machine learning models of one or more configurations that classify the same plurality of classes as the particular machine learning model; (¶ [0003]: “a computer-implemented system”.  ¶ [0008]: “a repository of trained predictive models that were all trained using the same initial training data set”; and “[e]ach of the trained predictive models in the repository can be tested for accuracy using the set of test data”.  ¶ [0042]: “For a given training function, multiple different hyper-parameter configurations can be applied to the training function, again generating multiple different trained predictive models.”  The examiner notes that Lin’s generating multiple different models with different hyerparameters teaches creating a cohort set of a plurality of additional machine learning models. The examiner further notes that in one embodiment where Lin trains all the models in its repository with the same initial training data set teaches that the cohort set classifies the same plurality of classes as the particular machine learning model because the same training data set has the same expected classes.)

for the plurality of additional machine learning models, accumulating, by the computer system, a separate score of a separate number of results that match the expected class label assignment of the different class for each of the plurality of predictive outputs; and (¶ [0003]: “multiple updateable trained predictive models which are each associated with an accuracy score that represents an estimation of the accuracy of the trained predictive model.” ¶ [0006]: “adding the sum of correct predictive outputs to previously determined sums of correct predictive outputs that were determined when the initial training data and other training data sets in the series of training data sets were received to determine a total number of correct predictive outputs”.  The examiner notes that Lin’s adding the sum of corrective predictive outputs for a predictive model to update its score for each predictive model teaches the above limitation.)

applying, by the computer system, one or more calibration rules to separate score to assess that the service provided by the application programming interface is running the particular machine learning model. (FIG. 6, reference numeral 608 and Abstract: “A system includes a computer(s) coupled to a data storage device(s) that stores a training function repository and a predictive model repository that includes includes [sic] updateable trained predictive models each associated with an accuracy score” and “[b]ased on the comparison and previous comparisons determined from the initial training data and from previously received training data sets, an updated accuracy score for each predictive model is determined.”  ¶ [0044]: “cross-validation is used to estimate the accuracy of each trained predictive model”. ¶ [0046]: “the predictive modeling server system 206 operates independently from the client computing system 202 and selects and provides the trained predictive model 218 as a specialized service.”  ¶ [0047]: “In some implementations, the trained models are ranked based on the value of their respective scores, and the top ranking trained model is chosen as the selected predictive model.” ¶ [0107]: “the new accuracy scores associated with the available trained predictive models”; and ¶ [0127]: “the trained predictive models with the top n accuracy scores are selected from among the total available predictive models”.  
The examiner notes that Lin’s selecting models having the “top n scores,” “top ranking score,” or “the most accurate trained predictive model” teaches a threshold.  The examiner further notes that Lin’s updating the accuracy score for each predictive model based on comparison and previous comparisons teaches applying one or more rules to each separate score. The examiner also notes that Lin’s performing a cross-validation for each trained predictive model provided as a service teaches assessing the likelihood that the service is running a correctly selected and hence a particular machine learning model (e.g., a model being among the top-ranking models).)

	Chen and Lin are analogous art as both pertain to classification using machine learning.  
It would have been obvious for a person of ordinary skill in the art prior to the effective filing date to combine Chen’s querying with distorted, synthetic samples and determining whether the classification of distorted, synthetic samples matches expected class assignments (Chen, supra) with Lin’s accumulating scores of additional machine learning models that are created as the particular machine learning model and applying calibration rule(s) to such scores (Lin, supra). The modification not only provides multiple predictive models for selection of the top-ranking, most accurate trained model based on the respective accuracy scores of these models but also helps select, based on up-to-date accuracy score, a trained predictive model that may change over time (Lin, ¶ [0011], “Accuracy scores can be determined that are reflective of more recently received data samples. As input data to be input into a trained predictive model to generate a predictive output changes over time, the accuracy of the trained predictive model may also change. Determining the accuracy score based on data samples that are representative of current input data can help to select the most accurate trained predictive model at a given time. Memory space can limit the Volume of data samples that can be retained. Determining which data samples are the most information-rich can be useful in selecting a set of test data and/or training data to be used and/or retained in memory.” ¶ [0047]: “In some implementations, the trained models are ranked based on the value of their respective scores, and the top ranking trained model is chosen as the selected predictive model.”)

Lin teaches applying, by the computer system, one or more calibration rules to each separate score to assess that the service provided by the application programming interface is running the particular machine learning model. Chen modified by Lin and Goodfellow does not appear to explicitly teach assess the likelihood. 
applying, by the computer system, one or more calibration rules to each separate score to calibrate the threshold to assess the likelihood.  
Reed does, however, teach: 
applying, by the computer system, one or more calibration rules to each separate score to calibrate the threshold to assess the likelihood.  (¶ [0008]: “The system includes a data manager that can employ various techniques to determine an associated intent of the data” and “the data manager utilizes information such as metadata, properties, content, context, keywords, history, heuristics, inferences, rules, demarcations, extrinsic information such the source of the data, the time of day and/or day of week the data was transmitted and/or received, cost/benefit of handling the data, etc. to group data into one or more sets of data with similar characteristics”; ¶ [0028]: “The data manager 110, upon receiving data via the interface component 120, can employ various techniques to determine an associated intent of the data”; ¶ [0033]: “The decision-making component 230 can determine whether the intent warrants a response. For example, the decision-making component 230 can utilize a threshold to compare with the intent. The threshold can be user defined, default and/or automatically set based on past user responses. In addition, the threshold can be manually and/or automatically adjusted in real-time (dynamically) to adapt to various users and/or circumstances. Moreover, the threshold can be set based on inferences, predictions, probabilities, etc.”   
The examiner notes that Reed’s applying rules to input information such as inferences and demarcations (e.g., each separate score of a separate number of results) to determine intent that is then compared to a dynamically and/or manually adjustable threshold to determine whether an action by Reed’s classification model is warranted teaches applying calibration rule(s) to input (e.g., each separate score).  The examiner further notes that Reed’s manually, dynamically, or in real-time adjusting a threshold based on inferences, predictions, probabilities, etc. teaches calibrating the threshold to assess probability and hence likelihood.  Therefore, Reed teaches the above limitation.)
Chen, Lin, Goodfellow, and Reed are analogous art because both references pertain to training neural networks with adversarial examples. 
Chen in view of Lin and Goodfellow to further incorporate Reed’s calibrating a threshold by applying rule(s) to inference results such as accuracy scores to assess the likelihood (e.g., probability) pertaining to inferences or predictions (Reed, supra). The modification adjusts a threshold value upon which inferences or predictions are based and allows for adaptation to different circumstances (Reed, ¶ [0033]: “For example, the decision making component 230 can utilize a threshold to compare with the intent. The threshold can be user defined, default and/or automatically Set based on past user responses. In addition, the threshold can be manually and/or automatically adjusted in real-time (dynamically) to adapt to various users and/or circumstances. Moreover, the threshold can be set based on inferences, predictions, probabilities, etc.”)

With respect to claim 14, it is substantially similar to claim 6 and is rejected in the same manner, the same art and reasoning applying.

Claims 7 and 15  is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al., U.S. Patent Number 10839291 with the effective filing date of July 6, 2018 (hereinafter Chen) in view of Lin et al., U.S. Pat. App. Pub. No. 2012/0284212 published on November 8, 2012 (hereinafter Lin) and Goodfellow et al., Explaining and Harnessing Adversarial Examples (20 March 2015) (hereinafter Goodfellow) and further in view of Hu et al. Generating Adversarial Malware Examples for Black-Box Attacks on GAN February 20, 2017 (hereinafter Hu).
Chen modified by Lin and Goodfellow teaches the method according to claim 1, and Chen further teaches: 
wherein querying, by the computer system, an application programming interface with a plurality of synthetic samples, (Chen, FIG. 8: “computing architecture 800” having “processing unit 804, a system memory 806 and a system bus 808.”  Col. 15, ll. 8-24: “[v]arious embodiments may be implemented using hardware elements, software elements, or a combination of both”; and “Examples of software may include software components, programs, applications, computer programs, application programs, system programs, machine programs, operating system software, middleware, firmware, software modules, routines, subroutines, functions, methods, procedures, software interfaces, application program interfaces (API)”.  Col. 10, ll. 9-16: “At block 524, the adversarial images may be provided as an input to block 506 ‘determine current training set’. Accordingly, the current training set may be updated with the adversarial images”.  The examiner notes that Chen’s computer architecture provides adversarial images via software implemented with an API to a deep neural network (DNN) for classification (or misclassification) of the adversarial images teaches the limitation.)

the plurality of synthetic samples representing separate samples assigned an original class from a plurality of classes classified by a particular machine learning model and (Chen, col. 4, ll. 26-28: “each piece of training data may be associated with a class (e.g. class 210-1, 210-2, 210-n or classes 210)”; col. 6, ll. 52-55: “altering one or more portions of a base image (e.g., sample image 212-n) to generate a test image; providing the test image to the trained DNN (e.g., trained DNN 304-1) for classification”; col. 9, ll. 58-59: “Continuing to block 516 “test image properly classified?” it may be determined if the test image was properly classified by the DNN. For instance, if the classification of the test image matches a target classification, the test image is properly classified”.  The examiner notes that Chen’s DNN teaches a particular machine learning model, that Chen’s sample images teach the claimed separate samples, that Chen’s classes (210) respectively teach the plurality of classes, that Chen’s target classification of the sample images teaches the claimed original classes, that Chen’s test images teach the plurality of synthetic samples, and that Chen thus teaches the above limitation in its entirety.)

distorted to induce the particular machine learning model to misclassify the separate sample as a different class from among the plurality of classes. (Chen, col. 6, ll. 52-55: “altering one or more portions of a base image (e.g., sample image 212-n) to generate a test image; providing the test image to the trained DNN (e.g., trained DNN 304-1) for classification”; col. 9, ll. 60-62 and col. 6, l. 59: “the classification of the test image does not match the target classification”; col. 10, ll. 1-3: “when the test image is not properly classified, the test image maybe identified as an adversarial image”. The examiner further notes that Chen was previously cited to teach the claimed separate samples and the plurality of classes.  See citations, supra. The examiner further notes Chen’s test images teach a plurality of synthetic samples, and that Chen’s altering a portion of sample images into corresponding test images so that the classifications of the test images do not match the target classifications of the sample images teaches the above claimed limitation.)
querying, by the computer system, the application programming interface with the plurality of synthetic samples (col. 7, ll. 1-5: “embodiments, causative image generator 402 may carry out a causative attack as follows. Given a valid input x (e.g., sample image 212-1), find a similar x' such that the classification output C(x) ≠ C(x'), but x and x' are close according to some distance metric.”   The examiner notes that Chen’s misclassifying a test image (x’ above) that is provided as “a valid input” to its DNN teaches that the test image (x’ above) is classified as a different classification output (C(x’) above) and is thus not detectable as test inputs as claimed.)
Chen modified by Lin and Goodfellow does not appear to explicitly teach: 
querying, by the computer system, the application programming interface with the plurality samples as normal, valid inputs to the application programming interface that are not detectable by the application programming interface as test inputs to verify an identity of the particular machine learning model deployed and running behind the application programming interface.  

Hu teaches this claimed limitation of claim 7 in its entirety: 
querying, by the computer system, the application programming interface with the plurality samples (Hu, § 2.1, ¶ 2: “In this paper we only generate adversarial examples binary features, because binary features are widely used by malware detection researchers and are able to result in high detection accuracy. Here we take API feature as an example to show how to represent a program. If M APIs are used as features, an M-dimensional feature vector is constructed for a program. If the program calls the d-th API, the d-th value is set to 1, otherwise it is set to 0.”  § 2.2, ¶ 1: “The generator is used to transform a malware feature vector into its adversarial version. It takes the concatenation of a malware feature vector m and a noise vector z as input.” The examiner notes that Hu’s querying AN API with adversarial examples obtained from a malware and a noise vector teaches the above limitation.)

the plurality of samples as normal, valid inputs to the application programming interface that are not detectable by the application programming interface as test inputs (Hu, § 1, Last paragraph: “Experimental results show that almost all of the adversarial examples generated by MalGAN successfully bypass the detection algorithms and MalGAN is very flexible to fool further defensive methods of detection algorithms.”  The examiner notes that the adversarial examples that successfully bypass the malware detection algorithms are normal, valid inputs that are not detectable as test inputs to the API because these adversarial examples successfully bypass the malware detection algorithms. Therefore, these adversarial examples are not classified as malware and hence are not detectable by the API as test inputs.)

(Abstract: “This paper proposes a generative adversarial network (GAN) based algorithm named MalGAN to generate adversarial malware examples, which are able to bypass black-box machine learning based detection models MalGAN uses a substitute detector to fit the black-box malware detection system.” § 1, ¶¶ 7-8: “It is hard for malware authors to know which classifier a malware detection system uses and the parameters of the classifier.  However, it is possible to figure out what features a mal ware detection algorithm uses by feeding some carefully designed test cases to the black-box algorithm.”  § 1, last paragraph: “A substitute detector is trained to fit the black-box malware detection algorithm, and a generative network is used to transform malware samples into adversarial examples. Experimental results show that almost all of the adversarial examples generated by MalGAN successfully bypass the detection algorithms and MalGAN is very flexible to fool further defensive methods of detection algorithms.”  
The examiner notes that Hu’s determining a substitute model that fits the target black-box neural network uses while being used to generate adversarial examples almost 100% of which successfully bypass detection (or classification) as malware by the black-box neural network teaches that the substitute model fits the target black-box neural network and thus verifies the identity of the target black-box neural network. )
Chen, Lin, Goodfellow, and Hu are analogous art because both references pertain to training neural networks with adversarial examples. 
Chen in view of Lin and Goodfellow to further incorporate Hu’s querying an API with samples that are not detectable by the API as test inputs to verify the identity of a particular model running behind the API (Hu, supra). The modification learns a substitute detection module that not only fits the detector module of the black-box neural network but also allows generated adversarial examples to bypass the substitute detection module in order to train the black-box neural network and to minimize the generated adversarial examples’ malicious probabilities of maliciously attacking the black-box neural network (Hu, Abstract: “This paper proposes a generative adversarial network (GAN) based algorithm named MalGAN to generate adversarial malware examples, which are able to bypass black-box machine learning based detection models. MalGAN uses a substitute detector to fit the black-box malware detection system. A generative network is trained to minimize the generated adversarial examples’ malicious probabilities predicted by the substitute detector.”)

With respect to claim 15, it is substantially similar to claim 7 and is rejected in the same manner, the same art and reasoning applying.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERICH C. TZOU whose telephone number is (571)272-9852. The examiner can normally be reached Monday-Friday 6:00AM-5:30PM PST with alternative Fridays off.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann J. Lo can be reached on 571-272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-


/E.C.T./Examiner, Art Unit 2126 
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126