Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Detailed action 
Claims 1-10 and 13-20 are pending and are being considered.
Claims 1 and 10 have been amended.
Claims 11-12 have been cancelled.
112 is withdrawn based on amendments 
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 04/06/2022 was filed after the mailing date of the application no. 17/499353 on 10/12/2021.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Response to 103
	Applicants arguments filed on 04/06/2022 have been fully considered and are not persuasive. 
	Arguments regarding claims 1 and 15:
In response to applicants argument on last para of page 7 of remarks that Sharma (i.e. primary reference) fails to teach the limitation “selecting datapoints from the training dataset that are predicted correctly by a source model as a group of adversarial candidates”
The applicant argues that Sharma does not teach selecting datapoints that are predicted correctly by the source model. The examiner acknowledges applicants point of view but respectfully disagrees because Sharma on [0037] teaches machine learning model (i.e. equivalent to source model) taking training data (i.e. equivalent to training data set) for determining or predicting an output based on input data (i.e. input data is equivalent to datapoints which is selected for predicting output by the machine learning model). See also Fig 3 and text on [0117-0119] teaches the privacy attack 314 may provide the first input 328a to the machine learning model 308. The privacy attack 314 may provide the second input 328b to the machine learning model 308 (i.e. selecting first and second input by the machine learning as datapoints). Furthermore, the spec of instant application on [0042] discloses source model M takes datapoint as an input to predict an output just like Sharma machine learning model takes inputs 328a and 328b for predicting output 330a 330b respectively as shown in Fig 3 of Sharma. 
In response to applicant’s argument on page 8 1st para that Sharma fails to teach the limitation 
“adding noise to each candidate of the sub-group of adversarial candidates to yield a noisy group of adversarial examples” 
	The applicant argues that Sharma only teaches adding noise to a model at input level which not equivalent what’s being claimed of “adding noise to each candidate of the sub-group of adversarial candidates”. The examiner acknowledges applicants point of view but respectfully disagrees because Sharma [0043] teaches noise can be added to a model at the input level (such as by adding randomized inputs to the training data). This portion of Sharma explicitly teach that adding a noise to model is equivalent to adding a randomize input at the training data, hence the noise is added as randomize input to the training data and the input data 328a is selected from the training data as a group of adversarial candidates as explained above, therefore the noise is added to each input selected as a group of adversarial candidates. Just like the instant application on [0051-0053] discloses adding a random noise to an image and source model takes image as input, therefore a random noise is added to model at an input level on the input it takes. Further on [0066] Sharma teaches noise 110 can be added at an input level of the causal model 108 (such as by adding randomized inputs to the training data 102). See also on [0069] teaches sending a first test input and a second test input to the causal model with noise 112 (i.e. also teaches noise added to first and second input).
	in response to applicants argument on page 8 last para that Sharma fails to teach the limitation  
“testing the set of source model output against the reference model to yield a set of reference model outputs; testing the set of source model output against the surrogate model to yield a set of surrogate model outputs” particularly the applicant argues that “the first input and the second input” that are being tested against reference model and surrogate model are not same as “testing the set of source model output” because the set of source model outputs are obtained by testing the noisy group of adversarial example that the source model predicts correctly. The examiner acknowledges applicants point of view but respectfully disagrees because first the examiner did not cite Sharma to teach  
“testing the noisy group of adversarial examples against the source model to obtain a set of source model outputs that the source model predicts correctly” because Sharma does not teach “testing the noisy group of adversarial examples against the source model” In other word Sharma teaches set of source model outputs to be tested but does not teach that these set of source model outputs are not obtained by testing the noisy group of adversarial example against the source model, therefore the examiner Relied upon XIE (i.e. third reference) to teach the  above limitation See XIE on [0023] teaches the system first performs DNN perception modeling 380 via a series of probes 375 to a target DNN 345. The probes 375 are performed offline prior to performing any real-time image compression on images received at the edge gateway device. The system injects a small amount of noises to a test image to (i.e. noisy group of adversarial example in this case) obtain an adversarial example, transmit the adversarial example in a probe message to a target DNN 345 (i.e. source model in this case), and check the response from the target DNN 345 to determine whether DNN 345 can detect objects in the adversarial example with desired DNN inference accuracy (i.e. source model output in this case).
	Now coming back to Sharma to see if Sharma teaches the argued limitation of testing the set of source model output against reference model and surrogate model Sharma on [0049] teaches a causal model (i.e. reference model) may be successful on roughly 50% of the test inputs (i.e. the first input and the second input as set of source model output), whereas the same attack on a correlational model (i.e. surrogate model) may be successful for 80% of the test inputs. Sharma further on [0069] teaches one example of a membership inference attack may involve sending a first test input and a second test input (i.e. set of source model output ) to the causal model with noise . The one or more outputs 116 may include a first output responsive to the first test input and a second output responsive to the second test input. The first output may include a first confidence value, and the second output may include a second confidence value. An adversary may attempt to infer whether the first test input is a member of the training data 102 based on a difference in the first confidence value and the second confidence value.
The applicant argues that the test inputs of Sharma are not the same as set of source output of the claim without giving any reasonable explanation of why the set of test inputs of Sharma are not equivalent to set of source output of the claim. The set of source model of output are broadly interpreted as test input against reference model and surrogate model. Detailed explanation is given above regarding how the set of source model outputs are obtained.
	In response to applicant’s argument on page 9 last para of remarks that KRUTHIVETI fails to teach “selecting, from the group of adversarial candidates, a sub-group of candidates that each have a low confidence score according to a threshold to yield a sub-group of adversarial candidates”. More specifically the applicants argue that KRUTHIVETI teaches selecting an “original input” as adversarial candidate based on low confidence score. The examiner acknowledges applicants point of view but respectfully disagrees because KRUTHIVETI explicitly teaches the argued limitation on [0050-0054] as cited in the previous action the ML system 200 determines whether the adversarial score output by the adversarial detection module 210 satisfies a predetermined threshold for raising an adversarial flag (i.e. equivalent to selecting as argued by the applicant). If the ML system 200 determines that the adversarial score satisfies the threshold, then at step 510 the ML system 200 flags the original input data as adversarial (i.e. selecting original data input as adversarial based on score satisfying threshold). 
	In response to applicant’s argument on page 10 2nd last para that KRUTHIVETI fails to teach “and identifying a set of fingerprints based on which ones from the set of source model outputs, the set of reference model outputs and the set of surrogate model outputs pass as adversarial examples against the source model and the surrogate model, but not the reference model” The applicant argues that the reference fails to teach fingerprint pass as adversarial through two model but not the third model.  KRUTHIVETI on Fig 2 and text [0029-0031] teaches input perturbate as “fingerprint” and the detection model 212 may include a neural fingerprinting ML model (i.e. source model) that is fed versions of the input data x perturbed with predefined random perturbations, as well as the output i of the ML model 204, and in such a case the detection model 212 may output an adversarial score indicating whether output perturbations generated for the perturbed input data by the neural fingerprinting ML model match expected output perturbations for the output i of the ML model 204 (i.e. passing as adversarial example if score is greater than threshold against source model in view of [0028]). In other embodiments, the detection model 212 may include a surrogate ML model that takes the same input as the ML model 204 and is used to extract features that are compared with an expected feature distribution for the output i of the ML model 204 to determine an adversarial score indicating whether the extracted features match the expected feature distribution (i.e. surrogate model classifying or passing the input as adversarial if it meets the threshold as shown in Fig 2). Furthermore on [0050-0051] teaches he adversarial detection module 210 which includes the fingerprinting module ML and surrogate module as explained above in [0029] the ML system 200 determines whether the adversarial score output by the adversarial detection module 210 satisfies a predetermined threshold for raising an adversarial flag. If the ML system 200 determines that the adversarial score satisfies the threshold, then at step 510 the ML system 200 flags the original input data as adversarial (i.e. the adversarial detection module containing fingerprinting module ML and surrogate module passing the input as adversarial if meets the threshold). Fig 2 and text on [0025] teaches the ML system 200 includes a ML model 204 (i.e. reference model produce an output 206 when score is less than threshold score and do not flag as adversarial input) that receives an input 202, denoted by x, and produces an output 206, denoted by i. 
Arguments regarding claim 10:
	In response to applicants argument on page 11 last para of remarks that Rouhani (i.e. cited references) fails to teach wherein the fingerprint passes against the source model and a surrogate model, but not a reference model. The examiner acknowledges applicants point of view but respectfully disagrees because Rouhani on [0053] teaches the first and second digital watermark serve as fingerprint and [0052] teaches the misuse detection system 300 may include a detection engine 330 configured to generate and embed, in the machine learning model 100 (i.e. source model), a digital watermark uniquely identifying the machine learning model 100. The detection engine 330 may be further be configured to extract, from the third party machine learning model 320 (i.e. surrogate model), a digital watermark and/or determine whether the digital watermark extracted from the third party machine learning model 320 matches the digital watermark embedded in the machine learning model 100 (i.e. broadly interpreted as passing the digital watermark as fingerprint against first and third machine learning model because the watermark is embedded in these model). Further on [0096] teaches the detection engine 330 may determine that the first client 352a is the source of the third party machine learning model 352 and not the second client 352b based at least on the third digital watermark extracted from the third party machine learning model 320 matching the first digital watermark embedded in the first copy of the machine learning model 100 distributed to the first client 352a but not the second digital watermark embedded in the second copy of the machine learning model 100 distributed to the second client 352b (i.e. not passing the second copy of digital machine learning model equivalent to fingerprint not passing the reference model as claimed).
Arguments regarding claim 14:
In response to applicant’s argument on page 13 of remarks that KRUTHIVETI (i.e. cited reference) fails to teach test the suspect model based on a request by applying fingerprint to the suspect model. The examiner acknowledges applicants point of view but respectfully disagrees because KRUTHIVETI explicitly teaches the adversarial detection module 210 inputs the perturbed data (i.e. request for matching fingerprint) and the output of the ML model 204 at step 504 into the detection model 212 (i.e. suspect model), which in this case includes a neural fingerprinting ML model. The detection model 212 is configured to predict an adversarial score that is indicative of whether the original input data is adversarial and may change in a specific manner based on the true classification of the input data. As described, the adversarial score output by the model 212 that includes a neural fingerprinting ML model is indicative of whether the input perturbations and outputs of the neural fingerprinting ML model matches the fingerprints (i.e. by applying fingerprint).
	The applicant further argues that the cited portion of KRUTHIVETI fails to teach the suspect model is derived from source model. The examiner respectfully disagrees because the examiner relied on Rouhani to teach the above argued limitation of deriving a suspect model from source mode. Rouhani on [0092] teaches the detection engine 330 may determine that the second machine learning model is a duplicate of the first machine learning model based at least on the digital watermark being determined to be present in the second machine learning model (i.e. equivalent to suspect model derived from source model based on marking key and verification key). 
Based on the above rationales the previous rejection on claims 1-10 and 13-20 is maintained. 
Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4 and 6-7 are rejected under 35 U.S.C. 103 as being unpatentable over Sharma et al (hereinafter Sharma) (US 20210064760) in view of KRUTHIVETI et al (hereinafter KRUTHIVETI) (US 20210157912) and further in view of XIE et al (hereinafter XIE) (US 20210035330).

Regarding claim 1 Sharma teaches a method comprising (Sharma on [0005] teaches a method for protecting against privacy attack on machine learning models):
 generating, based on a training dataset, a reference model and a surrogate model (Sharma Fig 1 and associated text on [0058] teaches a model generation module 106 may receive training data and generate a causal model 108. See also Fig 2 and text on [0076-0077] teaches generating causal model (i.e. reference model) and correlation model (i.e. surrogate model) based on receiving input from training data);
 selecting datapoints from the training dataset that are predicted correctly by a source model as a group of adversarial candidates (Sharma on [0037] teaches a machine learning algorithm may take training data and build a statistical or mathematical model for determining or predicting an output based on input data. The machine learning model (i.e. source model) may be a function that predicts one or more outputs or outcomes based on one or more inputs (i.e. data points). The output may be a prediction about where a value falls on a continuous or defined spectrum. It may be a prediction about which of multiple classifications the input falls into. See on [0118] teaches the first output 330a may include a first prediction based on the machine learning model 308 and the first input 328a. The second output 330b may include a second prediction based on the machine learning model 308 and the second input 328b. The first output 330a may include a first confidence score associated with the first prediction. The second output 330b may include a second confidence score associated with the second prediction);
adding noise to each candidate of the sub-group of adversarial candidates to yield a noisy group of adversarial examples (Sharma on [0043] teaches noise can be added to a model at the input level (such as by adding randomized inputs to the training data). Noise can be added to a model at the parameter level after the model has been trained. Noise can also be added at the output level (such as adding noise to the output confidence). The amount of noise added and the level or step at which noise is added may impact how much the accuracy of the model is reduced and how much less likely it is that the model will be susceptible to a privacy attack. For example, adding noise at the input level may reduce accuracy more than adding noise to model parameters. See on [0081-0085] teaches adding noise at input level of causal and correlation model as shown in Fig 2);
testing the set of source model output against the reference model to yield a set of reference model outputs (Sharma on [0048] teaches a causal model may be successful on roughly 50% of the test inputs (close to a random guess), whereas the same attack on a correlational model may be successful for 80% of the test inputs. See on [0069] teaches one example of a membership inference attack may involve sending a first test input and a second test input to the causal model with noise 112. The first test input may be different from the second test input. The first test input may be a sample contained in the training data 102. The second test input may be a sample not contained in the training data 102. The one or more outputs 116 may include a first output responsive to the first test input and a second output responsive to the second test input. The first output may include a first confidence value, and the second output may include a second confidence value. An adversary may attempt to infer whether the first test input is a member of the training data 102 based on a difference in the first confidence value and the second confidence value);
 testing the set of source model outputs against the surrogate model to yield a set of surrogate model outputs (Sharma on [0048] teaches a causal model may be successful on roughly 50% of the test inputs (close to a random guess), whereas the same attack on a correlational model may be successful for 80% of the test inputs).
	Although Sharma teaches a confidence value associated with first and second output generated by the models, but fails to explicitly teach selecting, from the group of adversarial candidates, a sub-group of candidates that each have a low confidence score according to a threshold to yield a sub-group of adversarial candidates, testing the noisy group of adversarial examples against the source model to obtain a set of source model outputs that the source model predicts correctly and identifying a set of fingerprints based on which ones from the set of source model outputs, the set of reference model outputs and the set of surrogate model outputs pass as adversarial examples against the source model and the surrogate model, but not the reference model, however KRUTHIVETI from analogous art teaches selecting, from the group of adversarial candidates, a sub-group of candidates that each have a low confidence score according to a threshold to yield a sub-group of adversarial candidates (KRUTHIVETI on [0050-0051] teaches the ML system 200 determines whether the adversarial score output by the adversarial detection module 210 satisfies a predetermined threshold for raising an adversarial flag. If the ML system 200 determines that the adversarial score satisfies the threshold, then at step 510 the ML system 200 flags the original input data as adversarial. On the other hand, if the ML system 200 determines that the adversarial score output by the adversarial detection module 210 does not satisfy the predefined threshold, then at step 512, the ML system 200 does not flag the input data as adversarial);
and identifying a set of fingerprints based on which ones from the set of source model outputs, the set of reference model outputs and the set of surrogate model outputs pass as adversarial examples against the source model and the surrogate model, but not the reference model (KRUTHIVETI on [0029] teaches the detection model 212 may include a neural fingerprinting ML model that is fed versions of the input data x perturbed with predefined random perturbations, as well as the output i of the ML model 204, and in such a case the detection model 212 may output an adversarial score indicating whether output perturbations generated for the perturbed input data by the neural fingerprinting ML model match expected output perturbations for the output i of the ML model 204. In other embodiments, the detection model 212 may include a surrogate ML model that takes the same input as the ML model 204 and is used to extract features that are compared with an expected feature distribution for the output i of the ML model 204 to determine an adversarial score indicating whether the extracted features match the expected feature distribution. See on [0032] teaches each input perturbation-expected output perturbation pair is also referred to herein as a “fingerprint.” The detection model 212 may employ multiple fingerprints to provide robust adversarial detection in some embodiments. Subsequent to training, the detection model 212 is configured to determine an adversarial score by measuring the error of how well output perturbations generated by the neural fingerprinting ML model for new input data that has been perturbed matches the expected perturbations for a class predicted by the ML model 204 for the new input data).
Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of KRUTHIVETI into the teaching of Sharma by selecting adversarial candidate based on low score and identifying set of fingerprint based on the generated models. One would be motivated to do so in order to defend machine learning systems from adversarial attacks (KRUTHIVETI on [0001]).
	The combination of Sharma and KRUTHIVETI fails to explicitly teach testing the noisy group of adversarial examples against the source model to obtain a set of source model outputs that the source model predicts correctly, however XIE from analogous art teaches testing the noisy group of adversarial examples against the source model to obtain a set of source model outputs that the source model predicts correctly (XIE on [0023] teaches the system first performs DNN perception modeling 380 via a series of probes 375 to a target DNN 345. The probes 375 are performed offline prior to performing any real-time image compression on images received at the edge gateway device. The system injects a small amount of noises to a test image to obtain an adversarial example, transmit the adversarial example in a probe message to a target DNN 345, and check the response from the target DNN 345 to determine whether DNN 345 can detect objects in the adversarial example with desired DNN inference accuracy).

Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of XIE into the combined teaching of Sharma and KRUTHIVETI by testing noisy group of adversarial against source model. One would be motivated to do so in order to facilitate the deep neural networks DNN inference without compromising the DNN inference accuracy (XIE on [0001-0002]).
Regarding claim 2 the combination of Sharma, KRUTHIVETI and XIE teaches all the limitations of claim 1 above, Sharma further teaches  wherein the training dataset is from a same distribution of a source model dataset (Sharma on [0049] teaches the test data is from the same distribution as the training dataset. See on [0054] teaches the training data 102 may be from a single distribution or from multiple distributions).
Regarding claim 3 the combination of Sharma, KRUTHIVETI and XIE teaches all the limitations of claim 1 above, Sharma further teaches wherein the training dataset comprises at least some data from a source model dataset (Sharma on [0035] teaches The training data used to build a machine learning model may be a set of examples of the task or decision the model is to perform or make. The training data may include pairs of input data and an outcome (a target). The outcome may be a binary response, one of multiple categories, or a number value).
Regarding claim 4 the combination of Sharma, KRUTHIVETI and XIE teaches all the limitations of claim 1 above, KRUTHIVETI further teaches  wherein fingerprint candidates comprise ones of the noisy group of adversarial examples that lead to a fully successful adversarial attack accuracy against the source model (KRUTHIVETI on [0060] teaches detecting adversarial attacks. In the disclosed techniques, a ML system processes the input into and output of a ML model using an adversarial detection module. The adversarial detection module includes a detection model that generates a score indicative of whether the input is adversarial using, e.g., a neural fingerprinting technique or a comparison of features extracted by a surrogate ML model to an expected feature distribution for the output of the ML model. In turn, the adversarial score is compared to a predefined threshold for raising an adversarial flag).
Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of KRUTHIVETI into the teaching of Sharma by selecting adversarial candidate based on low score and identifying set of fingerprint based on the generated models. One would be motivated to do so in order to defend machine learning systems from adversarial attacks (KRUTHIVETI on [0001]).

Regarding claim 6 the combination of Sharma, KRUTHIVETI and XIE teaches all the limitations of claim 1 above, KRUTHIVETI further teaches  wherein generating the set of fingerprints further comprises constructing respective adversarial examples with the noise that causes a receiving model to misclassify an input (KRUTHIVETI on [0045] teaches the modification to the visual appearance of the 80 mph speed limit sign causes the ML model 204 to misclassify the 80 mph speed limit sign as a 30 mph speed limit sign. To discern such an adversarial attack, the ML system 200 processes the image 400 and the 30 mph speed limit signal output by the ML model 204 using the adversarial detection module 210).
Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of KRUTHIVETI into the teaching of Sharma by selecting adversarial candidate based on low score and identifying set of fingerprint based on the generated models. One would be motivated to do so in order to defend machine learning systems from adversarial attacks (KRUTHIVETI on [0001]).

Regarding claim 7 the combination of Sharma, KRUTHIVETI and XIE teaches all the limitations of claim 1 above, XIE further teaches wherein the noise is imperceptible noise (XIE on [0023] teaches to probe the target DNN 345, the system injects a small amount of noises to a test image to obtain an adversarial example).
Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of XIE into the combined teaching of Sharma and KRUTHIVETI by testing noisy group of adversarial against source model. One would be motivated to do so in order to facilitate the deep neural networks DNN inference without compromising the DNN inference accuracy (XIE on [0001-0002]).

Claim 5 are rejected under 35 U.S.C. 103 as being unpatentable over Sharma et al (hereinafter Sharma) (US 20210064760) in view of KRUTHIVETI et al (hereinafter KRUTHIVETI) (US 20210157912) in view of XIE et al (hereinafter XIE) (US 20210035330) and further in view of Joshi et al (hereinafter Joshi) (US 20180227296).

Regarding claim 5 the combination of Sharma, KRUTHIVETI and XIE teaches all the limitations of claim 1 above, the combination fails to explicitly teach further comprising: sharing a hashed version of the set of fingerprints with a trusted third party, however Joshi from analogous art teaches sharing a hashed version of the set of fingerprints with a trusted third party (Joshi on [0041] teaches transmitting fingerprint hash to third party).
Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of Joshi into the combined teaching of Sharma, KRUTHIVETI and XIE by sending hash of fingerprint to third party. One would be motivated to do so in order to enhance security of the model (Joshi on [0004-0005]).

Claims 8-9 are rejected under 35 U.S.C. 103 as being unpatentable over Sharma et al (hereinafter Sharma) (US 20210064760) in view of KRUTHIVETI et al (hereinafter KRUTHIVETI) (US 20210157912) in view of XIE et al (hereinafter XIE) (US 20210035330) and further in view of Rouhani et al (hereinafter Rouhani) (US 20210019605).

Regarding claim 8 the combination of Sharma, KRUTHIVETI and XIE teaches all the limitations of claim 1 above, the combination fails to explicitly teach testing a suspect model by using the set of fingerprints against the suspect model to determine whether an overall accuracy operating on the set of fingerprints is equal to or greater than a testing threshold, however Rouhani from analogous art teaches  testing a suspect model by using the set of fingerprints against the suspect model to determine whether an overall accuracy operating on the set of fingerprints is equal to or greater than a testing threshold (Rouhani on [0092] teaches the detection engine 330 may determine that the second machine learning model is a duplicate of the first machine learning model based at least on the digital watermark being determined to be present in the second machine learning model (i.e. equivalent to suspect model derived from source model based on marking key and verification key). If the probability of the machine learning model 320 correctly classifying the K quantity of random input samples exceeds the threshold value p. When that is the case, the third party machine learning model 320 may be identified as being a duplicate of the machine learning model 100 and/or as having been trained using the same proprietary training data as the machine learning model 100. See on [0059] teaches the watermarking keys may therefore include the randomly selected Gaussian distributions s, trigger keys X.sup.key, and projection matrix A. See on [0090] teaches the watermarking keys may form the pairs (X.sup.key, Y.sup.key) in which X.sup.key may be the random input).
Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of Rouhani into the combined teaching of Sharma, KRUTHIVETI and XIE by testing suspect model using fingerprint to determine accuracy. One would be motivated to do so in order to validate machine learning model and test its accuracy (Rouhani on [0008-0012]).

Regarding claim 9 the combination of Sharma, KRUTHIVETI, XIE and Rouhani teaches all the limitations of claim 1 above, Rouhani further teaches further comprising: determining, when the overall accuracy operating on the set of fingerprints is equal to or greater than the testing threshold, that the suspect model was derived from the source model (Rouhani on [0092] teaches the detection engine 330 may determine that the second machine learning model is a duplicate of the first machine learning model based at least on the digital watermark being determined to be present in the second machine learning model (i.e. equivalent to suspect model derived from source model based on marking key and verification key). If the probability of the machine learning model 320 correctly classifying the K quantity of random input samples exceeds the threshold value p. When that is the case, the third party machine learning model 320 may be identified as being a duplicate of the machine learning model 100 and/or as having been trained using the same proprietary training data as the machine learning model 100. See on [0059] teaches the watermarking keys may therefore include the randomly selected Gaussian distributions s, trigger keys X.sup.key, and projection matrix A. See on [0090] teaches the watermarking keys may form the pairs (X.sup.key, Y.sup.key) in which X.sup.key may be the random input).
Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of Rouhani into the combined teaching of Sharma, KRUTHIVETI and XIE by testing suspect model using fingerprint to determine accuracy. One would be motivated to do so in order to validate machine learning model and test its accuracy (Rouhani on [0008-0012]).

Claims 10 are rejected under 35 U.S.C. 103 as being unpatentable over Rouhani et al (hereinafter Rouhani) (US 20210019605) in view of Zhang et al (hereinafter Zhang) (US 20210150042).

 Regarding claim 10 Rouhani teaches a method comprising (Rouhani on [0014] teaches a method for embedding a digital watermark in a machine learning model);
 receiving, from a model owner node, a source model and verification key at a service node (Rouhani Fig 3 and text on [0051-0053] teaches distributing first machine learning model (i.e. source model) and first watermarking key (i.e. verification key) from owner to client device. See on [0038] teaches owner of machine learning model to determine ownership of the model based on watermarking);
receiving a suspect model at the service node (Rouhani on [0016] teaches extracting, from the second machine learning model (i.e. suspect model), the second digital watermark by at least processing, with the second machine learning model);
in response to the request, receiving a marking key at the service node from the model owner node; and 36Docket No. 213-0106 based on the marking key and the verification key, determining whether the suspect model was derived from the source model by testing the suspect model to determine whether a fingerprint produces a same output from both the source model and the suspect model (Rouhani on [0092] teaches the detection engine 330 may determine that the second machine learning model is a duplicate of the first machine learning model based at least on the digital watermark being determined to be present in the second machine learning model (i.e. equivalent to suspect model derived from source model based on marking key and verification key). If the probability of the machine learning model 320 correctly classifying the K quantity of random input samples exceeds the threshold value p. When that is the case, the third party machine learning model 320 may be identified as being a duplicate of the machine learning model 100 and/or as having been trained using the same proprietary training data as the machine learning model 100. See on [0059] teaches the watermarking keys may therefore include the randomly selected Gaussian distributions s, trigger keys X.sup.key, and projection matrix A. See on [0090] teaches the watermarking keys may form the pairs (X.sup.key, Y.sup.key) in which X.sup.key may be the random input);
wherein the fingerprint passes against the source model and a surrogate model, but not a reference model (Rouhani on [0053] teaches the first and second digital watermark serve as fingerprint and [0052] teaches the misuse detection system 300 may include a detection engine 330 configured to generate and embed, in the machine learning model 100 (i.e. source model), a digital watermark uniquely identifying the machine learning model 100. The detection engine 330 may be further be configured to extract, from the third party machine learning model 320 (i.e. surrogate model), a digital watermark and/or determine whether the digital watermark extracted from the third party machine learning model 320 matches the digital watermark embedded in the machine learning model 100 (i.e. broadly interpreted as passing the digital watermark as fingerprint against first and third machine learning model because the watermark is embedded in these model ). Further on [0096] teaches the detection engine 330 may determine that the first client 352a is the source of the third party machine learning model 352 and not the second client 352b based at least on the third digital watermark extracted from the third party machine learning model 320 matching the first digital watermark embedded in the first copy of the machine learning model 100 distributed to the first client 352a but not the second digital watermark embedded in the second copy of the machine learning model 100 distributed to the second client 352b (i.e. not passing the second copy of digital machine learning model equivalent to fingerprint not passing the reference model as claimed).

	Rouhani fails to explicitly teach transmitting a request to the model owner node for a proof of ownership relative to the suspect model, Zhang from analogous art teaches transmitting a request to the model owner node for a proof of ownership relative to the suspect model (Zhang on [0039] teaches request verify ownership of deep neural network model. The owner makes that DNN available as a service.  More formally, a threat model for this scenario models two parties, a model owner O, who owns a deep neural network model m for a certain task, and a suspect S, who sets up a service t′ from model m′, while two services have similar performance t≅t′. In this context, assume that it is a goal to help owner O protect the intellectual property t of model m. intuitively, if model m is equivalent to m′, S can be confirmed as a plagiarized service of t).
Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of Zhang into the teaching of Rouhani by requesting model owner a proof of ownership. One would be motivated to do so in order to prevent attackers from obtaining correct predictions from stolen models and thus cannot fully prevent intellectual property theft (Zhang on [0007]).

Claims 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rouhani et al (hereinafter Rouhani) (US 20210019605) in view of Zhang et al (hereinafter Zhang) (US 20210150042) and further in view of Sharma et al (hereinafter Sharma) (US 20210064760).

Regarding claim 13 the combination of Rouhani and Zhang teaches all the limitation of claim 10 above, the combination fails to explicitly teach wherein at least one of the marking key and the verification key comprises added noise which causes a predictable output from the source model and surrogate models derived therefrom, however Sharma from analogous art teaches wherein at least one of the marking key and the verification key comprises added noise which causes a predictable output from the source model and surrogate models derived therefrom (Sharma on [0043] teaches noise can be added to a model at the input level (such as by adding randomized inputs to the training data). Noise can be added to a model at the parameter level after the model has been trained. Noise can also be added at the output level (such as adding noise to the output confidence). The amount of noise added and the level or step at which noise is added may impact how much the accuracy of the model is reduced and how much less likely it is that the model will be susceptible to a privacy attack. For example, adding noise at the input level may reduce accuracy more than adding noise to model parameters. See on [0081-0085] teaches adding noise at input level of causal and correlation model as shown in Fig 2).
Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of Sharma into the combined teaching of Rouhani and Zhang by adding noise to predict output from the model. One would be motivated to do so in order to protect against privacy attacks on machine learning models (Sharma on [0005]).

Claims 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rouhani et al (hereinafter Rouhani) (US 20210019605) in view of KRUTHIVETI et al (hereinafter KRUTHIVETI) (US 20210157912).

Regarding claim 14 Rouhani teaches a method comprising (Rouhani on [0014] teaches a method for embedding a digital watermark in a machine learning model);
receiving, from a model owner node, a source model and a fingerprint associated with the source model  (Rouhani Fig 3 and text on [0051-0053] teaches distributing first machine learning model (i.e. source model) and first watermarking key (i.e. fingerprint in view of [0053])  from owner to client device. See on [0038] teaches owner of machine learning model to determine ownership of the model based on watermarking); 
receiving a suspect model at a service node (Rouhani on [0016] teaches extracting, from the second machine learning model (i.e. suspect model), the second digital watermark by at least processing (i.e. testing suspect model based on watermarking keys), with the second machine learning model);
and when the output has an accuracy that is equal to or greater than a threshold, determining that the suspect model is derived from the source model  (Rouhani on [0092] teaches the detection engine 330 may determine that the second machine learning model is a duplicate of the first machine learning model based at least on the digital watermark being determined to be present in the second machine learning model (i.e. equivalent to suspect model derived from source model based on marking key and verification key). If the probability of the machine learning model 320 correctly classifying the K quantity of random input samples exceeds the threshold value p. When that is the case, the third party machine learning model 320 may be identified as being a duplicate of the machine learning model 100 and/or as having been trained using the same proprietary training data as the machine learning model 100. See on [0059] teaches the watermarking keys may therefore include the randomly selected Gaussian distributions s, trigger keys X.sup.key, and projection matrix A. See on [0090] teaches the watermarking keys may form the pairs (X.sup.key, Y.sup.key) in which X.sup.key may be the random input).
Although Rouhani teaches applying watermark on suspect model but fails to explicitly teach based on a request to test the suspect model applying the fingerprint to the suspect model to generate an output, however KRUTHIVETI from analogous art teaches based on a request to test the suspect model applying the fingerprint to the suspect model to generate an output (KRUTHIVETI on [0054] teaches the adversarial score output by the model 212 that includes a neural fingerprinting ML model is indicative of whether the input perturbations and outputs of the neural fingerprinting ML model matches the fingerprints (and specifically, whether the outputs match the expected output perturbations) for a class predicted by the ML model 204).
Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of KRUTHIVETI into the teaching of Rouhani by applying fingerprint on a suspect model. One would be motivated to do so in order to defend machine learning systems from adversarial attacks (KRUTHIVETI on [0001]).

Claims 15-16 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Rouhani et al (hereinafter Rouhani) (US 20210019605) in view of KRUTHIVETI et al (hereinafter KRUTHIVETI) (US 20210157912) in view of Sharma et al (hereinafter Sharma) (US 20210064760) and further in view of XIE et al (hereinafter XIE) (US 20210035330).

Regarding claim 15 the combination of Rouhani and KRUTHIVETI teaches all the limitations of claim 14 above, KRUTHIVETI further teaches selecting, from the group of adversarial candidates, a sub-group of candidates that each have a low confidence score according to a threshold to yield a sub-group of adversarial candidates (KRUTHIVETI on [0050-0051] teaches the ML system 200 determines whether the adversarial score output by the adversarial detection module 210 satisfies a predetermined threshold for raising an adversarial flag. If the ML system 200 determines that the adversarial score satisfies the threshold, then at step 510 the ML system 200 flags the original input data as adversarial. On the other hand, if the ML system 200 determines that the adversarial score output by the adversarial detection module 210 does not satisfy the predefined threshold, then at step 512, the ML system 200 does not flag the input data as adversarial);
and identifying a set of fingerprints based on which ones from the set of source model outputs, the set of reference model outputs and the set of surrogate model outputs pass as adversarial examples against the source model and the surrogate model, but not the reference model (KRUTHIVETI on [0029] teaches the detection model 212 may include a neural fingerprinting ML model that is fed versions of the input data x perturbed with predefined random perturbations, as well as the output i of the ML model 204, and in such a case the detection model 212 may output an adversarial score indicating whether output perturbations generated for the perturbed input data by the neural fingerprinting ML model match expected output perturbations for the output i of the ML model 204. In other embodiments, the detection model 212 may include a surrogate ML model that takes the same input as the ML model 204 and is used to extract features that are compared with an expected feature distribution for the output i of the ML model 204 to determine an adversarial score indicating whether the extracted features match the expected feature distribution. See on [0032] teaches each input perturbation-expected output perturbation pair is also referred to herein as a “fingerprint.” The detection model 212 may employ multiple fingerprints to provide robust adversarial detection in some embodiments. Subsequent to training, the detection model 212 is configured to determine an adversarial score by measuring the error of how well output perturbations generated by the neural fingerprinting ML model for new input data that has been perturbed matches the expected perturbations for a class predicted by the ML model 204 for the new input data).
Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of KRUTHIVETI into the teaching of Rouhani by applying fingerprint on a suspect model. One would be motivated to do so in order to defend machine learning systems from adversarial attacks (KRUTHIVETI on [0001]).
The combination of Rouhani and KRUTHIVETI fails to explicitly teach wherein the fingerprint is generated by a process comprising:  generating, based on a training dataset, a reference model and a surrogate model, selecting datapoints from the training dataset that are predicted correctly by a source model as a group of adversarial candidates, adding noise to each candidate of the sub-group of adversarial candidates to yield a noisy group of adversarial examples, testing the set of source model successful adversarial examples against the reference model to yield a set of reference model outputs, testing the set of source model successful adversarial examples against the surrogate model to yield a set of surrogate model outputs, however Sharma from analogous art teaches wherein the fingerprint is generated by a process comprising:  generating, based on a training dataset, a reference model and a surrogate model (Sharma Fig 1 and associated text on [0058] teaches a model generation module 106 may receive training data and generate a causal model 108. See also Fig 2 and text on [0076-0077] teaches generating causal model (i.e. reference model) and correlation model (i.e. surrogate model) based on receiving input from training data);
 selecting datapoints from the training dataset that are predicted correctly by a source model as a group of adversarial candidates (Sharma on [0037] teaches a machine learning algorithm may take training data and build a statistical or mathematical model for determining or predicting an output based on input data. The machine learning model (i.e. source model) may be a function that predicts one or more outputs or outcomes based on one or more inputs (i.e. data points). The output may be a prediction about where a value falls on a continuous or defined spectrum. It may be a prediction about which of multiple classifications the input falls into. See on [0118] teaches the first output 330a may include a first prediction based on the machine learning model 308 and the first input 328a. The second output 330b may include a second prediction based on the machine learning model 308 and the second input 328b. The first output 330a may include a first confidence score associated with the first prediction. The second output 330b may include a second confidence score associated with the second prediction);
adding noise to each candidate of the sub-group of adversarial candidates to yield a noisy group of adversarial examples (Sharma on [0043] teaches noise can be added to a model at the input level (such as by adding randomized inputs to the training data). Noise can be added to a model at the parameter level after the model has been trained. Noise can also be added at the output level (such as adding noise to the output confidence). The amount of noise added and the level or step at which noise is added may impact how much the accuracy of the model is reduced and how much less likely it is that the model will be susceptible to a privacy attack. For example, adding noise at the input level may reduce accuracy more than adding noise to model parameters. See on [0081-0085] teaches adding noise at input level of causal and correlation model as shown in Fig 2);
testing the set of source model successful adversarial examples against the reference model to yield a set of reference model outputs (Sharma on [0048] teaches a causal model may be successful on roughly 50% of the test inputs (close to a random guess), whereas the same attack on a correlational model may be successful for 80% of the test inputs. See on [0069] teaches one example of a membership inference attack may involve sending a first test input and a second test input to the causal model with noise 112. The first test input may be different from the second test input. The first test input may be a sample contained in the training data 102. The second test input may be a sample not contained in the training data 102. The one or more outputs 116 may include a first output responsive to the first test input and a second output responsive to the second test input. The first output may include a first confidence value, and the second output may include a second confidence value. An adversary may attempt to infer whether the first test input is a member of the training data 102 based on a difference in the first confidence value and the second confidence value);
 testing the set of source model successful adversarial examples against the surrogate model to yield a set of surrogate model outputs (Sharma on [0048] teaches a causal model may be successful on roughly 50% of the test inputs (close to a random guess), whereas the same attack on a correlational model may be successful for 80% of the test inputs).
Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of Sharma into the combined teaching of Rouhani and KRUTHIVETI by adding noise to predict output from the model. One would be motivated to do so in order to protect against privacy attacks on machine learning models (Sharma on [0005]).

	The combination of Rouhani, KRUTHIVETI and Sharma fails to explicitly teach testing the noisy group of adversarial examples against the source model to obtain a set of source model outputs that the source model predicts correctly, however XIE from analogous art teaches testing the noisy group of adversarial examples against the source model to obtain a set of source model outputs that the source model predicts correctly (XIE on [0023] teaches the system first performs DNN perception modeling 380 via a series of probes 375 to a target DNN 345. The probes 375 are performed offline prior to performing any real-time image compression on images received at the edge gateway device. The system injects a small amount of noises to a test image to obtain an adversarial example, transmit the adversarial example in a probe message to a target DNN 345, and check the response from the target DNN 345 to determine whether DNN 345 can detect objects in the adversarial example with desired DNN inference accuracy).
Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of XIE into the combined teaching of Rouhani, KRUTHIVETI and Sharma by testing noisy group of adversarial against source model. One would be motivated to do so in order to facilitate the deep neural networks DNN inference without compromising the DNN inference accuracy (XIE on [0001-0002]).

Regarding claim 16 the combination of Rouhani, KRUTHIVETI, Sharma and XIE teaches all the limitations of claim 15 above, KRUTHIVETI further teaches wherein the noisy group of adversarial examples comprises ones of the group of adversarial candidates that lead to a fully successful adversarial attack accuracy against the source model (KRUTHIVETI on [0060] teaches detecting adversarial attacks. In the disclosed techniques, a ML system processes the input into and output of a ML model using an adversarial detection module. The adversarial detection module includes a detection model that generates a score indicative of whether the input is adversarial using, e.g., a neural fingerprinting technique or a comparison of features extracted by a surrogate ML model to an expected feature distribution for the output of the ML model. In turn, the adversarial score is compared to a predefined threshold for raising an adversarial flag).
Regarding claim 18 the combination of Rouhani, KRUTHIVETI, Sharma and XIE teaches all the limitations of claim 15 above, KRUTHIVETI further teaches wherein generating the fingerprint further comprises constructing respective adversarial examples with the noise that causes a receiving model to misclassify an input (KRUTHIVETI on [0045] teaches the modification to the visual appearance of the 80 mph speed limit sign causes the ML model 204 to misclassify the 80 mph speed limit sign as a 30 mph speed limit sign. To discern such an adversarial attack, the ML system 200 processes the image 400 and the 30 mph speed limit signal output by the ML model 204 using the adversarial detection module 210).

Regarding claim 19 the combination of Rouhani, KRUTHIVETI, Sharma and XIE teaches all the limitations of claim 18 above, XIE further teaches wherein the noise is imperceptible noise (XIE on [0023] teaches to probe the target DNN 345, the system injects a small amount of noises to a test image to obtain an adversarial example).
Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of XIE into the combined teaching of Rouhani, KRUTHIVETI and Sharma by testing noisy group of adversarial against source model. One would be motivated to do so in order to facilitate the deep neural networks DNN inference without compromising the DNN inference accuracy (XIE on [0001-0002]).

Claims 17 are rejected under 35 U.S.C. 103 as being unpatentable over Rouhani et al (hereinafter Rouhani) (US 20210019605) in view of KRUTHIVETI et al (hereinafter KRUTHIVETI) (US 20210157912) in view of Sharma et al (hereinafter Sharma) (US 20210064760) in view of XIE et al (hereinafter XIE) (US 20210035330) and further in view of Joshi et al (hereinafter Joshi) (US 20180227296).

Regarding claim 17 the combination of Rouhani, KRUTHIVETI, Sharma and XIE teaches all the limitations of claim 15 above, the combination fails to explicitly teach further comprising: sharing a hashed version of the fingerprint with a trusted third party, however Joshi from analogous art teaches sharing a hashed version of the fingerprint with a trusted third party (Joshi on [0041] teaches transmitting fingerprint hash to third party).
Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of Joshi into the combined teaching of Rouhani, Sharma, KRUTHIVETI and XIE by sending hash of fingerprint to third party. One would be motivated to do so in order to enhance security of the model (Joshi on [0004-0005]).

Claim 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rouhani et al (hereinafter Rouhani) (US 20210019605) in view of KRUTHIVETI et al (hereinafter KRUTHIVETI) (US 20210157912) in view of Sharma et al (hereinafter Sharma) (US 20210064760) in view of XIE et al (hereinafter XIE) (US 20210035330) and further in view of Shrestha et al (hereinafter Shrestha) (US 20200202184).

Regarding claim 20 the combination of Rouhani, KRUTHIVETI, Sharma and XIE teaches all the limitations of claim 15 above, the combination fails to explicitly teach wherein the threshold is approximately 0.60, however Shrestha from analogous art teaches wherein the threshold is approximately 0.60 (Shrestha on [0095] teaches threshold of 0.60).
Thus, it would have been obvious to one ordinary skill in the art before the effective filing date to implement the teaching of Shrestha into the combined teaching of Rouhani, KRUTHIVETI, Sharma and XIE by having threshold approximately 0.60. One would be motivated to do so in order to improve overall security and accuracy of prediction of machine leaning model (Shrestha on [0003-0005]).

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOEEN KHAN whose telephone number is (571)272-3522. The examiner can normally be reached 7AM-5PM EST M-TH Alternate Fridays.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Shewaye Gelagay can be reached on (571)272-4219. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MOHAMMAD W REZA/Primary Examiner, Art Unit 2436                                                                                                                                                                                                        




/MOEEN KHAN/Examiner, Art Unit 2436