Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the 
first inventor to file provisions of the AIA .
2.	Examiner notes the entry of the following papers:
a.	Amended claims filed 7/29/2022.
b.	Applicant arguments/remarks made in amendment filed 7/29/2022.
3.	Claims  1, 14, and 20 are amended.
4.	Claims 1-20 are presented for examination.
Response to Arguments
5.	Applicant’s arguments that the prior art of record does not teach the amended claims are moot in view of new grounds of rejection necessitated by amendment.
	a.	Applicant recites “Applicant’s proposed amendment to claim 1 was discussed and it was agreed that the proposed amendment would overcome the current rejections..”  However, the specification indicates that the amendment, as written, is not supported. See paragraph 8 below.  In addition, new grounds of rejection have been identified. See detailed rejection.
Claim objections
6.	Claim 1 recites “in parallel with the generating of the first feature vector, in parallel with the generating of the first feature vector,” in lines 11 -12. It is unclear what the significance of the repetition of phrases means with respect to the claimed invention.  For the purpose of examination, Examiner is interpreting as “in parallel with the generating of the first feature vector,”.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claims 1, 14, and 20 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to enable one skilled in the art to which it pertains, or with which it is most nearly connected, to make and/or use the invention. 
	The limitation of “wherein the second numerical precision level is encrypted and hidden from the user” in claim 1 (and claims 14 and 20) was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor had possession of the claimed invention.
	Paragraph [0009] of the specification recites “In another embodiment, the second neural network model is an encrypted neural network model.”.  Similarly, paragraph [0019] recites “FIGURE 5, this figure depicts a block diagram of an example architecture for encrypted reduced precision neural network models in accordance with an embodiment;”. The “encryption” refers to the neural network model, not the “precision level” as recited in claims 1, 14, and 20.
	Therefore, claims 1, 14, and 20 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement. It is noted that the reasons for rejection of claim 1 also apply to claims 14 and 20, since claims 14 and 20 incorporate the recitations of claim 1. For the purpose of examination, the examiner is interpreting the limitation as “wherein the second neural network model is an encrypted neural network model”.
Claim Rejections - 35 USC § 103
9.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
	A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not 	identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior 	art are such that the claimed invention as a whole would have been obvious before the effective filing date of 	the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. 	Patentability shall not be negated by the manner in which the invention was made.

10.	Claims 1 - 20 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et al (Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks, herein Xu), Mishra et al (Apprentice: Using Knowledge Distillation Techniques to Improve Low-Precision Network Accuracy, herein Mishra), and Tople et al (PRIVADO: Practical and Secure DNN Inference, herein Tople).
11.	Regarding claim 1,
	Xu teaches a method comprising: receiving, by a processor, training data; (Xu, page 1, column 1, paragraph, line 12 “By comparing a DNN model’s prediction on the original input with that on squeezed inputs, feature squeezing detects adversarial examples with high accuracy and few false positives.” And, page 2 column 1, paragraph 5 “Deep Neural Networks (DNNs) can efficiently learn highly accurate  models from large corpora of training samples in many domains [18], [26].” In other words, DNN model is a method that runs on a processor, input is receiving, and training samples is training data.)
	training a first neural network using the training data resulting in a first neural network model, the first neural network model having a first numerical precision level, wherein the first numerical precision level is available to a user; training a second neural network using the training data resulting in a second neural network model, the second neural network model having a second numerical precision level different from the first numerical precision level, [wherein the second numerical precision level is encrypted and hidden from the user](Xu, Fig. 1, Fig. 2,  and, page 4, column 1, paragraph 5, line 1 “Adversarial training introduces discovered adversarial examples and the corresponding ground truth labels to the training [10, 40, 21].” And page 4, column 2, paragraph 3, line 11 “Instead, Meg and Chen proposed to train an autoencoder as an image filter to harden DNBN models [24]. The encoder state of the autoencoder is essentially non-linear dimensionality reduction.” And, page 5, column 2, paragraph 3, line 1 “Common image representations used color bit depths that lead to irrelevant features, so we hypothesize that reducing bit depth can reduce adversarial opportunity without harming classifier accuracy. Two common representation, which we focus on here because of their use in our test datasets, are 8-bit grayscale and 24-bit color.” And,  page 5, column 2, paragraph 6, line 1 “Figure 2 shows one example of class 0 in the MNIST dataset in the first row, with the original 8-bit grayscale images in the leftmost and the 1-bit monochrome images right most. The right most images, generated by applying a binary filter with 0.5 as the cutoff, appear nearly identical to the original images on the far left.  The processed images are still recognizable to humans, even though the feature space is only 1/128th the size of the original 8-bit grayscale space.”

    PNG
    media_image1.png
    321
    588
    media_image1.png
    Greyscale
      
    PNG
    media_image2.png
    368
    573
    media_image2.png
    Greyscale

In other words, training is training, reducing bit depth is reducing numerical precision, and from Fig. 1, Squeezer1 and Squeezer2 is a first and second trained neural network, and one neural network with reduced bit depth is the second neural  network model having a numerical precision different from the first.);
 	generating a first feature vector from input data using the first neural network model; generating, in parallel with the generating of the first feature vector, in parallel with the generating of the first feature vector, a second feature vector from the input data using the second neural network model;  and computing a difference metric between the first feature vector and the second feature vector, the difference metric indicative of whether the input data includes adversarial data (Xu, Fig. 1, “If the difference between the model’s prediction on a squeezed input and its prediction on the original input exceeds a threshold level, the input is identified to be adversarial” In other words, input is input data, difference is computing a difference, prediction1 and prediction2 are first feature vector and second feature vector, Squeezer1 and Squeezer2 generate feature vectors in parallel, and if the difference exceeds a threshold is the difference metric is indicative of whether the input data includes adversarial data.) .
	Mishra also teaches training a first neural network using the training data resulting in a first neural network model, the first neural network model having a first numerical precision level, wherein the first numerical precision level is available to a user; training a second neural network using the training data resulting in a second neural network model, the second neural network model having a second numerical precision level different from the first numerical precision level, [wherein the second numerical precision level is encrypted and hidden from the user] (Mishra, Figure 2, page 2, paragraph 2, line 8 “In our work, the student network has similar topology as that of the teacher network which has neurons operating at full-precision.” And, page 5, paragraph 2, line 5 “The first scheme (scheme-A) jointly trains both the networks – full-precision teacher and low-precision student network.”

    PNG
    media_image3.png
    419
    920
    media_image3.png
    Greyscale

In other words, trains both networks is training a first neural network and training a second neural network, full-precision teacher network is first network having a first precision level, and low-precision student network is second network having a second numerical precision level different from the first.);
	Both Mishra and Xu are directed to training multiple neural networks with different numerical precision, among other things.  Xu teaches training and using multiple neural networks in parallel, for evaluating the difference of the output of two or more neural networks, one with the original input sample and the other after compression (i.e. precision reduction), to detect adversarial input, but does not explicitly describe the training as training with two different precisions.  Mishra teaches training two neural networks that have different numerical levels of precision but does not teach evaluating the difference of the output of the two levels of precision to detect adversarial input.  In view of the teaching of Xu, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Mishra into Xu. This would result in explicitly describing training two neural networks with different numerical precision levels and using them to evaluate the difference of the output in order to detect adversarial input. 
	One of ordinary skill in the art would be motivated to do this because training for reduced precision while maintaining accuracy allows for a comparison of results between higher precision and lower precision neural networks, thus enabling anomaly detection. (Mishra, page 2, paragraph 1, line 2 “During this training, a process called, knowledge distillation is used to “transfer knowledge” from the complex network to the smaller network.  Work by Hinton et al. (2015) show that the knowledge distillation scheme can yield networks at comparable or slightly better accuracy than the original complex model.”)
	Thus far, the combination of Xu and Mishra does not explicitly teach wherein the second numerical precision level is encrypted and hidden from the user.
	Tople teaches wherein the second numerical precision level is encrypted and hidden from the user (Interpreted as wherein the second neural network model is an encrypted neural network model” - See paragraph 8.  Tople, Figure 3, and Abstract line 4 “In this paper, we therefore ask the question: “Can third-party cloud services use SGX to provide practical, yet secure DNN Inferences-as-a-service?” and, page 6, paragraph 4, line 2, “PRIVADO-Generator takes as input the ONNX representation of the model, the input-oblivious DNN framework that Step 1 generated, required SGX libraries, and an encryption key.  It outputs an enclave-executable model binary and the encrypted model parameters.”  

    PNG
    media_image4.png
    426
    534
    media_image4.png
    Greyscale

In other words, enclave-executable model binary and encrypted model parameters is encrypted neural network model.);
	Both Tople and the combination of Xu and Mishra are directed to deep neural networks (DNN).  In view of the teaching of Xu and Mishra, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Tople into the combination of Xu and Mishra.  This would result in being able to create an encrypted neural network model for inference from an unencrypted one, thus being able to use the adversarial detection method with an encrypted neural network model. 
	One of ordinary skill in the art would be motivated to do this in order to perform adversarial input detection over the cloud as a service, thus saving clients the cost and effort required to generate a custom method. (Tople, page 1, paragraph 1, line 1 “Recently, cloud providers have extended support for trusted hardware primitives such as Intel SGX.  Simultaneously, the field of deep learning is seeing enormous innovation and increase in adoption…. We first demonstrate that side-channel based attacks on DNN models are indeed possible.  We show that by observing access patterns, we can recover inputs to the DNN model.  This motivates the need for PRIVADO, a system we have designed for secure inference-as-a-service.”)
12.	Regarding claim 2,
	The combination of Xu, Mishra, and Tople teaches the method of claim 1, further comprising:
	comparing the difference metric to a predetermined threshold value. ( Xu, Fig. 1, See mapping of claim 1.  Fig. 1, In other words, max (d1,d2) >T shows comparing the difference to a predetermined threshold T. )
13.	Regarding claim 3,
	The combination of Xu, Mishra, and Tople teaches the method of claim 1, further comprising: 
	determining that the difference metric exceeds the predetermined threshold value; and determining that the input data includes the adversarial data responsive to the determining that the difference metric exceeds the predetermined threshold value. (Xu, Fig. 1, Shows that when the difference metric exceeds a predetermined threshold value T, a determination that the input includes adversarial data is made.)
14.	Regarding claim 4,
	The combination of Xu, Mishra, and Tople teaches the method of claim 3, further comprising: 
	discarding the input data. ( Xu, Fig. 1, and, page 1, column 2, paragraph 3, line 11 “By comparing the difference between predictions with a selected threshold value, our system outputs the correct prediction for legitimate examples and rejects adversarial inputs.” In other words, rejects adversarial inputs is discarding the input data.)
15.	Regarding claim 5,
	The combination of Xu, Mishra, and Tople teaches the method of claim 2, further comprising: 
	determining that the difference metric does not exceed the predetermined threshold value; and determining a classification of the input data responsive to the determining that the difference metric does not exceed the predetermined threshold value. ( Xu, Fig. 1, and, page 1, column 2, paragraph 3, line 11 “By comparing the difference between predictions with a selected threshold value, our system outputs the correct prediction for legitimate examples and rejects adversarial inputs.” In other words, comparing the difference with a selected threshold value is determining the difference metric does not exceed the predetermined threshold value, and outputs the correct prediction for legitimate examples is determining a classification of the input data responsive to the determining that the difference metric does not exceed the predetermined threshold.)
16.	Regarding claim 6,
	The combination of Xu, Mishra, and Tople teaches the method of claim 1, wherein 
	the first numerical precision level is greater than the second numerical precision level.  (Xu, Fig. 1 “The model is evaluated on both the original input and the input after being pre-processed by feature squeezers.” In other words, the original input before the pre-processing is the first numerical precision level is greater than the second numerical precision level.) 	
17.	Regarding claim 7,
	The combination of Xu, Mishra, and Tople teaches the method of claim 1, wherein 
	the first numerical precision level is a full numerical precision level.  (Xu, Fig. 1 “The model is evaluated on both the original input and the input after being pre-processed by feature squeezers.” In other words, the original input before the pre-processing is the first numerical precision level is a full numerical precision level.)
18.	Regarding claim 8,
	The combination of Xu, Mishra, and Tople teaches the method of claim 1, wherein 
	the first neural network model is a published neural network model with a known numerical precision level.  (Xu, TABLE 1,  

    PNG
    media_image5.png
    154
    594
    media_image5.png
    Greyscale

In other words, DenseNet and MobileNet are published neural network models with known numerical precision levels.)
19.	Regarding claim 9,
	The combination of Xu, Mishra, and Tople teaches the method of claim 1, wherein 
	the second neural network model is a reduced precision neural network model.  (Mishra, Figure 2, and, page 2, paragraph 2, line 8 “In our work, the student network has similar topology as that of the teacher network which has neurons operating at full-precision.” And, page 5, paragraph 2, line 5 “The first scheme (scheme-A) jointly trains both the networks – full-precision teacher and low-precision student network.” In other words, the low-precision student network is the second neural network model is a reduced precision neural network model.)
20.	Regarding claim 10,
	The combination of Xu, Mishra, and Tople teaches the method of claim 1, wherein
	the second neural network model is an encrypted neural network model. (Tople, See mapping of claim 1. Figure 3, and page 6, paragraph 4, line 2, “PRIVADO-Generator takes as input the ONNX representation of the model, the input-oblivious DNN framework that Step 1 generated, required SGX libraries, and an encryption key.  It outputs an enclave-executable model binary and the encrypted model parameters.”  In other words, Tople creates an encrypted neural network model.) 
21.	Regarding claim 11,
	The combination of Xu, Mishra, and Tople teaches the method of claim 1, wherein 
	one or more layers of the second neural network model include different numerical precision levels. (Mishra, page 2, paragraph 2, line 8 “In our work, the student network has similar topology as that of the teacher network, except that the student network has low-precision neurons compared to the teacher network which has neurons operating at full-precision.”  In other words, lower precision neurons is one or more layers of the second neural network model include different numerical precision levels.)
22.	Regarding claim 12,
	The combination of Xu, Mishra, and Tople teaches the method of claim 1, wherein 
	one or more of the first neural network or the second neural network includes a deep neural network (DNN).  (Xu, Fig. 1, and Abstract, line 7 “We propose a new strategy, feature squeezing, that can be used to harden DNN models by detecting adversarial examples.” In other words, DNN models is one or more of the first neural network or the second neural network includes a deep neural network (DNN).)
23.	Regarding claim 13,
	The combination of Xu, Mishra, and Tople teaches the method of claim 1, wherein 
	the input data includes image data.  (Xu, page 2, column 1, paragraph 3, line 3 “Our experiments show that joint-detection can successfully detect adversarial examples form eleven static attacks at the detection rates of 98% on MNIST and 85% on CIFAR-10 and ImageNet, with low (around 5%) false positive rates.”  In other words, ImageNet is image data.)
24.	Claims 14-17 are computer usable program product claims corresponding to method claims 1-4, respectively. Otherwise they are the same.  It is implicit that a computer implemented method requires computer usable program products in order to execute.  Therefore, claims 14-17 are rejected for the same reasons as claims 1-4, respectively.
25.	Regarding claim 18,
	The combination of Xu, Mishra and Tople teach the computer usable product of claim 14, wherein
the program instructions are stored in a computer readable storage medium in a data processing system, and wherein the program instructions are transferred over a network from a remoted data processing system. (Tople, See mapping of claim 1. And, Figure 1, and page 2, column 2, paragraph 4, line 1 “Figure 1 shows the entities involved in such an inference service: the cloud provider, the model owner and multiple model users.  The cloud provider supports trusted hardware primitives such as Intel SGX. SGX-enabled CPUs create isolated execution environments called enclaves in which all data are encrypted.  SGX-enabled CPUs also remotely attest the code executing within the enclaves to ensure its integrity [20].”

    PNG
    media_image6.png
    508
    522
    media_image6.png
    Greyscale

In other words, CPUs also remotely attest is computer usable code transferred over a network.)

26.	Regarding claim 19,
	The combination of Xu, Mishra, and Tople teach the computer usable program product of claim 14, wherein
	the program instructions are downloaded over a network to a remote data processing system for use in a computer readable storage medium associated with the remote data processing system. (Tople, Figure 1, and page 2, column 1, paragraph 3, line 1 “To address ease-of-use challenge, PRIVADO used the PRIVADO-Generator which takes as input models represented in the  popular ONNX format [17], and automatically generates a minimal set of enclave-specific code and encrypted parameters for the model.”   In other words, Figure 1 shows server data processing system together with the code generation process and remote processing.)
27.	Claim 20 is a computer system claim corresponding to method claim 1.  Otherwise, they are the same.  It is implicit that a computer implemented method requires a computer system in order to execute.  Therefore, claim 20 is rejected for the same reasons as claim 1.
Conclusion
	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to BART RYLANDER whose telephone number is (571)272-8359. The examiner can normally be reached Monday - Thursday 8:00 to 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on 571-270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/B.I.R./Examiner, Art Unit 2124        

                                                                                                                                                                                                
/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124