DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Examiner notes the entry of the following papers:
Amended claims filed 12/30/2021.
Applicant arguments/remarks made in amendment filed 12/30/2021.

A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/30/2021 has been entered.
 
Claims 1, 5-7, 10, 14-16, and 19 are amended.
Claims 4 and 13 are cancelled.
Claims 1-3, 5-12, and 14-20 are pending.

Response to Arguments
Applicant’s arguments filed 12/30/2021 in regard to prior art of record does not disclose the amended limitations are moot in view of a new ground of rejection.  Please see detailed rejection below.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5-7, 10-12, 14-16, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Graepel et al (ML Confidential: Machine Learning on Encrypted Data, herein Graepel), Chen et al (Privacy-Preserving Backpropagation Neural Network Learning, herein Chen) and Leboeuf et al (A New Learning Algorithm for Neural Networks with Integer Weights and Quantized Non-linear Activation Functions, herein Leboeuf).
Regarding claim 1,
	Graepel teaches a machine learning system, comprising:
	a memory storing at least one instruction, a processor communicatively coupled to the memory, wherein the processor is configured to access and execute at least one instruction from the memory for: inputting raw data to a first partition of a neural network, (Graepel, page 1, paragraph 1, line 1 “We demonstrate that, by using a recently proposed leveled homomorphic encryption scheme, it is possible to delegate the execution of a machine learning algorithm to a computing service while retaining confidentiality of the training and test data.” In other words, machine learning algorithm is machine learning system and test data is raw data.  Examiner notes that a machine learning algorithm implicitly requires a system with a processor, capable of executing instructions, and memory, in order to be implemented and executable.)
	wherein the first partition at least comprises [an activation function of the neural network and the activation function is configured to convert the raw data into irreversible metadata,] and (Graepel, page 1, paragraph 2, line 5 “In this work we propose a cloud service which provides confidential handling of machine learning tasks for various applications.  Machine learning (ML) consists of two stages, the training stage and the classification stage, either or both of which can be outsourced to the cloud.” And, page 2, paragraph 3, line 1 “One way to preserve confidentiality of data when outsourcing computation is to encrypt the data before uploading it to the cloud.  This may limit the utility of the data, but recent advances in cryptography allow searching on encrypted data and performing operations on encrypted data, all without decrypting it.  An encryption scheme which allow arbitrary operations on ciphertexts is called a Fully Homomorphic Encryption (FHE) scheme.” In other words, two stages, the training stage and the classification stage, either or both is a first partition, and encrypting with a homomorphic encryption scheme is applying an activation function configured to convert the raw data into irreversible metadata.)
	the metadata is transmitted to a second partition of the neural network to generate a learning result corresponding to the raw data.  (Graepel, page 1, paragraph 2, line 5 “In this work we propose a cloud service which provides confidential handling of machine learning tasks for various applications.  Machine learning (ML) consists of two stages, the training stage and the classification stage, either or both of which can be outsourced to the cloud.” In other words, two stages, the training stage and the classification stage, either… can be outsourced to the cloud is transmitted to a second partition to generate a result corresponding to the input data.)
	Thus far, Graepel does not explicitly teach an activation function of the neural network and the activation function is configured to convert the raw data into irreversible metadata,
	Chen teaches an activation function of the neural network and the activation function is configured to convert the raw data into irreversible metadata, (Chen, page 1, column 2, paragraph 3, contribution (3), line 1 “Our algorithms include a solution to a challenging technical problem, namely, privacy-preserving computation of activation function.  This problem is highly challenging because most of the frequently used activation functions are infinite and continuous while cryptographic tools are defined in finite fields.  To overcome this difficulty, we propose the first cryptographic method to securely compute sigmoid function, in which an existing piecewise linear approximation of the sigmoid function [8] is used.” In other words, piecewise linear approximation of the sigmoid function is activation function configured to convert the raw data into irreversible metadata.)
	Both Chen and Graepel are directed to preserving privacy of confidential data when using machine learning techniques.  In view of the teaching of Graepel, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Chen into Graepel.  This would result in being able to use an activation function to convert data into metadata.
	One of ordinary skill in the art would be motivated to do this to be able to use an activation function to further insure the privacy of sensitive data during machine learning because privacy preservation is important. (Chen, page 1, column 1, paragraph 2, line 3 “In such 
	Thus far, the combination of Graepel and Chen does not explicitly teach wherein the activation function is a stepwise nonlinear function, and a domain of the activation function is divided into a plurality of intervals according to a number of division, and each of the plurality of intervals is presented with a step line corresponding to a fixed value in a range of the activation function, and the stepwise non-linear function is a combination of step lines along the divided intervals, wherein each output value of the stepwise nonlinear function corresponds to a plurality of different input values.
	Leboeuf teaches wherein the activation function is a stepwise nonlinear function, and a domain of the activation function is divided into a plurality of intervals according to a number of division, and each of the plurality of intervals is presented with a step line corresponding to a fixed value in a range of the activation function, and the stepwise non-linear function is a combination of step lines along the divided intervals, wherein each output value of the stepwise nonlinear function corresponds to a plurality of different input values. (Leboeuf, page 1, column 2, paragraph 1, line 2 “Piece-wise linear approximation (PWL), lookup tables (LUTs), and hybrid methods have been widely used for this purpose [1, 5, 7].” And, page 1, column 1, paragraph 1, line 1 “The hyperbolic tangent function is commonly used as the 

    PNG
    media_image1.png
    516
    492
    media_image1.png
    Greyscale


    PNG
    media_image2.png
    452
    492
    media_image2.png
    Greyscale

In other words, activation function is activation function, piece-wise is stepwise, hyperbolic tangent function is nonlinear function, Figure 2 (b) range addressable lookup Table architecture is the domain of the activation function is divided into a plurality of intervals according to a number of division where each of the plurality of intervals is presented with a step line corresponding to a fixed value in a range of the activation function, quantized non-linear activation function is stepwise nonlinear function, from Figure 5, the plurality of steps represent the stepwise non-linear function is a combination of step lines along the divided intervals, and, every output corresponds to a range of addresses is each output value of the stepwise nonlinear function corresponds to a plurality of different input values.)
	Both Leboeuf and the combination of Graepel and Chen are directed to neural network activation functions, among other things.  In view of the teaching of the combination of Graepel and Chen, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Leboeuf into the combination of Graepel and Chen.  This would result in being able to use quantized non-linear activation functions in 
	One of ordinary skill in the art would be motivated to do this in order to more easily implement neural networks with nonlinear activation functions in hardware for faster inference. (Leboeuf, page 1, column 1, paragraph 1, line 3 “Since ANN systems are computationally intensive, they require large execution times in software implementations.  Hardware implementations can eliminate this issue.  One of the challenges presented when designing a hardware-based ANN system is the implementation of the activation function.”)
Regarding claim 2,
	The combination of Graepel, Chen, and Leboeuf teaches the machine learning system of claim 1
	further comprising a server communicatively coupled to the processor, wherein the server is configured to receive the metadata and input the metadata to the second partition that follows the first partition in the neural network.  (Graepel, page 1, paragraph 2, line 5 “In this work we propose a cloud service which provides confidential handling of machine learning tasks for various applications.” In other words, a cloud service is a server communicatively coupled to the local processor, provides confidential handling of machine learning tasks is configured to receive the metadata and input the metadata to the second partition.)  
Regarding claim 3,
	The combination of Graepel, Chen, and Leboeuf teaches the machine learning system of claim 1,
wherein the activation function is ordered as a first activation function in the neural network. (Chen, page 1, column 2, paragraph 3, contribution (3), line 1 “Our algorithms include a solution to a challenging technical problem, namely, privacy-preserving computation of activation function.  This problem is highly challenging because most of the frequently used activation functions are infinite and continuous while cryptographic tools are defined in finite fields.  To overcome this difficulty, we propose the first cryptographic method to securely compute sigmoid function, in which an existing piecewise linear approximation of the sigmoid function [8] is used.” In other words, piecewise linear approximation of the sigmoid function is a first activation function in the neural network.)
Regarding claim 5,
	The combination of Graepel, Chen, and Leboeuf teaches the machine learning system of claim 1,
	wherein the activation function corresponds to a clipping value, and the clipping value and the number of division have a ratio, the activation function is configured to compare an input with the clipping value to generate a comparison result, and the activation function is configured to generate the metadata according to the ratio, the comparison result and the input. (Chen, page 5, column 2, paragraph 4, line 1 “In this section, we present a secure distributed algorithm for computing piecewise linear approximated sigmoid function (y(x) (as shown in algorithm 2).” And page 6, column 1, paragraph 3, line 1 “For ease of presentation, we write the algorithm under the assumption that x1 and x2 are integers.  Note that we can easily rewrite the algorithm to allow real numbers with precision of a few digits after the dot.” In other words, piecewise linear approximation corresponds to a clipping value, and we can easily rewrite the algorithm to allow real numbers with precision of a few digits after the dot is the activation function is configured to generate the metadata according to the ratio.  Examiner notes that Chen shows it can easily rewrite their algorithm for arbitrary precision based on mathematical calculation, which is another way of saying it can calculate values based on a specific ratio.)

    PNG
    media_image3.png
    569
    587
    media_image3.png
    Greyscale

Regarding claim 6,
	The combination of Graepel, Chen, and Leboeuf teaches the machine learning system of claim 1,	
	wherein the number of division is in a range between a first value and a second value.  (Chen, page 3, column 1, paragraph 2, line 5 “Approximating the activation function in a piecewise way offers us an opportunity to apply cryptographic tools to make the computation of sigmoid function privacy preserving.  Equation (3) is a piecewise linear approximation [8] of the sigmoid function 1/1 + e-x . Our privacy-preserving algorithm for BPN learning is based on this approximation2 

    PNG
    media_image4.png
    218
    482
    media_image4.png
    Greyscale


In other words, equation (3) shows the number of divisions in a range between a first value and a second value.)
Regarding claim 7,
	The combination of Graepel, Chen, and Leboeuf teaches the machine learning system of claim 1,
	wherein the number of division is determined according to a content complexity of the raw data. (Chen, page 7, column 2, paragraph 1, line 1 “In this section, we analyze the computation and communication complexity of our privacy-preserving backpropagation algorithms.” And, page 6, column 1, paragraph 3, line 1 “For ease of presentation, we write the algorithm under the assumption that x1 and x2 are integers.  Note that we can easily rewrite the algorithm to allow real numbers with precision of a few digits after the dot.” In other words, rewrite the algorithm to allow real numbers with precision of a few digits after the dot is determining accuracy and range, which is the number of division, and analyze the computation and communication complexity is according to a content complexity.) 
Claims 10-12, and 14-16 are machine learning method claims corresponding to machine learning system claims 1-3, and 5-7, respectively.  Otherwise, they are the same. Machine 
Claims 19-20 are non-transitory computer readable medium claims corresponding to machine learning system claims 1-2, respectively.  Otherwise, they are the same.  It is implicit that a machine learning system must have a non-transitory computer readable medium for storage.  Therefore, claims 19-20 are rejected for the same reasons as claim 1-2, respectively.
Claims 8-9, and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Graepel, Chen, Leboeuf and Shokri et al(Privacy-Preserving Deep Learning, herein Shokri).
Regarding claim 8,
	The combination of Graepel, Chen, and Leboeuf teaches the machine learning system of claim 1,
	Thus far, the combination of Graepel, Chen, and Leboeuf does not explicitly teach wherein the first partition comprises a convolution layer.  
	Shokri teaches wherein the first partition comprises a convolution layer.  (Shokri, page 1316, column 1, paragraph 1, line 1 “To show the effectiveness of our approach compared to conventional stochastic gradient descent, we evaluate the accuracy of SSGD and SGD when training a convolutional neural network (CNN) on the MNIST and SVHN datasets.” In other words, training a convolutional neural network (CNN) is comprises a convolution layer.  See Figure 3, for Pseudocode of DSSGD (distributed selective stochastic gradient descent) for participant i, and Figure 4, for Pseudocode of DSSGD on the server.) 

    PNG
    media_image5.png
    478
    534
    media_image5.png
    Greyscale


    PNG
    media_image6.png
    358
    535
    media_image6.png
    Greyscale

	Both Shokri and the combination of Graepel, Chen, and Leboeuf are directed to preserving privacy of confidential data when using machine learning systems.  In view of the teaching of Graepel, Chen, and Leboeuf, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Shokri into the combination of Graepel, Chen, and Leboeuf.  This would enable protecting privacy of data when using a convolutional neural network (CNN).

Regarding claim 9,
	The combination of Shokri, Chen, and Graepel teach the machine learning system of claim 1,
	wherein the second partition comprises at least one of a convolution layer, a pooling layer and a fully connected layer. (Shokri, page 7, column 1, paragraph 1, line 1 “To show the effectiveness of our approach compared to conventional stochastic gradient descent, we evaluate the accuracy of SSGD and SGD when training a convolutional neural network (CNN) on the MNIST and SVHN datasets.” In other words, training a convolutional neural network (CNN) is comprises at least one of a convolution layer, a pooling layer and a fully connected layer.  Examiner notes that CNNs typically include convolution, pooling, and fully connected layers.  Also note, DSSGD is a distributed algorithm which means a portion of the calculations can be done remotely. In addition, Graepel shows that a portion, i.e. partition, of the machine learning method can be shifted to the cloud.  See paragraph 9 above.)
Claims 17-18 are machine learning method claims corresponding to machine learning system claims 8-9, respectively.  Otherwise, they are the same. Therefore, claims 17-18 are rejected for the same reasons as claims 8-9, respectively.
Conclusion
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to BART RYLANDER whose telephone number is (571)272-8359. The examiner can normally be reached Monday - Thursday 8:00 to 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on 571-270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/B.I.R./Examiner, Art Unit 2124                 

/BRIAN M SMITH/Primary Examiner, Art Unit 2122