Detailed Action
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim 1-20 are pending.

Drawings
New corrected drawings in compliance with 37 CFR 1.121(d) are required in this application because according to the paragraph [0090] of the specification, the 690 of the Figure 2B should contain ‘Perform method of FIG. 12’ instead of ‘Perform method of FIG. 2’. Applicant is advised to employ the services of a competent patent draftsperson outside the Office, as the U.S. Patent and Trademark Office no longer prepares new drawings. The corrected drawings are required in reply to the Office action to avoid abandonment of the application. The requirement for corrected drawings will not be held in abeyance.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

Regarding claim 1,
2A Prong 1: The limitation of measuring, respective similarities for each of a set of intermediate representations to input information used as an input to the deep learning inference system, is a mental process, because measuring similarity between plurality of datasets can be performed in human mind, or by using pen and paper. The limitation of selecting a subset of the set of intermediate representations that are most similar to the input information, is also a mental process, because the limitation encompasses the user comparing the information and selecting one of them. The limitation of determining, a partitioning point in the plurality of layers used to partition the plurality of layers into two partitions defined so that information leakage for the two partitions will meet a privacy parameter, is a mental process, because determining a point of partitioning using specific criteria can be done in human mind or using pen and paper. The limitation of outputting the partitioning point for use in partitioning the plurality of layers into the two partitions also recites a mental process as it is a decision of determination which can be done in human mind.
2A Prong 2: The judicial exception is not integrated into a practical application. There is no additional element recited in the claim. 
2B: The claim does not recite additional elements that amount to significantly more than the judicial exception. The limitation of using a deep learning inference system, merely says which particular technological field or environment the abstract idea is performed in (MPEP 2106.05(h)). The claim is not patent eligible.
Regarding claim 9, the limitation of an apparatus comprising a memory having computer program code and one or more processors that retrieve and execute the program code, are generic computer components. Claim 9 is an apparatus claim having similar limitation to method claim 1. Therefore, it is an abstract idea under the same rational as of claim 1 above.
Regarding claim 17, the limitation of a computer program product comprising computer readable storage medium having program instructions, is a generic computer function running in a 

Regarding claim 2, 
2A Prong 1: The limitation of wherein the deep learning inference system is a first deep learning inference system, and wherein selecting a subset of the set of intermediate representations that are most similar to the input information is a mental process, because the limitation encompasses a user comparing two subsets with input information and choosing one that is most similar to the input. The limitation of creating inferencing output for each of the set of intermediate representations is a mental process as it is merely a process of inferencing an output with specific data, which can be done in human mind.
2A Prong 2: The judicial exception is not integrated into a practical application.
2B: The claim does not recite additional elements that amount to significantly more than the judicial exception. The limitation of using a second deep learning inference system merely says which particular technological field or environment the abstract idea is performed in (MPEP 2106.05(h)).
Claim 10 is an apparatus claim having similar limitations to method claim 2 above. Therefore, they are rejected under the same rational as of claim 2 above.
Claim 18 is a computer program product claim having similar limitations to method claim 2 above. Therefore, they are rejected under the same rational as of claim 2 above.

Regarding claim 3,
2A Prong 1: The limitation of selecting a subset of the set of intermediate representations that are most similar to the input information is a mental process, because the limitation encompasses a user comparing two subsets with input information and choosing one that is most similar to the input. The projecting feature maps for each of the set of intermediate representations into a same input format as the input information is a mental process. 
2A Prong 2: The judicial exception is not integrated into a practical application.
2B: The claim does not recite additional elements that amount to significantly more than the judicial exception. The limitation of second deep learning inference system merely says which particular technological field or environment the abstract idea is performed in (MPEP 2106.05(h)). The claim is not patent eligible.
Claim 11 is an apparatus claim having similar limitations to method claim 3 above. Therefore, they are rejected under the same rational as of claim 3 above.
Claim 19 is a computer program product claim having similar limitations to method claim 3 above. Therefore, they are rejected under the same rational as of claim 3 above.

Regarding claim 4, the limitation of wherein the input format is an image format merely says which particular format of data is used to perform the process.
Claim 12 is an apparatus claim having similar limitations to method claim 4 above. Therefore, they are rejected under the same rational as of claim 4 above.

Regarding claim 5, 
2A Prong 1: The limitation of selecting a subset of the set of intermediate representations that are most similar to the input information is a mental process, because the limitation encompasses a user comparing two subsets with input information and choosing one that is most similar to the input. The limitation of measuring a similarity between the intermediate representations and input information by comparing the output data is a mental process. Measuring similarity between the images and input information can be done in human mind. The limitation of selecting a single intermediate representation is also a mental process, because selecting an image among a set of images using specific criteria can be done in human mind.
2A Prong 2: The judicial exception is not integrated into a practical application.
2B: The claim does not recite additional elements that amount to significantly more than the judicial exception. 
Claim 13 is an apparatus claim having similar limitations to method claim 5 above. Therefore, they are rejected under the same rational as of claim 5 above.
Claim 20 is a computer program product claim having similar limitations to method claim 5 above. Therefore, they are rejected under the same rational as of claim 5 above.

Regarding claim 6, 
2A Prong 1: The limitation of measuring a similarity between the intermediate representations and the input information further comprises measuring a similarity metric between the output of the inference system for each of intermediate representations and the input information is a mental process, because it recites the process of measuring similarity between two different data which can be done by pen and paper. The limitation of selecting a single intermediate representation for each of the layers comprises selecting an intermediate representation at each layer that has a minimum measured similarity metric is also a mental process. Selecting specific data using specific criteria can be performed in human mind, or using pen and paper.
2A Prong 2: The judicial exception is not integrated into a practical application.
2B: The claim does not recite additional elements that amount to significantly more than the judicial exception. The limitation of deep learning inference system merely says which particular technological field or environment the abstract idea is performed in (MPEP 2106.05(h)). The claim is not patent eligible.


Regarding claim 7, 
2A Prong 1: The limitation of wherein the similarity metric is a divergence, wherein the divergence is a first divergence, inferencing for the first and second system is performed for N classes, is a mathematical process, because calculating divergence is a mathematical process. The limitation of wherein determining a partitioning point in the plurality of layers is a mental process, because selecting a point using specific criteria can be done using pen and paper. The limitation of computing a second divergence between output of the first deep learning inference system and uniform distribution of a probability vector is a mathematical process. Calculating a ratio for each of the plurality of layers and comparing the calculated ratio are also mathematical process.
2A Prong 2: The judicial exception is not integrated into a practical application.
2B: The claim does not recite additional elements that amount to significantly more than the judicial exception.
Claim 15 is an apparatus claim having similar limitations to method claim 7 above. Therefore, they are rejected under the same rational as of claim 7 above.

Regarding claim 8, 
2A Prong 1: The limitation of determining a partitioning point as a selected layer in the plurality of layers where a corresponding calculated ratio is greater than the value of the privacy parameter is a mental process, as determining a specific point by using specific criteria can be done using pen and paper or in human mind.
2A Prong 2: The judicial exception is not integrated into a practical application.
2B: The claim does not recite additional elements that amount to significantly more than the judicial exception.
Claim 16 is an apparatus claim having similar limitations to method claim 8 above. Therefore, they are rejected under the same rational as of claim 8 above.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
In regards to claim 1, the claim recites the limitation ‘determining, using the selected subset of intermediate representations, a partitioning point in the plurality of layers used to partition the plurality of layers into two partitions defined so that information leakage for the two partitions will meet a privacy defined so that information leakage for the two partitions will meet a privacy parameter when a first of the two partitions is prevented from leaking information’ in line 9-13. The claim is indefinite as it is unclear what constitutes the privacy parameter, what constitutes meeting a privacy parameter for leakage, and how the determining a partitioning point can meet privacy parameter for leakage.  
the neural network operation is partitioned to prevent privacy leakage.
Claim 2-8 depend on the claim 1, and inherit the same deficiency. Therefore, rejected by the same reasoning as claim 1.
Claim 9 and 17 is/are have similar limitations to method claim 1 above. Therefore, they are rejected under the same rational as of claim 1 above.
Claim 10-16 and 18-20 depend on claim 9 and claim 17, and inherit the same deficiency. Therefore, rejected by the same reasoning as claim 9 and claim 17.


Claim Rejections - 35 USC § 102
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim 1-20 are rejected under 35 U.S.C. 102 over Gu (Gu et al, 07/03/2018, “Securing Input Data of Deep Learning Inference Systems via Partitioned Enclave Execution”).

Regarding claim 1, Gu teaches a method for addressing information leakage in a deep learning service ([Gu, page 1, left column, Abstract, line 9-18] “In this paper, we systematically investigate the potential information exposure in deep-learning based AI inference systems. Based on our observation, we develop DeepEnclave, a privacy-enhancing system to mitigate sensitive information disclosure in deep learning inference pipelines. The key innovation is to partition deep learning models and leverage secure enclave techniques on cloud infrastructures to cryptographically protect the confidentiality and integrity of user inputs”), comprising: 
[Gu, page 7, line 16-18 from the top of the right column] “By measuring the similarity of classification results, we can deduce whether a specific IR image is visually similar to its original input”, [Gu, page 7, line 32-40, Neural Network Assessment Framework] “We use the Kullback-Leibler (KL) divergence to measure the similarity of classification results. At each Layer i, we select the IR image with the minimum KL divergence DKL with the input x to quantitatively measure the dist[x;IRi]:8 j 2 [1; filter num(Li)] … where F*(.;q) is the representation function shared by both IRGenNet and IRValNet”); 
selecting a subset of the set of intermediate representations that are most similar to the input information ([Gu, page 7, line 9-13 from the top of the right column] “Instead, we replace human subjects with another ConvNet (by exploiting ConvNet’s approaching-human visual recognition capability) to automatically assess all IR images and identify the ones revealing most input information at each layer” discloses the IR images are selected from each of the layer (i.e. selecting plurality of images), [Gu, page 8, left column, 6.1.1 Model Analysis, line 13-17] “Figure 4. For each hidden layer, we choose the IR image that has the minimum KL divergence. For example, Layer 1 is a convolutional layer and the most similar IR image to the original input is generated by the 6th filter of this layer”); 
determining, using the selected subset of intermediate representations, a partitioning point in the plurality of layers used to partition the plurality of layers into two partitions defined so that information leakage for the two partitions will meet a privacy parameter when a first of the two partitions is prevented from leaking information, and outputting the partitioning point for use in partitioning the plurality of layers of the deep learning inference system into the two partitions ([Gu, page 7, right column, Neural Network Assessment Framework, line 34-44] “At each Layer i, we select the IR image with the minimum KL divergence DKL with the input x to quantitatively measure the … To determine the optimal partitioning point for each neural network, we compute (F∗(x,θ)||µ) where µ ∼ U(0,N), the uniform distribution of the probability vector and N is the number of classes”, the paragraph discloses the process of determining the partitioning point. As partitioning point is determined, it is obvious that the result will be outputted); 
Regarding claim 9, Gu teaches an apparatus for addressing information leakage in a deep learning service, comprising: memory having computer program code; and one or more processors, wherein the one or more processors, in response to retrieval and execution of the computer program code, cause the apparatus to perform operations ([Gu, page 4, left column, 1st paragraph of the 4 Threat Model-the end of the 2nd paragraph] “In our threat model, we assume that adversaries are able to obtain data from machines of deep learning cloud systems. There are multiple ways for them to achieve that. For example, attackers may exploit some zero-day vulnerabilities to penetrate and compromise the system software of the server. Insiders, such as cloud administrators, can also retrieve and leak data from the servers on purpose. The data can be files on disks or snapshots of physical memory. We assume that adversaries understand the format of the files stored on disks and they are able to locate and extract structured data (of their interest) from memory snapshots. We also expect that adversaries master the state-of-the-art techniques [18, 29, 43] for reconstructing inputs from IRs. However, we assume that the adversaries cannot break into the perimeters of CPU packages to track the code execution and data flow at the processor level. We do not intend to address the side-channel attacks against Intel SGX in this paper. But in Section 9 we introduce some recent representative SGX side-channel attacks, give an in-depth analysis why the core computation of deep neural networks is still resilient to side channel attacks, and the potential vulnerabilities”, the paragraph suggests that the invention runs in generic computer with processors and memories. The memory with computer program code and processor that execute the program code are generic computer component). Claim 9 is an apparatus claim having similar limitations to method claim 1 above. Therefore, they are rejected under the same rational as of claim 1 above.

Regarding claim 17, Gu teaches computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by an apparatus to cause the apparatus to perform operations ([Gu, page 4, left column, 1st paragraph of the 4 Threat Model-the end of the 2nd paragraph] “In our threat model, we assume that adversaries are able to obtain data from machines of deep learning cloud systems. There are multiple ways for them to achieve that. For example, attackers may exploit some zero-day vulnerabilities to penetrate and compromise the system software of the server. Insiders, such as cloud administrators, can also retrieve and leak data from the servers on purpose. The data can be files on disks or snapshots of physical memory. We assume that adversaries understand the format of the files stored on disks and they are able to locate and extract structured data (of their interest) from memory snapshots. We also expect that adversaries master the state-of-the-art techniques [18, 29, 43] for reconstructing inputs from IRs. However, we assume that the adversaries cannot break into the perimeters of CPU packages to track the code execution and data flow at the processor level. We do not intend to address the side-channel attacks against Intel SGX in this paper. But in Section 9 we introduce some recent representative SGX side-channel attacks, give an in-depth analysis why the core computation of deep neural networks is still resilient to side channel attacks, and the potential vulnerabilities”, computer program product comprising a computer readable medium is a generic computer program runs in generic computer component). Claim 17 is a computer program product claim having similar limitations to method claim 1 above. Therefore, they are rejected under the same rational as of claim 1 above.


creating, using a second deep learning inference system, inferencing output for each of the set of intermediate representations ([Gu, page 7, Figure 3] discloses the architecture of the inference system with the second learning inference system (the lower square of the figure), [Gu, page 7, line 22-27 from the top of the right column, Neural Network Assessment Framework] “In Figure 3, we present the Dual-ConvNet architecture of our neural network assessment framework. We submit an input x to the IR Generation ConvNet (IRGenNet) and generate IRi i 2 [1;n]. Each IRi contains multiple feature maps after passing Layer i (Li). Then we project feature maps to IR images and submit them to the IR Validation ConvNet (IRValNet), which shares the same network architecture/weights as the IRGenNet. The outputs of both ConvNets are N-dimensional (N is the number of classes) probability vectors with class scores. We use the Kullback-Leibler (KL) divergence to measure the similarity of classification results”).
Claim 10 is an apparatus claim having similar limitations to method claim 2 above. Therefore, they are rejected under the same rational as of claim 2 above.
Claim 18 is a computer program product claim having similar limitations to method claim 2 above. Therefore, they are rejected under the same rational as of claim 2 above.

Regarding claim 3, Gu teaches the method of claim 2, wherein selecting a subset of the set of intermediate representations that are most similar to the input information comprises:
projecting, prior to creating the inferencing output, feature maps for each of the set of intermediate representations into a same input format as used by the input information ([Gu, page 7, right column, line 27-34 from the top of the right column, Neural Network Assessment Framework] “Then we project feature maps to IR images and submit them to the IR Validation ConvNet (IRValNet), which shares the same network architecture/weights as the IRGenNet. The outputs of both ConvNets are N-dimensional (N is the number of classes) probability vectors with class scores. We use the Kullback-Leibler (KL) divergence to measure the similarity of classification results”); 
and Page 44 of 50P201706237US01inputting the intermediate representations in the input format to the second deep learning inference system for the second deep learning inference system to use when creating the inferencing output ([Gu, page 7, Figure 3; page 7, line 27-32 from the top of the right column, Neural Network Assessment Framework] “Then we project feature maps to IR images and submit them to the IR Validation ConvNet (IRValNet), which shares the same network architecture/weights as the IRGenNet. The outputs of both ConvNets are N-dimensional (N is the number of classes) probability vectors with class scores”, discloses the process using the intermediate representations to create the inferencing output from the second deep learning system. The inferencing outputs are N-dimensional probability vectors with class scores).
Claim 11 is an apparatus claim having similar limitations to method claim 3 above. Therefore, they are rejected under the same rational as of claim 3 above.
Claim 19 is a computer program product claim having similar limitations to method claim 3 above. Therefore, they are rejected under the same rational as of claim 3 above.

Regarding claim 4, Gu teaches the method of claim 3, wherein the input format is an image format ([Gu, page 7,  line 27-29 from the top of the right column, Neural Network Assessment Framework] “Then we project feature maps to IR images and submit them to the IR Validation ConvNet (IRValNet)”).
Claim 12 is an apparatus claim having similar limitations to method claim 4 above. Therefore, they are rejected under the same rational as of claim 4 above.

Regarding claim 5, Gu teaches the method of claim 2, wherein selecting a subset of the set of intermediate representations that are most similar to the input information comprises: measuring a similarity between the intermediate representations and the input information by comparing the output of the second deep learning inference system for each of the intermediate representations with a corresponding intermediate representation in the input format; and selecting a single intermediate representation for each of the plurality of layers and their corresponding one or more intermediate representations ([Gu, page 7, line 9-13 from the top of the right column] “Instead, we replace human subjects with another ConvNet (by exploiting ConvNet’s approaching-human visual recognition capability) to automatically assess all IR images and identify the ones revealing most input information at each layer” discloses the IR images are selected from each of the layer (i.e. selecting plurality of images), [Gu, page 8, left column, 6.1.1 Model Analysis, line 13-17] “Figure 4. For each hidden layer, we choose the IR image that has the minimum KL divergence. For example, Layer 1 is a convolutional layer and the most similar IR image to the original input is generated by the 6th filter of this layer”, [Gu, page 7, line 30-40 from the top of the right column, Neural Network Assessment Framework; Equation (2)] “The outputs of both ConvNets are N-dimensional (N is the number of classes) probability vectors with class scores. We use the Kullback-Leibler (KL) divergence to measure the similarity of classification results. At each Layer i, we select the IR image with the minimum KL divergence DKL with the input x to quantitatively measure the dist[x;IRi]:8 j 2 [1; filter num(Li)],             
                d
                i
                s
                t
                
                    
                        x
                        ,
                         
                        I
                        
                            
                                R
                            
                            
                                i
                            
                        
                    
                
                =
                m
                i
                
                    
                        n
                    
                    
                        j
                    
                
                (
                
                    
                        D
                    
                    
                        K
                        L
                    
                
                
                    
                        
                            
                                F
                            
                            
                                *
                                
                                    
                                        x
                                        ,
                                        θ
                                    
                                
                            
                        
                    
                    
                        
                            
                                
                                    
                                        F
                                    
                                    
                                        *
                                    
                                
                                
                                    
                                        I
                                        
                                            
                                                R
                                            
                                            
                                                i
                                                j
                                            
                                        
                                        ,
                                        θ
                                    
                                
                            
                        
                    
                
                =
                m
                i
                
                    
                        n
                    
                    
                        j
                    
                
                (
                
                    
                        ∑
                        
                            k
                        
                    
                    
                        
                            
                                F
                            
                            
                                *
                            
                        
                        
                            
                                
                                    
                                        x
                                        ,
                                        θ
                                    
                                
                            
                            
                                k
                            
                        
                        l
                        o
                        g
                        
                            
                                
                                    
                                        
                                            
                                                F
                                            
                                            
                                                *
                                            
                                        
                                        
                                            
                                                
                                                    
                                                        x
                                                        ,
                                                        θ
                                                    
                                                
                                            
                                            
                                                k
                                            
                                        
                                    
                                    
                                        
                                            
                                                F
                                            
                                            
                                                *
                                            
                                        
                                        
                                            
                                                
                                                    
                                                        I
                                                        
                                                            
                                                                R
                                                            
                                                            
                                                                i
                                                                j
                                                            
                                                        
                                                        ,
                                                        θ
                                                    
                                                
                                            
                                            
                                                k
                                            
                                        
                                    
                                
                            
                        
                    
                
                )
            
          where F*( . , Θ) is the representation function shared by both IRGenNet and IRValNet”, IRValNet corresponds to the second deep learning inference system).
Claim 13 is an apparatus claim having similar limitations to method claim 5 above. Therefore, they are rejected under the same rational as of claim 5 above.


Regarding claim 6, Gu teaches the method of claim 5, wherein: measuring a similarity between the intermediate representations and the input information further comprises measuring a similarity metric between the output of the second deep learning inference system for each of the intermediate representations and the input information ([Gu, page 7, line 34-40 from the top of the right column, Neural Network Assessment Framework] “At each Layer i, we select the IR image with the minimum KL divergence DKL with the input x to quantitatively measure the dist[x;IRi]:8 j 2 [1; filter num(Li)],              
                d
                i
                s
                t
                
                    
                        x
                        ,
                         
                        I
                        
                            
                                R
                            
                            
                                i
                            
                        
                    
                
                =
                m
                i
                
                    
                        n
                    
                    
                        j
                    
                
                (
                
                    
                        D
                    
                    
                        K
                        L
                    
                
                
                    
                        
                            
                                F
                            
                            
                                *
                                
                                    
                                        x
                                        ,
                                        θ
                                    
                                
                            
                        
                    
                    
                        
                            
                                
                                    
                                        F
                                    
                                    
                                        *
                                    
                                
                                
                                    
                                        I
                                        
                                            
                                                R
                                            
                                            
                                                i
                                                j
                                            
                                        
                                        ,
                                        θ
                                    
                                
                            
                        
                    
                
                =
                m
                i
                
                    
                        n
                    
                    
                        j
                    
                
                (
                
                    
                        ∑
                        
                            k
                        
                    
                    
                        
                            
                                F
                            
                            
                                *
                            
                        
                        
                            
                                
                                    
                                        x
                                        ,
                                        θ
                                    
                                
                            
                            
                                k
                            
                        
                        l
                        o
                        g
                        
                            
                                
                                    
                                        
                                            
                                                F
                                            
                                            
                                                *
                                            
                                        
                                        
                                            
                                                
                                                    
                                                        x
                                                        ,
                                                        θ
                                                    
                                                
                                            
                                            
                                                k
                                            
                                        
                                    
                                    
                                        
                                            
                                                F
                                            
                                            
                                                *
                                            
                                        
                                        
                                            
                                                
                                                    
                                                        I
                                                        
                                                            
                                                                R
                                                            
                                                            
                                                                i
                                                                j
                                                            
                                                        
                                                        ,
                                                        θ
                                                    
                                                
                                            
                                            
                                                k
                                            
                                        
                                    
                                
                            
                        
                    
                
                )
            
         where F*( . , Θ) is the representation function shared by both IRGenNet and IRValNet”, the first deep learning inference system is the IRGenNet, and the second deep learning inference system is the IRValNet. The F*( . , Θ) represents the output of IRGenNet or IRValNet); and selecting a single intermediate representation for each of the plurality of layers and their corresponding one or more intermediate representations further comprises selecting an intermediate representation at each layer that has a minimum measured similarity metric ([Gu, page 7, line 9-13 from the top of the right column] “Instead, we replace human subjects with another ConvNet (by exploiting ConvNet’s approaching-human visual recognition capability) to automatically assess all IR images and identify the ones revealing most input information at each layer” discloses the IR images are selected from each of the layer (i.e. selecting plurality of images), [Gu, page 8, left column, 6.1.1 Model Analysis, line 13-17] “Figure 4. For each hidden layer, we choose the IR image that has the minimum KL divergence. For example, Layer 1 is a convolutional layer and the most similar IR image to the original input is generated by the 6th filter of this layer”, [Gu, page 7, line 30-40, Neural Network Assessment Framework; Equation (2)] “The outputs of both ConvNets are N-dimensional (N is the number of classes) probability vectors with class scores. We use the Kullback-Leibler (KL) divergence to measure the similarity of classification results. At each Layer i, we select the IR image with the minimum KL divergence DKL with the input x to quantitatively measure the dist[x;IRi]:8 j 2 [1; filter num(Li)],             
                d
                i
                s
                t
                
                    
                        x
                        ,
                         
                        I
                        
                            
                                R
                            
                            
                                i
                            
                        
                    
                
                =
                m
                i
                
                    
                        n
                    
                    
                        j
                    
                
                (
                
                    
                        D
                    
                    
                        K
                        L
                    
                
                
                    
                        
                            
                                F
                            
                            
                                *
                                
                                    
                                        x
                                        ,
                                        θ
                                    
                                
                            
                        
                    
                    
                        
                            
                                
                                    
                                        F
                                    
                                    
                                        *
                                    
                                
                                
                                    
                                        I
                                        
                                            
                                                R
                                            
                                            
                                                i
                                                j
                                            
                                        
                                        ,
                                        θ
                                    
                                
                            
                        
                    
                
                =
                m
                i
                
                    
                        n
                    
                    
                        j
                    
                
                (
                
                    
                        ∑
                        
                            k
                        
                    
                    
                        
                            
                                F
                            
                            
                                *
                            
                        
                        
                            
                                
                                    
                                        x
                                        ,
                                        θ
                                    
                                
                            
                            
                                k
                            
                        
                        l
                        o
                        g
                        
                            
                                
                                    
                                        
                                            
                                                F
                                            
                                            
                                                *
                                            
                                        
                                        
                                            
                                                
                                                    
                                                        x
                                                        ,
                                                        θ
                                                    
                                                
                                            
                                            
                                                k
                                            
                                        
                                    
                                    
                                        
                                            
                                                F
                                            
                                            
                                                *
                                            
                                        
                                        
                                            
                                                
                                                    
                                                        I
                                                        
                                                            
                                                                R
                                                            
                                                            
                                                                i
                                                                j
                                                            
                                                        
                                                        ,
                                                        θ
                                                    
                                                
                                            
                                            
                                                k
                                            
                                        
                                    
                                
                            
                        
                    
                
                )
            
          where F*( . , Θ) is the representation function shared by both IRGenNet and IRValNet”, IRValNet corresponds to the second deep learning inference system).
Claim 14 is an apparatus claim having similar limitations to method claim 6 above. Therefore, they are rejected under the same rational as of claim 6 above.

Regarding claim 7, Gu teaches the method of claim 6, wherein the similarity metric is a divergence, wherein the divergence is a first divergence, inferencing for the first and second deep learning inference systems is performed for N classes ([Gu, page 7, Neural Network Assessment Framework] “Then we project feature maps to IR images and submit them to the IR Validation ConvNet (IRValNet), which shares the same network architecture/weights as the IRGenNet. The outputs of both ConvNets are N-dimensional (N is the number of classes) probability vectors with class scores”, the first inference system is the IRGenNet, the second inference system is the IRValNet), and wherein determining a partitioning point in the plurality of layers further ([Gu, page 7, Neural Network Assessment Framework, right after the equation (2)] “To determine the optimal partitioning point for each neural network, we compute D_KL(F*(x, Θ)||µ) where µ~U(0;N), the uniform distribution of the probability vector and N is the number of classes. This represents that A1 has no prior knowledge of x before”) comprises: computing a second divergence between inferencing output of the first deep learning inference system for the input information and a uniform distribution of a probability vector for the N classes ([Gu, page 7, Neural Network Assessment Framework, after the equation (2)] “To determine the optimal partitioning point for each neural network, we compute D_KL(F*(x, Θ)||µ) where µ~U(0;N), the uniform distribution of the probability vector and N is the number of classes. This represents that A1 has no prior knowledge of x before”, D_KL(F*(x,Θ)||µ) corresponds to the second divergence);Page 45 of 50 P201706237US01calculating a ratio for each of the plurality of layers as the corresponding selected intermediate representation divided by the second divergence ([Gu, 3 Problem Definition, page 3, right column, the last paragraph – page 4, left column, the first paragraph; equation (1)] “dist[x, ˜x | ˜x A(B,IR)]/dist[˜x, ˜x | ˜x ≤ A(B)] ≤ ξ  … where ξ is the privacy parameter to bound the distances between x and ˜x before and after observing IR and e 2 [0;1]. Specifically, dist[˜x, ˜x | ˜x ≤ A(B)] considers that ˜x is reconstructed only based on adversaries’ background knowledge B.”, the equation discloses the ratio calculation, ); and comparing the calculated ratios with a value of the privacy parameter ([Gu, 3 Problem Definition, page 3, right column, the last paragraph – page 4, left column, the first paragraph; equation (1)] “dist[x, ˜x | ˜x A(B,IR)]/dist[˜x, ˜x | ˜x ≤ A(B)] ≤ ξ  … where ξ is the privacy parameter to bound the distances between x and ˜x before and after observing IR and e 2 [0;1]”, the equation discloses the calculated ratio compared with the privacy parameter ξ ).
Claim 15 is an apparatus claim having similar limitations to method claim 7 above. Therefore, they are rejected under the same rational as of claim 7 above.

Regarding claim 8, Gu teaches the method of claim 7, wherein determining a partitioning point in the plurality of layers further comprises: determining the partitioning point as a selected layer in the plurality of layers where a corresponding calculated ratio is greater than the value of the privacy parameter ([Gu, page 3, right column, the last paragraph – page 4, left column, the first paragraph] “The Front-Net representation function F(.) is considered to violate the e-privacy for x, if there exists an attack A, background knowledge B and intermediate representation IR, dist[x, ˜x | ˜x A(B,IR)]/dist[˜x, ˜x | ˜x ≤ A(B)] ≤ ξ  … where ξ is the privacy parameter to bound the distances between x and ˜x before and after observing IR and e 2 [0;1]. The dist measures the distance between an original input x and a reconstructed input ˜x. Specifically, dist[x; ˜xj˜x   A(B)] considers that ˜x is reconstructed only based on adversaries’ background knowledge B. Whereas in dist[x; ˜xj˜x   A(B;IR)], ˜x is reconstructed based on both the adversaries’ background knowledge B and the observed IR. Eq. 1 says that the privacy of the true inference input x is breached if adversaries can significantly reduce the distance between ˜x and x after obtaining the intermediate representation IR of x”).
Claim 16 is an apparatus claim having similar limitations to method claim 8 above. Therefore, they are rejected under the same rational as of claim 8 above.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

	Claim 1-6, 9-14, and 17-20 are rejected under 35 U.S.C. 103 over Tuor (Tuor et al, 05/04/2018, “Understanding information leakage of distributed inference with deep neural networks: Overview of information theoretic approach and initial results”) in view of Osia (Osia et al, 10/11/2017, “Privacy-Preserving Deep Inference for Rich User Data on The Cloud”).
	
	Regarding claim 1, Tuor teaches a method for addressing information leakage in a deep learning service ([Tuor, page 1, ABSTRACT, line 9-13] “In this paper, we conduct a simple experiment to understand to which extent is it possible to reconstruct the raw data given the output of an intermediate layer, in other words, to which extent do we leak private information when sending the output of an intermediate layer to the cloud. We also present an overview of mutual-information based studies of DNN, to help understand information leakage and some potential ways to make distributed inference more secure”), comprising: 
measuring, using a deep learning inference system, respective similarities for each of a set of intermediate representations to input information used as an input to the deep learning inference system, wherein the deep learning inference system comprises a plurality of layers, each layer producing one or more associated intermediate representations ([Tuor, page 2, 2.INITIAL EXPERIMENT: RECOVERING RAW DATA, 2-3rd paragraph] “Formally, we denote the representation of the original image x obtained at the output of layer i by bi(x), where we consider the 0-th layer as the input layer, thus b0 = x. The problem is formulated as follows. Given the following: 1. A representation function bi: Rd0 −→ Rdi for each layer i, where d0 is the dimension of input layer 0 and di is the dimension of the output at layer i (this representation function is specified directly by the trained DNN model)  2. An input image x0  3. A representation bi(x0) at a specific layer I . The goal is to find the input image x so that the error between bi(x) and bi(x0) is the smallest: x∗ = arg min || x∈Rd0 kbi(x) − bi(x0)k2 . The image x∗ found from (1) should visually look similar as x0 … Convolutional Layer 1: 32 of 5 x 5 filters with ReLU activation function, Pooling Layer 1: Max pooling with a 2 x 2 filter and a stride of 2, Convolutional Layer 2: 64 of 5 x 5 filters with ReLU activation function, Pooling Layer 2: Max pooling with a 2 x 2 filter and a stride of 2, Dense Layer 1: 1024 neurons with ReLU activation function, Dense Layer 2 (Output Layer): 10 with Softmax activation function, one neuron for each digit class”); 
selecting a subset of the set of intermediate representations that are most similar to the input information ([Tuor, page 2, line 18-28, 2.INITIAL EXPERIMENT: RECOVERING RAW DATA, 3rd paragraph – the last paragraph; Figure 3] “Formally, we denote the representation of the original image x obtained at the output of layer i by bi(x), where we consider the 0-th layer as the input layer, thus b0 = x. The problem is formulated as follows. Given the following: 1. A representation function bi: Rd0 −→ Rdi for each layer i, where d0 is the dimension of input layer 0 and di is the dimension of the output at layer i (this representation function is specified directly by the trained DNN model)  2. An input image x0  3. A representation bi(x0) at a specific layer I . The goal is to find the input image x so that the error between bi(x) and bi(x0) is the smallest: x∗ = arg min || x∈Rd0 kbi(x) − bi(x0)k2 . The image x∗ found from (1) should visually look similar as x0 … Our experiment confirms that we leak information about raw data when performing distributed inference. Most of the layers seem to retain lots of information about the raw image. We manage to reconstruct visually similar image as the original input until the penultimate layer. From Figure 3, we can see that the reconstruction error is larger when reconstructing from a deeper representation. We also note that the reconstructed image retains the characteristics of this specific handwriting of the digit “2” ”, discloses the selecting visually similar image as the original input image. Figure 3 shows plurality of intermediate representations similar to the original input); 
determining, using the selected subset of intermediate representations, a partitioning point in the plurality of layers used to partition the plurality of layers into two partitions defined so that information leakage for the two partitions will meet a privacy parameter when a first of the two partitions is prevented from leaking information ([Tuor, page 2, Figure 1] shows the diagram of partitioning the neural network into two sets of layers (Edge device and cloud), [Tuor, page 1, line 3-6, ABSTRACT] “Consequently, the inference of deep neural network (DNN) model is often partitioned between the edge and the cloud. In this case, the edge device performs inference up to an intermediate layer of the DNN, and offloads the output features to the cloud for the inference of the remaining of the network”, [Tuor, page 4, Second paragraph after the list of layers] “Our experiment confirms that we leak information about raw data when performing distributed inference. Most of the layers seem to retain lots of information about the raw image. We manage to reconstruct visually similar image as the original input until the penultimate layer. From Figure 3, we can see that the reconstruction error is larger when reconstructing from a deeper representation. We also note that the reconstructed image retains the characteristics of this specific handwriting of the digit “2” ”, discloses the experiment to show less privacy leakage happens in the deeper representation, which can be used to determine the partitioning layer. [Tuor, page 6, the first paragraph; Figure 4] “The encoder/decoder representation allows us to quantify B by the amount of information it captures on the input variable and on the desired output, as well as on the predicted output of the DNN. We aim to find and encoding B that is maximally expressive about Y (i.e., that selects the important bits in order to accurately predict Y ) as well as maximally compressive about X (i.e., that throws away bits that are not relevant for predicting Y ). Hence, the optimal encoder is selected by finding the encoding that optimizes the IB defined in (2)”, discloses the process of determining which layers to set as an encoder (i.e. the first half of a bi-partitioned neural network) ).
Tuor does not specifically teaches outputting the partitioning point for use in partitioning the plurality of layers of the deep learning inference system into the two partitions.
Osia teaches outputting the partitioning point for use in partitioning the plurality of layers of the deep learning inference system into the two partitions ([Osia, page 2, Entire 2nd paragraph] “Our approach relies on optimizing the layer separation of pre-trained deep models. Primary layers are held on the user device and the secondary ones on the cloud. In this way, the inference task starts by applying the primary layers as the feature extractor on the user device, and continues by sending the resultant features to the cloud and, end by applying the secondary analyzing layers in cloud. We demonstrate that our proposed solution does not have the overhead of executing the whole deep model on the user device, while it will be favored by a cloud provider as the user does not have access to their complete model and part of the inference should be done on the cloud. We introduce a method to manipulate the extracted features (from the primary layers) in a way that irrelevant extra information can not leak, hence addressing the privacy challenges of cloud solution. To do this, we alter the training phase by applying Siamese network [12] in a specific manner, and by employing a dimensionality reduction and noise addition mechanism for increased privacy”).
Before the effective filing date of the invention to a person of ordinary skill in the art, it would have been obvious, having both the teachings of Tuor and Osia, to use the process of outputting the partitioning point for use in partitioning the plurality of layers of Osia to implement the deep inference system of Tuor. The suggestion and/or motivation for doing so is to improve the energy efficiency and prevent privacy leakage, as reconstruction error is larger when reconstructing from a deeper representation ([Tuor, page 4, line 11-13 from the top of the page] “We manage to reconstruct visually similar image as the original input until the penultimate layer. From Figure 3, we can see that the reconstruction error is larger when reconstructing from a deeper representation”). 
Claim 9 is an apparatus claim having similar limitations to method claim 1 above. Therefore, they are rejected under the same rational as of claim 1 above.
Claim 17 is a computer program product claim having similar limitations to method claim 1 above. Therefore, they are rejected under the same rational as of claim 1 above.

Regarding claim 2, Tuor in view of Osia teaches the method of claim 1, wherein the deep learning inference system is a first deep learning inference system, and wherein selecting a subset of the set of intermediate representations that are most similar to the input information comprises ([Tuor, page 2, line 18-28, 2.INITIAL EXPERIMENT: RECOVERING RAW DATA; Figure 3] “Formally, we denote the representation of the original image x obtained at the output of layer i by bi(x), where we consider the 0-th layer as the input layer, thus b0 = x. The problem is formulated as follows. Given the following: 1. A representation function bi: Rd0 −→ Rdi for each layer i, where d0 is the dimension of input layer 0 and di is the dimension of the output at layer i (this representation function is specified directly by the trained DNN model)  2. An input image x0  3. A representation bi(x0) at a specific layer I . The goal is to find the input image x so that the error between bi(x) and bi(x0) is the smallest: x∗ = arg min || x∈Rd0 kbi(x) − bi(x0)k2 . The image x∗ found from (1) should visually look similar as x0”, discloses the selecting visually similar image as the original input image. Figure 3 shows plurality of intermediate representations similar to the original input): creating inferencing output for each of the set of intermediate representations ([Tuor, page 4, the first paragraph after the list of layers; Figure 2 & 3] “Figure 2 shows the original input we attempt to reconstruct and Figure 3 shows reconstructions obtained from the representations obtained at different layers (i.e., bi(x0) at different i). Our experiment confirms that we leak information about raw data when performing distributed inference. Most of the layers seem to retain lots of information about the raw image. We manage to reconstruct visually similar image as the original input until the penultimate layer”, the figure 3 shows the inferencing output (reconstructed images) from the intermediate results from each of the layers). 
Tuor does not specifically teach using a second deep learning inference system to inference output for each of the intermediate representations.
Osia teaches using a second deep learning inference system to inference output for each of the intermediate representations ([Osia, page 8, left column, entire second paragraph, 3) Visualization] “Deep visualization can brought us a good intuition about identity preservation of each layer. We fed the the intermediate layers of gender classification model as the input of Alexnet decoder [14] to reconstruct the original inputs. The reconstructed images leads to visually figure out the amount of identity information in the intermediate feature of gender classification model. These images are illustrated in Figure 11 for different methods. It can be observed that the genders of all images in the simple and Siamese embeddings remain the same as the original ones. This is also the case for the advanced embedding, although it is harder to distinguish it from the reconstructed images. The original images are almost restored in the simple embedding. Therefore, just separating layers of a deep network can not assure acceptable privacy preservation performance. Siamese embedding performs better than the simple embedding by distorting the identity due to intrinsic characteristics of the face. Finally, the Advanced Embedding provides the best results, because the decoder was not trainable and nothing can be deduced from images, including the person’s identity”).
Claim 10 is an apparatus claim having similar limitations to method claim 2 above. Therefore, they are rejected under the same rational as of claim 2 above.
Claim 18 is a computer program product claim having similar limitations to method claim 2 above. Therefore, they are rejected under the same rational as of claim 2 above.

Regarding claim 3, Tuor in view of Osia teaches the method of claim 2, wherein selecting a subset of the set of intermediate representations that are most similar to the input information comprises ([Tuor, page 2, line 18-28 from the top of the page, 2.INITIAL EXPERIMENT: RECOVERING RAW DATA; Figure 3] “Formally, we denote the representation of the original image x obtained at the output of layer i by bi(x), where we consider the 0-th layer as the input layer, thus b0 = x. The problem is formulated as follows. Given the following: 1. A representation function bi: Rd0 −→ Rdi for each layer i, where d0 is the dimension of input layer 0 and di is the dimension of the output at layer i (this representation function is specified directly by the trained DNN model)  2. An input image x0  3. A representation bi(x0) at a specific layer I . The goal is to find the input image x so that the error between bi(x) and bi(x0) is the smallest: x∗ = arg min || x∈Rd0 kbi(x) − bi(x0)k2 . The image x∗ found from (1) should visually look similar as x0”, discloses the selecting visually similar image as the original input image. Figure 3 shows plurality of intermediate representations similar to the original input).
Tuor does not specifically teach projecting, prior to creating the inferencing output, feature maps for each of the set of intermediate representations into a same input format as used by the input information, and Page 44 of 50P201706237US01inputting the intermediate representations in the input format to the second deep learning inference system for the second deep learning inference system to use when creating the inferencing output.
Osia teaches projecting, prior to creating the inferencing output, feature maps for each of the set of intermediate representations into a same input format as used by the input information ([Osia, page 8, left column, entire second paragraph, 3) Visualization] “Deep visualization can brought us a good intuition about identity preservation of each layer. We fed the the intermediate layers of gender classification model as the input of Alexnet decoder [14] to reconstruct the original inputs. The reconstructed images leads to visually figure out the amount of identity information in the intermediate feature of gender classification model. These images are illustrated in Figure 11 for different methods. It can be observed that the genders of all images in the simple and Siamese embeddings remain the same as the original ones. This is also the case for the advanced embedding, although it is harder to distinguish it from the reconstructed images. The original images are almost restored in the simple embedding. Therefore, just separating layers of a deep network can not assure acceptable privacy preservation performance. Siamese embedding performs better than the simple embedding by distorting the identity due to intrinsic characteristics of the face. Finally, the Advanced Embedding provides the best results, because the decoder was not trainable and nothing can be deduced from images, including the person’s identity”); and Page 44 of 50P201706237US01inputting the intermediate representations in the input format to the second deep learning inference system for the second deep learning inference system to use when creating the inferencing output ([Osia, page 8, left column, entire second paragraph, 3) Visualization] “Deep visualization can brought us a good intuition about identity preservation of each layer. We fed the the intermediate layers of gender classification model as the input of Alexnet decoder [14] to reconstruct the original inputs. The reconstructed images leads to visually figure out the amount of identity information in the intermediate feature of gender classification model. These images are illustrated in Figure 11 for different methods. It can be observed that the genders of all images in the simple and Siamese embeddings remain the same as the original ones. This is also the case for the advanced embedding, although it is harder to distinguish it from the reconstructed images. The original images are almost restored in the simple embedding. Therefore, just separating layers of a deep network can not assure acceptable privacy preservation performance. Siamese embedding performs better than the simple embedding by distorting the identity due to intrinsic characteristics of the face. Finally, the Advanced Embedding provides the best results, because the decoder was not trainable and nothing can be deduced from images, including the person’s identity”, discloses the process of inputting the intermediate representations into another neural network (i.e. Alexnet decoder) to reconstruct the input image).
Claim 11 is an apparatus claim having similar limitations to method claim 3 above. Therefore, they are rejected under the same rational as of claim 3 above.
Claim 19 is a computer program product claim having similar limitations to method claim 3 above. Therefore, they are rejected under the same rational as of claim 3 above.

Regarding claim 4, Tuor in view of Osia teaches the method of claim 3, wherein the input format is an image format ([Osia, page 5, left column, B. Deep Visualization] “In [14], a decoder is designed on the data representation of each layer, in order to reconstruct the original input image based on the learned representation. So, we can analyze the preserved sensitive information in each layer, via comparing the reconstructed images with the original input image”, Osia uses same input format (image) for the intermediate representation as used by the input information (image) ).
Claim 12 is an apparatus claim having similar limitations to method claim 4 above. Therefore, they are rejected under the same rational as of claim 4 above.

[Tuor, page 2, line 18-28, 2.INITIAL EXPERIMENT: RECOVERING RAW DATA] “Formally, we denote the representation of the original image x obtained at the output of layer i by bi(x), where we consider the 0-th layer as the input layer, thus b0 = x. The problem is formulated as follows. Given the following: 1. A representation function bi: Rd0 −→ Rdi for each layer i, where d0 is the dimension of input layer 0 and di is the dimension of the output at layer i (this representation function is specified directly by the trained DNN model)  2. An input image x0  3. A representation bi(x0) at a specific layer I . The goal is to find the input image x so that the error between bi(x) and bi(x0) is the smallest: x∗ = arg min || x∈Rd0 kbi(x) − bi(x0)k2 . The image x∗ found from (1) should visually look similar as x0”) comprises: 
measuring a similarity between the intermediate representations and the input information by comparing the inferencing output of each of the intermediate representations with a corresponding intermediate representation in the input format ([Tuor, page 2, line 18-28, 2.INITIAL EXPERIMENT: RECOVERING RAW DATA, 3rd paragraph – the last paragraph; Figure 3] “Formally, we denote the representation of the original image x obtained at the output of layer i by bi(x), where we consider the 0-th layer as the input layer, thus b0 = x. The problem is formulated as follows. Given the following: 1. A representation function bi: Rd0 −→ Rdi for each layer i, where d0 is the dimension of input layer 0 and di is the dimension of the output at layer i (this representation function is specified directly by the trained DNN model)  2. An input image x0  3. A representation bi(x0) at a specific layer I . The goal is to find the input image x so that the error between bi(x) and bi(x0) is the smallest: x∗ = arg min || x∈Rd0 kbi(x) − bi(x0)k2 . The image x∗ found from (1) should visually look similar as x0 … Our experiment confirms that we leak information about raw data when performing distributed inference. Most of the layers seem to retain lots of information about the raw image. We manage to reconstruct visually similar image as the original input until the penultimate layer. From Figure 3, we can see that the reconstruction error is larger when reconstructing from a deeper representation. We also note that the reconstructed image retains the characteristics of this specific handwriting of the digit “2” ”, discloses the process of comparing the output made from intermediate representation with input information), 
and selecting a single intermediate representation for each of the plurality of layers and their corresponding one or more intermediate representations ([Tuor, page 2, line 18-28, 2.INITIAL EXPERIMENT: RECOVERING RAW DATA] “Formally, we denote the representation of the original image x obtained at the output of layer i by bi(x), where we consider the 0-th layer as the input layer, thus b0 = x. The problem is formulated as follows. Given the following: 1. A representation function bi: Rd0 −→ Rdi for each layer i, where d0 is the dimension of input layer 0 and di is the dimension of the output at layer i (this representation function is specified directly by the trained DNN model)  2. An input image x0  3. A representation bi(x0) at a specific layer I . The goal is to find the input image x so that the error between bi(x) and bi(x0) is the smallest: x∗ = arg min || x∈Rd0 kbi(x) − bi(x0)k2 . The image x∗ found from (1) should visually look similar as x0”).
Tuor does not specifically teach using second deep learning inference system to inference output for each of the intermediate representations.
Osia teaches using second deep learning inference system to inference output for each of the intermediate representations ([Osia, page 5, left column, B. Deep Visualization] “Visualization is a method for understanding the deep networks. In this paper, we used an auto-encoder objective visualization technique [14] in order to measure the amount of sensitive information in the intermediate feature of the network, which is trained for primary variable inference. In [14], a decoder is designed on the data representation of each layer, in order to reconstruct the original input image based on the learned representation. So, we can analyze the preserved sensitive information in each layer, via comparing the reconstructed images with the original input image”, discloses the process of measuring the similarity between the reconstructed intermediate representations with original input image); 
Claim 13 is an apparatus claim having similar limitations to method claim 5 above. Therefore, they are rejected under the same rational as of claim 5 above.
Claim 20 is a computer program product claim having similar limitations to method claim 5 above. Therefore, they are rejected under the same rational as of claim 5 above.

Regarding claim 6, Tuor in view of Osia teaches the method of claim 5, wherein: measuring a similarity between the intermediate representations and the input information further comprises: measuring a similarity metric for each of the intermediate representations and the input information ([Tuor, page 2, line 18-28, 2.INITIAL EXPERIMENT: RECOVERING RAW DATA] “Formally, we denote the representation of the original image x obtained at the output of layer i by bi(x), where we consider the 0-th layer as the input layer, thus b0 = x. The problem is formulated as follows. Given the following: 1. A representation function bi: Rd0 −→ Rdi for each layer i, where d0 is the dimension of input layer 0 and di is the dimension of the output at layer i (this representation function is specified directly by the trained DNN model)  2. An input image x0  3. A representation bi(x0) at a specific layer I . The goal is to find the input image x so that the error between bi(x) and bi(x0) is the smallest: x∗ = arg min || x∈Rd0 kbi(x) − bi(x0)k2 . The image x∗ found from (1) should visually look similar as x0”, shows the process of calculating the similarity metric between reconstructed intermediate representation and the input information); and selecting a single intermediate representation for each of the plurality of layers and their corresponding one or more intermediate representations further comprises selecting an intermediate representation at each layer that has a minimum measured similarity metric ([Tuor, page 2, line 18-28 from the top of the page, 2.INITIAL EXPERIMENT: RECOVERING RAW DATA, 3rd paragraph – the last paragraph; Figure 3] “Formally, we denote the representation of the original image x obtained at the output of layer i by bi(x), where we consider the 0-th layer as the input layer, thus b0 = x. The problem is formulated as follows. Given the following: 1. A representation function bi: Rd0 −→ Rdi for each layer i, where d0 is the dimension of input layer 0 and di is the dimension of the output at layer i (this representation function is specified directly by the trained DNN model)  2. An input image x0  3. A representation bi(x0) at a specific layer I . The goal is to find the input image x so that the error between bi(x) and bi(x0) is the smallest: x∗ = arg min || x∈Rd0 kbi(x) − bi(x0)k2 . The image x∗ found from (1) should visually look similar as x0 … Our experiment confirms that we leak information about raw data when performing distributed inference. Most of the layers seem to retain lots of information about the raw image. We manage to reconstruct visually similar image as the original input until the penultimate layer. From Figure 3, we can see that the reconstruction error is larger when reconstructing from a deeper representation. We also note that the reconstructed image retains the characteristics of this specific handwriting of the digit “2” ”, discloses the selecting visually similar image as the original input image. Figure 3 shows plurality of intermediate representations similar to the original input).
Tuor does not specifically teach using second deep learning inference system.
Osia teaches using second deep learning inference system ([Osia, page 8, left column, entire second paragraph, 3) Visualization] “Deep visualization can brought us a good intuition about identity preservation of each layer. We fed the the intermediate layers of gender classification model as the input of Alexnet decoder [14] to reconstruct the original inputs. The reconstructed images leads to visually figure out the amount of identity information in the intermediate feature of gender classification model. These images are illustrated in Figure 11 for different methods. It can be observed that the genders of all images in the simple and Siamese embeddings remain the same as the original ones. This is also the case for the advanced embedding, although it is harder to distinguish it from the reconstructed images. The original images are almost restored in the simple embedding. Therefore, just separating layers of a deep network can not assure acceptable privacy preservation performance. Siamese embedding performs better than the simple embedding by distorting the identity due to intrinsic characteristics of the face. Finally, the Advanced Embedding provides the best results, because the decoder was not trainable and nothing can be deduced from images, including the person’s identity”).
Claim 14 is an apparatus claim having similar limitations to method claim 6 above. Therefore, they are rejected under the same rational as of claim 6 above.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure. Regarding the process of partitioning the neural network.
Jong-Hwan Ko, “Edge-Host Partitioning of Deep Neural Networks with Feature Space Encoding for Resource-Constrained Internet-of-Things Platforms”, 02/11/2018

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JUN KWON whose telephone number is (571)272-2072. The examiner can normally be reached on 7:30 AM - 5:30 PM. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on (571)270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-

/JUN KWON/
Examiner, Art Unit 2127
/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127