PNG
    media_image1.png
    340
    340
    media_image1.png
    Greyscale
United States Patent and Trademark Office    
        
            
                                
            
        
    

Commissioner for Patents
United States Patent and Trademark Office
P.O. Box 1450
Alexandria, VA 22313-1450
www.uspto.gov











BEFORE THE PATENT TRIAL AND APPEAL BOARD


Application Number: 15/859,943
Filing Date: 2 Jan 2018
Appellant(s): Norouzi et al.



__________________
Kim Thien Bui Reg. No. 76,843
For Appellant


EXAMINER’S ANSWER





This is in response to the appeal brief filed 12/08/2021.

(1) Grounds of Rejection to be Reviewed on Appeal
Every ground of rejection set forth in the Office action dated 07/09/2021 from which the appeal is taken is being maintained by the examiner except for the grounds of rejection (if any) listed under the subheading “WITHDRAWN REJECTIONS.”  New grounds of rejection (if any) are provided under the subheading “NEW GROUNDS OF REJECTION.”
The following ground(s) of rejection are applicable to the appealed claims. Examiner maintains:
35 U.S.C. 103 rejections for claims 1-2, 8-9, and 20 as being unpatentable over “Convolutional Neural Networks with Intra-layer Recurrent Connections for Scene Labeling” to Liang et al, (hereinafter, “Liang”) in view of “Recurrent Convolutional Neural networks for Scene Labeling” to Pinheiro et al, (hereinafter, “Pinheiro”);
35 U.S.C. 103 rejection for claim 3 as being unpatentable over Liang in view of Pinheiro, and further in view U.S. Pub. No. US 2015/0170021 A1 to Lupon et al, (hereinafter, “Lupon”).;
35 U.S.C. 103 rejections for claims 4-5, 11-12, and 16-17 as being unpatentable over Liang in view of Pinheiro, and further in view of “Input Convex Neural Networks” to Amos et al, (hereinafter, “Amos”);
35 U.S.C. 103 rejections for claims 6, 10, 13, 15 and 18 as being unpatentable over Liang in view of Pinheiro, and further in view of “Malware Detection Based On Training Using Automatic Feature Pruning With Anomaly Detection Of Execution Graphs” to Apostolescu et al, (hereinafter, “Apostolescu”);
35 U.S.C. 103 rejections for claims 7, 14, and 19 as being unpatentable over Liang in view of Pinheiro, and further in view of “Graphical Models For High-Level Computer Vision” to Heitz et al, (hereinafter, “Heitz”).
(2) Response to Argument
II. The Action errs in relying on Liang as allegedly disclosing “processing the input data item and the current structured output by using a value neural network” “wherein the value neural network has been trained to [] process the input data item and the current structured output to generate a value score” as recited in claim 1.
Examiner response:
Appellant first argues that prior art Liang does not teach or suggests the features of claim 1, in particular: “the value neural network has been trained to receive as input the input data item and the current structured output and to process the input data item and the current structured output to generate a value score that is an estimate of how accurately the current structured output predicts, for each data element in the input data item, a category that the data element belongs to”. In response to appellants above argument, the examiner respectfully disagrees. Claims were evaluated as a whole, based on the broadest reasonable interpretation of the claim limitations. After careful analysis and evaluation of the limitations in independent claim 1, examiner understands that the claim limitations do not reflect a patentable distinction from the prior art of record. Examiner would like to provide a brief explanation of the prior art of record. 

“A structured output prediction task may be, for example, an image segmentation masking task. In particular, given an input image including multiple pixels, an image segmentation masking task involves generating a segmentation mask for the input image. The segmentation mask is a structured output that assigns, to each of the image pixels in the input image, a respective value for each of one or more categories”; 
wherein it can be seen that the current application is directed towards image classification assigning a value to each pixel. 
In comparison, examiner points out to Liang’s Abstract at Page 1: 
“Scene labeling is a challenging computer vision task. It requires the use of both local discriminative features and global context information. We adopt a deep recurrent convolutional neural network (RCNN) for this task, which is originally proposed for object recognition”; and also to the Introduction at page 1: “Scene labeling (or scene parsing) is an important step towards high-level image interpretation. It aims at fully parsing the input image by labeling the semantic category of each pixel”, 
wherein it can be seen that prior art Ling is also directed towards image classification by labeling each pixel.
Examiner further points out to Liang’s second paragraph at Page 3: “To the best of our knowledge, the first end-to-end neural network model for scene labeling refers to the deep CNN proposed in [7]. The model is trained by a supervised greedy learning strategy. In [19], another end-to-end model is proposed. Top-down recurrent connections are incorporated into a CNN to capture context information. In the first recurrent iteration, the CNN receives a raw patch and outputs a predicted label map (downsampled due to pooling). In other iterations, the CNN receives both a downsampled patch and the label map predicted in the previous iteration and then outputs a new predicted label map”. According to this citation, Liang teaches an iterative process of a recurrent neural network that outputs a predicted label map, which corresponds to the claimed “structured output”. Further, it explains that the neural network receives, in a next iteration, a downsampled patch (which it is still the input image) along with the previous predicted label map (output). In addition, Liang teaches at Page 4: Section 3.2: 
“
    PNG
    media_image2.png
    160
    573
    media_image2.png
    Greyscale
”.
A person having ordinary skill in the art understands that the role of a loss function is to determine a measure of error of a model prediction/classification; therefore, this is not patentable distinct from the claimed “value score”. The claimed “value score” states to be “an estimate of how accurately the current structured output predicts”; which is equivalent as calculating a loss function, as Liang teaches. Furthermore, Liang teaches the training of the model done by backpropagation, which means that the model’s weights are updated iteratively through time in order to minimize the error/loss (which corresponds to the claimed “value score”).

Applicant further argues that Liang’s cross entropy between the predicted probability and the true hard label is dissimilar from the claim’s “value score” between the “input data item” and the “current structured output”, however, examiner respectfully disagrees. Examiner understands that in order to generate a “value score”, there is a need of truth data for the “input data item” so that it can be compared to the “current structured output” to establish how accurate the prediction is (a comparison between what should the real value is vs the predicted value). Furthermore, current application’s explains this process at paragraph [24]: “During training, the value neural network 106 is configured to receive initial training data 102. The initial training data 102 includes multiple examples and, for each training example, a corresponding ground truth output. The ground truth output for a given training example is an output that should be the predicted structured output for the training example”. Therefore, according to this disclosure, there is a need of ground truth (which corresponds to Liang’s true hard label) for the “input data item” in order to be compared to the “current structured output” (which corresponds to Liang’s predicted probability) for determining the “value score” (which corresponds to Liang’s cross entropy for the loss function). 
patentable distinction between the current application and the prior art of record.

III. The Action errs in relying on Pinheiro as allegedly teaching “updating the current structured output by adjusting the current values in the current structured output to increase the value score generated by the value neural network” as recited in claim 1.
Examiner response:
Appellant first argues “[a]s shown above, the neural network in Pinheiro takes the input (i.e., “an RGB image”) and the current output (i.e., the “classification predictions”) and generates an improved output. However, the classification predictions in Pinheiro are not being updated to improve the score generated for the predictions by another neural network, as would be required to read on claim 1. The neural network of Pinheiro improves the classification predictions through learning”; however, examiner respectfully disagrees. 
First, regarding the argument about the “another neural network”, it is noted that this feature upon which applicant relies is not recited in the rejected claim. Examiner reviewed the claim limitations of independent claim 1, and it does not recite two neural networks in order to read that the claim has “another” neural network. According to the claim limitations, the only neural network recited is the “value neural network”. Claim 1 recites “iteratively performing the following operations”, therefore, the broadest reasonable interpretation of the claim limitations is that there is only one (emphasis added) neural network that iteratively performs the operations of the value score generation and updating the current structures output by adjusting the current values.
Furthermore, in regards to the art Pinheiro, examiner respectfully would like to explain why the current application and the prior art Pinheiro are not patentable distinct. Examiner points out to paragraph [49] and [50] of the current application’s specification: 
“[49] In particular, assuming that the current structured output is y(t), the system determines a gradient of the value score v(x, y(t)) with respect to the current values in y(t) as follows: 
    PNG
    media_image3.png
    73
    179
    media_image3.png
    Greyscale
 where 0 denotes values of parameters of the trained value neural network (e.g., the trained value neural network 106 of FIG. 2) that are held fixed. The system determines the output by backpropagating gradients with respect to the current values in the current output y(t) through the trained value neural network. 
[50]   The system then adjusts each of the current values in the current structured output using the respective gradient value for the current value by performing the following steps”;
 wherein it can be seen that the current application recites the use of backpropagation and adjusts values using a gradient value. 
In comparison, examiner points out to Pinheiro at Abstract: 
“The goal of the scene labeling task is to assign a class label to each pixel in an image”; in addition to the Introduction at page 1, right column: “The network automatically learns to smooth its own predicted labels. As a result, the overall network performance is increased as the number of instances increases” and further to Section 3.2, right column: The learning procedure is the same as for a standard CNN (stochastic gradient descent), where gradients are computed with the backpropagation through time (BPTT) algorithm”, 
wherein it can be also seen that prior art Pinheiro recites the use of backpropagation as a learning procedure to smooth/adjust values using gradients. 
The limitation of claim 1 “updating the current structured output by adjusting the current values in the current structured output to increase the value score generated by the value neural network” is reasonably taught by Pinheiro since this art teaches the concept of smoothing (which corresponds to the claimed “adjusting”) in order to increase the performance of the neural network (which corresponds to the claimed “increase the value score generated by the value neural network”).
This explanation also applies to independent claims 10 and 15 as they recite similar features to those described in claim 1. For these above reasons, examiner understands that there is no patentable distinction between the current application and the prior art of record.

IV. The Office Action fails to establish a prima facie case of obviousness for the
proposed combination of Liang and Pinheiro.
Examiner response:
Appellant argues that the Office Action is insufficient to support an obviousness rejection, however, examiner respectfully disagrees.
Examiner would like to explain how the current application and the combination (emphasis added) of Liang and Pinheiro are similar.

“A structured output prediction task may be, for example, an image segmentation masking task. In particular, given an input image including multiple pixels, an image segmentation masking task involves generating a segmentation mask for the input image. The segmentation mask is a structured output that assigns, to each of the image pixels in the input image, a respective value for each of one or more categories”; 
wherein it can be seen that the current application is directed towards image classification assigning a value to each pixel.
In comparison, Examiner points out to Liang’s Abstract at Page 1: 
“Scene labeling is a challenging computer vision task. It requires the use of both local discriminative features and global context information. We adopt a deep recurrent convolutional neural network (RCNN) for this task, which is originally proposed for object recognition”; and also to the Introduction at page 1: “Scene labeling (or scene parsing) is an important step towards high-level image interpretation. It aims at fully parsing the input image by labeling the semantic category of each pixel”,
and Pinheiro at Abstract: 
“The goal of the scene labeling task is to assign a class label to each pixel in an image”, wherein it can be seen that both prior arts Liang and Pinheiro are also directed towards image classification by labeling each pixel.
Examiner understands that it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Liang with the teachings of Pinheiro by iteratively generating a value score that is an as suggested by Pinheiro at Introduction and Conclusion.
This explanation also applies to independent claims 10 and 15 as they recite similar features to those described in claim 1. For these above reasons, examiner understands that there is no patentable distinction between the current application and the prior art of record.
For the above reasons, it is believed that the rejections should be sustained.
Respectfully submitted,
/LUIS A SITIRICHE/Primary Examiner, Art Unit 2126                                                                                                                                                                                                        
Conferees:
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126  
                                                                                                                                                                                                      /RYAN M STIGLIC/Primary Examiner 

Requirement to pay appeal forwarding fee.  In order to avoid dismissal of the instant appeal in any application or ex parte reexamination proceeding, 37 CFR 41.45 requires payment of an appeal forwarding fee within the time permitted by 37 CFR 41.45(a), unless appellant had timely paid the fee for filing a brief required by 37 CFR 41.20(b) in effect on March 18, 2013.