Commissioner for Patents
United States Patent and Trademark Office
P.O. Box 1450
Alexandria, VA 22313-1450
www.uspto.gov











BEFORE THE BOARD OF PATENT APPEALS 
AND INTERFERENCES


Application Number: 14/468,532
Filing Date: August 26, 2014
Appellant(s): The MITRE Corporation



__________________
Meredith L. Stradley
For Appellant


EXAMINER’S ANSWER





This is in response to the appeal brief filed 12/14/2020.


(1) GROUNDS OF REJECTIONS TO BE REVIEWED ON APPEAL
Every ground of rejection set forth in the Office action dated 13 May 2020 from which the appeal is taken is being maintained by the examiner except for the grounds of rejection (if any) listed under the subheading “WITHDRAWN REJECTIONS.”  New grounds of rejection (if any) are provided under the subheading “NEW GROUNDS OF REJECTION.”

(2) RESPONSE TO ARGUMENT
Applicants: (p.9-11) submitted that ‘The Examiner relies on Dube for all the limitations of claim 1, except for the requirement of “building the first set of machine learned model parameters and the second set of machine learned model parameters into an executable image compression application that is configured to compress an image file that is different from the training image files. As explained further below, Dube is not directed to machine learning … Moreover, the Examiner’s attempt to cure at least some of the deficiency of Dube by citing of Bryt, which is directed to image compression via machine learning, is wrong because the two are completely different image compression / decompression techniques, and as such, a person having ordinary skill in the art would not have had any reason to modify Dube in view of Bryt”, and outlined four elements for argument
In response: 
Examiner respectfully disagrees because of the following: (1) Dube shows learning from regions of image context to generate prediction parameters to reduce data amount for image representation (see, e.g., Dube, C7L62-63, ‘The predictor module 100 has a plurality of outputs that provide indications of the predicted values, determined by the predictors, for the object pixel, e.g., pred1(X), pred2(X), … pred4(X) ', and FIG.16, Item 230 ‘Compute pred(X) adaptively as weighted sum of one or more predictions from the predictors…’ shows predicted parameters for the regions in FIG.7B); (2) Both Dube and Bryt are related to image compression (as can be seen from the title Dube: “Method and apparatus for processing an image compression/decompression system that uses hierarchical coding”, Bryt: “Compression of facial images using the K-SVD algorithm”) and it is proper to combine them.    As most of the arguments are centered around independent claim 1, the detailed mapping of limitations in claim1 to the cited references are provided below for reference.

Mapping of Claim 1:
With regards to claim 1, Dube teaches 
“A computer implemented method for machine learning model parameters for image compression, comprising: 
partitioning a plurality of training image files stored on a first computer memory (Dube, FIG.1, Item 32, ‘Image Storage’) into a first set of regions, wherein each region of the first set of regions is an array of pixel values (Dube, FIG.7A,

    PNG
    media_image1.png
    356
    333
    media_image1.png
    Greyscale

C10L25, ‘FIG.10 is a representation of contexts and classifications of the context …. context generation module may use the following convention to classify the 2x2 block … Smooth commonly refers to a region having a high degree of correlation.  Texture typically refers to a region having variations …’, C6L16-17, ‘4x4 block’ shows that an image is partitioned into regions.); 
training a probabilistic learning machine on the first set of regions to generate a first set of machine learned model parameters, the first set of machine learned model parameters representing a first level of data patterns in the plurality of training image files (Dube shows generating predicted parameters for region based on image context characteristics learned from neighboring regions, e.g., FIG.5, FIG.6, FIG.7B, C7L62-63, ‘The predictor module 100 has a plurality of outputs that provide indications of the predicted values, determined by the predictors, for the object pixel, e.g., pred1(X), pred2(X), … pred4(X) ', and

    PNG
    media_image2.png
    402
    358
    media_image2.png
    Greyscale

FIG.16, Item 230 ‘Compute pred(X) adaptively as weighted sum of one or more predictions from the predictors…’ shows predicted parameters for the regions in FIG.7B.

    PNG
    media_image3.png
    646
    429
    media_image3.png
    Greyscale

);
constructing a representation of each region in the first set of regions by using the first set of regions and the first set of machine learned model parameters, wherein a representation of a respective region of the first set of regions has an equal number of pixel values as the respective region (Dube, FIG.18, C12L21-35, ‘FIG.15 is a representation 214 of predicted differences for a current resolution ….. The predictor module 214 uses a plurality of predictors to generate predicted values for the object prediction difference, e.g., pred(X) (FIG.15), and to predict one or more causal context prediction difference neighbors e.g., N, W, NW, NN, WW (FIG.15), of the object prediction difference’, and FIG.9, C6L25-29, ‘FIG.7B shows one embodiment of a causal neighborhood context 96 of a current resolution….. neighborhood context … at the same resolution of X’, show that at each level, image representation is constructed from parameters learned from image neighborhood context regions of resolution (i.e., equal number of pixels), wherein region X is constructed with parameters learned from regions N, W, NW, NN, WW of equal number of pixels.


    PNG
    media_image4.png
    544
    798
    media_image4.png
    Greyscale

);
constructing representations of the plurality of training image files by combining the representations of the regions of the first set of regions (Dube, C5L19-37, ‘FIG.4B shows an arrangement of prediction differences 70 in one embodiment … The quantized prediction differences and the quantized lowest resolution image are supplied to a reconstructer 76, which further receives the predicted version, e.g., pred(X) and provides reconstructed version, e.g., X’, and FIG.1 show reconstructing images by combining predicted regions together.);  
partitioning the representations of the plurality of image files into a second set of regions, wherein a region of the second send of regions has a greater number of pixel values than at least one region of the first set of regions (Dube, FIG.2A, FIG.3A shows regions of difference sizes.

    PNG
    media_image5.png
    428
    704
    media_image5.png
    Greyscale

);
 training the probabilistic learning machine on the second set of regions to generate a second set of machine learned model parameters, the second set of machine learned model parameters representing a second level of data patterns in the plurality of image files, wherein the number of machine learned model parameters in the second set of machine learned model parameters is less than the number of machine learned model parameters in the first set of machine learned model parameters (Dube, FIG.5 shows prediction parameters calculation for two set of parameters for two image levels, and FIG.8 shows different levels of image representation and different number of parameters for each level.  An image representation level of higher resolution has more parameters than a lower resolution one as more pixel values are to be predicted.

    PNG
    media_image6.png
    646
    440
    media_image6.png
    Greyscale

); and 
building the first set of machine learned model parameters and the second set of machine learned model parameters into an executable image compression application ….  transforming the image file into a compressed image file based on the first set of machine learned va-508690Application No.: 14/678,5323 Docket No.: 69959-20009.00 model parameters generated from the training image files and the second set of machine learned model parameters generated from the training image files (Dube, ‘The skip and average methods result in a nonexpansive pyramid data structure (i.e., the total amount of transform coefficients are as many as pixels in the original image)’ shows examples of transforming images, and  FIG.1 / FIG.12 shows compression / decompression with multiple level of image representations.  e.g.,  FIG.1, from Item 34, 50, 68, 74, to Item 78 show an encoding path to represent original image using multiple level of representation with low bit rate, and from Item 80 to the left show an decoding path to generate Reconstructed versions e.g., X’.  Notice that the reconstructed image X’ is the same resolution as the original image.
[AltContent: rect][AltContent: rect]
    PNG
    media_image7.png
    679
    990
    media_image7.png
    Greyscale

).”
Dube does not explicitly detail “that is configured to compress an image file that is different from the training image files”
However Bryt teaches “that is configured to compress an image file that is different from the training image files (Bryt, p.272, 3.1.1. K-SVD training, ‘The K-SVD training process is an off-line procedure, preceding any image compression.  The training produces a set of K-SVD dictionaries that are then considered fixed for the image compression sage’, and p.270, ‘We train K-SVD dictionaries for predefined image patches, and compress each image according to these dictionaries’, shows that the system is pre-trained with a set of images, as shown in FIG.3, ‘Image’ (to be compressed), ‘Training Set’ are shown from two distinct paths on the left, and

    PNG
    media_image8.png
    621
    798
    media_image8.png
    Greyscale

p.272, 2.3 Training A dictionary shows a process to learn the parameters for machine learned models.).“
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Dube and Bryt before him or her, to modify the predictive multiple level image compression/decompression system & method of Dube to include compressing an image file that is different from training files as taught in Bryt.   
The motivation for doing so would have been to support very low bit rate compression (Bryt, Introduction)’. 

The four elements of arguments are addressed below in details.  


    PNG
    media_image9.png
    210
    704
    media_image9.png
    Greyscale

(p.11-13)
Applicants: (p.12) submitted that ‘the predicted values of Dube are not analogous to the claimed “machine learned parameters” at least because the predicted values of Dube are derived from only a single image file … By contrast, the claimed “machine learned parameters” are not derived from a single image file, but are instead derived from “the first set of regions [of a plurality of training image files]” as recited in the claim …. the methods taught by Dube do not include a learning machine because they do not have algorithm that uses training data to automatically improves through experience’.
In response: 
Examiner respectfully disagrees because of the following: (1) Dube teaches learning parameters from image context regions to make prediction for image compression. e.g., FIG.16, Item 230 “Compute pred(X) adaptively”, 

    PNG
    media_image10.png
    732
    499
    media_image10.png
    Greyscale

uses a plurality predictors to generate a plurality of predictions for the object …”, shows adaptively learning parameters to make prediction, and  FIG.5, FIG.7B shows an example of making prediction based on parameters learned from neighboring image regions. 

    PNG
    media_image11.png
    453
    410
    media_image11.png
    Greyscale

In FIG.5, Item 102 and Item 104 show two set of learned parameters for two set of regions of different levels;

    PNG
    media_image12.png
    683
    464
    media_image12.png
    Greyscale

(2) The teaching of Dube is analogous to the limitation of learning parameters for representing image regions of different levels.  The method shown by Dube can be used to a single or multiple images, with single image as a special case for multiple 


    PNG
    media_image13.png
    147
    718
    media_image13.png
    Greyscale

(p.13-23)
Applicants: (p.14) submitted that Dube fails to disclose or suggest “a representation of a respective region of the first set of regions has an equal number of pixel values as the respective region”.
In response: 
Examiner respectfully disagrees because of the following: (1) Dube shows hierarchical image compression which hierarchically partitioning images into regions for prediction and re-construct back to be the same size as original images, as shown e.g., FIG.1 shows the image encoding / decoding path taught by Dube. In  FIG.1, from Item 34, 50, 68, 74, to Item 78 show an encoding path to represent original image using multiple level of representation with low bit rate, and from Item 80 to the left show an decoding path to generate Reconstructed versions e.g., X’.  Notice that the 


    PNG
    media_image14.png
    434
    657
    media_image14.png
    Greyscale

(2) At the same level, image representation is constructed (as shown, e.g., Dube, C2L10-18, receives data representative of a plurality of reconstructed versions of an image … determines a context for the two or more causal contexts, and determine a substantially mean error for the context ….. provides a prediction for the object prediction difference in accordance with the data representative of the prediction for the object’) from parameters learned from image regions of same size - i.e., equal number of pixels (as shown in e.g., C2L10-18, receives data representative of a plurality of reconstructed versions of an image … determines a context for the two or more causal contexts, and determine a substantially mean error for the context ….. provides a prediction for the object prediction difference in accordance with the data representative of the prediction for the object’, FIG.5 (shows prediction region parameters based on neighboring regions at different level), FIG.7B (shows neighboring regions, wherein each region is of same resolution, as shown in as shown in C6L26-29, ‘The neighborhood context … at the same resolution of X’), FIG.14 (shows prediction of content of image regions), and FIG.15, C12L21-35, ‘FIG.15 is a representation 214 of predicted differences for a current resolution ….. The predictor module 214 uses a plurality of predictors to generate predicted values for the object prediction difference, e.g., pred(X) (FIG.15), and to predict one or more causal context prediction difference neighbors e.g., N, W, NW, NN, WW (FIG.15), of the object prediction difference’, wherein region X is constructed with parameters learned from regions N, W, NW, NN, WW of equal number of pixels, as shown in C6L26-29, ‘The neighborhood context … at the same resolution of X’.

Applicants: (p.15-20) submitted that Dube fails to disclose constructing a representation of each region with parameters learned from regions of equal number of pixels.
In response: 
Examiner respectfully disagrees because of the following: (1) as shown earlier and detailed in the Claim 1 mapping, Dube shows representing image region at each level with prediction parameters learned from neighboring regions of equal number of pixels; (2) Applicant pointed out many of the hierarchical representation teaching of portioning the representations of the plurality of images files into a second set of regions, wherein a region of the second set of regions has a greater number of pixel values than at least one region of the first set of regions” contradicts to Applicant’s argument about regions of equal number of pixels.  The cited reference Dube does teach using parameters learned from image context regions of same resolution as shown earlier in the response to argument and claim 1 mapping. The additional teachings in Dube for multiple level image representation for image regions with different size are used to cover multiple resolution requirement claimed in the instant application.

Applicants: (p.21-22) submitted that

    PNG
    media_image15.png
    690
    729
    media_image15.png
    Greyscale


    PNG
    media_image16.png
    253
    986
    media_image16.png
    Greyscale


    PNG
    media_image17.png
    206
    996
    media_image17.png
    Greyscale

In response: 
Examiner respectfully disagrees because of the following: (1) As stated earlier, at each level, Dube teaches constructing image regions by parameters learned from neighboring regions of equal number of pixels (Dude does teaches learning features, e.g., FIG.16, Item 230 “Compute pred(X) adaptively”, C1L64-65, “uses a plurality predictors to generate a plurality of predictions for the object …”, shows adaptively learning parameters to make prediction, and  FIG.5, FIG.7B shows an example of making prediction based on parameters learned from neighboring image regions); (2) The instant application also claimed multi-level level representation of regions with different number of pixels, as shown in the limitation in claim 1 “partitioning the representations of the plurality of image files into a second set of regions, wherein a region of the second send regions has a greater number of pixel values that at least one regions of the first set of regions”.  This corresponds to the different resolution levels, i+1, i+2, etc., as shown in FIG.1 of Dube.  The argument is out of focus when Applicant pointed out the teaching of Dube about multi-level of image regions of different size, which is also claimed in the instant application; (3) see mapping of claim 1 included at the beginning for more detailed analysis. 


    PNG
    media_image18.png
    193
    709
    media_image18.png
    Greyscale

(p.23-30)
Applicants: (p.25-26) submitted that ‘the predicted values of Dube are not analogous to the claimed “machine learned model parameters,” …. There is no machine learning at all involved with Dube’s predicted value, since the predicted value are only derived from data of a single image and are not trained on data from a plurality of images’.
In response: 
Examiner respectfully disagrees because of the following:  As stated above, Dube shows learning from regions of image context to make prediction for image content of each region to reduce data amount for image representation, which is a machine learning process (e.g., Dube, FIG.16, Item 230 “Compute pred(X) adaptively”, C1L64-65, “uses a plurality predictors to generate a plurality of predictions for the object …”, shows adaptively learning parameters to make prediction, and  FIG.5, FIG.7B shows an example of making prediction based on parameters learned from neighboring image regions)

Applicants: (p.27) submitted that ‘Dube teaches predicting pixel values using only data from a single image file.  This is completely different from building machine learned model parameters into an image compression application as taught by Bryt”.
In response: 
Examiner respectfully disagrees because of the following:  Dube teaches learning from image context to predict region content (as shown e.g., Dube, C2L9-17, ‘receives data representative of a plurality of reconstructed versions of an image including data representative of two or more causal contexts, …. provides a prediction for the object’) and Bryt teaches learning image characteristics from multiple image (as shown in, e.g., Bryt, p.272, 3.1.1. K-SVD training, ‘The K-SVD training process is an off-line procedure, preceding any image compression.  The training produces a set of K-SVD dictionaries that are then considered fixed for the image compression sage’, and p.270, ‘We train K-SVD dictionaries for predefined image patches, and compress each image according to these dictionaries’, shows that the system is pre-trained with a set of images).  Extending Dube’s method to multiple image files would further reduce data rate as is proven in industrial video compression process which learned context from multiple images to further generate low bit rate representation of image sequences.

Applicants: submitted that “a skilled artisan would not have any reason to combine Dube’s predicted values with Bryt’s teaching of building machine learned model parameters into an image compression application because the skilled artisan would understand that the teaching of Dube and the teaching of Bryt do not complement each other” (p.27), “Dube teaches an image compression algorithm that is applied to each individual image to be compressed … Bryt teaches building machine learned model parameters into an image compression application …. One having skill in the art would readily recognize that Dube’s algorithm is applied individually and specifically to each image to be compressed, and thus, one having skill in the art would not expect that the predicted values specific to the single image could be applied to another, different image” (p.28-29),
In response: 
Examiner respectfully disagrees because of the following: (1) Both Dube and Bryt are related to image compression and it is proper to combine them; (2) Dube shows learning from regions of image context to make prediction to reduce data amount for image representation (as shown e.g., Dube, C2L9-17, ‘receives data representative of a plurality of reconstructed versions of an image including data representative of two or more causal contexts, …. provides a prediction for the object’) and Bryt shows learning from multiple image files (as shown in, e.g., Bryt, p.272, 3.1.1. K-SVD training, ‘The K-SVD training process is an off-line procedure, preceding any image compression.  The training produces a set of K-SVD dictionaries that are then considered fixed for the image compression sage’, and p.270, ‘We train K-SVD dictionaries for predefined image patches, and compress each image according to these dictionaries’, shows that the system is pre-trained with a set of images).   One artisan in the field can apply Dube’s method on multiple images as shown in Bryt to further reduce bit rate.  There is no reason that this method is restricted to only a single image, e.g, video compression is using similar technique to make prediction cross multiple images, for neighboring images in a video are similar, this is shown e.g., in Figure.6.1 of Al-Mualla et al., “Multiple-Reference Motion Estimation Techniques”, Video Coding for Mobile Communications, 2002

    PNG
    media_image19.png
    703
    544
    media_image19.png
    Greyscale


    PNG
    media_image20.png
    135
    709
    media_image20.png
    Greyscale

(p.30-33)
Applicants: (p.30-31) submitted that it would not be obvious to combine Dube and Bryt  for “there is no reason at all to combine the teaching of these references since they are simply completely different image compression methods … The Examiner’s reasoning (i.e., low bit rate compression) would provide a reason to use Bryt instead of Dube”.
In response: 
Examiner respectfully disagrees because of the following: Both Dube and Bryt are in the same area of image compression, so there is no reason to prohibit combing the teaching of these two references to provide a process of multiple level context learning on multiple images. Specifically, Dube teaches learning parameters from context of image regions (as shown e.g., Dube, C2L9-17, ‘receives data representative of a plurality of reconstructed versions of an image including data representative of two or more causal contexts, …. provides a prediction for the object’) and Bryt shows learning from multiple image files (as shown in, e.g., Bryt, p.272, 3.1.1. K-SVD training, ‘The K-SVD training process is an off-line procedure, preceding any image compression.  The training produces a set of K-SVD dictionaries that are then considered fixed for the image compression sage’, and p.270, ‘We train K-SVD dictionaries for predefined image patches, and compress each image according to these dictionaries’, shows that the system is pre-trained with a set of images), one artisan in the field can apply Dube’s method on multiple images as shown in Bryt to 

Applicants: (p.32-33) submitted that “the Examiner does not explain why one having ordinary skill in the art would be motivated to select feature of Bryt to apply to a completely different compression/decompression algorithm (i.e., that of Dube) in an unknown way to achieve the claimed methods for machine learning model parameters for image compression” , “there is no teaching in Bryt that its learning machine could be applied to a predictive multiple level process for image compression/ decompression such as Dube’s, nor how one would go about doing so”, “there is no teaching in Dube that its predictive multiple level image compression/decompression process can utilize or be applied to a learning machine, nor how that could take place”
In response: 
Examiner respectfully disagrees because of the following: (1) As stated earlier, Dube does teach a learning machine to learn parameters from context image regions to reduce bit rate for representation; (2) Dube’s method of predicting region content using parameters learned from context image regions (as shown e.g., Dube, C2L9-17, ‘receives data representative of a plurality of reconstructed versions of an image including data representative of two or more causal contexts, …. provides a prediction for the object’) could be extend to multiple images in views the multiple file learning taught by Bryt (as shown in, e.g., Bryt, p.272, 3.1.1. K-SVD training, ‘The K-SVD training process is an off-line procedure, preceding any image compression.  The training produces a set of K-SVD dictionaries that are then considered fixed for the image compression sage’, and p.270, ‘We train K-SVD dictionaries for predefined image patches, and compress each image according to these dictionaries’, shows that the system is pre-trained with a set of images); (3) as both Dube and Bryt are related to image compression and it is proper to combine them



Respectfully submitted,

/TSU-CHANG LEE/
Examiner, Art Unit 2126

Conferees:
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126            



/Jason Cardone/
Primary Examiner

                                                                                                                                                                                            
Requirement to pay appeal forwarding fee.  In order to avoid dismissal of the instant appeal in any application or ex parte reexamination proceeding, 37 CFR 41.45 requires payment of an appeal forwarding fee within the time permitted by 37 CFR 41.45(a), unless appellant had timely paid the fee for filing a brief required by 37 CFR 41.20(b) in effect on March 18, 2013.