DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 3, 6, 7, 8, 9, 10, 13, 14. 15, 16 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhou et al. (“Learning Rich Features for Image Manipulation Detection” 2018) in view of Dolhansky et al. (US 10810725 B1) in view of Cozzolino et al. (IDS: Noiseprint: a CNN-based camera model fingerprint, 2018).

Regarding claims 1 and 8 and 14, Zhou et al. disclose a computer vision system for localizing image forgery (manipulation detection, abstract) comprising: a memory; and a processor in communication with the memory, the processor, and method for localizing image forgery by a computer vision system, comprising the steps of, and non-transitory computer 

 
Zhou et al. disclose a convolution (uses CNN) but do not disclose a constrained convolution in particular. Zhou et al. disclose a statistical signature of at least one source model, but do not disclose it is a camera model.

Dolhansky et al. disclose a memory; and a processor in communication with the memory, and non-transitory computer readable medium, the processor generating a constrained convolution, training a neural network with the constrained convolution and a plurality of images of a dataset to learn a low-level representation indicative of a statistical signature of at least one source camera model for each image among the plurality of images, localizing an attribute of an image of the dataset by the trained neural network, and detecting an image fake or forgery (“The feature vector is provided as input to a neural network that determines whether the types of modification have been made to the image.  The neural network may include a constrained convolution layer and several unconstrained convolution layers.  An image fake model may also be applied to determine whether the image was generated using a computer model or algorithm” abstract, “The feature vector is provided as input to a neural network that generates 

Zhou et al. and Dolhansky et al. are in the same art of detecting fake images (Zhou et al., abstract; Dolhansky et al., abstract). The combination of Dolhansky et al. with Zhou et al. enables using a constrained convolution. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the constrained convolution of Dolhansky et al. with the tampered image detection of Zhou et al. as this was known at the time of filing, the combination would have predictable results, and as Dolhansky et al. indicate “The constrained convolution layer 540 suppresses the image content to make it easier to detect traces of image manipulation” (col. 11, lines 1-20) indicating the benefit to improve the detection of Zhou et al.

To the extent Zhou et al. and Dolhansky et al. do not explicitly disclose a statistical signature of at least one source camera model, another reference is provided to make this explicit.

Cozzolino et al. teach training a neural network with the convolution and a plurality of images of a dataset to learn a low-level representation indicative of a statistical signature of at least one source camera model for each image among the plurality of images (“Forensic analyses of digital images rely heavily on the traces of in-camera and out-camera processes left on the acquired images. Such traces represent a sort of camera fingerprint… In this paper we propose a method to extract a camera model fingerprint, called noiseprint, where the scene content is 

Zhou et al. and Dolhansky et al. and Cozzolino et al. are in the same art of detecting fake images (Zhou et al., abstract; Dolhansky et al., abstract; Cozzolino et al., abstract). The 

Regarding claims 2, 9 and 15, Zhou et al., Dolhansky et al., and Cozzolino et al. disclose the system, method, and CRM of claims 1, 8 and 14. Zhou et al., Dolhansky et al., and Cozzolino et al. further indicate wherein the processor: extracts at least one noise residual pattern from each image among the plurality of images via the constrained convolution (Zhou et al., noise is modeled by the residual between a pixel’s value and the estimate of that pixel’s value produced by interpolating only the values of neighboring pixels, p1056, Dolhansky et al., constrained convolution, col. 2, lines 45-60, col. 8, lines 15-35; Cozzolino et al., extract a noise residual, p3, Using CNNs to extract noise residuals, Fig. 3), determines a spatial distribution of the extracted at least one noise residual pattern (Zhou et al., utilize the local noise distributions of the image 

Regarding claims 3, 10 and 16, Zhou et al. and Cozzolino et al. disclose the system, method, and CRM of claims 2, 9 and 15. Zhou et al., Dolhansky et al., and Cozzolino et al. further indicate the processor trains the neural network with a complete loss function based on a cross-entropy loss function over the dataset, the probabilistic regularization, and a rich filter constraint penalty (Zhou et al., Lcls denotes cross entropy loss for RPN network, p1055, We use cross entropy loss for manipulation classification and smooth L1 loss for bounding box regression, Ltamper denotes the final cross entropy classification loss, which is based on the bilinear pooling feature from both the RGB and noise stream, p1057; Cozzolino et al.
    PNG
    media_image1.png
    363
    513
    media_image1.png
    Greyscale
, p4, probability mass, 4) Regularization: To encourage diversity of noiseprints, we add a regularization term to the previous DBL loss, p5).

Regarding claim 6, Zhou et al. and Cozzolino et al. disclose the system of claim 1. Cozzolino et al. further indicate the dataset is a Dresden Image dataset (44 cameras from the Dresden dataset, p7).
 
Regarding claims 7, 13 and 19, Zhou et al. and Cozzolino et al. disclose the system, method, and CRM of claims 1, 8 and 14. Zhou et al. and Cozzolino et al. further indicate the localized adversarial perturbation of the image is a splicing manipulation (Zhou et al., Columbia dataset focuses on splicing based on uncompressed images, p1058, Qualitative results for multi-class image manipulation detection on NIST16 dataset. RGB and noise map provide different information for splicing, copy-move and removal, p1059, classes for manipulation classification to be splicing, removal and copy-move so as to learn distinct visual tampering .

Claims 4, 11 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhou et al. (“Learning Rich Features for Image Manipulation Detection” 2018) and Dolhansky et al. (US 10810725 B1) and Cozzolino et al. (IDS: Noiseprint: a CNN-based camera model fingerprint, 2018) as applied to claims 1, 8 and 14 above, further in view of Shu et al. (“Unsupervised 3D shape segmentation and co-segmentation via deep learning” 2016).

Regarding claims 4, 11 and 17, Zhou et al., Dolhansky et al., and Cozzolino et al. disclose the system, method, and CRM of claims 1, 8 and 14. Zhou et al. and Cozzolino et al. further indicate the processor localizes the attribute of the image of the dataset by the trained neural network by: subdividing the image into a plurality of patches, determining a feature vector for each patch, and segmenting the plurality of patches by applying an expectation maximization algorithm to each patch to fit a two component Gaussian mixture model to each feature vector (Zhou et al., Goljan et al. [19] propose a Gaussian Mixture Model (GMM) to classify CFA present regions (authentic regions) and CFA absent regions (tampered regions). Bappy et al. [2] propose an LSTM based network applied to small image patches to find the tampering artifacts on the boundaries between tampered patches and image patches, p1055; Cozzolino et al., network, which is trained with pairs of image patches, abstract, p3, In Splicebuster [28] the expectationmaximization algorithm is used, p3, To each pixel of a regular sampling grid, a feature vector is associated, accounting for the spatial co-occurrences of residuals. These 

Zhou et al., Dolhansky et al., and Cozzolino et al. do not disclose determining a hundred-dimensional feature vector for each patch.

Shu et al. teach determining a hundred-dimensional feature vector for each patch (define a common GMM to guide the consistent segmentation of a family of models, part 3.3, collection of 100-dimensional high-level feature vectors in the output layer, part 4.1, cluster the patches of the shape by considering their corresponding high-level feature vectors. GMM is employed for clustering in our method, resulting in a probability matrix depicting the probabilities for a patch belonging to a cluster, part5).

Zhou et al. and Shu et al. are in the same art of neural networks (Zhou et al., abstract; Shu et al., part 1). The combination of Shu et al. with Zhou et al., Dolhansky et al., and Cozzolino et al. enables using a 100 dimensional feature vector. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the vector of Shu et al. with the tampered image detection of Zhou et al., Dolhansky et al., and Cozzolino et al. as this was known at the time of filing, the combination would have predictable results, and as it has been held that discovering an optimum value of a result effective variable involves only routine skill in the art. In re Boesch, 617 F.2d 272, 205 USPQ 215 (CCPA 1980), and as Shu et al. indicate it is a commonly used design (part 4) indicating that it would be obvious to try.

Claims 5, 12 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhou et al. (“Learning Rich Features for Image Manipulation Detection” 2018) and Dolhansky et al. (US 10810725 B1) and Cozzolino et al. (IDS: Noiseprint: a CNN-based camera model fingerprint, 2018) as applied to claims 1, 8 and 14 above, further in view of Alberti et al. (“Are You Tampering With My Data?”, 2018).

Regarding claims 5, 12, and 18, Zhou et al., Dolhansky et al., and Cozzolino et al. disclose the system, method, and CRM of claims 1, 8 and 14. Zhou et al. further indicate the neural network is multiple layer deep Convolutional Neural Network (CNN) (Bilinear pooling [23], first proposed for finegrained classification, combines streams in a two-stream CNN network while preserving spatial information to improve the detection confidence, p1056; layers shown in Fig. 2), but do not specify it is 18 layers deep.

Alberti et al. teach a neural network is an 18 layer deep Convolutional Neural Network (CNN) (“propose a novel approach towards adversarial attacks on neural networks (NN), focusing on tampering the data used for training  instead of generating attacks on trained models… networks to misclassify any images to which the modification is applied, abstract, network models ResNet-18, p6, The residual network we used differs from a the original ResNet-18 model as it has an expected input size of 32×32 instead of the standard 224×224, p7).

Zhou et al. and Dolhansky et al. and Cozzolino et al. and Alberti et al. are in the same art of detecting fake images (Zhou et al., abstract; Dolhansky et al., abstract; Cozzolino et al., abstract; Alberti et al., abstract). The combination of Alberti et al. with Zhou et al., Dolhansky et al., and Cozzolino et al. enables using a 18 layer CNN. It would have been obvious at the time .

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: Liu et al. “Encoding High Dimensional Local Features by Sparse Coding Based Fisher Vectors” 2014 (local features 1 and for each dimension we calculate d in a GMM model with 100 mixtures p4 Deriving from the gradient vector of a generative model of local features, Fisher vector coding (FVC) has been identified as an effective coding method for image classification. Most, if not all, FVC implementations employ the Gaussian mixture model (GMM) to characterize the generation process of local features p1, 1000 Gaussian mixtures with 100 dimensional local features denoted as GMMFV-1000-100D p6); US 10925568 B2 (DL network training, “The function U(p) is a regularization term, and this term is directed at imposing one or more constraints (e.g., a total variation (TV) minimization constraint), which often have the effect of smoothing or denoising the reconstructed image.  The value .beta.  is a regularization parameter is a value that weights the relative contributions of the data fidelity term and the regularization term”, error/loss function can be calculated using one or more of a hinge loss and a cross-entropy loss).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHELLE M ENTEZARI HAUSMANN whose telephone number is (571)270-5084.  The examiner can normally be reached on 10-7 M-F.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, VINCENT M RUDOLPH can be reached on (571)272-8243.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/MICHELLE M ENTEZARI/Primary Examiner, Art Unit 2661