Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.

Claims 1-33 are rejected under 35 U.S.C. 103(a) as being unpatentable over Lai et al. (AAPA, Simultaneous Feature Learning and Hash Coding with Deep Neural Networks, pp. 3270-3278, hereinafter “Lai”), in view of Shen et al. (AAPA, Learning Binary Codes for Maximum Inner Product Search, hereinafter “Shen”), and further in view of Lal et al. (Automatic Image Colorization Using Adversarial Training, pp. 84-88, hereinafter “Lal”).

Regarding claim 1, Lai discloses a method for preparing a deep neural network to generate binary codes corresponding to images, the method comprising:
 obtaining a plurality of training images and a corresponding plurality of similarity values, each of the similarity values indicating a degree of similarity of a pair of the training images (p. 3272, sect. 3.1: training triplets of images are used to compare the similarity for mapping pairs); 
providing the plurality of images directly as input to the deep neural network to yield binary codes corresponding to the images (p. 3272, sect. 3.1: using binary codes in a deep neural network); 

While Lai discloses using the objective function, Lai does not explicitly disclose training the deep neural network using an iterative discrete optimization without continuous relaxations; however, Shen discloses training the deep neural network using an iterative discrete optimization without continuous relaxations (p. 4149, 1st para of 2nd column: , the binary codes and coding functions are simultaneously learned without continuous relaxations, which is the key to achieving high-quality binary codes).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Shen into Lai to provide a smooth optimization and achieving high-quality binary codes for deep neural network.
LaL further discloses providing image directly input [layer] to the deep neural network (p. 84, last para.: the last layer of generator as input).  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Lal into Lai and to generate learning more meaningful and discriminative features describing the input image.

Regarding claim 2, Lai in view of Shen and Lal disclose the method according to claim 1 wherein the deep neural network is a convolutional neural network (p. 3275, 1st column, last para.: we use the same stack of convolution-pooling layers as in Table 1, except for modifying the size of the last convolution).

Regarding claim 3, Lai in view of Shen and Lal disclose the method according to claim 2 wherein the convolutional neural network comprises at least one convolutional layer, pooling layer or fully connected layer (p. 3271, Fig. 1, shared sub-network with stacked convolution layers).

Regarding claim 4, Lai in view of Shen and Lal disclose the method according to claim 2 wherein the convolutional neural network comprises two or more convolutional layers, pooling layers and fully connected layers (Fig. 1, the image triplets are first encoded into a triplet of image feature vectors by a shared stack of multiple convolution layers).

Regarding claim 5, Lai in view of Shen and Lal disclose the method according to claim 1 wherein the deep neural network comprises an input layer, the input layer comprising a plurality of nodes connected to receive the plurality of images wherein each of the plurality of nodes corresponds to a different pixel value in one of the plurality of images (p. 3275, 1st column, we directly use the image pixels as input. For the other baseline methods, we follow [27, 12] to represent each image in SVHN and CIFAR-10 by a 512-dimensional GIST vector; we represent each image in NUS-WIDE by a 500-dimensional bag-of-words vector).

Regarding claim 6, Lai in view of Shen and Lal disclose the method according to claim 5 wherein the plurality of nodes is one of a plurality of sets of nodes of the input layer and the nodes of each of the sets of nodes are connected to receive pixel values of the plurality of images (p. 3275, 1st column, directly use the image pixels as input; see Table 2 for e.g. values).

Regarding claim 7, Lai in view of Shen and Lal disclose the method according to claim 6 wherein the pixel values each comprise a plurality of color values, each of the plurality of sets of nodes is associated with a different color corresponding to one of the color values and the pixel values that the nodes of each of the sets of nodes is connected to receive are the color values corresponding to the color associated with the set of nodes (Lal, p. 86, sect. 3.1.2, for corresponding color values).

Regarding claim 8, Lai in view of Shen and Lal disclose the method according to claim 7 wherein the color values represent values in a color space selected from the group consisting of LUV, HST, CIELAB, CMYK, CIEXYZ, TSL and HSL color spaces (Lal, p. 84, Predicting the RGB color channels from lightness channel L of the CIE Lab color space is overdetermined. Instead, the approach presented predicts a and b channels for given input L channel in the CIE Lab color space. CIE Lab color space itself incorporates lightness channel, thereby requiring the system to predict two channels instead of three, reducing the complexity of the system).

Regarding claim 9, Lai in view of Shen and Lal disclose the method according to claim 7 wherein the color values represent values in an RGB color space (Lal, p. 84, Predicting the RGB color channels from lightness channel L).

Regarding claim 10, Lai in view of Shen and Lal disclose the method according to claim 9 wherein the sets of nodes comprise: a first set of nodes connected to receive red color values (p. 86, 2nd column: RGB value for color space); 
a second set of nodes connected to receive green color values (p. 86, 2nd column: RGB value for color space); and 
a third set of nodes connected to receive blue color values (p. 86, 2nd column: RGB value for color space).

Regarding claim 11, Lai in view of Shen and Lal disclose the method according to claim 1 wherein training the deep neural network is unsupervised (Shen, p. 4148, last para.: including unsupervised PCA hashing etc.).

Regarding claim 12, Lai in view of Shen and Lal disclose the method according to claim 1 wherein the discrete optimization comprises alternating between: training the deep neural network as a deep neural network regressor on target binary codes (Shen, p. 4151, for training in a neural network); and 
updating the target binary codes based on memory and an output of the deep neural network regressor ((Shen, p. 4151, getting update and prediction/regressor; predict the target binary codes B with minimum quantization loss; updated with the pre-learned r − 1 bits till the procedure converges with a set of better codes).

Regarding claim 13, Lai in view of Shen and Lal disclose the method according to claim 12 wherein the deep neural network regressor is configured as a non-linear regressor ((Shen, p. 4151, getting prediction/regressor; predict the target binary codes B with minimum quantization loss).

Regarding claim 14, Lai in view of Shen and Lal disclose the method according to claim 1 wherein the plurality of similarity values are computed using pre-computed image features (p. 4150, 2nd column, 3rd para.: computing the similarity matrix S by inner product).

Regarding claim 15, Lai in view of Shen and Lal disclose the method according to claim 14 wherein the pre-computed image features comprise at least one of: Gist, generic ImageNet-pretrained features not specific to a retrieval task or tuned to the retrieval dataset and raw pixel intensities (Lai, p. 3270, 2nd para.: GIST is utilized).



Regarding claim 17, Lai in view of Shen and Lal disclose the method according to claim 1 comprising optimizing parameters of the deep neural network to generate optimized binary codes in an iterative procedure which comprises alternating between a first procedure and a second procedure wherein the first procedure trains the deep neural network regressor on target binary codes B and the second procedure updates the target binary codes B ((Shen, p. 4151, getting update and prediction/regressor; predict the target binary codes B with minimum quantization loss; updated with the pre-learned r − 1 bits till the procedure converges with a set of better codes).

Regarding claim 18, Lai in view of Shen and Lal disclose the method according to claim 17 wherein the deep neural network is applied as a non-linear regressor that maps directly from images to the binary codes ((Shen, p. 4151, getting update and prediction/regressor).

Regarding claim 19, Lai in view of Shen and Lal disclose the method according to claim 18 wherein: A and X denote sets of the training images, h(-) and z(-) are non-linear functions implemented using the deep neural network, H : Q -> {-1, 1} and Z : 1 -> {-1, 1} denote mappings from an image space 1 to k-bit binary codes, and Sis a similarity matrix having entries S, which have values that indicate the visual similarity between the ith image in A and the fth image in X and training the deep neural network comprises performing an optimization using an optimization objective function that attempts to maximize a correlation between S and inner products of the k-bit binary codes (Shen, p. 4150, at least in the last half of 2nd column).

Regarding claim 20, Lai in view of Shen and Lal disclose the method according to claim 18 wherein the optimization objective function is as follows:

    PNG
    media_image1.png
    54
    217
    media_image1.png
    Greyscale
(Shen, p. 4150, function (2); p. 4151, function (14)).
Regarding claim 21, Lai in view of Shen and Lal disclose the method according to claim 18 wherein the optimization objective function is as follows:

    PNG
    media_image2.png
    35
    185
    media_image2.png
    Greyscale
(Shen, p. 4150, function (7))
Regarding claim 22, Lai in view of Shen and Lal disclose the method according to claim 18 wherein the optimization objective function comprises a discrete sign function and sgn(. ) and the method comprises, in successive iterations, without relaxing the discrete sign function, alternating between holding zfixed and solving for h, and holding hfixed and solving for z (Shen, p. 4152, “sgn” function (17)).

Regarding claim 23, Lai in view of Shen and Lal disclose the method according to claim 22 wherein, for holding z fixed and solving for h the optimization objective function is given by:

    PNG
    media_image3.png
    74
    348
    media_image3.png
    Greyscale
(Shen, p. 4151:
 
    PNG
    media_image4.png
    162
    336
    media_image4.png
    Greyscale


Regarding claim 24, Lai in view of Shen and Lal disclose the method according to claim 23 comprising separating the non-linear function h( -) from the sign function sgn(.) using an auxiliary binary variable B representing the binary codes (Shen, p. 4152, “sgn” function (17)).

Regarding claim 25, Lai in view of Shen and Lal disclose the method according to claim 24 comprising iteratively alternating between holding h fixed and solving for B, and holding B fixed and solving for h (Shen, p. 4152).

Regarding claim 26, Lai in view of Shen and Lal disclose the method according to claim 25 wherein holding B fixed and solving for h comprises training the deep neural network using backpropagation and a loss function that provides a measure of differences between B and h(A) (Shen, p. 4152).

Regarding claim 27, Lai in view of Shen and Lal disclose the method according to claim 1 comprising mapping query images or stored images using the deep neural network to yield corresponding binary codes for the query images or the stored images and using the corresponding binary codes to assess similarity of the query images or the stored images to other images (Shen, sect. 3.3 retrieval with large dataset).

Regarding claim 28, Lai discloses a method for retrieving from a database images similar to an input image, the method comprising:

searching a plurality of binary codes corresponding to a plurality of stored images using the output binary code (p. 3275, column 1, last para.: searching for result obtaining from comparison ); and 
retrieving images from the plurality of stored images with binary codes similar to the output binary code (abstract, large-scale image retrieval tasks are performed); 
wherein training the deep neural network comprises: obtaining a plurality of training images and a corresponding plurality of similarity values, each of the similarity values indicating a degree of similarity of a pair of the training images (p. 3272, sect. 3.1: training triplets of images are used to compare the similarity for mapping pairs); 
providing the plurality of images directly as input to the deep neural network to yield binary codes corresponding to the images (p. 3272, sect. 3.1: using binary codes in a deep neural network); 
generating an objective function based on the binary codes and the similarity values ((p. 3272, sect. 3.1: the goal is to find a mapping F(.) such that the binary code F(I) is closer to F(I +) than to F(I −)); and 
While Lai discloses using the objective function, Lai does not explicitly disclose training the deep neural network using an iterative discrete optimization without continuous relaxations; however, Shen discloses training the deep neural network using an iterative discrete optimization without continuous relaxations (p. 4149, 1st para of 2nd column: , the binary codes and coding functions are simultaneously learned without continuous relaxations, which is the key to achieving high-quality binary codes).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Shen into Lai to provide a smooth optimization and achieving high-quality binary codes for deep neural network.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Lal into Lai and to generate learning more meaningful and discriminative features describing the input image.

Regarding claim 29, Lai in view of Shen and Lal disclose the method according to claim 28 comprising updating the plurality of binary codes corresponding to the plurality of stored images based on changes in one or both of the types of and number of images in the plurality of stored images (Shen, sect. 3.3 retrieval with large dataset).

Regarding claim 30, Lai in view of Shen and Lal disclose the method according to claim 28 wherein providing the input image directly as input to the deep neural network comprises preprocessing the input image into a format receivable by the deep neural network (p. 3272, sect. 3.1: training triplets of images are used to compare the similarity for mapping pairs).

Regarding claim 31, Lai in view of Shen and Lal disclose the method according to claim 30 wherein preprocessing the input image comprises at least one of: changing a size of the image (Lai, Table 1); changing to a selected bit depth; transforming to a selected color format; and performing image adjustments.

Regarding claim 32, Lai in view of Shen and Lal disclose the method according to claim 31 wherein changing the size of the input image comprises at least one of upsampling, downsampling, decimating, interpolating, padding and cropping of the input image (Lal, p. 87, sect. 4: cropping).


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TUANKHANH D PHAN whose telephone number is (571)270-3047.  The examiner can normally be reached on Mon-Fri, 10:00am-18:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hosain Alam can be reached on 571-272-3978.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 or 571-272-1000.
/TUANKHANH D PHAN/Examiner, Art Unit 2154