Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
References Cited in Prior Art and Double Patenting Rejections 
The following references are cited in the prior art rejections set forth below and are referred to as noted:
Tang et al., US 20120106854 A1, published on May 3, 2012, hereinafter Tang, 
Nister et al., "Scalable recognition with a vocabulary tree." In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), vol. 2, pp. 2161-2168. IEEE, 2006, hereinafter Nister,
Rodriguez Serrano, US 20140056520 A1, published on February 27, 2014, pages 1-7, hereinafter Rodriguez Serrano, 
Bober et al., US 20160267351 A1, published on September 15, 2016, filed on July 7, 2014, hereinafter Bober, 
Song, US 10565759 B2, issued on February 18, 2020, hereinafter ‘759, and  
Song, US 9721186 B2, issued on August 1, 2017, hereinafter ‘186.  
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person 

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 31-37, 39-45, 49-52 and 55-56 are rejected under 35 U.S.C. 103 as being unpatentable over Tang in view of Nister.
Regarding claim 31, Tang discloses an image recognition device (Tang: Abstract, Figs. 1A-B) comprising: 
at least one tangible, non-transitory, computer-readable memory configured to store a vocabulary of visual words each based on at least one descriptor relating to image information; (Tang: Fig. 1B and [0032-0033]) and 
at least one processor coupled with the at least one tangible, non-transitory computer-readable memory and, upon execution of image recognition software instructions, (Tang: Fig. 1B) is configured to: 
identify a plurality of local features of an image based on the vocabulary, the local features being represented by a plurality of local descriptors; (Tang: [0044-0046]. “For example, visual feature data can be obtained based on advanced invariant local features, such as using a scale-invariant feature transform (SIFT) in computer vision to detect and describe local features in images. … As another example, visual feature data can be obtained using a bag-of-features model in image retrieval.” ([0044]). “For each image of the collection of images in the database, and an image to be recognized, first dense local features are extracted and each feature is assigned a visual ID of the corresponding visual word.” ([0045]))
determine an associated visual word in the vocabulary for each one of the plurality of local descriptors; (Tang: [0045]. “In order to obtain the visual word vocabulary, an efficient feature clustering method can be used. For example, clustering methods like k-means or Expectation Maximization (EM) can be used. As another example, a clustering method that is scalable to a large number of images, such as fast k-means clustering, can be used to cluster a large number of features. In an example fast k-means clustering, each iteration of k-means is accelerated by building a random forest, a variation of kd-tree, on the cluster centers.”)
generate, prior to facilitating an image recognition search, a plurality of vectors for the image based on the associated visual words, wherein some of the plurality of vectors are generated using local descriptors corresponding to different versions of the image; (Tang: Fig. 3 and [0045-0046]. “The bag-of-features model is used to create a unique and compact digital signature or fingerprint for each image. … Then a visual word frequency vector can be built with each element as the number of features that are closest to that visual word.” ([0045]). “In order to incorporate spatial information within an image 305, the image 305 is further divided into subregions 310. For each subregion, a visual word frequency vector is computed by comparing the subregion to a codebook of image subregions. The codebook is populated by image subregions of training images having known event association. In the illustrated multiscale computation, a reduced scale version of the image 306 is also further divided into subregions and compared to the codebook to compute a visual word frequency vector for each subregion. Another further reduced scale version of the image 307 is also further divided into subregions and compared to the codebook to compute a visual word frequency vector for each subregion. The visual word frequency vectors for the subregions from the various multiscale computations are concatenated to form a frequency vector representation 320 for the image. The concatenated frequency vector representation is visual feature data for the image.” ([0046]))
store the plurality of vectors; (Tang: Fig. 1B and [0032-0033, 0046]) and 
facilitate the image recognition search Tang: [0043, 0047]. The image recognition search is facilitated by determining if the image is associated with a certain event. ([0058, 0064]))
Nister: Abstract, Sections 4-5. In particular, equation 3 “compares each of the stored plurality of vectors with a plurality of image signatures”, while equations 5-6 compute normalized differences (or distances).) 
It is noted that Tang refers to Nister as an example for image retrieval applications. (Tang: [0043]. “As another example, visual feature data can be obtained using a bag-of-features model in image retrieval. See, e.g., D. Nister et al., 2006, Scalable recognition with a vocabulary tree, IEEE CVPR, pages 2161-2168”.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Tang’s disclosure with Nister’s teachings by combining the image recognition device (from Tang) with the technique of image database search by comparing image signatures (from Nister) to yield no more than predictable use of prior art elements according to their established functions since all the claimed elements, which are taught by prior art references, would continue to operate in the same manner, particularly, the image recognition device would still work in the way according to Tang and the technique of image database search by comparing image signatures would continue to function as taught by Nister. In fact, the inclusion of Nister's technique of image database search by comparing image signatures would provide a practical application and an alternative implementation for the image recognition device from Tang (Tang: [0043]) and thus would enable a better and more flexible image recognition device.

Regarding claim 32, which depends on claim 31, Tang {modified by Nister} does not disclose explicitly wherein facilitating the image recognition search includes displaying an image recognition search result to a user, which is, however, well known and commonly practiced in the image processing art as evidenced by the further teachings from Nister. (Nister: Fig. 9.) 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Tang’s disclosure with Nister’s further teachings by incorporating the result displaying function into the device of Tang {modified by Nister} in order to enable the user to see the retrieved image resulted from user’s image retrieval action. One of ordinary skill in the art would motivated to make the combination since the inclusion of the result displaying function into the image recognition device would provide a practical implementation of the device from Tang {modified by Nister} to enable its application in the image retrieval.
Regarding claim 33, Tang {modified by Nister} discloses the device of claim 31, wherein a first vector of the plurality of vectors corresponds to a first version of the image, and wherein at least one of the different versions of the image is centered at a same pixel location as the first version of the image. (Tang: [0046]. At least a reduced scale version of the image (306 or 307 in Fig. 3) is centered at a same pixel location as the original image (305 in Fig. 3).)
Regarding claim 34, Tang {modified by Nister} discloses the device of claim 31, wherein a first vector of the plurality of vectors corresponds to a first version of the Tang: [0046]. A subregion of the image 305 is interpreted as a different version of the image.)
Regarding claim 35, Tang {modified by Nister} discloses the device of claim 31, wherein the at least one processor is further configured to: select at least one additional pixel location of the image; and generate one or more of the plurality of vectors for the image using local descriptors corresponding to versions of the image centered at the at least one additional pixel location. (Tang: [0046]. At least one subregion of the image 305 is centered at an additional location of the image. Dividing the image into subregions is interpreted as selecting additional pixel locations of the image.)
Regarding claim 36, Tang {modified by Nister} discloses the device of claim 31, wherein the at least one processor is further configured to determine at least one of the different versions of the image in order to focus a vector around one or more features or objects in the image. (Tang: [0046]. “In order to incorporate spatial information within an image 305, the image 305 is further divided into subregions 310.”)
Regarding claim 37, Tang {modified by Nister} discloses the device of claim 31, wherein the at least one processor is further configured to use the different versions of the image to identify different regions of the image. (Tang: [0046]. A subregion of the image 305 is interpreted as a different version of the image.)
Regarding claim 39, Tang {modified by Nister} discloses the device of claim 31, wherein one of the plurality of vectors is generated using local descriptors corresponding to an uncropped version of the image. (Tang: [0046]. At least the scaled down version (307 in Fig. 3) is an uncropped version of the image.)
Regarding claim 40, Tang {modified by Nister} discloses the device of claim 31, wherein the image is a digital representation of at least one of a medical patient, a face, and biological material. (Tang: a face in the image shown in Fig. 3.)
Regarding claim 41, Tang {modified by Nister} discloses the device of claim 31, wherein the image is one of a query image or a document image. (Nister: first paragraph of Section 4.)
Regarding claim 42, Tang {modified by Nister} discloses the device of claim 31, wherein the dataset comprises vectors corresponding to each of a plurality of images. (Tang: [0046-0047].) (Nister: Section 4.)
Regarding claim 43, Tang {modified by Nister} discloses the device of claim 31, wherein the dataset includes digital representations of an object at different scales. (Tang: [0027].) (Nister: Fig. 5.)
Regarding claim 44, Tang {modified by Nister} discloses the device of claim 31, wherein the dataset comprises at least one of a medical training image, a video frame, and a test library of medical record images. (Tang: images captured by video camera as in [0012] and video content as in [0028].)
Regarding claim 45, Tang {modified by Nister} discloses the device of claim 31, wherein the at least one processor is further configured to: divide the image into k cluster centers; and assign each one of the plurality of local descriptors to a closest one of the P cluster centers. (Tang: [0045-0046].)
Regarding claim 49, Tang {modified by Nister} discloses the device of claim 31, wherein the at least one processor is further configured to generate the plurality of vectors based on one or more features or objects located at an off-center location in the Tang: [0046]. At least some features in some subregions are located at an off-center location.)
Regarding claim 50, Tang {modified by Nister} discloses the device of claim 31, wherein the at least one processor is further configured to generate the plurality of vectors based on one or more different scaled features or objects in the image. (Tang: [0046].)
Regarding claim 51, Tang {modified by Nister} discloses the device of claim 31, wherein a distance measurement between vectors representing images containing an object in common is less than a distance measurement between vectors representing images that do not contain an object in common. (Nister: Sections 4-5. The normalized differences in equations 3-5 are interpreted as the claimed “distance”. See also discussions in the Results section 6.)
Regarding claim 52, Tang {modified by Nister} discloses the device of claim 31, wherein the plurality of local descriptors are one of scale-invariant feature transform (SIFT) descriptors, Fast Retina Keypoint (FREAK) descriptors, Histograms of Oriented Gradient (HOG) descriptors, Speeded Up Robust Features (SURF) descriptors, DAISY descriptors, Binary Robust Invariant Scalable Keypoints (BRISK) descriptors, FAST Application Serial No.: Not Yet Assigned Attorney Docket No.: 10077-2000502 descriptors, Binary Robust Independent Elementary Features (BRIEF) descriptors, Harris Corners descriptors, Edges descriptors, Gradient Location and Orientation Histogram (GLOH) descriptors, Electrooculography (EOG) descriptors or Transform Invariant Low-rank Textures (TILT) descriptors. (Tang: [0044])
Claims 55-56 are the method and computer readable medium (Tang: Fig. 1B and [0032-0033]) claims, respectively, corresponding to the device claim 31. Therefore, since claims 55-56 are similar in scope to claim 31, claims 55-56 are rejected on the same grounds as claim 31.
Claim 38 is rejected under 35 U.S.C. 103 as being unpatentable over Tang {modified by Nister} as applied to claim 31 and further in view of Rodriguez Serrano.
Regarding claim 38, which depends on claim 31, Tang {modified by Nister} does not disclose explicitly wherein at least one aspect of the different versions of the image is user-selectable, which is however well known and commonly practiced in the field of image recognition as evidenced by the prior art reference of Rodriguez Serrano. (Rodriguez Serrano: [0067] (“Generally, the segmentation maps are manually or semi-automatically generated.”))
In light of the above discussions, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Tang’s {modified by Nister} disclosure with Rodriguez Serrano’s teachings by combining the image recognition device (from Tang {modified by Nister}) with the technique to generate user-selectable segmentation maps (from Rodriguez Serrano) to yield no more than predictable use of prior art elements according to their established functions since all the claimed elements, which are taught by prior art references, would continue to operate in the same manner, particularly, the image recognition device would still work in the way according to Tang {modified by Nister} and the technique to generate user-selectable segmentation maps would also function in the same way as taught by Rodriguez Serrano. One of ordinary skill in the art would be motivated to make such a 
Therefore, it would have been obvious to combine Tang {modified by Nister} with Rodriguez Serrano to obtain the invention as specified in claim 38.
Claims 46-48 and 53-54 are rejected under 35 U.S.C. 103 as being unpatentable over Tang {modified by Nister} as applied to claims 45 and 31 and further in view of Bober.
Regarding claim 46, Tang {modified by Nister} discloses the device of claim 45, but does not disclose explicitly wherein a global signature is a k * 128 dimensional vector of locally aggregated descriptors (VLAD) global signature, which is however well known and commonly practiced in the field of image recognition as evidenced by the prior art reference of Bober. (Bober: [0008] (“An experimental evaluation of the performance of the state-of-the art, including BOW, VLAD and Fisher Kernels are … summarised in the table below, referred to as BoW, VLAD and GMM-FK respectively.” “The results show the best performance reported for image descriptors expressed as 2048 dimensional vectors”, and 2048 = 16 * 128, so k=16))
In light of the above discussions, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Tang’s {modified by Nister} disclosure with Bober’s teachings by combining the image recognition device (from Tang {modified by Nister}) with the k*128 dimensional VLAD global signature (from Bober) to yield no more than predictable use of prior art elements according to their established functions since all the claimed elements, which are taught 
Therefore, it would have been obvious to combine Tang {modified by Nister} with Bober to obtain the invention as specified in claim 46.
Regarding claim 47, Tang {modified by Nister and Bober} discloses the device of claim 46, wherein the global signature is a 2048 dimensional VLAD global signature. (Bober: [0008])
Regarding claim 48, Tang {modified by Nister and Bober} discloses the device of claim 31, wherein the plurality of global signatures includes at least one of a VLAD signature, a GIST signature or a Deep Learning signature. (Bober: [0008])
The reasoning and motivation for combining are the same as for claim 46.
Regarding claim 53, Tang {modified by Nister} discloses the device of claim 31, but does not disclose explicitly wherein a dictionary is used to determine the associated visual word in the vocabulary with each of the plurality of local descriptors, which is however well known and commonly practiced in the field of image recognition as evidenced by the prior art reference of Bober. (Bober: [0007] (“Similarly as in BOW, a codebook is first computed off-line using k-means algorithm and each local descriptor is assigned to its nearest visual word.”) Here the claimed “dictionary” is interpreted as the disclosed “codebook”.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Tang’s {modified by Nister} disclosure with Bober’s teachings by combining the image recognition device (from Tang {modified by Nister}) with the codebook (from Bober) to yield no more than predictable use of prior art elements according to their established functions since all the claimed elements, which are taught by prior art references, would continue to operate in the same manner, particularly, the image recognition device would still work in the way according to Tang {modified by Nister} and the codebook would also function in the same way as taught by Bober. One of ordinary skill in the art would be motivated to make such a modification to provide a convenient way to identify a corresponding visual word for any given local descriptor.
Therefore, it would have been obvious to combine Tang {modified by Nister} with Bober to obtain the invention as specified in claim 53.
Regarding claim 54, Tang {modified by Nister and Bober} discloses the device of claim 53, wherein the dictionary comprises a VLAD dictionary in which a plurality of visual words is based on descriptors determined from a training sample of model images. (Rodriguez Serrano: [0132]) (Bober: [0007])
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent 
A timely filed terminal disclaimer in compliance with 37 C.F.R. § 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the conflicting application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. 
Effective January 1, 1994, a registered attorney or agent of record may sign a terminal disclaimer. A terminal disclaimer signed by the assignee must fully comply with 37 C.F.R. § 3.73(b).
Claims 31-56 are rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claims 1-26 of US 10565759 B2.  Although the conflicting claims are not identical, they are not patentably distinct from each other because the claims of the instant application are anticipated by the claims of the ‘759 patent.
col. 14, line 50) comprising: at least one tangible, non-transitory, computer-readable memory configured to store a vocabulary of visual words each based on at least one descriptor relating to image information; (col. 14, lines 51-55) and at least one processor coupled with the at least one tangible, non-transitory computer-readable memory and, upon execution of image recognition software instructions, is configured to: (col. 14, lines 56-59) identify a plurality of local features of an image based on the vocabulary, the local features being represented by a plurality of local descriptors; (col. 14, lines 60-62) determine an associated visual word in the vocabulary for each one of the plurality of local descriptors; (col. 14, lines 63-64) generate, prior to facilitating an image recognition search, a plurality of vectors for the image based on the associated visual words, wherein some of the plurality of vectors are generated using local descriptors corresponding to different versions of the image; (col. 14, line 65 to col. 15, line 3) store the plurality of vectors; (col. 15, line 4) and facilitate the image recognition search to search a dataset, wherein the image recognition search compares each of the stored plurality of vectors with a plurality of image signatures, (col. 15, lines 5-8) so that the invention defined by claim 31 of the instant application is fully anticipated by claim 1 of the '759 patent.  
Furthermore, the additional requirements variously set forth in claims 32-54 of the instant application are variously stipulated by corresponding limitations set forth in claims 2-24 of the '759 patent, so that claims 32-54 of the instant application are also fully anticipated by claims 2-24 of the '759 patent.

Claims 31 and 55-56 are rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claims 28, 1 and 29 of US 9721186 B2 in view of Nister.
With respect to claim 31 of the instant application, claim 28 of the ‘186 patent stipulates an image recognition device (col. 16, lines 34-35) comprising: at least one tangible, non-transitory, computer-readable memory configured to store a vocabulary of visual words each based on at least one descriptor relating to image information; (col. 16, lines 37-38 and 43-45. A memory configured to store a vocabulary as claimed in the instant application is implied by the disclosed feature of to “obtain a vocabulary” since the obtained vocabulary is to be used later for calculations implying that the obtained vocabulary is stored and the claim 28 of the ‘186 patent also recites “a main memory device”) and at least one processor coupled with the at least one tangible, non-transitory computer-readable memory and, upon execution of image recognition software instructions, is configured to: (col. 16, lines 36 and 39-42) identify a plurality of local features of an image based on the vocabulary, the local features being represented by a plurality of local descriptors; (col. 16, lines 46-48) determine an associated visual word in the vocabulary for each one of the plurality of local descriptors; (col. 16, lines 49-50) generate, prior to facilitating an image recognition col. 16, lines 55-61) store the plurality of vectors; (col. 15, line 4) and facilitate the image recognition search to search a dataset, col. 16, lines 62-64)  
The claim 28 of the ‘186 patent does not disclose explicitly but Nister teaches, in the same field of endeavor of image recognition and retrieval, facilitate the image recognition search to search a dataset, wherein the image recognition search compares each of the stored plurality of vectors with a plurality of image signatures. (Nister: Abstract, Sections 4-5. In particular, equation 3 “compares each of the stored plurality of vectors with a plurality of image signatures”, while equations 5-6 compute normalized differences (or distances).) 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the claim 28 of the ‘186 patent’s disclosure with Nister’s teachings by combining the image recognition device (from the claim 28 of the ‘186 patent) with the technique of image database search by comparing image signatures (from Nister) to yield no more than predictable use of prior art elements according to their established functions since all the claimed elements, which are taught by prior art references, would continue to operate in the same manner, particularly, the image recognition device would still work in the way according to the claim 28 of the ‘186 patent and the technique of image database search by comparing image signatures would continue to function as taught by Nister. In fact, the inclusion of 
Therefore, it would have been obvious to combine the claim 28 of the ‘186 patent with Nister to obtain the invention as specified in claim 31.
In a similar way, the method claim 55 and computer readable medium claim 56 of the instant application are also rejected, respectively, by the corresponding method claim 1 and computer readable medium claim 29 of the ‘186 patent in view of Nister.
The dependent claims 32-54 are rejected as being obvious over the claim 29 of the ‘186 patent in view of the art of record relied upon in the rejections above, as applied to the claims above.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Nister et al. (US 20070214172 A1): “The novel hierarchical TF-IDF scoring uses hierarchically defined ` visual words` to build a novel vocabulary tree, i.e., hierarchically organized quantizer, Q, at 10, 30, applied in connection with novel image-insertion and image-query stages (respectively at 110 and 120 in FIG. 12). This allows efficient lookup (match 128, FIG. 12) of visual words, permitting use of a larger vocabulary (or database of hierarchically organized feature vectors), shown to result in a significant improvement of retrieval quality over conventional image retrieval techniques.”. (Fig. 12 and [0067])

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached on (571) 272-7409.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/FENG NIU/Primary Examiner, Art Unit 2669