DETAILED ACTION
Claims 1-6, 13-19 and 21-25 are currently pending.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
It is acknowledged that the application is a national stage entry of PCT/CN2018/097008 filed 25 July 2018 which claims foreign priority to 2017106322446.X filed 28 July 2017.  Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 21 January 2020; 7 July 2020; 30 October 2020; 21 May 2021; and 21 September 2021 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Specification
The abstract of the disclosure is objected to because the abstract is in claim form instead of narrative form.  Correction is required.  See MPEP § 608.01(b).
Applicant is reminded of the proper content of an abstract of the disclosure.
A patent abstract is a concise statement of the technical disclosure of the patent and should include that which is new in the art to which the invention pertains. The abstract should not refer to purported merits or speculative applications of the invention and should not compare the invention with the prior art.
If the patent is of a basic nature, the entire technical disclosure may be new in the art, and the abstract should be directed to the entire disclosure. If the patent is in the nature of an improvement in an old apparatus, process, product, or composition, the abstract should include the technical disclosure of the improvement. The abstract should also mention by way of example any preferred modifications or alternatives. 
Where applicable, the abstract should include the following: (1) if a machine or apparatus, its organization and operation; (2) if an article, its method of making; (3) if a chemical compound, its identity and use; (4) if a mixture, its ingredients; (5) if a process, the steps.
Extensive mechanical and design details of an apparatus should not be included in the abstract. The abstract should be in narrative form and generally limited to a single paragraph within the range of 50 to 150 words in length.
See MPEP § 608.01(b) for guidelines for the preparation of patent abstracts.

Claim Objections
Claims 2 and 13-18 are objected to because of the following informalities:  
Claim 2 recites “wherein acquiring target features …” whereas claim 1 states “obtaining target features.”  It is requested that the same term of “obtaining” be utilized in order to maintain consistency in the language used.
Claim 13, line 4 ends in a period instead of a semi-colon.
Claims 14-18 recite “The apparatus of claim 13.”  Claim 13 recites an electronic device.  Claims 14-18 should also recite “The electronic device” in order to provide antecedent basis and to maintain consistent terminology.  
Appropriate correction is required.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claim(s) 3, 6, 15, 18, 22 and 25 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Patent No 10,810,252 to Kerr et al (hereafter Kerr).

Referring to claim 1, Kerr discloses an image retrieval method, comprising: 
acquiring a query image (see column 7, lines 41-43; column 9, lines 22-23 and column 12, lines 47-48 – A user initially submits an image to the visual similarity engine via a user device.); 
determining a target feature of the query image based on a pre-trained deep neural network (see column 4, lines 6-21; column 8, lines 28-49; column 9, lines 48-54; and column 12, lines 48-51 – Each of the one or more images comprises one or more attributes that may be identified by a neural network.); 
wherein the deep neural network is obtained by training according to sample images and predetermined features that are able to form the target feature and correspond to the sample images (see column 4, lines 6-21 – Deep neural networks are utilized to extract attributes of images, for example as a feature vector. Training images may be utilized to implement a generic system initially that identifies visual similarity generally, but without any understanding of specific attributes. The generic system may then be trained with a new set of training data for a specific attribute.); 
obtaining target features of a plurality of images to be retrieved (see column 8, line 50 – column 9, line 14; column 10, lines 46-56; and column 12, lines 51-54 – Classifier component may be perform the search by implementing the same deep neural networks to extract attributes from a set of images, such as image database.); 
calculating a similarity between the target feature of the query image and the target feature of each image to be retrieved (see column 5, line 11 – column 6, line 16; column 7, lines 43-48; column 12, lines 51-54 – The neural network may compare feature vectors corresponding visual-based query to feature vectors in the set of images to identify image results based on visual similarity.); and 
determining a retrieval image corresponding to the query image from the plurality of images to be retrieved according to the calculated similarities (see column 7, lines 48-50; column 13, lines 4-7 – The visual similarity engine can the return, to the user device, at least a portion of the set of result images as an image results set.).
Referring to claim 2, Kerr discloses the method according to claim 1, wherein acquiring target features of a plurality of images to be retrieved comprises: 
obtaining the target features of the plurality of images to be retrieved stored in a preset database (see column 8, line 65 – column 9, line 1); or 
determining the target features of the plurality of images to be retrieved based on the pre-trained deep neural network (see column 10, lines 46-56 – Classifier component may be perform the search by implementing the same deep neural networks to extract attributes from a set of images, such as image database.).
Referring to claim 4, Kerr discloses the method according to claim 1, wherein the predetermined features are global features and the target feature is a global feature (see column 2, lines 16-37); and 
determining a target feature of the query image based on a pre-trained deep neural network comprises: inputting the query image into a pre-trained third deep neural network to obtain a global feature of the query image (see column 4, lines 6-21; column 8, lines 28-49; column 9, lines 48-54; and column 12, lines 48-51 – Each of the one or more images comprises one or more attributes that may be identified by a neural network.), wherein the third deep neural network is obtained by training according to the sample images and global features corresponding to the sample images (see column 4, lines 6-21 – Deep neural networks are utilized to extract attributes of images, for example as a feature vector. Training images may be utilized to implement a generic system initially that identifies visual similarity generally, but without any understanding of specific attributes. The generic system may then be trained with a new set of training data for a specific attribute.).
Referring to claim 5, Kerr discloses the method according to claim 1, wherein determining a retrieval image corresponding to the query image from the plurality of images to be retrieved according to the calculated similarities comprises: 
sorting the calculated similarities, and determining the retrieval image corresponding to the query image from the plurality of images to be retrieved according to results of the sorting (see column 11, lines 5-9 – Classifier component 112 may classify images based on each individual selection received by selection component. Results may then identify the images in the results sets based on an average score across all selections. Results component may rank images in the results set based on the selected weights.); or 
determining a target image to be retrieved among the plurality of images to be retrieved as the retrieval image corresponding to the query image, wherein the target image to be retrieved is an image to be retrieved with a similarity greater than a predetermined similarity threshold.
Referring to claim 13, Kerr discloses an electronic device, comprising: 
a processor [one or more processors 814], communication interfaces [input/output components 820], a memory [memory 812], and a communication bus [bus 810], wherein the processor, the communication interfaces, and the memory communicate with each other via the communication bus (see Fig 8 and column 15, line 24 – column 16, line 45); 
the memory is configured to store a computer program (see column 16, lines 10-21);  
the processor is configured for executing a program stored in the memory to perform the following operations (see column 16, lines 10-21): 
acquiring a query image (see column 7, lines 41-43; column 9, lines 22-23 and column 12, lines 47-48 – A user initially submits an image to the visual similarity engine via a user device.); 
determining a target feature of the query image based on a pre-trained deep neural network (see column 4, lines 6-21; column 8, lines 28-49; column 9, lines 48-54; and column 12, lines 48-51 – Each of the one or more images comprises one or more attributes that may be identified by a neural network.); 
wherein the deep neural network is obtained by training according to sample images and predetermined features that are able to form the target feature and correspond to the sample images (see column 4, lines 6-21 – Deep neural networks are utilized to extract attributes of images, for example as a feature vector. Training images may be utilized to implement a generic system initially that identifies visual similarity generally, but without any understanding of specific attributes. The generic system may then be trained with a new set of training data for a specific attribute.); 
obtaining target features of a plurality of images to be retrieved (see column 8, line 50 – column 9, line 14; column 10, lines 46-56; and column 12, lines 51-54 – Classifier component may be perform the search by implementing the same deep neural networks to extract attributes from a set of images, such as image database.); 
calculating a similarity between the target feature of the query image and the target feature of each image to be retrieved (see column 5, line 11 – column 6, line 16; column 7, lines 43-48; column 12, lines 51-54 – The neural network may compare feature vectors corresponding visual-based query to feature vectors in the set of images to identify image results based on visual similarity.); and 
determining a retrieval image corresponding to the query image from the plurality of images to be retrieved according to the calculated similarities (see column 7, lines 48-50; column 13, lines 4-7 – The visual similarity engine can the return, to the user device, at least a portion of the set of result images as an image results set.).
Referring to claim 14, Kerr discloses the apparatus according to claim 13, wherein, acquiring target features of a plurality of images to be retrieved comprises: obtaining the target features of the plurality of images to be retrieved stored in a preset database (see column 8, line 65 – column 9, line 1); or determining the target features of the plurality of images to be retrieved based on the pre-trained deep neural network (see column 10, lines 46-56 – Classifier component may be perform the search by implementing the same deep neural networks to extract attributes from a set of images, such as image database.).
Referring to claim 16, Kerr discloses the apparatus according to claim 13, wherein the predetermined features are global features and the target feature is a global feature (see column 2, lines 16-37); and 
determining a target feature of the query image based on a pre-trained deep neural network comprises: inputting the query image into a pre-trained third deep neural network to obtain a global feature of the query image (see column 4, lines 6-21; column 8, lines 28-49; column 9, lines 48-54; and column 12, lines 48-51 – Each of the one or more images comprises one or more attributes that may be identified by a neural network.), wherein the third deep neural network is obtained by training according to the sample images and global features corresponding to the sample images (see column 4, lines 6-21 – Deep neural networks are utilized to extract attributes of images, for example as a feature vector. Training images may be utilized to implement a generic system initially that identifies visual similarity generally, but without any understanding of specific attributes. The generic system may then be trained with a new set of training data for a specific attribute.).
Referring to claim 17, Kerr discloses the apparatus according to claim 13, wherein determining a retrieval image corresponding to the query image from the plurality of images to be retrieved according to the calculated similarities comprises: sorting the calculated similarities, and determining the retrieval image corresponding to the query image from the plurality of images to be retrieved according to results of the sorting (see column 11, lines 5-9 – Classifier component 112 may classify images based on each individual selection received by selection component. Results may then identify the images in the results sets based on an average score across all selections. Results component may rank images in the results set based on the selected weights.); or determining a target image to be retrieved among the plurality of images to be retrieved as the retrieval image corresponding to the query image, wherein the target image to be retrieved is an image to be retrieved with a similarity greater than a predetermined similarity threshold.
Referring to claim 19, Kerr discloses a non-transitory computer readable storage medium with executable codes stored thereon, wherein the executable codes are executed to implement an image retrieval method, wherein the image retrieval method, comprises: 
acquiring a query image (see column 7, lines 41-43; column 9, lines 22-23 and column 12, lines 47-48 – A user initially submits an image to the visual similarity engine via a user device.); 
determining a target feature of the query image based on a pre-trained deep neural network (see column 4, lines 6-21; column 8, lines 28-49; column 9, lines 48-54; and column 12, lines 48-51 – Each of the one or more images comprises one or more attributes that may be identified by a neural network.); 
wherein the deep neural network is obtained by training according to sample images and predetermined features that are able to form the target feature and correspond to the sample images (see column 4, lines 6-21 – Deep neural networks are utilized to extract attributes of images, for example as a feature vector. Training images may be utilized to implement a generic system initially that identifies visual similarity generally, but without any understanding of specific attributes. The generic system may then be trained with a new set of training data for a specific attribute.); 
obtaining target features of a plurality of images to be retrieved (see column 8, line 50 – column 9, line 14; column 10, lines 46-56; and column 12, lines 51-54 – Classifier component may be perform the search by implementing the same deep neural networks to extract attributes from a set of images, such as image database.); 
calculating a similarity between the target feature of the query image and the target feature of each image to be retrieved (see column 5, line 11 – column 6, line 16; column 7, lines 43-48; column 12, lines 51-54 – The neural network may compare feature vectors corresponding visual-based query to feature vectors in the set of images to identify image results based on visual similarity.); and 
determining a retrieval image corresponding to the query image from the plurality of images to be retrieved according to the calculated similarities (see column 7, lines 48-50; column 13, lines 4-7 – The visual similarity engine can the return, to the user device, at least a portion of the set of result images as an image results set.).
Referring to claim 21, Kerr discloses the non-transitory computer readable storage medium according to claim 19, wherein acquiring target features of a plurality of images to be retrieved comprises: obtaining the target features of the plurality of images to be retrieved stored in a preset database (see column 8, line 65 – column 9, line 1); or determining the target features of the plurality of images to be retrieved based on the pre-trained deep neural network (see column 10, lines 46-56 – Classifier component may be perform the search by implementing the same deep neural networks to extract attributes from a set of images, such as image database.).
of interest; and aggregating the target region-of-interest features into the target feature of the query image.
Referring to claim 23, Kerr discloses the non-transitory computer readable storage medium according to claim 19, wherein the predetermined features are global features and the target feature is a global feature (see column 2, lines 16-37); and 
determining a target feature of the query image based on a pre-trained deep neural network comprises: inputting the query image into a pre-trained third deep neural network to obtain a global feature of the query image (see column 4, lines 6-21; column 8, lines 28-49; column 9, lines 48-54; and column 12, lines 48-51 – Each of the one or more images comprises one or more attributes that may be identified by a neural network.), wherein the third deep neural network is obtained by training according to the sample images and global features corresponding to the sample images (see column 4, lines 6-21 – Deep neural networks are utilized to extract attributes of images, for example as a feature vector. Training images may be utilized to implement a generic system initially that identifies visual similarity generally, but without any understanding of specific attributes. The generic system may then be trained with a new set of training data for a specific attribute.).
Referring to claim 24, Kerr discloses the non-transitory computer readable storage medium according to claim 19, wherein determining a retrieval image corresponding to the query image from the plurality of images to be retrieved according to the calculated similarities comprises: sorting the calculated similarities, and determining the retrieval image corresponding to the query image from the plurality of images to be retrieved according to results of the sorting (see column 11, lines 5-9 – Classifier component 112 may classify images based on each individual selection received by selection component. Results may then identify the images in the results sets based on an average score across all selections. Results component may rank images in the results set based on the selected weights.); or determining a target image to be retrieved among the plurality of images to be retrieved as the retrieval image corresponding to the query image, wherein the target image to be retrieved is an image to be retrieved with a similarity greater than a predetermined similarity threshold.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 3, 6, 15, 18, 22 and 25 is/are rejected under 35 U.S.C. 103 as being unpatentable over US Patent No 10,810,252 to Kerr et al as applied to claims 1, 15 and 22 above, and further in view of US PGPub 2017/0206431 to Sun et al (hereafter Sun).

Referring to claims 3, 15 and 22, Kerr teaches wherein, the predetermined features are features of regions of interest, and the target feature is a feature aggregated with the features of the regions of interest [feature vector] (see column 5, line 35 – column 6, line 17 and column 8, line 65 – column 9, line 14 - the concept of using semantic similarity and training a neural network using certain shapes and relationships between the shapes); and determining a target feature of the query image based on a pre-trained deep neural network (see column 4, lines 6-21; column 8, lines 28-49; column 9, lines 48-54; and column 12, lines 48-51 – Each of the one or more images comprises one or more attributes that may be identified by a neural network.); and aggregating the target region-of-interest features into the target feature of the query image (see column 5, line 35 – column 6, line 17; column 8, line 65 – column 9, line 14; column 4, lines 6-21; column 8, lines 28-49; column 9, lines 48-54; and column 12, lines 48-51).  While Kerr teaches the concept of using semantic similarity and training a neural network using certain shapes and relationships between the shapes (see column 5, line 35 – column 6, line 17), Kerr utilizes a single neural network and therefore fails to explicitly teach the further limitations of inputting the query image into a pre-trained first deep neural network to obtain a target region of interest of the query image, wherein the first deep neural network is obtained by training according to the sample images and regions of interest corresponding to the sample images; inputting the target region of interest into a pre-trained second deep neural network to obtain a target region-of-interest feature of the target region of interest, wherein the second deep neural network is obtained by training according to the regions of interest and region-of-interest features of the regions of interest; and aggregating the target region-of-interest features into the target feature of the query image.
Sun teaches the analysis of images to extract features using a neural network, including the further limitations of 
inputting the query image into a pre-trained first deep neural network [Deep CNN] to obtain a target region of interest of the query image, wherein the first deep neural network is obtained by training according to the sample images and regions of interest corresponding to the sample images (see [0045]; [0049]; [0059]; [0075]; and [0089]-[0091]); 
inputting the target region of interest into a pre-trained second deep neural network [RPN] to obtain a target region-of-interest feature of the target region of interest, wherein the second deep neural network is obtained by training according to the regions of interest and region-of-interest features of the regions of interest (see [0045]; [0060]; [0075]; and [0089]-[0091]); and 
aggregating the target region-of-interest features into the target feature of the query image (see [0061]; [0062]; [0075]; and [0089]-[0091]).
Kerr teaches the concept of a single neural network with a plurality of layers.  Sun teaches the concept of utilizing a first neural network which feeds data into a second neural network.  Both Kerr and Sun are analogous art in that they both extract features/attributes of images utilizing neural networks.  It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to replace the single neural network of Kerr with two separate neural networks as disclosed by Sun.  One would have been motivated to do so in order to increase the accuracy and speed while decreasing the hardships of object detection through the use of multiple neural networks (see Sun: [0001] and Kerr: column 1, line 48 – column 2, line 6).
Referring to claim 6, the combination of Kerr and Sun (hereafter Kerr/Sun) teaches the method according to claim 3, wherein, after obtaining the target region of interest of the query image, the method further comprises: outputting position information of the target region of interest (Sun: see [0045]; [0060]; [0075]; and [0089]-[0091]).
Referring to claim 18, Kerr/Sun teaches the apparatus according to claim 15, wherein the processor is further configured to output position information of the target region of interest after obtaining the target region of interest of the query image (Sun: see [0045]; [0060]; [0075]; and [0089]-[0091]).
Referring to claim 25, Kerr/Sun teaches the non-transitory computer readable storage medium according to claim 22, wherein, after obtaining the target region of interest of the query image, the method further comprises: outputting position information of the target region of interest (Sun: see [0045]; [0060]; [0075]; and [0089]-[0091]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US PGPub 2020/0151577 to Ogawa et al  - Paragraph [0243] teaches the use of a first neural network which feeds the results to a second neural network. 

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KIMBERLY LOVEL WILSON whose telephone number is (571)272-2750. The examiner can normally be reached 8-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Robert Beausoliel can be reached on 571-272-3645. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KIMBERLY L WILSON/Primary Examiner, Art Unit 2167