DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-20 are pending.
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Information Disclosure Statement
The information disclosure statements (IDS) submitted on 7/13/2020 and 1/28/2021 are being considered by the examiner.
Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference character(s) not mentioned in the description: second sample images 426 and instruction feature vector 432 in Fig. 4.  Corrected drawing sheets in compliance with 37 CFR 1.121(d), or amendment to the specification to add the reference character(s) in the description in compliance with 37 CFR 1.121(b) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office 
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 8 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over He et al. (CN 103020173 A, see attached machine translation) in view of Mao (CN 105354579 A, see attached machine translation).
	Regarding claim 1, He et al. teaches, a method for recognizing a target object in a target image, the method comprising (Abstract: a video image information searching method and a video image information searching system for a mobile terminal…The method includes the steps as follows: a terminal receives an interested image area in video files input by a user, obtains image objects in the interested image area; Note: the image objects in the interested image area are the target objects); 
obtaining, by a device comprising a memory storing instructions and a processor in communication with the memory, an image recognition instruction, the image recognition instruction carrying object identification information used for indicating a target object in a target image (Para. 0030: any process or method description in the flowchart or described in other ways herein can be understood as a module, segment or part of code that includes one or more executable instructions for implementing specific logical functions or steps of the process; 
 obtaining, by the device, an instruction feature vector matching the image recognition instruction (Para. 0100: specifically, the mobile terminal 100 is used to receive auxiliary information input by the user, where the auxiliary information includes indicating whether the image object is a first object or a second object, where the first object includes a person and an animal, and the second object includes a rigid object, and the auxiliary information When indicating that the image object is the first object, use biometric recognition technology to extract the gabor feature value of the image object; Note: a gabor feature value of the image object (i.e. feature vector) is extracted or obtained based on the auxiliary information (for 
obtaining, by the device, an image feature vector set matching the target image, the image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale, and i being a positive integer; 
and recognizing, by the device, the target object from the target image according to the instruction feature vector and the image feature vector set.
He et al. does not expressly disclose the following limitations underlined above: obtaining, by the device, an image feature vector set matching the target image, the image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale, and i being a positive integer; and recognizing, by the device, the target object from the target image according to the instruction feature vector and the image feature vector set.
However, Mao teaches, obtaining, by the device, an image feature vector set matching the target image, the image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale, and i being a positive integer (Para. 0002: the present invention relates to the field of image processing technology, in particular to a method and device for feature detection; Para. 0012: performing scaling processing on the video image to obtain S video images of different scales; Para. 0013: use HOG feature descriptors to perform feature extraction on each of the S video images of different scales, so as to extract M1 feature values from the video images of each scale; Para. 0014: the M1 feature values respectively extracted from the S video images of different scales are combined into a feature set containing M1*S feature values to obtain M feature values, th scale. The M1 feature values extracted from the S video images of different scales are the ith image feature vectors); 
and recognizing, by the device, the target object from the target image according to the instruction feature vector and the image feature vector set (Para. 0014: the M1 feature values respectively extracted from the S video images of different scales are combined into a feature set containing M1*S feature values to obtain M feature values, M=M1*S; Para. 0067: the feature vector corresponding to the GMM model is a high-dimensional Fisher Vector feature vector. This feature vector is used as the feature vector of a video image (video image including a human face), and the feature vector can be used for feature detection, such as for human Follow-up operations such as face recognition; Note: the target object is the human face in the video image. The image feature set M containing M1*S feature values is the image feature vector set and the M1 feature values are the instruction feature vectors).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to include obtaining an image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale as taught by Mao into the object recognition of He et al. in order to improve recognition rate (Mao, Para. 0060).
Regarding claim 8, He et al. teaches, an apparatus for recognizing a target object in a target image, the apparatus comprising (Abstract: a video image information searching method and a video image information searching system for a mobile terminal…The method includes the steps as follows: a terminal receives an interested image area in video files input by a user, 
a memory storing instructions (Abstract: a video image information searching method and a video image information searching system for a mobile terminal; Para. 0030: any process or method description in the flowchart or described in other ways herein can be understood as a module, segment or part of code that includes one or more executable instructions for implementing specific logical functions or steps of the process; Note: the mobile terminal includes a memory that stores the instructions or code for processing of the image); 
and a processor in communication with the memory, wherein, when the processor executes the instructions, the processor is configured to cause the apparatus to (Abstract: a video image information searching method and a video image information searching system for a mobile terminal; Para. 0030: any process or method description in the flowchart or described in other ways herein can be understood as a module, segment or part of code that includes one or more executable instructions for implementing specific logical functions or steps of the process; Note: the mobile terminal is an apparatus that includes a memory and processor that carries out instructions or code stored in the memory (i.e. is connected to the memory) for processing of the image): 
obtain an image recognition instruction, the image recognition instruction carrying object identification information used for indicating a target object in a target image (Para. 0100: specifically, the mobile terminal 100 is used to receive auxiliary information input by the user, where the auxiliary information includes indicating whether the image object is a first 
obtain an instruction feature vector matching the image recognition instruction (Para. 0100: specifically, the mobile terminal 100 is used to receive auxiliary information input by the user, where the auxiliary information includes indicating whether the image object is a first object or a second object, where the first object includes a person and an animal, and the second object includes a rigid object, and the auxiliary information When indicating that the image object is the first object, use biometric recognition technology to extract the gabor feature value of the image object; Note: a gabor feature value of the image object (i.e. feature vector) is extracted or obtained based on the auxiliary information (for biometric recognition i.e. image recognition instruction)), 
obtain an image feature vector set matching the target image, the image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale, and i being a positive integer, 
and recognize the target object from the target image according to the instruction feature vector and the image feature vector set.
He et al. does not expressly disclose the following limitations underlined above: obtain an image feature vector set matching the target image, the image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale, and i being a positive integer, and recognize the target object from the target image according to the instruction feature vector and the image feature vector set.
However, Mao teaches, obtain an image feature vector set matching the target image, the image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale, and i being a positive integer (Para. 0002: the present invention relates to the field of image processing technology, in particular to a method and device for feature detection; Para. 0012: performing scaling processing on the video image to obtain S video images of different scales; Para. 0013: use HOG feature descriptors to perform feature extraction on each of the S video images of different scales, so as to extract M1 feature values from the video images of each scale; Para. 0014: the M1 feature values respectively extracted from the S video images of different scales are combined into a feature set containing M1*S feature values to obtain M feature values, M=M1*S; Note: the S video images of different scales are the target images in an ith scale. The M1 feature values extracted from the S video images of different scales are the ith image feature vectors),
and recognize the target object from the target image according to the instruction feature vector and the image feature vector set (Para. 0014: the M1 feature values respectively extracted from the S video images of different scales are combined into a feature set containing 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to include obtaining an image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale as taught by Mao into the object recognition of He et al. in order to improve recognition rate (Mao, Para. 0060).
Regarding claim 15, He et al. teaches, a non-transitory computer readable storage medium storing computer readable instructions, the computer readable instructions, when executed by a processor, causing the processor to perform (Abstract: a video image information searching method and a video image information searching system for a mobile terminal; Para. 0030: any process or method description in the flowchart or described in other ways herein can be understood as a module, segment or part of code that includes one or more executable instructions for implementing specific logical functions or steps of the process; Note: the mobile terminal includes a memory (i.e. non-transitory computer readable storage medium) and a processor that carries out instructions or code stored in the memory (i.e. is connected to the memory) for processing of the image): 

 obtaining an instruction feature vector matching the image recognition instruction (Para. 0100: specifically, the mobile terminal 100 is used to receive auxiliary information input by the user, where the auxiliary information includes indicating whether the image object is a first object or a second object, where the first object includes a person and an animal, and the second object includes a rigid object, and the auxiliary information When indicating that the image object is the first object, use biometric recognition technology to extract the gabor feature value of the image object; Note: a gabor feature value of the image object (i.e. feature vector) is extracted or obtained based on the auxiliary information (for biometric recognition 
obtaining an image feature vector set matching the target image, the image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale, and i being a positive integer; 
and recognizing the target object from the target image according to the instruction feature vector and the image feature vector set.
He et al. does not expressly disclose the following limitations underlined above: obtaining an image feature vector set matching the target image, the image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale, and i being a positive integer; and recognizing the target object from the target image according to the instruction feature vector and the image feature vector set.
However, Mao teaches, oobtaining an image feature vector set matching the target image, the image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale, and i being a positive integer (Para. 0002: the present invention relates to the field of image processing technology, in particular to a method and device for feature detection; Para. 0012: performing scaling processing on the video image to obtain S video images of different scales; Para. 0013: use HOG feature descriptors to perform feature extraction on each of the S video images of different scales, so as to extract M1 feature values from the video images of each scale; Para. 0014: the M1 feature values respectively extracted from the S video images of different scales are combined into a feature set containing M1*S feature values to obtain M feature values, M=M1*S; Note: the S video images of different scales are the target images in an ith scale. The M1 feature values extracted from the S video th image feature vectors);
and recognizing the target object from the target image according to the instruction feature vector and the image feature vector set (Para. 0014: the M1 feature values respectively extracted from the S video images of different scales are combined into a feature set containing M1*S feature values to obtain M feature values, M=M1*S; Para. 0067: the feature vector corresponding to the GMM model is a high-dimensional Fisher Vector feature vector. This feature vector is used as the feature vector of a video image (video image including a human face), and the feature vector can be used for feature detection, such as for human Follow-up operations such as face recognition; Note: the target object is the human face in the video image. The image feature set M containing M1*S feature values is the image feature vector set and the M1 feature values are the instruction feature vectors).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to include obtaining an image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale as taught by Mao into the object recognition of He et al. in order to improve recognition rate (Mao, Para. 0060).
Claims 7 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over He et al. (CN 103020173 A, see attached machine translation) in view of Mao (CN 105354579 A, see attached machine translation) and further in view of Liu et al. (US 2018/0039853 A1).
Regarding claim 7, He et al. teaches, the method according to claim 1, wherein after the recognizing, by the device, the target object from the target image, the method further comprises (Abstract: a video image information searching method and a video image 
performing, by the device, an image processing operation on the target object, the image processing operation comprising at least one of the following operations: a cropping operation on the target object, or an editing operation on the target object.
He et al. does not expressly disclose the following limitations in claim 1 from which claim 7 depends: obtaining, by the device, an image feature vector set matching the target image, the image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale, and i being a positive integer; and recognizing, by the device, the target object from the target image according to the instruction feature vector and the image feature vector set.
However, Mao teaches, obtaining, by the device, an image feature vector set matching the target image, the image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale, and i being a positive integer (Para. 0002: the present invention relates to the field of image processing technology, in particular to a method and device for feature detection; Para. 0012: performing scaling processing on the video image to obtain S video images of different scales; Para. 0013: use HOG feature descriptors to perform feature extraction on each of the S video images of different scales, so as to extract M1 feature values from the video images of each scale; Para. 0014: the M1 feature values respectively extracted from the S video images of different scales are th scale. The M1 feature values extracted from the S video images of different scales are the ith image feature vectors); 
and recognizing, by the device, the target object from the target image according to the instruction feature vector and the image feature vector set (Para. 0014: the M1 feature values respectively extracted from the S video images of different scales are combined into a feature set containing M1*S feature values to obtain M feature values, M=M1*S; Para. 0067: the feature vector corresponding to the GMM model is a high-dimensional Fisher Vector feature vector. This feature vector is used as the feature vector of a video image (video image including a human face), and the feature vector can be used for feature detection, such as for human Follow-up operations such as face recognition; Note: the target object is the human face in the video image. The image feature set M containing M1*S feature values is the image feature vector set and the M1 feature values are the instruction feature vectors).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to include obtaining an image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale as taught by Mao into the object recognition of He et al. in order to improve recognition rate (Mao, Para. 0060).
The combination of He et al. and Mao does not expressly disclose the following limitation underlined above: performing, by the device, an image processing operation on the target object, the image processing operation comprising at least one of the following 
However, Liu et al. teaches, performing, by the device, an image processing operation on the target object, the image processing operation comprising at least one of the following operations: a cropping operation on the target object, or an editing operation on the target object (Para. 0028: FIG. 4A shows a procedure of resizing a target region image and a contest region image in an image. When the proposal box 15 is applied to the image 10, the neural networks 200 crops the target region image corresponding to the proposal box 15 and resized the target region image to a resized target image 16, and the resized target image 16 is transmitted to the first DCNN 210).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to include cropping the target image as taught by Liu et al. into the combined object recognition of He et al. and Mao in order to improve detection of small objects (Liu et al., Para. 0028).
Regarding claim 14, He et al. teaches, the apparatus according to claim 8, wherein, after the processor is configured to cause the apparatus to recognize the target object from the target image, the processor is configured to cause the apparatus to (Abstract: a video image information searching method and a video image information searching system for a mobile terminal…The method includes the steps as follows: a terminal receives an interested image area in video files input by a user, obtains image objects in the interested image area; Note: the image objects in the interested image area are the target objects): 
perform an image processing operation on the target object, the image processing operation comprising at least one of the following operations: a cropping operation on the target object, or an editing operation on the target object.
He et al. does not expressly disclose the following limitations in claim 8 in which claim 14 depends: obtain an image feature vector set matching the target image, the image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale, and i being a positive integer, and recognize the target object from the target image according to the instruction feature vector and the image feature vector set.
However, Mao teaches, obtain an image feature vector set matching the target image, the image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale, and i being a positive integer (Para. 0002: the present invention relates to the field of image processing technology, in particular to a method and device for feature detection; Para. 0012: performing scaling processing on the video image to obtain S video images of different scales; Para. 0013: use HOG feature descriptors to perform feature extraction on each of the S video images of different scales, so as to extract M1 feature values from the video images of each scale; Para. 0014: the M1 feature values respectively extracted from the S video images of different scales are combined into a feature set containing M1*S feature values to obtain M feature values, M=M1*S; Note: the S video images of different scales are the target images in an ith scale. The M1 feature values extracted from the S video images of different scales are the ith image feature vectors),
and recognize the target object from the target image according to the instruction feature vector and the image feature vector set (Para. 0014: the M1 feature values respectively extracted from the S video images of different scales are combined into a feature set containing 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to include obtaining an image feature vector set comprising an ith image feature vector for indicating an image feature of the target image in an ith scale as taught by Mao into the object recognition of He et al. in order to improve recognition rate (Mao, Para. 0060).
The combination of He et al. and Mao does not expressly disclose the following limitation underlined above: perform an image processing operation on the target object, the image processing operation comprising at least one of the following operations: a cropping operation on the target object, or an editing operation on the target object.
However, Liu et al. teaches, perform an image processing operation on the target object, the image processing operation comprising at least one of the following operations: a cropping operation on the target object, or an editing operation on the target object (Para. 0028: FIG. 4A shows a procedure of resizing a target region image and a contest region image in an image. When the proposal box 15 is applied to the image 10, the neural networks 200 crops the target region image corresponding to the proposal box 15 and resized the target region 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to include cropping the target image as taught by Liu et al. into the combined object recognition of He et al. and Mao in order to improve detection of small objects (Liu et al., Para. 0028).
Allowable Subject Matter
Claims 2-6, 9-13 and 16-20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:
The prior art of record teaches the claimed limitation as rejected in the independent claims above.  However, the prior of the record fails to show:
Claim 2  (including all of the limitations of the base claim and any intervening claims) is directed to indicating an image feature vector of the target image obtained through a first neural network model and obtaining a change image feature vector in the image feature vector set obtained through a second neural network model.
Claim 9 (including all of the limitations of the base claim and any intervening claims)  is directed to indicating an image feature vector of the target image obtained through a first neural network model and obtaining a change image feature vector in the image feature vector set obtained through a second neural network model.
Claim 16 (including all of the limitations of the base claim and any intervening claims)  is 
There are no explicit teachings to the above limitations found in the closest prior arts cited in the rejection for claims 2, 9 and 16. Claims 3-6, 10-13 and 17-20 are dependent on claims 2, 9 and 16, respectively, and are therefore allowed for the same reasons as set forth above.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Wang et al. (US 2016/0140424 A1) teaches object-centric fine-grained image classification.
Xu et al. (CN 103824067 A, see attached machine translation), teaches a method for positioning and recognizing the main target of an image. 
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Daniella M. DiGuglielmo whose telephone number is (571)272-2682.  The examiner can normally be reached on Monday - Friday 7:30 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Daniella M. DiGuglielmo/Examiner, Art Unit 2664                                                                                                                                                                                                  
/NAY A MAUNG/Supervisory Patent Examiner, Art Unit 2664                                                                                                                                                                                                        





8/16/2021