DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 4, 5, 7, 9, 10, 11, 13, 14, 16, 18, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over  Elkhamy et al. (US 20170344808 A1) in view of Suh et al. (US Pub No. 20190138792 A1). 

Regarding Claim 1, 
Elkhamy discloses An image processing method, comprising: 
features are extracted from image to obtain feature map)

performing key point detection on the first feature map according to the first region to determine target key point information of the to-be-processed image.  (Elkhamy, [0031], discloses an alignment parameter regression layer may be applied only on the feature map regions corresponding to regions after non-maximum suppression of detection boxes at different scales.  In an embodiment, the alignment layer may be applied to only patches of the feature maps that corresponds to the regions of interest.  The training procedure may depend on the annotations of the test set.  If key point or detection box annotations exist for both aligned and non-aligned images, the alignment layer may be added after the key point regression layer has been trained, and then the network is fine-tuned end-to-end to learn the affine transformations.  In another embodiment, both the key point regression layer and the alignment layer are trained simultaneously in key points are determined from feature map to determine object (target) key points in an image)

		Elkhamy does not explicitly disclose performing target region prediction on the first feature map to determine a first region where a target is located in the first feature map; 
		Suh discloses performing target region prediction on the first feature map to determine a first region where a target is located in the first feature map; (Suh, Abstract, discloses at least one example embodiment discloses a method of extracting a feature from an input image.  The method may include detecting landmarks from the input image, detecting physical characteristics between the landmarks based on the 
landmarks, determining a target area of the input image from which at least one 
feature is to be extracted and an order of extracting the feature from the target area based on the physical characteristics and extracting the feature based on the determining; target area is predicted from identified landmarks (feature maps))

 Accordingly, it would have been obvious to one of ordinary skill in the art to modify Elkhamy that is directed to obtaining features and feature maps to classify objects in images with Suh that is directed to further localize target in image using key points in feature maps. One would be motivated to modify Elkhamy that is object classification using feature maps and key points by teachings of Suh to localize object in images using key points in image feature maps to provide a simple and efficient tool for 

Regarding Claim 2, 
The combination of Elkhamy and Suh and Suh further discloses performing bounding box regression on the target key point information of each target in the to-be-processed image respectively to determine a second region of the each target; (Elkhamy,[0053], discloses the outputs of the classification score generator 323, and the bounding box regression generator 340 may be coupled to the RoI pooler 350, which may pool or group regions of interest.  In one embodiment, the RoI Pooler 350 may extract the features contained in each bounding box.  The output of the alignment parameter regression network 330 and the output of the RoI pooler 350 are coupled to the affine transformation network 360.  In one embodiment, the affine transformation network 360 may perform alignment on feature maps instead on the original input image.  For example, the affine transformation 360 may perform the appropriate transformation (scale, rotation etc.) for the features in each bounding box; bounding boxes are determined for areas according to features extracted) and 

determining a target recognition result of the to-be-processed image according to the second region of the each target.  (Elkhamy, [0031], discloses an alignment parameter regression layer may be applied only on the feature map regions corresponding to regions after non-maximum suppression of detection boxes at different scales.  In an embodiment, the alignment layer may be applied to only patches of the feature maps target (object) regions are determined). Additionally, the rational and motivation to combine the references Elkhamy and Suh as applied in claim 1 apply to this claim. 


Regarding Claim 4, 
wherein performing key point detection on the first 34117681.000012 feature map according to the first region to determine the target key point information of the to-be-processed image comprises: performing key point feature extraction on the first feature map to obtain a third feature map; (Elkhamy, [0031], discloses an alignment parameter regression layer may be applied only on the feature map regions corresponding to regions after non-maximum suppression of detection boxes at different scales.  In an embodiment, the alignment layer may be applied to only patches of the feature maps that corresponds to the regions of interest.  The training procedure may depend on the annotations of the test set.  If keypoint or detection box annotationsexist for both aligned and non-aligned target (object) regions are determined). 
determining a plurality of key points of the target from a feature region corresponding to the first region in the third feature map; and determining the target key point information of the to-be-processed image according to positions of the plurality of key points in the third feature map.  (Elkhamy,[0053], discloses the outputs of the classification score generator 323, and the bounding box regression generator 340 may be coupled to the RoI pooler 350, which may pool or group regions of interest.  In one embodiment, the RoI Pooler 350 may extract the features contained in each bounding box.  The output of the alignment parameter regression network 330 and the output of the RoI pooler 350 are coupled to the affine transformation network 360.  In one embodiment, the affine transformation network 360 may perform alignment on feature maps instead on the original input image.  For example, the affine transformation 360 may perform the appropriate transformation (scale, rotation etc.) for the features in each bounding box; bounding boxes are determined for areas according to features extracted; pluralities of regions and feature maps are determined according to key points obtained from feature maps). Additionally, the rational and motivation to combine the references Elkhamy and Suh as applied in claim 1 apply to this claim.

Regarding Claim 5, 
wherein determining the plurality of key points of the target from the feature region corresponding to the first region in the third feature map comprises: performing key point detection on each channel of the feature region respectively to obtain key points corresponding to the each channel.  (Elkhamy,[0053], discloses the outputs of the classification score generator 323, and the bounding box regression generator 340 may be coupled to the RoI pooler 350, which may pool or group regions of interest.  In one embodiment, the RoI Pooler 350 may extract the features contained in each bounding box.  The output of the alignment parameter regression network 330 and the output of the RoI pooler 350 are coupled to the affine transformation network 360.  In one embodiment, the affine transformation network 360 may perform alignment on feature maps instead on the original input image.  For example, the affine transformation 360 may perform the appropriate transformation (scale, rotation etc.) for the features in each bounding box; bounding boxes are determined for areas according to features extracted; pluralities of regions and feature maps are determined according to key points obtained from feature maps). Additionally, the rational and motivation to combine the references Elkhamy and Suh as applied in claim 1 apply to this claim.


Regarding Claim 7, 
		The combination of Elkhamy and Suh further discloses wherein the method is implemented via a neural network, and the method further comprises: training the 
two-dimensional (2D) feature map that reflects the probability that the region belongs to the face of interest.  This may be extended to generate multiple object classifications for multiple objects or regions of interest by having each corresponding subregion of feature maps at the output represent the probabilities that corresponding subregion may belong to each respective class in a classification domain; neural network using images to train and classify or label the objects is disclosed). Additionally, the rational and motivation to combine the references Elkhamy and Suh as applied in claim 1 apply to this claim. 

Regarding Claim 9, 
		The combination of Elkhamy and Suh further discloses wherein the target in the to-be-processed image comprises any one of a face, a body, or hands. (Elkhamy, [0022], discloses output of the input interface 110 may be coupled to an input of the neural network using images to train and classify or label the regions such as face is disclosed). Additionally, the rational and motivation to combine the references Elkhamy and Suh as applied in claim 1 apply to this claim. 


Claims 10, 11, 13, 14, 16 and 18 recite apparatus with elements corresponding to the method steps recited in Claims 1, 2, 4, 5, 7 and 9 respectively. Therefore, the recited elements of the apparatus Claims 10, 11, 13, 14, 16 and 18 are mapped to the proposed combination in the same manner as the corresponding steps of Claims 1, 2, 4, 5, 7 and 9 respectively. Additionally, the rationale and motivation to combine the Elkhamy and Suh references presented in rejection of Claim 1, apply to these claims.


Claims 19-20 recite computer readable storage medium with program instructions corresponding to the method steps recited in Claims 1-2 respectively. Therefore, the recited instructions of the computer readable medium Claims 19-20 are mapped to the proposed combination in the same manner as the corresponding steps of Claims 1-2 respectively. Additionally, the rationale and motivation to combine the Elkhamy and Suh  references presented in rejection of Claim 1, apply to these claims.

Furthermore, the combination of Elkhamy and Suh further discloses A non-transitory computer-readable storage medium, having computer program instructions stored thereon, wherein when the computer program instructions are executed by a processor, the processor is caused to perform the operations  (Elkhamy, [0018],  [0056], Fig. 1, Fig. 4, discloses an electronic device 400 that includes one or more integrated circuits (chips) having a system to recognize a face and/or an object in an image according to the subject matter disclosed herein.  Electronic device 400 may be used in, but not limited to, a computing device, a personal digital assistant (PDA), a laptop computer, a mobile computer, a web tablet, a wireless phone, a cell phone, a smart phone, a digital music player, or a wireline or wireless electronic device.  The electronic device 400 may include a controller 410, an input/output device 420 such as, but not limited to, a keypad, a keyboard, a display, or a touch-screen display, a memory 430, and a wireless interface 440 that are coupled to each other through a bus 450.  The controller 410 may include, for example, at least one microprocessor, at least one digital signal process, at least one microcontroller, or the like.  The memory 430 may be configured to store a command code to be used by the controller 410 or a user data; an embodiment of a system 100 for face recognition according to the subject matter disclosed herein.  In one embodiment, the system 100 includes a unified architecture, multi-task deep-learning machine.  Each of the respective blocks of the system 100 depicted in FIG. 1 may represent a component and/or a module of a system, or alternatively, an operation in a method.  As used herein, the term module refers to any 

be embodied as a software package, code and/or instruction set or instructions, 
and the term "hardware," as used in any implementation described herein, may 
include, for example, singly or in any combination, hardwired circuitry, programmable circuitry, state machine circuitry, and/or firmware that stores instructions executed by programmable circuitry.  The modules may, collectively or individually, be embodied as circuitry that forms part of a larger system, for example, but not limited to, an integrated circuit (IC), system on-chip (SoC) and so forth). 

Allowable Subject Matter
Claims 3, 6, 8, 12, 15 and 17 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PINALBEN V PATEL whose telephone number is (571)270-5872.  The examiner can normally be reached on M-F: 10am - 8pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached on (571)272-8243.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Pinalben Patel/Examiner, Art Unit 2661