DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action is responsive to communications filed on 04/21/2020. Claims 1-14 are pending in the instant application. Claims 1, 6 and 14 are independent. An Office Action on the merits follows here below.
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 06/16/2020 & 11/20/2020 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:

2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Jiang (US 20150054824 A1) in combination with Gu (US 20190180469 A1).

Regarding Claim 1: Jiang discloses a method (Refer at least to para [038]; “A flowchart for an object detection method in accordance with one mode of embodiment of the application is described with reference to FIG. 2 firstly. In the object detection method, a specific object is to be determined from a target image.”) comprising: obtaining a target candidate region in a to-be-detected image  (Refer to para [039-040]; “Step S201 is an object detection step in which the specific object is detected in the image by a specific object detector. The specific object detector may be a general detector for detecting the specific object, e.g., a person, in the image. According to one embodiment, firstly, a plurality of regions is collected from the image or video including the image.”) determining at least two part candidate regions from the target candidate region by using an image segmentation network (Refer to para [044-050]; “the image is divided into a specific object and a background or a specific scene for the sake of ease of description, the background or the specific scene excludes the specific object.”) wherein each part candidate region corresponds to one part of a to-be-detected target (Refer to para [053]; “…the acquired images or video frames may be divided into a plurality of image regions with different positions and dimensions upon the start of the specific object detection. The regions exclude the specific object to be detected in those image regions may be taken as the samples. For example, a plurality of image regions may be labelled as the samples on those images or video frames manually by a user. Alternatively, the regions as the background provided by the object 

While Jiang teaches a neural network, Jiang does not expressly teach a recursive neural network such as LSTM.

Gu teaches “receiving video data representing a sequence of image frames including at least one head and extracting, by a neural network, spatial features comprising pitch, yaw, and roll angles of the at least one head from the video data.” 

More specifically, Gu teaches recurrent neural network such as a bidirectional long short-term memory (LSTM) network (Refer to para [022]; “At step 130, the spatial features for two or more image frames in the sequence of image frames are processed by a recurrent neural network (RNN) to produce head pose estimate for the at least one head. In one embodiment, the RNN is a gated recurrent unit (GRU) neural network. In one embodiment, the RNN is a long short-term memory (LSTM) neural network. In one embodiment, the RNN is a fully connected RNN (FC-RNN). In one embodiment, the neural network is trained separately from the RNN.”).

Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Jiang by adding a recurrent neural network as rejected above by Gu.

The suggestion/motivation for combining the teachings of Jiang and Gu would have been in order to improve feature detection such that “dynamic estimation and tracking of features in video image data so that the analysis system receives color data (e.g., RGB component values), without depth, as an 

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Jiang and Gu in order to obtain the specified claimed elements of Claim 1. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Regarding Claim 14: Jiang discloses a computer device (Refer to para [035]; “FIG. 9 illustrates a block diagram of a hardware configuration of a computer system in which the embodiments of the application may be implemented.”) wherein the computer device comprises: a processor (Refer to para [134 and 136]) and a memory, wherein the memory is configured to store a program instruction, and when the processor invokes the program instruction (Refer to para [134]; “In addition, a program can be executed by a computer (processor). Further, a program can be processed by plurality of computers in a distributed manner. Moreover, a program can be transmitted to a remote computer to be executed.”) the program instruction enables the processor to perform a method according to the following steps: obtaining a target candidate region in a to-be-detected image (Refer to para [039-040]; “Step S201 is an object detection step in which the specific object is detected in the image by a specific object detector. The specific object detector may be a general detector for detecting the specific object, e.g., a person, in the image. According to one embodiment, firstly, a plurality of regions is collected from the image or video including the image.”)  determining at least two part candidate regions from the target candidate region by using an image segmentation network (Refer to para [044-050]; “the image is divided into a specific object and a background or a specific scene for the sake of ease of description, the background or the specific scene excludes the specific object.”) wherein each part candidate region corresponds to one part of a to-be-detected target 

While Jiang teaches a neural network, Jiang does not expressly teach a recursive neural network such as LSTM.

Gu teaches “receiving video data representing a sequence of image frames including at least one head and extracting, by a neural network, spatial features comprising pitch, yaw, and roll angles of the at least one head from the video data.” 

More specifically, Gu teaches a recurrent neural network such as a bidirectional long short-term memory (LSTM) network (Refer to para [022]; “At step 130, the spatial features for two or more image frames in the sequence of image frames are processed by a recurrent neural network (RNN) to produce head pose estimate for the at least one head. In one embodiment, the RNN is a gated recurrent unit (GRU) neural network. In one embodiment, the RNN is a long short-term memory (LSTM) neural network. In one embodiment, the RNN is a fully connected RNN (FC-RNN). In one embodiment, the neural network is trained separately from the RNN.”).



The suggestion/motivation for combining the teachings of Jiang and Gu would have been in order to improve feature detection such that “dynamic estimation and tracking of features in video image data so that the analysis system receives color data (e.g., RGB component values), without depth, as an input and is trained using a large-scale synthetic dataset to estimate and track either poses or three-dimensional (3D) positions of landmarks.” (at para [018], Gu).

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Jiang and Gu in order to obtain the specified claimed elements of Claim 14. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Claims 6 and 9 are rejected under 35 U.S.C. 103 as being unpatentable over He et al. (US 20190318502 A1) in combination with Gu (US 20190180469 A1).

Regarding Claim 6: He discloses a method (Refer to Figure 2 and para [010]; “FIG. 2 is an illustration of an exemplary flow diagram of a method for feature descriptor neural network training and/or matching.”) comprising: obtaining a target candidate region in a to-be-detected image (Refer to para [032]; “An image capture device 20 may be mounted on a vehicle, for example, and be used to capture an image, such as an input image. According to another aspect, the image capture device 20 may be the image capture device 20 of a mobile device, which may be mounted in a vehicle or be handheld. A server 30 may house a set of images, such as a reference set of images of a point of interest, for 

While He teaches a convolutional neural network (CNN), He does not expressly teach a recursive neural network such as LSTM.

Gu teaches “receiving video data representing a sequence of image frames including at least one head and extracting, by a neural network, spatial features comprising pitch, yaw, and roll angles of the at least one head from the video data.” 

More specifically, Gu teaches a recurrent neural network such as a bidirectional long short-term memory (LSTM) network (Refer to para [022]; “At step 130, the spatial features for two or more image frames in the sequence of image frames are processed by a recurrent neural network (RNN) to produce head pose estimate for the at least one head. In one embodiment, the RNN is a gated recurrent unit (GRU) neural network. In one embodiment, the RNN is a long short-term memory 

Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify He by adding a recurrent neural network as rejected above by Gu.

The suggestion/motivation for combining the teachings of He and Gu would have been in order to improve feature detection such that “dynamic estimation and tracking of features in video image data so that the analysis system receives color data (e.g., RGB component values), without depth, as an input and is trained using a large-scale synthetic dataset to estimate and track either poses or three-dimensional (3D) positions of landmarks.” (at para [018], Gu).

Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of He and Gu in order to obtain the specified claimed elements of Claim 6. It is for at least the aforementioned reasons that the Examiner has reached a conclusion of obviousness with respect to the claim in question.

Regarding Claim 9: He discloses using the positive sample image feature of each part and the negative sample image feature of each part as input of the part identification model, and learning, by using the part identification model and by using a binary classification problem distinguishing between a target part and a background as a learning task, a capability of obtaining a local image feature of the part (Refer to para [040]; “The local feature descriptors may be of different types (e.g., a first type, a second type, binary, real-valued, etc.), such as a binary descriptor or a real-valued descriptor. In other words, the first set of local feature descriptors or the second set of local feature descriptors may be the binary descriptor or the real-valued descriptor. Using these local feature .
Allowable Subject Matter
Claims 2-5, 7, 8, 10-13 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
US 20210201504 A1
US 10229346 B1
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MIA M THOMAS whose telephone number is (571)270-1583.  The examiner can normally be reached on M-Th 8:30am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Edward (Ed) Urban can be reached on 572-272-7899.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may 


MIA M. THOMAS
Primary Examiner
Art Unit 2665



/MIA M THOMAS/Primary Examiner
Art Unit 2665