DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .



Information Disclosure Statement
The information disclosure statements (IDS) filed on 7/30/2021 and 6/10/2022 were considered and placed on the file of record by the examiner.

	

Drawings
The subject matter of this application admits of illustration by a drawing to facilitate understanding of the invention.  Applicant is required to furnish a drawing under 37 CFR 1.81(c).  No new matter may be introduced in the required drawing.  Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d).  The applicant is required to demonstrate the structure of the claim language and relationship of the probabilities, deviations, candidate boxes, and anchor boxes.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1, 8, and 15 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  Claims 2-7, and 9-14 are rejected based on their dependency. 
The following claim 1, 8, 15 elements are vague and indefinite.  The examiner suggest the applicant amend the claims to be more clearly defined: “determining, for a pixel in the to-be-processed image, a probability that each anchor box of at least one anchor box arranged for the pixel includes the to-be-tracked target, and determining a deviation of the candidate box corresponding to each anchor box relative to each anchor box; determining, based on positions of at least two anchor boxes corresponding to at least two probabilities among the probabilities and deviations corresponding to the at least two anchor boxes respectively, candidate positions of the to-be-tracked target corresponding to the at least two anchor boxes respectively.”
The following claim 1, 8, 15 elements are vague and indefinite because it is not clear how two positions are combined: “combining at least two candidate positions among the candidate positions to obtain a position of the to-be-tracked target in the to-be-processed image.”


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 2, 5, 7, 8, 9, 12, 14, 15 are rejected under 35 U.S.C. 103 as being unpatentable over Ren et al. (Non-patent literature titled “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”) in view of Cha et al. (US 2020/0175352).

Regarding clam 1, Ren teaches a method for tracking a target, comprising: generating, based on a region proposal network and a feature map of a to-be-processed image, a position of a candidate box of a to-be-tracked target in the to-be-processed image (see figure 2, section 3.1, where Ren discusses Region Proposal Network (RPN) with feature map of an image for candidate proposal boxes);
determining, for a pixel in the to-be-processed image, a probability that each anchor box of at least one anchor box arranged for the pixel includes the to-be-tracked target, and determining a deviation of the candidate box corresponding to each anchor box relative to each anchor box (see figure 2, section 3.1, where Ren discusses multiple anchors relative to the candidate proposal boxes); 
determining, based on positions of at least two anchor boxes corresponding to at least two probabilities among the probabilities and deviations corresponding to the at least two anchor boxes respectively, candidate positions of the to-be-tracked target corresponding to the at least two anchor boxes respectively (see figure 2, section 3.1, where Ren discusses generating predicted probabilities for the deviation corresponding to multiple anchors with different scale aspect ratios).  Ren does not expressly teach combining at least two candidate positions among the candidate positions to obtain a position of the to-be-tracked target in the to-be-processed image. 
However, Cha teaches combining at least two candidate positions among the candidate positions to obtain a position of the to-be-tracked target in the to-be-processed image (see para. 0359-0360, where Cha discusses generating multiple proposal candidates in a region proposal network and convolutional neural network, output image data used to detect objects).
Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Ren with Cha to derive at the invention of claim 1.  The result would have been expected, routine, and predictable in order to perform object detection using a region proposal network.
The determination of obviousness is predicated upon the following:  One skilled in the art would have been motivated to modify Ren in this manner in order to improve object detection using a region proposal network that combines multiple candidate proposals to properly address different regions in the image and increase the speed of network training and object detection.  Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in this manner explained using known engineering design, interface and/or programming techniques, without changing a fundamental operating principle of Ren, while the teaching of Cha continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result of implementing a region proposal network that takes into consideration multiple candidate boxes to properly analyze multiple regions in the image when performing object detection.  The Ren and Cha systems perform object detection using a region proposal network, therefore one of ordinary skill in the art would have reasonable expectation of success in the combination.  It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question.

Regarding claim 2, Cha teaches wherein the deviation comprises a size scaling amount and a specified point position offset amount (see para. 0359, 0362, where Cha discusses using multiple scales, center coordinates, and ground truth); and the determining, based on the positions of the at least two anchor boxes corresponding to the at least two probabilities among the probabilities and the deviations corresponding to the at least two anchor boxes respectively, the candidate positions of the to-be-tracked target corresponding to the at least two anchor boxes respectively comprises: performing, based on the positions of the at least two anchor boxes corresponding to the at least two probabilities, size scaling and specified point position offsetting on the at least two anchor boxes respectively according to size scaling amounts and specified point offset amounts corresponding to the at least two anchor boxes respectively, to obtain the candidate positions of the to-be-tracked target corresponding to the at least two anchor boxes respectively (see para. 0359, 0362, 0388, where Cha discusses anchor boxes with multiple scales, center coordinates, and ground truth; see para. 0359-0360, where Cha discusses generating multiple proposal candidates used in a region proposal network and convolutional neural network to detect objects).
The same motivation of claim 1 is applied to claim 2.  Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Ren with Cha to derive at the invention of claim 2.  The result would have been expected, routine, and predictable in order to perform object detection using a region proposal network.

Regarding claim 5, Cha teaches wherein the determining, for the pixel in the to-be-processed image, the probability that each anchor box of the at least one anchor box arranged for the pixel includes the to-be-tracked target, and determining the deviation of the candidate box corresponding to each anchor box relative to each anchor box comprises: inputting the position of the candidate box into a classification processing layer in a deep neural network, to obtain the probability that each anchor box of the at least one anchor box arranged for each pixel in the to-be-processed image includes the to-be-tracked target and that is outputted from the classification processing layer; and inputting the position of the candidate box into a bounding box regression processing layer in the deep neural network, to obtain the deviation of the candidate box corresponding to each anchor box relative to each anchor box, the deviation being outputted from the bounding box regression processing layer (see para. 0361, 0364, where Cha discusses the regression layer, predicts the center coordinates, width, and height of a bounding box, and is trained to map the predicted box to a ground-truth box; see para. 0364, where Cha discusses regression and softmax layers, to calculate the location of bounding boxes and classify objects in the boxes).
The same motivation of claim 1 is applied to claim 5.  Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Ren with Cha to derive at the invention of claim 5.  The result would have been expected, routine, and predictable in order to perform object detection using a region proposal network.

Regarding claim 7, Cha teaches wherein the generating, based on the region proposal network and the feature map of the to-be-processed image, the position of the candidate box of the to-be-tracked target in the to-be-processed image comprises: inputting a feature map of a template image of the to-be-tracked target and the feature map of the to-be-processed image into the region proposal network (see para. 0359, where Cha discusses RPN uses a CNN to extract a feature map), to obtain the position of the candidate box of the to-be-tracked target in the to-be-processed image outputted from the region proposal network (see para. 0359, where Cha discusses RPN obtaining multiple bounding boxes), wherein the template image of the to-be-tracked target corresponds to a local region within a bounding box of the to-be-tracked target in an original image of the to-be-tracked target (see para. 0359, 0364, where Cha discusses regression and softmax layers, to calculate the location of bounding boxes and classify objects in the boxes).
The same motivation of claim 1 is applied to claim 7.  Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Ren with Cha to derive at the invention of claim 7.  The result would have been expected, routine, and predictable in order to perform object detection using a region proposal network.

Claim 8 is rejected as applied to claim 1 as pertaining to a corresponding device.
Claim 9 is rejected as applied to claim 2 as pertaining to a corresponding device.
Claim 12 is rejected as applied to claim 5 as pertaining to a corresponding device.
Claim 14 is rejected as applied to claim 7 as pertaining to a corresponding device.
Claim 15 is rejected as applied to claim 1 as pertaining to a corresponding device.

Claims 4, 11 are rejected under 35 U.S.C. 103 as being unpatentable over Ren et al. (Non-patent literature titled “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”) in view of Cha et al. (US 2020/0175352) in view of Grancharov et al. (US 2021/0201505) (WO2019114954 Publication date June 20, 2019).

Regarding claim 4, Ren and Cha do not expressly teach wherein the at least two probabilities are obtained by: processing the probabilities using a preset window function, to obtain a processed probability of each of the probabilities; and selecting at least two processed probabilities from the processed probabilities in descending order, wherein probabilities corresponding to the at least two processed probabilities among the probabilities are the at least two probabilities.
However, Grancharov teaches wherein the at least two probabilities are obtained by: processing the probabilities using a preset window function, to obtain a processed probability of each of the probabilities; and selecting at least two processed probabilities from the processed probabilities in descending order, wherein probabilities corresponding to the at least two processed probabilities among the probabilities are the at least two probabilities (see para. 0117, where Grancharov discusses setting probabilities in descending order and rejects lower ranked candidate objects).
Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Ren and Cha with Grancharov to derive at the invention of claim 4.  The result would have been expected, routine, and predictable in order to perform object detection using a region proposal network.
The determination of obviousness is predicated upon the following:  One skilled in the art would have been motivated to modify Ren and Cha in this manner in order to improve object detection using a region proposal network that combines multiple candidate proposals with different prediction probabilities used to properly address different regions in the image and increase the speed of network training and object detection.  Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in this manner explained using known engineering design, interface and/or programming techniques, without changing a fundamental operating principle of Ren and Cha, while the teaching of Grancharov continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result of implementing a region proposal network that takes into consideration multiple candidate boxes to properly analyze multiple regions in the image when performing object detection.  The Ren, Cha, and Grancharov systems perform object detection using a region proposal network, therefore one of ordinary skill in the art would have reasonable expectation of success in the combination.  It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question.

Claim 11 is rejected as applied to claim 4 as pertaining to a corresponding device.

Claims 6, 13 are rejected under 35 U.S.C. 103 as being unpatentable over Ren et al. (Non-patent literature titled “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”) in view of Cha et al. (US 2020/0175352) in view of Chen (US 11,004,209).

Regarding claim 6, Ren and Cha do not expressly teach wherein the to-be-processed image is obtained by: acquiring a position of a bounding box of the to-be-tracked target in a previous video frame among adjacent video frames; generating a target bounding box at the position of the bounding box in a next video frame based on a target side length obtained by enlarging a side length of the bounding box; and generating the to-be-processed image based on a region where the target bounding box is located.
However, Chen teaches wherein the to-be-processed image is obtained by: acquiring a position of a bounding box of the to-be-tracked target in a previous video frame among adjacent video frames; generating a target bounding box at the position of the bounding box in a next video frame based on a target side length obtained by enlarging a side length of the bounding box; and generating the to-be-processed image based on a region where the target bounding box is located (see col. 84 lines 3-14, where Chen discusses that the bounding boxes of the high confidence trackers can be enlarged by any suitable amount. In one illustrative example, one or more of the predicted bounding boxes can be increased by 25% of the width and 25% of the height, with the center position being kept unchanged. Enlarging the bounding boxes can increase the amount of overlap between bounding boxes, allowing more objects to be located).
Motivation to combine may be gleaned from the prior art considered.  It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify the invention of Ren and Cha with Chen to derive at the invention of claim 6.  The result would have been expected, routine, and predictable in order to perform object detection using a region proposal network.
The determination of obviousness is predicated upon the following:  One skilled in the art would have been motivated to modify Ren and Cha in this manner in order to improve object detection using a region proposal network that combines multiple candidate proposals with various sizes to properly address different regions in the image and increase the speed of network training and object detection.  Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in this manner explained using known engineering design, interface and/or programming techniques, without changing a fundamental operating principle of Ren and Cha, while the teaching of Chen continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result of implementing a region proposal network that takes into consideration multiple candidate boxes with enlarged sizes to properly analyze multiple regions in the image when performing object detection.  The Ren, Cha, and Chen systems perform object detection using a region proposal network, therefore one of ordinary skill in the art would have reasonable expectation of success in the combination.  It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question.

Claim 13 is rejected as applied to claim 6 as pertaining to a corresponding device.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
	
Kim et al. (US 10,438,082) discusses instructing the Region Proposal Network to generate at least one ROI bounding box for training a Convolutional Neural Network.
Fukagai (US 2019/0050994) discusses proposed region calculating unit inputs the feature maps obtained from the CNN layer to a region proposal network (RPN) layer, and outputs proposed regions as an example of candidate positions in which objects are present.
Lee et al. (US 2021/0326656) discusses two-stage object detector including a Mask R-CNN, and an RPN.


Contact Information

Any inquiry concerning this communication or earlier communications from the examiner should be directed to KENNY A CESE whose telephone number is (571) 270-1896.  The examiner can normally be reached on Monday – Friday, 9am – 4pm.
If attempts to reach the primary examiner by telephone are unsuccessful, the examiner’s supervisor, Claire Wang can be reached on (571) 270-1051.  The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Kenny A Cese/
Primary Examiner, Art Unit 2663