Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's amendments filed on 8/3/2021 overcome the following set forth in the previous Office Action:
The claims 17-18 being objected to,
The claim 17 being rejected under 35 USC §101, and
The claims 4-6, 11, 13-16 and 18 being rejected under 35 USC §112 (b) or 35 USC §112 (pre-AIA ), second paragraph
Applicant's arguments filed 8/3/2021 have been fully considered but they are not persuasive. The Office has thoroughly reviewed Applicants' arguments but firmly believes that the cited references reasonably and properly met the claimed limitations as originally filed. Furthermore, the amendments necessitate new grounds of rejections as to be detailed below.
Regarding applicant’s arguments on the claim interpretation of apparatus claim 18 under 35 U.S.C. 112(f), it suffices to note that the interpretation under 35 U.S.C. 112(f) is for the generic placeholders of the claim 18 as discussed in the previous office action, i.e., a module for generating a plurality of region proposals, a CNN (or convolutional neural network (CNN) as amended) pre-trained for object detection, a tracker for tracking one or more targets … and generating tracking information and a module further configured to refine the plurality of region proposals in claim 18 (highlight indicating the generic placeholders). Applicant has not argued why these are 
Regarding claim 1, applicant alleges that “as the present independent claims include at least one limitation which is not expressly or inherently described by Fukagai, at least the independent claims are patentable over Fukagai.”
The Office respectfully disagrees. In fact, it is unclear from applicant’s arguments which specific claim limitation(s) is (are) not disclosed by Fukagai.
Applicant argues that “Fukagai discloses a two-stage object detector which uses different layers of a CNN to detect objects: a region proposal network layer (e.g. RPN layer) and a Fast R-CNN layer (e.g. see elements 14 and 15 of FIG. 1, and FIG. 6). In other words, Fukagai uses two layers of a single CNN to both propose regions and detect objects. Put another way, region proposals in the context of Fukagai are an internal part of a CNN, and which act as an interface between the two layers. The region proposals are output by the region proposal network layer of the CNN to the Fast R- CNN layer of the CNN.”
The Office respectfully disagrees. In fact, whether or not the particular CNN disclosed by Fukagai, in this case, the R-CNN, is inside another system, which may or may not include CNN elements, is of no relevance as long as the R-CNN in Fukagai, which is mapped to the claim limitations, does the same thing as claimed in claim 1.
Applicant further argues that “the presently claimed subject matter uses a separate module to generate input to a CNN. Such a feature is different from Fukagai, in which performance of a CNN occurs based on layers internal to the CNN. Indeed, the presently claimed "generating a plurality of region proposals, each region proposal 
The Office respectfully disagrees. In fact, Fukagai also uses a separate module (14 of Fig. 1 of Fukagai) to generate input to a CNN (the R-CNN or 15 of Fig. 1 of Fukagai). It is also noted that in applicant’s Fig. 4 the region proposal module 402 and CNN module 406 are also part of one system 400, i.e., the CNN module 406 is internal to the system 400.
Applicant then argues that “while both Fukagai and the presently claimed subject matter use the term "Region Proposal", the regions proposals of Fukagai, and the regions proposals of the presently claimed subject matter, are not the same (e.g. they may have similar names, but they are not equivalent).” To support this argument, applicant asserts that “in the presently claimed subject matter, region proposals are input to a CNN pre-trained for object detection: attention is directed to present FIG. 4 which shows region proposals, from a region proposal module, being input to a CNN observations module. While in Fukagai, regions proposals are an internal CNN feature.”
The Office respectfully disagrees. Here, applicant’s logic has similar flaws as discussed above. First of all, applicant has not pointed out exactly what the difference is between the term "Region Proposal" used by Fukagai and the term "Region Proposal" used in claim 1. In fact, applicant has just argued that “the presently claimed "generating a plurality of region proposals, each region proposal comprising a part of a video frame, the plurality of region proposals being input to a convolutional neural network (CNN) pre-trained for object detection" could be used to generate input TO the RPN layer of input to a CNN pre-trained for object detection. Thirdly, just like a region proposal module 402 is inside the system or module 400 and provides input to a CNN observations module 406 inside the system or module 400 as shown in Fig. 4 of instant application, the unit 14 is inside the system 1 and provides input to a CNN unit 15 inside the system 1 of Fig. 1 of Fukagai.
Furthermore, applicant argues that “as described throughout the present application, optimization of performance of the CNN occurs by manipulating the region proposal input to the CNN and not the CNN itself”, “whereas Fukagai is directed towards optimization of the CNN”.
The Office respectfully disagrees. First of all, the rejection is on the claimed invention not on the specification or figures of the application. If applicant believes that there is something unique (and unclaimed) about the invention, applicant should claim it. For example, “optimization of performance of CNN” “by manipulating the region proposal input to the CNN” is not a claimed subject matter. Secondly, the office action maps the R-CNN to the claimed CNN. If applicant disagrees with this mapping, applicant should point out where the office has erred in such a mapping.
Finally, applicant argues that “the goals of the presently claimed subject matter and Fukagai are also different. For example, the subject matter of Fukagai is directed towards optimizing performance for each single object in a given input generated by the RPN layer. In contrast, in the presently claimed subject matter, performance for ALL input to the CNN for the detector.”
The Office respectfully disagrees. Again, the rejection is on the claimed invention not on the goal of the claimed subject matter. As long as all the claim limitations are anticipated by the prior art, the claim is anticipated by the prior art, in this case Fukagai. Applicant has not pointed out where the office has erred in the rejection presented in the previous office action and which claim limitation is not anticipated by the prior art of Fukagai.
References Cited in Prior Art Rejections 
The following references are cited in the prior art rejections set forth below and are referred to as noted:
Fukagai, US 20190050694 A1, published on February 14, 2019, filed on August 6, 2018, hereinafter Fukagai, and
Cinnamon et al., US 20190019318 A1, published on January 17, 2019, filed on November 1, 2017, hereinafter Cinnamon.  

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-5, 7-11, 13-15 and 17-20 are rejected under 35 U.S.C. 102 as being anticipated by Fukagai.
Regarding claim 1, Fukagai discloses a method comprising: 
generating a plurality of region proposals, each region proposal comprising a part of a video frame, the plurality of region proposals being input to a convolutional neural network (CNN) pre-trained for object detection; (Fukagai: element 14 in Figs. 1, 9 and 11. Fig. 3. S6 in Figs. 14 and 16. Image data 111 in Fig. 1 is an image sequence (see [0046-0047]). The output of element 14 are region proposals (see [0059]) being input to a Fast R-CNN (see [0072]))
detecting, using the CNN, one or more objects in a series of video frames; (Fukagai: elements 15-16 in Figs. 1, 9 and 11. Fig. 6. S10-S11 in Figs. 14 and 16. The Fast R-CNN is used for object detection or recognition [0073, 0084].)
tracking one or more targets based on outputs from the CNN across the series of video frames and generating tracking information on the one or more targets; (Fukagai: element 17 in Figs. 1, 9 and 11. S12 in Figs. 14 and 16. [0085-0086, 0092-0093, 0097-0100]. “In one example, the recognition-result analyzing unit 17 may calculate movement information indicating the moving position of the object in the present frame based on the recognition result of the past frames and provides the calculated movement information to the Faster R-CNN.” [0086].) and 
refining the plurality of region proposals to be input to the CNN, based on the tracking information. (Fukagai: Figs. 1, 9, 11, 14 and 16. [0087-0092]. Two examples of refinement of region proposals are shown in Figs. 9-10 and 14 (first example, [0087-0088, 0118-0119]) and Figs. 11-12 and 16 (second example, [0089-0090, 0120-0121]) based on movement information (or the claimed “tracking information”) (see S7 in Fig. 14 and S14 in Fig. 16).)
Regarding claim 2, Fukagai discloses the method of claim 1, wherein the outputs from the CNN comprise a bounding box and a classification score for each detected object, wherein each bounding box is defined by a location and vertical and horizontal dimensions.  (Fukagai: Figs. 6-7. [0074-0076, 0079, 0084])
Regarding claim 3, Fukagai discloses the method of claim 1, further comprising: categorizing each of the one or more targets into a status category, based on the outputs from the CNN, the status category of a target indicating the time since the target was likely detected by the CNN.  (Fukagai: Figs. 1and 8-12. S12 in Figs. 14 and 16. [0085-0086, 0092-0093, 0097-0100, 0111]. In particular, Figs. 8-12 show the claimed “status category of a target indicating the time since the target was likely detected by the CNN” with time stamps t=1, 2, 3 and related discussions.)
Regarding claim 4, Fukagai discloses the method of claim 3, further comprising: for each of the one or more targets, identifying a region of the plurality of region proposals likely containing a target or a new region likely containing the target.  (Fukagai: Figs. 1 and 6-12. [0087-0090, 0118-0121].)
Regarding claim 5, Fukagai discloses the method of claim 4, further comprising: calculating a region priority score for each of the plurality of region proposals based on a priority score of the target that is likely within an identified region proposal, the priority score of the target being determined based on a corresponding status category.  (Fukagai: Figs. 1 and 6-13. [0087-0090, 0118-0126]. “In one example, as illustrated in FIG. 12, the output unit 172 may output correction information for increasing the score of a candidate region whose position and size are close to the position and size of the predicted region at t=3 to the proposed-region calculating unit 14. An example of the candidate region whose position and size are close to the position and size of the predicted region is a rectangular region (in the example in FIG. 12, a rectangular region A) of which the proportion p of a region of the rectangular region overlapping with the predicted region is the maximum.” [0121]. “FIG. 13 illustrates an example in which additional proposed regions (for example, rectangular regions in which a boat is likely to be actually present), which is one example of the movement information, are added to the proposed regions output from the RPN layer 140, and the results are evaluated.” [0125]. “Thus, boats are newly detected with high scores in the regions in which additional proposed regions are specified (see the right image in FIG. 13). Also when correction information, which is an example of the movement information, is provided to the Faster R-CNN, the same advantageous effect as that in FIG. 13 is expected.” [0126])
Regarding claim 7, Fukagai discloses the method of claim 1, wherein the plurality of region proposals include a non-zero motion vector and are selected from a plurality of predefined regions covering the frame. (Fukagai: Fig. 8 shows a “vector S” with position, velocity and acceleration of each target in each frame ([0111-0114]. “The predicted presence region of the object may be determined by calculating the vector S of each frame and each target object.” ([0111]). “Thus, the presuming unit 171 may obtain the vector S (i.e., the claimed “motion vector”) to presume a predicted presence region by specifying the target object for each frame and applying a motion model prepared in advance to each target object. Examples of the motion model include a uniform motion model, an acceleration motion model, and other various motion models.” ([0114]). Figs. 9-12 shows multiple such regions with non-zero motion vectors (i.e., objects in these regions moved). These movement information are fed back for region selection as shown in Figs. 1, 9 and 11 ([0087-0090, 0118-0126]).)
Regarding claim 8, Fukagai discloses the method of claim 7, wherein the predefined regions are generated from an object size map, the object size map's value for a given location in the object size map representing an estimated object size in pixels. (Fukagai: [0068, 0111, 0113, 0121, 0123, 0129, 0162-0163, 0184]. “For example, the proposed-region calculating unit 14 may presume presence/absence of an object in each anchor region or a region formed of a combination of a plurality of anchors, the size of a region in which the object is present, and its score based on the feature maps.” ([0068]). “As illustrated in FIG. 8, in detecting and tracking a moving object whose size changes, the presuming unit 171 may presume the predicted presence region of the object from the past image sequence data using a tracking filter, such as a Kalman Filter.” ([0111]). “The presuming unit 171 may determine a rectangular region in which the target object can be detected next based on the position and the size of the rectangular region, the type of the object (person, vehicle, or horse) presumed first using a common Faster R-CNN, and the frame rate at the observation.” ([0113]). “In one example, as illustrated in FIG. 12, the output unit 172 may output correction information for increasing the score of a candidate region whose position and size are close to the position and size of the predicted region at t=3 to the proposed-region calculating unit 14.” ([0121]). “Since the size of a tracked target changes among frames, the values of q.sub.x and q.sub.y that determine the value of Q may be determined as the function of the size (w, h) of the tracked object.” ([0184]))
Regarding claim 9, Fukagai discloses the method of claim 7, wherein the total number of the plurality of region proposals satisfies an upper threshold number criterion. (Fukagai: [0064-0067, 0078]. “The number of proposed regions output from the RPN layer 140 may be limited to a predetermined number (for example, 150).” ([0064]))
Regarding claim 10, Fukagai discloses the method of claim 7, wherein generating the plurality of region proposals comprises: adding an additional region to the plurality of region proposals until the number of the plurality of region proposals satisfies an upper threshold number criterion. (Fukagai: Figs. 1, 9 and 11. [0064-0067, 0078].)
Regarding claim 11, Fukagai discloses the method of claim 10, wherein the additional region is determined using a default region or a last checking time map, the last checking time map describing the time since a local region was provided to the CNN. (Fukagai: Figs. 1 and 8-12. For example, the last checking time can be t=2 in Figs. 1 and 8-12.)
Regarding claim 13, Fukagai discloses the method of claim 2, further comprising: for each of the one or more targets, identifying a region of the plurality of region proposals in which a target is likely contained based on a corresponding bounding box.  (Fukagai: Figs. 1 and 6-12. [0064-0067, 0087-0090, 0118-0121].)
Regarding claim 14, Fukagai discloses the method of claim 13, further comprising: creating a new region for a target that is not likely within any of the plurality Fukagai: Figs. 1 and 6-12. [0064-0067, 0089-0090, 0120-0121].)
Regarding claim 15, Fukagai discloses the method of claim 14, further comprising: 
categorizing each of the one or more targets into a status category, based on the outputs from the CNN, the status category of a target indicating the time since the target was likely detected by the CNN; (Fukagai: Figs. 1and 8-12. S12 in Figs. 14 and 16. [0085-0086, 0092-0093, 0097-0100, 0111]. In particular, Figs. 8-12 show the claimed “status category of a target indicating the time since the target was likely detected by the CNN” with time stamps t=1, 2, 3 and related discussions.) and 
calculating a region priority score for each region that likely contains a target based on a priority score of the target, the priority score of the target based on a corresponding status category. (Fukagai: Figs. 1 and 6-13. [0087-0090, 0118-0126]. “In one example, as illustrated in FIG. 12, the output unit 172 may output correction information for increasing the score of a candidate region whose position and size are close to the position and size of the predicted region at t=3 to the proposed-region calculating unit 14. An example of the candidate region whose position and size are close to the position and size of the predicted region is a rectangular region (in the example in FIG. 12, a rectangular region A) of which the proportion p of a region of the rectangular region overlapping with the predicted region is the maximum.” [0121]. “FIG. 13 illustrates an example in which additional proposed regions (for example, rectangular regions in which a boat is likely to be actually present), which is one example of the movement information, are added to the proposed regions output from the RPN layer 140, and the results are evaluated.” [0125]. “Thus, boats are newly detected with high scores in the regions in which additional proposed regions are specified (see the right image in FIG. 13). Also when correction information, which is an example of the movement information, is provided to the Faster R-CNN, the same advantageous effect as that in FIG. 13 is expected.” [0126])
Claims 17-18 are the computer readable medium and apparatus (Fukagai: Fig. 30) claims, respectively, corresponding to the method claim 1. Therefore, since claims 17-18 are similar in scope to claim 1, claims 17-18 are rejected on the same grounds as claim 1.
Regarding claim 19, Fukagai discloses the method of claim 1, wherein a first region proposal, of the plurality of region proposals, includes a plurality of the targets.  (Fukagai: for example, a region proposal in Figs. 3 and 6-7 has a horse and a person. As another example, at least one region proposal in Fig. 13 has two boats.)
Regarding claim 20, Fukagai discloses the method of claim 19, wherein each of one or more second region proposals of the plurality of region proposals includes a respective two or more of the targets.  (Fukagai: for example, Fig. 13, either before or after movement information is added, has two region proposals each having two boats, one with two similar boats and the other with two dissimilar boats, and one of those two region proposals with two boats is interpreted as the claimed “first region proposal” in claim 19 while the other one is interpreted as the claimed “each of one or more second region proposals” in claim 20.)
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 6 and 16 are rejected under 35 U.S.C. 103 as obvious over Fukagai as applied to claims 5 and 15 discussed above, and further in view of Cinnamon.
Regarding claim 6, Fukagai discloses the method of claim 5, wherein refining the plurality of region proposals comprises: Fukagai: [0064-0067]. “The number of proposed regions output from the RPN layer 140 may be limited to proposed regions whose scores have a predetermined numerical value (for example, a numerical value of "0.800" or greater). … The number of proposed regions output from the RPN layer 140 may be limited to a predetermined number (for example, 150).” ([0064]). “The candidate-region selecting unit 142 narrows down the number of proposed regions, calculated by the candidate-region and score calculating unit 141, to a predetermined number.” ([0067]).)
Fukagai does not disclose explicitly sorting the plurality of region proposals in a descending order by the region priorities scores, although it is arguable that this claimed feature is implied by the disclosure from Fukagai, particularly the above cited paragraphs. First of all, as disclosed by Fukagai, each of proposed regions (i.e., the claimed “region proposals”) includes a score (i.e., the claimed region priority score) “indicating the probability of presence of an object in the regions” (see [0064]). Secondly, as cited above, the “number of proposed regions output from the RPN layer 140 may be limited to proposed regions whose scores have a predetermined numerical value (for example, a numerical value of "0.800" or greater)” (see [0064]). Thirdly, also as cited above, the “number of proposed regions output from the RPN layer 140 may be limited to a predetermined number (for example, 150)” (see [0064]). Nonetheless, sorting region proposals based on their respective region proposal scores is well known and commonly practiced in the image processing art as evidenced by the prior art of Cinnamon. (Cinnamon: [0174]) 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Fukagai’s disclosure with Cinnamon’s teachings by combining the method for region proposal with motion information feedback (from Fukagai) with the technique of sorting region proposals based on their respective region proposal scores (from Cinnamon) to yield no more than predictable 

Claim 16 is similarly rejected for the same reasoning and motivation discussed above regarding claim 6.
Allowable Subject Matter
Claim 12 is objected to as being dependent upon rejected base claims, but would be allowable over prior art references cited if rewritten in independent form including all of the limitations of the respective base claims and any intervening claims.
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FENG NIU whose telephone number is (571)272-9592.  The examiner can normally be reached on Monday - Friday, 8am-5pm PT.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached on (571) 272-7409.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/FENG NIU/Primary Examiner, Art Unit 2669