DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 26-46 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims of U.S. Patent No. 11354903. Although the claims at issue are not identical, they are not patentably distinct from each other.
Instant application # 17742122
Patent # 11354903
26. (New) A system for performing object detection, comprising: a memory to store at least a portion of one or more of a current, a previous, and a subsequent frame of video; and a processor coupled to the memory, the processor to: generate a current frame initial feature map comprising object detection scoring for the current frame; detect, based on patch similarity, first and second paired patches between the current and previous frames, respectively, and third and fourth paired patches between the current and subsequent frames, respectively; increase a first prediction result in the current frame initial feature map for the first paired patch to a score of the second paired patch or increase a second prediction result in the current frame initial feature map for the third paired patch to a score of the fourth paired patch to generate an enhanced feature map; and determine one or more of an object detection localization, class, and confidence scoring for the current frame using the enhanced feature map.











30. (New) The system of claim 29, wherein the processor to determine the one or more of the object detection localization, class, and confidence comprises the processor to: concatenate the enhanced feature map and the second enhanced feature map; and provide the concatenated feature maps to at least one neural network layer to generate the one or more of the object detection localization, class, and confidence.

31. (New) The system of claim 29, wherein the processor to determine the one or more of the object detection localization, class, and confidence comprises the processor to: determine a shared object patch between the enhanced feature map and the second enhanced feature map; retain object detection localization and confidence scoring from the enhanced feature map for the shared object patch and discard object detection localization and confidence scoring from the second enhanced feature map for the shared object patch in response to the object detection confidence scoring for the shared object patch in the enhanced feature map comparing favorably to the object detection confidence scoring for the shared object patch in the second enhanced feature map; and determine the one or more of the object detection localization, class, and confidence scoring based at least on the retained object detection localization and confidence scoring from the enhanced feature map.

32. (New) The system of claim 29, wherein the processor to determine the one or more of the object detection localization, class, and confidence comprises the processor to: retain object detection localization and confidence scoring for shared object patches between the enhanced feature map and the second enhanced feature map based on higher confidence scoring for each of the shared object patches to generate a bidirectional enhanced feature map for the current frame; and determine the one or more of the object detection localization, class, and confidence scoring based on the bidirectional enhanced feature map.

33. (New) A computer-implemented method for performing object detection, comprising: receiving a current, a previous, and a subsequent frame of video; generating a current frame initial feature map comprising object detection scoring for the current frame; detecting, based on patch similarity, first and second paired patches between the current and previous frames, respectively, and third and fourth paired patches between the current and subsequent frames, respectively; increasing a first prediction result in the current frame initial feature map for the first paired patch to a score of the second paired patch or increasing a second prediction result in the current frame initial feature map for the third paired patch to a score of the fourth paired patch to generate an enhanced feature map; and determining one or more of an object detection localization, class, and confidence scoring for the current frame using the enhanced feature map.













37. (New) The method of claim 36, determining the one or more of the object detection localization, class, and confidence comprises: concatenating the enhanced feature map and the second enhanced feature map; and providing the concatenated feature maps to at least one neural network layer to generate the one or more of the object detection localization, class, and confidence.

38. (New) The method of claim 36, wherein determining the one or more of the object detection localization, class, and confidence comprises: determining a shared object patch between the enhanced feature map and the second enhanced feature map; retaining object detection localization and confidence scoring from the enhanced feature map for the shared object patch and discard object detection localization and confidence scoring from the second enhanced feature map for the shared object patch in response to the object detection confidence scoring for the shared object patch in the enhanced feature map comparing favorably to the object detection confidence scoring for the shared object patch in the second enhanced feature map; and determining the one or more of the object detection localization, class, and confidence scoring based at least on the retained object detection localization and confidence scoring from the enhanced feature map.

39. (New) The method of claim 36, wherein determining the one or more of the object detection localization, class, and confidence comprises: retaining object detection localization and confidence scoring for shared object patches between the enhancement feature map and the second enhanced feature map based on higher confidence scoring for each of the shared object patches to generate a bidirectional enhanced feature map for the current frame; and determining the one or more of the object detection localization, class, and confidence scoring based on the bidirectional enhanced feature map.

40. (New) At least one non-transitory machine readable medium comprising a plurality of instructions that, in response to being executed on a computing device, cause the computing device to perform object detection by: receiving a current, a previous, and a subsequent frame of video; generating a current frame initial feature map comprising object detection scoring for the current frame; detecting, based on patch similarity, first and second paired patches between the current and previous frames, respectively, and third and fourth paired patches between the current and subsequent frames, respectively; increasing a first prediction result in the current frame initial feature map for the first paired patch to a score of the second paired patch or increasing a second prediction result in the current frame initial feature map for the third paired patch to a score of the fourth paired to generate an enhanced feature map; and determining one or more of an object detection localization, class, and confidence scoring for the current frame using the enhanced feature map.













44. (New) The non-transitory machine readable medium of claim 43, determining the one or more of the object detection localization, class, and confidence comprises: concatenating the enhanced feature map and the second enhanced feature map; and providing the concatenated feature maps to at least one neural network layer to generate the one or more of the object detection localization, class, and confidence.

46. (New) The non-transitory machine readable medium of claim 43, wherein determining the one or more of the object detection localization, class, and confidence comprises: retaining object detection localization and confidence scoring for shared object patches between the enhanced feature map and the second enhanced feature map based on higher confidence scoring for each of the shared object patches to generate a bidirectional enhanced feature map for the current frame; and determining the one or more of the object detection localization, class, and confidence scoring based on the bidirectional enhanced feature map.
1. A system for performing object detection comprising: a memory to store a current, a previous, and a subsequent frame of video; and a processor coupled to the memory, the processor to: perform still image object detection on the current frame to determine a current frame initial feature map comprising object detection localization and confidence scoring for the current frame; detect, based on patch similarity, first and second paired patches between the current and previous frames, respectively, and third and fourth paired patches between the current and subsequent frames, respectively; modify a first prediction result in the current frame initial feature map for the first paired patch of the current frame to a maximum cached confidence score of the second paired patch of the previous frame to generate a first enhanced feature map comprising forward object detection localization and confidence scoring for the current frame; modify a third prediction result in the current frame initial feature map for the third paired patch of the current frame to a maximum cached confidence score of the fourth object patch of the subsequent frame to generate at least a second enhanced feature map comprising reverse object detection localization and confidence scoring for the current frame; and determine and output an object detection localization, class, and confidence scoring for the current frame using the first and second enhanced feature maps.

2. The system of claim 1, wherein the processor to determine the object detection localization, class, and confidence comprises the processor to: concatenate the first and second enhanced feature maps; and provide the concatenated first and second enhanced feature maps to at least one neural network layer to generate the object detection localization, class, and confidence.


3. The system of claim 1, wherein the processor to determine the object detection localization, class, and confidence comprises the processor to: determine a shared object patch between the first and second enhanced feature maps; retain object detection localization and confidence scoring from the first enhanced feature map for the shared object patch and discard object detection localization and confidence scoring from the second enhanced feature map for the shared object patch in response to the object detection confidence scoring for the shared object patch in the first enhanced feature map comparing favorably to the object detection confidence scoring for the shared object patch in the second enhanced feature map; and determine the object detection localization, class, and confidence scoring based at least on the retained object detection localization and confidence scoring from the first enhanced feature map.



4. The system of claim 1, wherein the processor to determine the object detection localization, class, and confidence comprises the processor to: retain object detection localization and confidence scoring for shared object patches between the first and second enhanced feature maps based on higher confidence scoring for each of the shared object patches to generate a bidirectional enhanced feature map for the current frame; and determine the object detection localization, class, and confidence scoring based on the bidirectional enhanced feature map.



11. A computer-implemented method for performing object detection comprising: receiving a current, a previous, and a subsequent frame of video; performing still image object detection on the current frame of video to determine a current frame initial feature map comprising object detection localization and confidence scoring for the current frame; detecting, based on patch similarity, first and second paired patches between the current and previous frames, respectively, and third and fourth paired patches between the current and subsequent frames, respectively; modifying a first prediction result in the current frame initial feature map for the first paired patch of the current frame to a maximum cached confidence score of the second paired patch of the previous frame to generate a first enhanced feature map comprising forward object detection localization and confidence scoring for the current frame; modifying a third prediction result in the current frame initial feature map for the third paired patch of the current frame to a maximum cached confidence score of the fourth object patch of the subsequent frame to generate at least a second enhanced feature map comprising reverse object detection localization and confidence scoring for the current frame; and determining and outputting an object detection localization, class, and confidence scoring for the current frame using the first and second enhanced feature maps.

12. The method of claim 11, wherein said determining the object detection localization, class, and confidence comprises: concatenating the first and second enhanced feature maps; and providing the concatenated first and second enhanced feature maps to at least one neural network layer to generate the object detection localization, class, and confidence.


13. The method of claim 11, wherein said determining the object detection localization, class, and confidence comprises: retaining only object detection localization and confidence scoring for shared object patches between the first and second enhanced feature maps based on higher confidence scoring for each of the shared object patches to generate a bidirectional enhanced feature map for the current frame; and determining the object detection localization, class, and confidence scoring based on the bidirectional enhanced feature map.










14. The method of claim 11, further comprising: sequentially generating, prior to said modifying the first prediction result and in a temporal order of the video, a plurality of enhanced feature maps comprising forward object detection localization and confidence scoring for each of a plurality of frames of the video previous to the previous frame and the previous frame, wherein the maximum cached confidence score is in the enhanced feature map for the previous frame.




16. At least one non-transitory machine readable medium comprising a plurality of instructions that, in response to being executed on a computing device, cause the computing device to perform object detection by: receiving a current, a previous, and a subsequent frame of video; performing still image object detection on the current frame of video to determine a current frame initial feature map comprising object detection localization and confidence scoring for the current frame; detecting, based on patch similarity, first and second paired patches between the current and previous frames, respectively, and third and fourth paired patches between the current and subsequent frames, respectively; modifying a first prediction result in the current frame initial feature map for the first paired patch of the current frame to a maximum cached confidence score of the second paired patch of the previous frame to generate a first enhanced feature map comprising forward object detection localization and confidence scoring for the current frame; modifying a third prediction result in the current frame initial feature map for the third paired patch of the current frame to a maximum cached confidence score of the fourth object patch of the subsequent frame to generate at least a second enhanced feature map comprising reverse object detection localization and confidence scoring for the current frame; and determining and outputting an object detection localization, class, and confidence scoring for the current frame using the first and second enhanced feature maps.

17. The non-transitory machine readable medium of claim 16, wherein said determining the object detection localization, class, and confidence comprises: concatenating the first and second enhanced feature maps; and providing the concatenated first and second enhanced feature maps to at least one neural network layer to generate the object detection localization, class, and confidence.

18. The non-transitory machine readable medium of claim 16, wherein said determining the object detection localization, class, and confidence comprises: retaining only object detection localization and confidence scoring for shared object patches between the first and second enhanced feature maps based on higher confidence scoring for each of the shared object patches to generate a bidirectional enhanced feature map for the current frame; and determining the object detection localization, class, and confidence scoring based on the bidirectional enhanced feature map.





     	Claims 26-46 of the instant application is unpatentable under the judicially created doctrine of “obviousness-type” double patenting with respect to claim 1 of U.S. Patent No. 11354903.
Application claims defines an obvious variation of the invention claimed in U.S. Patent No. 11354903.
 	The assignee of all applications of all applications is the same.
Claims 26-46 of the instant application is obvious over patent claims in that claim of the patent are substantially similar to the claims of the instant application.  Claims 26-46 of the instant application therefore are not patently distinct from the earlier patent claims and as such is unpatentable for obvious-type double patenting.
Similarly, claim 27 is rejected as being obvious over claim 1 of the Patent because both recites the maximum cached confidence score. claim 28 is rejected as being obvious over claim 1 of the Patent as it describes forward object detection. claim 29 is rejected as being obvious over claim 1 of the patent because claim 1 refers to the generate the enhanced feature map. Claims 27-29 are substantially similar to claim 1 of the Patent and therefore obvious over each other. Claims 34-36 and 41-43 are rejected for the same reason claims 27-29 is rejected. Claim 45 is rejected as being obvious over claim 3 of the Patent. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  
Ren et al. (US 2021/0133461) discloses video visual relation detection method and systems.

Choi et al. (US 9965719) discloses Subcategory-aware Convolutional Neural Networks For Object Detection. (10) The present invention further proposes a method composed of two steps: 1) using a region proposal network and 2) using an object detection network given the region proposals. The method uses the concept of subcategories (i.e., group of examples that shares similar appearance characteristics, e.g., cars with frontal view point, people standing or walking, etc.) to learn a compact representation of the objects. In the first step, a CNN method is learned that can propose object regions directly from an image. The subcategory-aware convolutional filters are learned to predict the possible location of the target objects in the image. In this step, many false-positive boxes may be generated. However, the goal is not to miss any true-positives. In the second step, given the candidate box proposals, each region is evaluated by using the subcategory-aware classification model. In contrast to the region proposal network, this model may have much more complex classification model, which may take a longer time, but produce high quality detection results. Therefore, the subcategory information is exploited to learn better object proposal and object detection models.

Huang et al. (US 2017/0147905) discloses [0072] Traditional NN-based Face Detector. Neural network-based face detector refers to those face detection system using neural network before the recent break-through results of CNNs for image classification. Early works dating back to 1990s train neural network-based detectors that are activated only on faces having a specific size, and apply detectors on the image pyramid in a sliding-window fashion. While the systems and methods presented herein have a similar detection pipeline, embodiments use modern CNNs as detectors. Hence the systems and methods presented herein are in a sense “modern NN-based detectors.”

Zhao et al. (US 2009/0141940) discloses Integrated Systems and Methods for Video-Based Object Modeling, Recognition, And Tracking in which modeling, recognizing, and tracking object images in video files. In one embodiment, a video file, which includes a plurality of frames, is received. An image of an object is extracted from a particular frame in the video file, and a subsequent image is also extracted from a subsequent frame. A similarity value is then calculated between the extracted images from the particular frame and subsequent frame. If the calculated similarity value exceeds a predetermined similarity threshold, the extracted object images are assigned to an object group. The object group is used to generate an object model associated with images in the group, wherein the model is comprised of image features extracted from optimal object images in the object group. Optimal images from the group are also used for comparison to other object models for purposes of identifying images.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to DHAVAL V PATEL whose telephone number is (571)270-1818. The examiner can normally be reached Monday to Friday (8:00am-4:30pm).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sam Ahn can be reached on 571-272-3044. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DHAVAL V PATEL/Primary Examiner, Art Unit 2631                                                                                                                                                                                                        10/31/2022