DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Applicant's submission filed on 13 October 2021 has been entered.  Claims 1, 4, 9 and 10 have been amended.  Claims 3, 7 and 8 have been previously canceled.  Claims 1, 2, 4-6 and 9-11 are currently pending.

Response to Arguments
The 35 U.S.C. §112(b) rejection of claims 1, 2, 4-6 and 9-11 has been withdrawn in view of Applicant’s response.

Allowable Subject Matter
Claims 1, 2, 4-6 and 9-11 are allowed.

Reasons for Allowance
The following is an examiner’s statement of reasons for allowance:
Claims 1, 2, 4-6 and 9-11 are allowable over the prior art as the prior art references taken individually or in combination fail to particularly disclose, fairly suggest, or render obvious Applicant's independent claim language as argued by Applicant on pages 8-9 of Applicant’s Remarks dated 13 October 2021 which the Examiner considers persuasive.  As enumerated below, the prior art discloses convolutional neural networks that are used for object detection.  The convolutional neural networks 
The closest prior art being A. Piergiovanni, C. Fan, and M. S. Ryoo. Learning latent sub-events in activity videos using temporal attention filters. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAl-17), 1 January 2017 discloses a method for temporal attention filters and describes how the filters are used for human activity recognition from videos.  Activities are composed of multiple temporal parts (e.g., sub-events) with different duration/speed and the model explicitly learns the temporal structure using multiple attention filters.  The model learns a set of optimal static temporal attention filters that can be shared across different videos and uses these filters to dynamically adjust attention filters for each test video using recurrent long short-term memory networks (LSTMs). This allows the temporal attention filters to learn latent sub-events specific to each activity.
Baradel F, Wolf C, Mille J. Pose-conditioned spatio-temporal attention for human action recognition. arXiv preprint arXiv:1703.10106. 2017 Mar 29 discloses a system and method that recognizes human actions from multi-modal video data.  The method uses a pose stream and an RGB stream for recognition.   The pose stream is processed with a convolutional model taking as input a 3D 
Palanisamy et al., U.S Publication No. 2020/0139973 discloses systems and methods for spatial and temporal attention-based deep reinforcement learning of hierarchical lane-change policies for controlling an autonomous vehicle.  The system processes image data received from an environment to learn the lane-change policies as a set of hierarchical actions, and evaluates the lane-change policies to calculate loss and gradients to predict an action-value function that is used for learning and to update parameters of the lane-change policies.  The system selects relevant regions in the image data that are of importance, and a temporal attention module learns temporal attention weights to be applied to past frames of image data to indicate relative importance in deciding which lane-change policy to select.

With respect to the independent claims, the claimed limitations “generate, based at least in part on a first group of the plurality of first feature maps calculated at a first time and a second group of the plurality of first feature maps calculated prior to the first time, a time observation map comprising a plurality of elements ... wherein the hardware processor generates the time observation map based on a result of an inner product of the feature quantities defined for each element of the plurality of elements along each of the time direction, a position direction in the plurality of first feature maps, and a relationship direction among the plurality of first feature maps, and wherein the inner product of the feature quantities for each element of the plurality of first feature maps is defined as the first weighting value for each element of the plurality of first feature maps belonging to the first group and the plurality of first feature maps belonging to the second group” in conjunction with other elements of the .

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TRACY MANGIALASCHI whose telephone number is (571)270-5189. The examiner can normally be reached M-F, 9:30AM TO 6:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached on (571) 272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like 




/TRACY MANGIALASCHI/Examiner, Art Unit 2668                 
/VU LE/Supervisory Patent Examiner, Art Unit 2668