DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
2.	A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 02/11/2021 has been entered.
 
Response to Arguments
3.	Applicant's arguments filed 02/11/2021 have been fully considered but they are not persuasive. 
Applicant argues that Vijayanarasimhan et al., in view of Yang et al., do not teach a controller configured to send an action description module a sequence of image frames from a host system when it is determined that there is sufficient movement occurring within the sequence of image frames (Amendment, pages 7, 8).
The examiner disagrees, since Vijayanarasimhan et al., disclose “The machine learning module 202 may identify one or more video segments from the video that include frames between the start time and the end time for each action in the video.  Each video includes a single action or multiple actions.  For example, a ten second .

Claim Rejections - 35 USC § 103
4.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

5.	Claims 1, 4, 5, 9, 10, 13, 14, 18 are rejected under 35 U.S.C. 103 as being unpatentable over Vijayanarasimhan et al., (US PAP 2019/0114487) in Yang et al., (US PAP 2019/0102908).
	As claims 1, 10, 19, Vijayanarasimhan et al., teach a system for enhancing the accessibility of Audio Visual content, the system, comprising:
	a controller configured to send an action description module a sequence of image frames from a host system when it is determined that there is sufficient movement occurring within the sequence of image frames (“Each video includes a 
the action description module implemented in hardware, software, or a combination of hardware and software configured to recognize action happening within the sequence of image frames from the host system and generate a tag describing the action happening within the sequence of frames (paragraphs 19 – 21, 46 – 50);
a neural network configured to convert image frame level features to sequence window level features (“The machine learning module 202 may use a sliding window localizer to analyze the features within a series of frames”; paragraphs 50, 61);
	However, Vijayanarasimhan et al., do not specifically teach the action description module includes a first neural network configured to generate frame level features from image frame data in the sequence of frames and a third neural network configured to generate a classification of the action happening within the sequence window level features.
	Yang et al., disclose that at operation 124, a backbone neural network is applied to each frame in the input sequence of frames to generate a respective spatial feature volume for each anchor tube… At operation 128 a head neural network is selected for processing the regional features associated with each anchor tube.  According to some 

	Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was made to use different neural networks as taught by Yang et al., in Vijayanarasimhan et al., because that would help improve classification accuracy (paragraph 27).

	As per claims 4, 13, Vijayanarasimhan et al., in view of Yang et al., further disclose the action description module is configured to generate the tag describing the action happening within the sequence of frames from the classification of the action happening within the sequence window level features (“the videos were tagged with the action "jumping into pool."; Vijayanarasimhan et al., paragraphs 46 – 50; Yang et al., paragraphs 23 - 36).

	As per claims 5, 14, Vijayanarasimhan et al., in view of Yang et al., further disclose the image frame data is video game frame data (Vijayanarasimhan et al., paragraphs 46 – 50; Yang et al., paragraphs 23 - 36).

.

6.	Claims 6, 7, 15, 16 are rejected under 35 U.S.C. 103 as being unpatentable over Vijayanarasimhan et al., (US PAP 2019/0114487) in Yang et al., (US PAP 2019/0102908); and further in view of Zavesky et al., (US PAP 2020/0134298).
As per claims 6, 15, Vijayanarasimhan et al., further disclose synchronizing the output of the action description module with one or more other neural network modules (Vijayanarasimhan et al., paragraphs 46 – 50).
However, Vijayanarasimhan et al., in view of Yang et al., do not specifically teach a controller coupled to the host system and the action description module, wherein the controller is configured to activate the action description module in response to an input from a user.
Zavesky et al., disclose that the visual communication session may be established for such devices after AS 104 retrieves one or more configuration settings for the user 191, user 192, and/or user 193, determines which configuration setting(s), if any, to apply based upon the context(s), and activates the respective action detection 
Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was made to activate the action description module as taught by Zavesky et al., in Vijayanarasimhan et al., because that would help apply more stringent filtering by activating more action detection models and configuration settings for removing/altering visual content, and so on (paragraph 49).

As per claims 7, 16, Vijayanarasimhan et al., in view of Zavesky et al., further disclose the one or more other neural network modules includes an Acoustic Effect Annotation module implemented in hardware, software, or a combination of hardware and software configured classify primary acoustic effects occurring within an audio segment wherein the audio segment is synchronized to occur during presentation of the sequence of image frames (“The audio may include audio effects that correspond to actions.  For example, the audio effect may include a splashing sound that corresponds to a video clip of people jumping into a swimming pool.”; Vijayanarasimhan et al., paragraphs 81 – 83).

7.	Claims 8, 17 are rejected under 35 U.S.C. 103 as being unpatentable over Vijayanarasimhan et al., (US PAP 2019/0114487) in Yang et al., (US PAP 2019/0102908); and further in view of Aman et al., (US PAP 2011/0173235).
As per claims 8, 17, Vijayanarasimhan et al., in view of Yang et al., do not specifically teach a text to speech synthesis module coupled to the action description 
Aman et al., disclose that the same differentiated marks that are integrated and synthesized by the session processor into new events and marks, are used as is or in combination with newly generated session processor events and marks to controllably direct the flow of image frames into and out of the frame buffer for mixing (paragraph 156).
Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was made to synthesize the tag as taught by Aman et al., in Vijayanarasimhan et al., because that would help better generate videos that are used by the video application 103 to identify video segments that include action (paragraph 33).

Conclusion
8.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEONARD SAINT CYR whose telephone number is (571)272-4247.  The examiner can normally be reached on Monday- Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/LEONARD SAINT CYR/Primary Examiner, Art Unit 2658