DETAILED ACTION

Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103

1.        In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

2.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

3.	Claims 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Krishnamurthy et al., US 2020/0134316 A1, as applied to claim 1 above, and further in view of Salekin, WO 2019166296 A1 .

4. 	As per claim 19, Krishnamurthy  discloses: A computer-implemented method of determining audio tags of video frames for a video game, the method comprising:
determining, for each frame in a plurality of frames of an animation sequence of the video game, (Krishnamurthy, [0058], “ The Action description module 110 takes a short sequence of image frames from a video stream as input and generates a text description of the activity occurring within the video stream.”, and [0027])  whether said frame is associated with a blank audio tag or a non-blank audio tag using a binary classifier; (Salekin, [0041],” The DCNN audio tagging model 34 utilizes a DCNN (dilated convolution neural network) as a binary classifier to detect and tag the presence of a target audio event in an audio clip. More particularly, the processor 22 is configured to execute program instructions corresponding to the DCNN audio tagging model 34 to determine a classification output indicating the presence or non-presence of a particular target audio event.”)

5.	Krishnamurthy doesn’ t expressly disclose: determining, for each of the frames determined to be associated with a non- blank audio tag, an identity of the non-blank audio tag using a multiclass classifier

6.	Salekin discloses: determining, for each of the frames determined to be associated with a non- blank audio tag, an identity of the non-blank audio tag using a multiclass classifier.( Salekin, [0051], “However, it will be appreciated, that in some alternative embodiments, the DCNN audio tagging model 34 may comprise a multi-class model in which the output layer has neuron with sigmoid activation for each target audio event that is to be detected (e.g., four) to provide a multi-class classification output C.sub.tag . Thus, a single set of weights and/or parameters may be learned and used for detecting the presence or non-presence of all target audio events that are to be detected.”) 

7.	Salekin is analogous art with respect to Krishnamurthy because they are from the same field of endeavor, namely image processing.  At the time the application was filed, it would have been obvious to a person of ordinary skill in the art to include the process of that the tag for each frame comprises an audio tag out of a set of audio tags, and wherein the set of audio tags includes a blank audio tag. as taught by Salekin into the teaching of Krishnamurthy.  The suggestion for doing so would evaluates a classification output.  Therefore, it would have been obvious to combine Salekin with Krishnamurthy.

8. 	As per claim 20, Krishnamurthy and in view of Salekin discloses: The method of claim 19, wherein the multiclass classifier comprises a neural network, the method further comprising: 
sequentially input a sequence of video frames of an animation from the video game into the neural network; (Krishnamurthy, [0058], “The Action description module 110 takes a short sequence of image frames from a video stream as input and generates a text description of the activity occurring within the video stream. To implement this, three convolutional Neural Networks are used. A first Action Description NN 301 takes a short sequence of video frames, referred to herein as a window, and generates segment-level or video-level feature vectors, e.g., one feature vector for each video frame in the window.”)
processing the sequence of video frames through a plurality of neural network layers; (Krishnamurthy, [0045], “Generally, neural networks used in the component systems of the On-Demand Accessibility System may include one or more of several different types of neural networks and may have many different layers.”, and [0058]) and
outputting, from the neural network and based on the processed sequence of video frames, data indicative of an audio tag for each of the video frames, (Krishnamurthy, [0039],” The Acoustic Effect Annotation module 150 receives audio information from the control module 101 and generates corresponding text information. The Acoustic Effect Annotation module 150, controller 101 or host system 102 may include an audio tagging module 109 that combines the text information, e.g., as subtitles or captions with video frame information so that the text information appears on corresponding video images presented by the video output device 104.”), wherein the neural network comprises one or more recurrent neural network layers. ( Krishnamurthy, [0045], “Generally, neural networks used in the component systems of the On-Demand Accessibility System may include one or more of several different types of neural networks and may have many different layers. By way of example and not by way of limitation the classification neural network may consist of one or multiple convolutional neural networks (CNN), recurrent neural networks (RNN) and/or dynamic neural networks (DNN).”)


Response to Arguments

9.	Applicant's arguments filed 06/27/2022 have been fully considered but they are not persuasive.

	Applicant argues Krishnamurthy and in view of Salekin does not teach “determining, for each of the frames determined to be associated with a non- blank audio tag, an identity of the non-blank audio tag using a multiclass classifier” as recited in claim 19.  
	Examiner replies: The most reasonable interpretation of the word “blank” is “empty” or “vacant”. The Examiner is equating blank tag or a non-blank tag to an empty frame  or non-empty frame. The reference indicate this by teaching:” “However, it will be appreciated, that in some alternative embodiments, the DCNN audio tagging model 34 may comprise a multi-class model in which the output layer has neuron with sigmoid activation for each target audio event that is to be detected (e.g., four) to provide a multi-class classification output C.sub.tag . Thus, a single set of weights and/or parameters may be learned and used for detecting the presence or non-presence of all target audio events that are to be detected.”” In paragraph 51. In this case the presence or non presence is equated to a non- blank or blank.

Allowable Subject Matter

10.	Claims 1-13, and 15-18 are allowed.	

Conclusion

11. 	THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ABDERRAHIM MEROUAN whose telephone number is (571)270-5254.  The examiner can normally be reached on Monday to Friday 7:30 AM to 5:00 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kent Chang can be reached on 571-272-7761. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/ABDERRAHIM MEROUAN/Primary Examiner, Art Unit 2619