DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Allowable Subject Matter
Claims 2 - 11, 14 - 19, are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  
With regards to claim 2, several of the features of this claim were known in the art as evidenced by the combination of Dimitrova et al (U.S. PG Pub. No. 2002/0147782) in view of Ahmed et al (U.S. Patent No. 11,216,517), Lyu et al (U.S. PG Pub. No. 2021/0201934) and Ogawa et al (U.S. PG Pub. No. 2003/0165320), which render obvious the limitations of parent claim 1. In particular, Dimitrova discloses initiating a machine learning (“visual analysis”) component on the video component to identify one or more sensitive portions in the one or more image frames at ¶¶ [0038]-[0039], [0041]. Dimitrova further discloses using a neural network as a machine learning (“visual analysis”) component at ¶ [0060]. But, Dimitrova does not disclose initiating a reinforcement learning algorithm on the one or more image frames, wherein initiating further comprises: observing a current state of a first frame, wherein the first frame is associated with the one or more image frames, initiating the masking algorithm on the first frame, implementing, using the masking algorithm, the masking action policy on the first frame to generate a masked first frame, wherein implementing the masking action policy changes the current state of the first frame to a next state, wherein the next state is a current state of the masked first frame, initiating the CNN algorithm on the masked first frame, and  determining, using the CNN algorithm, a performance assessment output for the masked first frame based on at least implementing the masking action policy.
With regards to claims 3 - 6, these claims depend from claim 2 and therefore incorporate the features of that claim that were found allowable. These claims are found allowable for the same reasons as were provided with respect to their parent claim(s).
With regards to claims 7 and 14, several of the features of this claim were known in the art as evidenced by the combination of Dimitrova et al (U.S. PG Pub. No. 2002/0147782) in view of Ahmed et al (U.S. Patent No. 11,216,517), Lyu et al (U.S. PG Pub. No. 2021/0201934) and Ogawa et al (U.S. PG Pub. No. 2003/0165320), which render obvious the limitations of parent claim 1. In particular, Lyu discloses initiating a vectorization engine on the one or more audio portions and segmenting, using the audio word2vec algorithm, the audio component into one or more audio portions at ¶¶ [0104]-[0108]. But, Lyu does not disclose mapping, using the vectorization engine, the one or more audio portions into a one or more fixed length audio portion vectors capable of being represented in a vector space.
With regards to claims 8 - 11 and 15 - 19, these claims depend from claims 1 and 13, respectively, and therefore incorporate the features of those claims that were found allowable. These claims are found allowable for the same reasons as were provided with respect to their parent claim(s).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1, 12 - 13, 20, are rejected under 35 U.S.C. 103 as being unpatentable over Dimitrova et al (U.S. PG Pub. No. 2002/0147782) in view of Ahmed et al (U.S. Patent No. 11,216,517), Lyu et al (U.S. PG Pub. No. 2021/0201934) and Ogawa et al (U.S. PG Pub. No. 2003/0165320).
With regards to claim 1, as a matter of claim construction, the preamble has been found non-limiting. The determination of whether preamble recitations are structural limitations or mere statements of purpose or use "can be resolved only on review of the entirety of the [record] to gain an understanding of what the inventors actually invented and intended to encompass by the claim" as drafted without importing "'extraneous' limitations from the specification." Corning Glass Works, 868 F.2d at 1257, 9 USPQ2d at 1966. In the instant matter, the statements in the preamble of claim 1 do not result in a structural difference (or, in the case of process claims, manipulative difference) between the claimed invention and the prior art. Instead, it is found that the preamble merely states the purpose or intended use of the invention, rather than any distinct definition of any of the claimed invention’s limitations. Accordingly, the preamble is not considered a limitation and is of no significance to claim construction. See, Pitney Bowes, Inc. v. Hewlett-Packard Co., 182 F.3d 1298, 1305, 51 USPQ2d 1161, 1165 (Fed. Cir. 1999). See also Rowe v. Dror, 112 F.3d 473, 478, 42 USPQ2d 1550, 1553 (Fed. Cir. 1997).
With respect to the prior art, the limitations of this claim are obvious, as evidenced by the following references:
The Dimitrova reference
Dimitrova discloses a non-transitory storage device and a processing device coupled to the at least one non-transitory storage device at ¶ [0027] and FIG. 1.
Dimitrova discloses electronically retrieving an audiovisual file from a data repository at ¶¶ [0026]-[0027], [0035].
Dimitrova discloses initiating a file editing engine on the audiovisual file to separate the audiovisual file into a video component and an audio component, wherein the video component comprises one or more image frames at ¶ [0038]; to wit: “[T]he demultiplexer 140 may also include a demodulator to split an NTSC or PAL or similar broadcast signal into respective visual and audible information. Further in such case, the demultiplexer 140 would also utilize a frame grabber so that full digitized frames of information can be sent to the visual analysis component 160 of the feature extraction module.”
Dimitrova discloses initiating a machine learning (“visual analysis”) component on the video component to identify one or more sensitive portions in the one or more image frames at ¶¶ [0038]-[0039], [0041]. Dimitrova further discloses using a neural network as a machine learning (“visual analysis”) component at ¶ [0060], but does not specify a convolutional neural network (CNN) algorithm. However, this limitation was known in the art as evidenced by the Ahmed reference discussed below.
Dimitrova discloses initiating a machine learning (“audio analysis”) component on the audio component to identify one or more sensitive portions in the audio component at ¶¶ [0038]-[0039], [0044], [0047](“The classification module 190 categorizes segments based on whether they are sensitive or non-sensitive based on the learned categories and the output of the learning module 180”), [0052](“The segments are then fed into the classification module. For each segment, a number representing the likelihood of the segment belonging to one of the sensitive categories is obtained”); but does not specify the machine learning (“audio analysis”) component comprises audio word2vec algorithm. However, this limitation was known in the art as evidenced by the Lyu reference discussed below.
Dimitrova discloses initiating a masking algorithm on the one or more image frames and the audio component and generating, using the masking algorithm, a masking action policy at ¶¶ [0050](“[T]he segmentation and categorization module 280, which uses the feature extraction from the three feature extraction components, the transcript engine 250, the visual engine 260, and the audio engine 270, in combination with the previously stored learned criteria, in order to determine whether to tell the filtering module 290 whether to filter the video program during a given segment.”), [0055]-[0056](“The filtering module 290, …, makes a determination, based on the segment duration and preset configuration settings, of the optimal method to be employed in filtering the offending content. For example, the module 290 can simply skip the segment, ... Alternatively, the filter module 290 advantageously can be configured to substitute another video signal for that segment (e.g., show a Barney the dinosaur interstitial or Web page). Moreover, …, the filter module 290 advantageously can mask or blur out that particular portion. For example, when the audio portion of the segment contains an offensive word or phrase but is otherwise unobjectionable, the user may wish to merely garble the offending word or phrase rather than draw attention to the fact that a portion of the film was excised.”)
Dimitrova discloses implementing, using the masking algorithm, the masking action policy on the one or more sensitive portions in the one or more image frames and the one or more sensitive portions in the audio component at ¶¶ [0055]-[0056] and FIG. 3 (step 290: “Remove/Substitute Sensitive Segments”).
Dimitrova discloses generating a masked video component and a masked audio component based on at least implementing the masking action policy at ¶¶ [0016]-[0017](“Such blocking or masking may include simply skipping over the material determined to meet the user specified criteria, substituting an alternate ‘safe’ signal for the duration of the offending program segment, or masking portions of the video or audio content, e.g. blurring a naked body or garbling profanity... For example, in the visual content one can have expressions (e.g. facial and body), behaviors (e.g. shooting a gun, sexual activity, driving a vehicle), body attributes (e.g. skin tone or shape), violence (e.g. explosions, fighting), etc. In the audio domain one can have sound level (e.g. heavy sound with a lot of low frequency noise for an explosion), verbal expressions (e.g. profanity, slurs, slang, cursing, innuendo), ‘heavy breathing’ sounds (e.g. as might occur during a sex scene), etc.”), [0056](“As mentioned above, the parental control system equipped or connected to a large multimedia memory would permit the controlling user to filter the original film in a myriad of ways to produce versions of the film suitable for several disparate groups in the user's household”)
Dimitrova discloses generating a masked video component and a masked audio component to produce versions of the film suitable for several disparate groups in the user's household at ¶¶ [0016]-[0017], [0056], but does not specify binding, using the file editing engine, the video component and the audio component to generate an audiovisual file. However, this limitation was known in the art as evidenced by the Ogawa reference discussed below.
The Ahmed reference
Ahmed discloses initiating a convolutional neural network (CNN) algorithm on a video component to identify one or more sensitive portions in one or more image frames at 12:38-53 (“automated image recognition methods, such as those employing convolutional neural networks, may be utilized to screen out one or more images in the entity content 525 that have inappropriate content”), 16:12-33 (“a not safe for work (NSFW) filter may be applied to classify the images into the first or second set of images. If the NSFW filter indicates the image is not safe for work, it may be included in the second set of images. Otherwise, the image may be included in the first set of images. In some aspects, the NSFW filter may include a convolutional neural network trained on a training set of images.”) At the time of filing of the present application, it would have been obvious to a person of ordinary skill in the art to use a convolutional neural network (CNN) to classify sensitive material, as taught by Ahmed, as a substitute for using a neural network as a machine learning (“visual analysis”) component to classify sensitive material, as taught by Dimitrova. This combination is a simple substitution of one known element for another to obtain predictable results.  The prior art contained a method, taught by Dimitrova, which differed from the claimed method by the substitution of “convolutional neural network” for “neural network”.  Convolutional neural networks and their functions were known in the art.  One of ordinary skill in the art could have substituted a convolutional neural network into the method taught by Dimitrova and the results would have been predictable, as evidenced by Ahmed which proves the suitability of convolutional neural networks for this purpose.
The Lyu reference
Lyu discloses initiating an audio word2vec algorithm on the audio component to identify one or more sensitive portions in the audio component at ¶¶ [0106]-[0107]. At the time of filing of the present application, it would have been obvious to a person of ordinary skill in the art to use an audio word2vec algorithm on the audio component to identify one or more sensitive portions in the audio component, as taught by Ahmed, as a substitute for using a neural network as a machine learning (“audio analysis”) component to classify sensitive material, as taught by Dimitrova. This combination is a simple substitution of one known element for another to obtain predictable results.  The prior art contained a method, taught by Dimitrova, which differed from the claimed method by the substitution of an audio word2vec algorithm for a “neural network”.  One of ordinary skill in the art could have substituted an audio word2vec algorithm into the method taught by Dimitrova and the results would have been predictable, as evidenced by Lyu which proves the suitability of audio word2vec algorithms for this purpose.
The Ogawa reference
Ogawa discloses binding, using a file editing engine, a video component and an audio component to generate an MPEG audiovisual file at ¶ [0172]. At the time of the filing of the present application, it would have been obvious to a person of ordinary skill in the art to bind,  video and audio components into an MPEG audiovisual file, as taught by Ogawa, after generating a masked video component and a masked audio component to produce versions of the film suitable for several disparate groups in the user's household. The motivation for doing so comes from the prior art wherein the benefits of the MPEG format were well known and include its universal acceptance in consumer media platforms.  Therefore, it would have been obvious to combine Ogawa with Dimitrova to obtain the invention specified in this claim.
With regards to claim 12, Dimitrova discloses electronically receiving, from a computing device of a user, a request to view the audiovisual file, and transmitting control signals configured to cause the computing device of the user to display the masked audiovisual file at ¶¶ [0025], [0052]-[0053], [0056].
With regards to claim 13, the steps of the instructions stored in the computer readable medium of this claim are obvious over Dimitrova et al (U.S. PG Pub. No. 2002/0147782) in view of Ahmed et al (U.S. Patent No. 11,216,517), Lyu et al (U.S. PG Pub. No. 2021/0201934) and Ogawa et al (U.S. PG Pub. No. 2003/0165320) for the same reasons as were presented with respect to claim 1, which is an apparatus performing these same steps.
With regards to claim 20, the steps performed by the method of this claim are obvious over Dimitrova et al (U.S. PG Pub. No. 2002/0147782) in view of Ahmed et al (U.S. Patent No. 11,216,517), Lyu et al (U.S. PG Pub. No. 2021/0201934) and Ogawa et al (U.S. PG Pub. No. 2003/0165320) for the same reasons as were presented with respect to claim 1, which is an apparatus performing these same steps.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVID F DUNPHY whose telephone number is (571)270-1230. The examiner can normally be reached 9 am - 5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached on 5712727332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DAVID F DUNPHY/Primary Examiner, Art Unit 2668