DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s remarks filed 24 October 2022 have been fully considered and are persuasive.  The previously cited references do not disclose the newly-claimed subject matter of identifying one or more portions of an audio stream of the media file at which a sound level is below a threshold volume level.  New grounds of rejection are presented below.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 4-7, 12, and 14-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Polumbus et al., US 2011/0069230 A1 (hereinafter “Polumbus”), in view of Heckerman et al., US 6,260,011 B1 (hereinafter “Heckerman”).

As per claim 1, Polumbus teaches:
receiving from a user device an identification of the media file and the keyword (Polumbus ¶¶ 0008, 0016), where a user queries a media file and a keyword;
querying the index to determine that the first time code is mapped to the keyword in the first audio segment and to identify a range of bytes corresponding to the first time code (Polumbus ¶¶ 0036-0037), where the parameters contain the keyword, where the keyword is used to determine the time stamp, and where a portion comprises a range of bytes; and
transmitting the range of bytes to the user device (Polumbus ¶ 0037), where delivery is the transmission.

Polumbus, however, does not teach, while Heckerman teaches:
obtaining a media file (Heckerman Abstract), where a electronic file containing audio data is identified;
identifying one or more portions of an audio stream of the media file at which a sound level is below a threshold volume level (Heckerman Abstract), where modeling silence is the claimed identifying; 
dividing the audio stream of the media file into a plurality of audio segments, wherein a first audio segment in the plurality of audio segments is mapped to a first time code of the media file (Heckerman Abstract), where the audio stream is divided into words;
determining that a keyword is an output of speech recognition performed on the first audio (Heckerman Abstract), where a word is recognized using speech recognition; and
storing a mapping of the keyword to the first time code in an index (Heckerman Abstract), where the pointers are the mapping.

It would therefore have been obvious to one of ordinary skill in the art at the time of invention to combine the teachings of Heckerman with those of Polumbus to generate captions for video files without captions using the method of Heckerman in order to ensure that all videos are searchable by captions.

As per claim 2, the rejection of claim 1 is incorporated, and Polumbus further teaches:
extracting the audio stream from the media file (Polumbus ¶ 0026), where audio is extracted.

As per claim 3, the rejection of claim 1 is incorporated, but Polumbus does not teach while Heckerman teaches
receiving a corpus of text (Heckerman 3:23-35), where the training data comprises a text corpus; and
training the one or more computing devices to identify the keyword using the corpus of text (Heckerman 3:23-35), where the training data is used to train the speech recognition operation to recognize words.

It would therefore have been obvious to one of ordinary skill in the art at the time of invention to combine the teachings of Heckerman with those of Polumbus to generate captions for video files without captions using the method of Heckerman in order to ensure that all videos are searchable by captions.

As per claim 4, the rejection of claim 1 is incorporated, and Polumbus further teaches:
wherein processing the first audio segment to identify a keyword further comprises:
generating speech recognition results associated with the first audio segment (Polumbus ¶ 0031), where translating the audio using language recognition algorithms is the generating;
parsing the speech recognition results (Polumbus ¶ 0031), where the matching is the parsing; and
determining that the speech recognition results include the keyword based on the parsing (Polumbus ¶ 0031), where the result of the matching is the determining.

As per claim 5, the rejection of claim 1 is incorporated, and Polumbus further teaches:
obtaining a second media file (Polumbus ¶ 0016), where a plurality of files are indexed;
identifying one or more second keywords (Polumbus ¶ 0031), where the matching is the parsing; and
storing a mapping of the identified one or more second keywords to the second time code in the index (Polumbus ¶ 0031), where an association between time stamp and caption text is stored.

As per claim 6, the rejection of claim 5 is incorporated, and Polumbus further teaches:
receiving from a second user device an indication of the second media file and a third keyword in the one or more second keywords (Polumbus ¶ 0040), where a search for a particular scene in particular content is searched for;
querying the index to identify a second range of bytes corresponding to the third keyword (Polumbus ¶¶ 0036-0037), where the parameters contain the keyword, where the keyword is used to determine the time stamp, and where a portion comprises a range of bytes; and
transmitting the second range of bytes to the second user device (Polumbus ¶ 0037), where delivery is the transmission.

As per claim 7, Polumbus teaches:
a data store configured to store a media file (Polumbus ¶ 0015), where the files are stored;
wherein the index is queried to determine that the first time code is mapped to the keyword identified in the first audio segment in response to a request for the media file (Polumbus ¶¶ 0036-0037), where the parameters contain the keyword, where the keyword is used to determine the time stamp, and where a portion comprises a range of bytes.

Polumbus, however, does not teach, while Heckerman teaches:
obtain a media file (Heckerman Abstract), where a electronic file containing audio data is identified;
identify one or more portions of an audio stream of the media file at which a sound level is below a threshold volume level (Heckerman Abstract), where modeling silence is the claimed identifying; 
divide the audio stream of the media file into a plurality of audio segments using the identified one or more portions of the audio stream, wherein a first audio segment in the plurality of audio segments is mapped to a first time code of the media file (Heckerman Abstract), where the audio stream is divided into words;
determine that a keyword is an output of speech recognition performed on the first audio segment (Heckerman Abstract), where a word is recognized using speech recognition; and 
store a mapping of the keyword to the first time code in an index (Heckerman Abstract), where the pointers are the mapping.

It would therefore have been obvious to one of ordinary skill in the art at the time of invention to combine the teachings of Heckerman with those of Polumbus to generate captions for video files without captions using the method of Heckerman in order to ensure that all videos are searchable by captions.

As per claim 12, the rejection of claim 7 is incorporated, and Polumbus further teaches:
extract the audio stream from the media file (Polumbus ¶ 0026), where audio is extracted.

As per claim 13, the rejection of claim 7 is incorporated, but Polumbus does not teach while Heckerman teaches:
receive a corpus of text (Heckerman ¶ 0080), where the training corpus is a text corpus; and
train the one or more computing devices to identify the keyword using the corpus of text (Heckerman 3:23-35), where the training data is used to train the speech recognition operation to recognize words.

It would therefore have been obvious to one of ordinary skill in the art at the time of invention to combine the teachings of Heckerman with those of Polumbus to generate captions for video files without captions using the method of Heckerman in order to ensure that all videos are searchable by captions.

As per claim 14, the rejection of claim 7 is incorporated, and Polumbus further teaches:
generate speech recognition results associated with the first audio segment (Polumbus ¶ 0031), where translating the audio using language recognition algorithms is the generating;
parse the speech recognition results (Polumbus ¶ 0031), where the matching is the parsing; and
determine that the speech recognition results include the keyword based on the parsing (Polumbus ¶ 0031), where the result of the matching is the determining.

As per claim 15, the rejection of claim 7 is incorporated, and Polumbus further teaches:
identify one or more second keywords (Polumbus ¶ 0031), where the captions are parsed in order to identify words in the captions that match sounds; and
store a mapping of the identified one or more second keywords to a second time code in the index (Polumbus ¶ 0031), where an association between time stamp and caption text is stored.

As per claim 16, the rejection of claim 15 is incorporated, and Polumbus further teaches:
receive from a user device an indication of the media file and a third keyword in the one or more second keywords (Polumbus ¶ 0040), where a search for a particular scene in particular content is searched for;
query the index to identify a range of bytes corresponding to the third keyword (Polumbus ¶¶ 0036-0037), where the parameters contain the keyword, where the keyword is used to determine the time stamp, and where a portion comprises a range of bytes; and
transmit the range of bytes to the user device (Polumbus ¶ 0037), where delivery is the transmission.

As per claim 17, Polumbus teaches:
wherein the index is queried to determine that the first time code is mapped to the keyword identified in the first audio segment in response to a request for the media file (Polumbus ¶¶ 0036-0037), where the parameters contain the keyword, where the keyword is used to determine the time stamp, and where a portion comprises a range of bytes; and
transmitting at least a byte corresponding to the first time code to a user device in response to a request for a portion of the media file associated with the keyword (Polumbus ¶¶ 0036-0037), where the parameters contain the keyword, where the keyword is used to determine the time stamp, and where a portion comprises a range of bytes, where delivery is the claimed transmission.

Polumbus, however, does not teach, while Heckerman teaches:
obtaining a media file (Heckerman Abstract), where a electronic file containing audio data is identified;
identifying one or more portions of an audio stream of the media file at which a sound level is below a threshold volume level (Heckerman Abstract), where modeling silence is the claimed identifying; 
dividing the audio stream of the media file into a plurality of audio segments, wherein a first audio segment in the plurality of audio segments is mapped to a first time code of the media file (Heckerman Abstract), where the audio stream is divided into words;
determining that a keyword is an output of speech recognition performed on the first audio (Heckerman Abstract), where a word is recognized using speech recognition; and
storing a mapping of the keyword to the first time code in an index (Heckerman Abstract), where the pointers are the mapping.

It would therefore have been obvious to one of ordinary skill in the art at the time of invention to combine the teachings of Heckerman with those of Polumbus to generate captions for video files without captions using the method of Heckerman in order to ensure that all videos are searchable by captions.

As per claim 18, the rejection of claim 17 is incorporated, and Polumbus further teaches:
extracting the audio stream from the media file (Polumbus ¶ 0026), where audio is extracted.

Claims 8-11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Polumbus et al., US 2011/0069230 A1 (hereinafter “Polumbus”), in view of Heckerman et al., US 6,260,011 B1 (hereinafter “Heckerman”), and further in view of Lin et al., US 2011/0191679 A1 (hereinafter “Lin”).

As per claim 8, the rejection of claim 7 is incorporated, but Polumbus does not teach:
wherein the data store is further configured to store a second media file, wherein the second media file is a lower resolution version of the media file.

The analogous and compatible art of Lin, however, teaches storing a lower resolution version of a media file in order to preview the media file (Lin ¶ 0022).

It would therefore have been obvious to one of ordinary skill in the art to modify the teahicngs of Polumbus in order store a version media file at a lower resolution in order to preview the media file.

As per claim 9, the rejection of claim 8 is incorporated, but Polumbus does not teach:
a media retrieval server, wherein second instructions, when executed, cause the media retrieval server to:
receive a request for the second media file from a user device;
transmit the second media file to the user device;
receive a second request for a portion of the media file, wherein the second request comprises the keyword; and
querying the index to determine that the first time code is mapped to the keyword and to identify a range of bytes corresponding to the first time code.

The analogous and compatible art of Lin, however, teaches transmitting a preview stream at a lower resolution in response to a request (Lin ¶ 0022).

It would therefore have been obvious to one of ordinary skill in the art to combine the teachings of Lin with those of Polumbus in order to allow a user to scan media using the lower resolution version of Lin, and, having determined that the media is the desired media, to search for a keyword in the media for a time code mapped by the keyword that identifies a range of bytes as in Polumbus (Polumbus ¶¶ 0036-0037) in order to allow a user to determine that a particular piece of media is the desired media, and then to find a particular location in the media based on the audio.

As per claim 10, the rejection of claim 9 is incorporated, and Polumbus further teaches:
the media retrieval server to transmit the range of bytes to the user device to satisfy the second request (Polumbus ¶¶ 0036-0037), where located media is played at the particular portion identified by the queried keyword.

As per claim 10, the rejection of claim 9 is incorporated, and Polumbus further teaches:
restore the range of bytes from archive storage (Polumbus ¶ 0036), where a media file is retrieved, thereby restoring the bytes; and
transmit the restored range of bytes to the user device to satisfy the second request (Polumbus ¶ 0037), where the portion is delivered.

Claim 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Polumbus et al., US 2011/0069230 A1 (hereinafter “Polumbus”), in view of Heckerman et al., US 6,260,011 B1 (hereinafter “Heckerman”), and further in view of Porier, US 10,109,279 B1 (hereinafter “Porier”).

As per claim 20, the rejection of claim 17 is incorporated, but Polumbus does not teach:
identifying a portion of the audio stream having a sound below a threshold volume level; and
dividing the audio stream into the plurality of audio segments using the identified portion of the audio stream as a division.

The analogous and compatible art of Porier, however, teaches dividing an audio stream into segments at a division indicated by a sound below a threshold volume level (Porier 7:11-29).

It would therefore have been obvious to one of ordinary skill in the art to combine the teachings of Porier with those of Polumbus by modifying the divisions of Polumbus with the divisions of Porier so as to better discriminate word sounds from each other.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-18 and 20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 2, 4, 1, 5, 6, 7, 8, 9, 10, 11, 12, 14, 12, 16, 17, 18, 2, and 15, respectively, of U.S. Patent No. 10,546,011. Although the claims at issue are not identical, they are not patentably distinct from each other because the instant claimed subject matter is broader than, but encompasses, the subject matter of the ’011 patent.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM SPIELER whose telephone number is (571)270-3883.  The examiner can normally be reached on Monday-Friday, 11-3.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mariela Reyes can be reached on 571-270-1006.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


WILLIAM SPIELER
Primary Examiner
Art Unit 2159



/WILLIAM SPIELER/Primary Examiner, Art Unit 2159