DETAILED ACTION (RCE)
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 02/09/2022 has been entered.
This Non-Final office action is responsive to the amendment filed on 02/09/2022. Claims 1, 2, 5-14 and 17-20 are pending in the case. Claims 1 and 13 are the independent claims.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Claim Rejection:
Applicants’ 35 U.S.C. § 103 arguments have been fully considered but they are not persuasive. Applicants argue that the claims are allowable over Houh and Jannink fails to disclose or make obvious the actions of identifying, based at least in part on the Claim 1 as amended recites among other elements the elements of, "adding to the playlist, responsive to the another selection of the user interface element, data that identifies a second excerpt, from the second audio recording, that includes a mention of a second keyword of interest that is different from the first keyword of interest, the adding the data that identifies the second excerpt including: adding to the playlist an identifier for the second audio recording, wherein the second audio recording is different from the first audio recording; and adding to the playlist data to locate the second excerpt in the second audio recording, the data to locate the second excerpt in the second audio recording including an identifier for the second keyword of interest, wherein the playlist is searchable by keyword of interest.".  
Response: Examiner respectfully disagrees. One cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 U.S.P.Q. 871 (C.C.P.A. 1981); In re Merck & Co., 800 F.2d 1091, 231 U.S.P.Q. 375 (Fed. Cir. 1986).
Houh precisely teach a method for generating and obtaining a set of rules that define instructions for obtaining media content that comprise the content for a media channel, the set including at least one rule with instructions to include media content resulting from a search; searching for candidate media content according to a search query defined by the at least one rule; and merging one or more of the candidate media content resulting from the search into the content for the media channel. The candidate media content can include segments of the media content resulting from the search. See into Fig:9B and [0103]; the content segments 1022, 1024, 1032 and 1042 may include stories on a particular topic. Instead of having to listen to or view each audio/video podcast 1020, 1030 and 1040 which may include many topics, the media player accesses and presents only those segments of the podcasts corresponding to specific topics of user interest. Further see into [0117]; The preset channels 1510 a-1510 c can provide media streams customized to one or more specific topics selected by the content provider, while channel 1510 d can provide media streams customized to one or more specific topics requested by a user. Further see into [0119]; a second rule with instructions to conduct a media search on a first topic (e.g. “steroids”) and to add one or more of media segments resulting from that search; a third rule with instructions to conduct a media search on a second topic (e.g. “World Baseball Classic”) and to add one or more of media segments resulting from that search. So, user can merge multiple media segments based on user selected topics and user can make a search keyword inputted by user. See [0117]; FIG. 12 includes a graphical user interface 1500 including a media player 1520 and graphical icons (e.g., “buttons”) that represent preset media channels 1510 a, 1510 b, 1510 c and a user-defined channel 1510 d. In this example, each of the channels offers access to a media stream generated from a segments of audio/video content that are merged together into a single media file, a common group of media files or a play list of such files. The media stream can be presented using the media player 1520. The preset channels 1510 a-1510 c can provide media streams customized to one or more specific topics selected by the content provider, while channel 1510 d can provide media streams customized to one or more specific topics requested by a user.
Jannink precisely teach a method for methods for organizing and analyzing audio content derived from media files such as invention provides that the server of the system may be configured to receive one or more key words  that are submitted by a user of the system through the website, whereupon the server queries the database to identify all media files which include the one or more key words.
Thus, one of ordinary skill in the art would be motivated to make such a combination to provide new and powerful ways to labeling a plurality of media files with various relevant attributes, to identify and even suggest to a user of the system which key word may be relevant to the user. Therefore, Examiner has provided articulated reasoning with some rational to support the legal conclusion of obviousness.
Therefore, Examiner respectfully asserts that the cited art sufficiently teaches the limitations recited in the independent claims 1 and 10. Therefore, the given 35 U.S.C. 103 rejection has been remains sustained for independent claims 1 and 10. The foregoing applies to all independent claims and their dependent claims.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 5-9, 12-13 and 17-19 are rejected under 35 U.S.C. 103 as being unpatentable over Houh; Henry et al. (US Pub. NO. 20070106693 A1, Houh), and in further view of Jannink; Jan et al. (US Publication 20180260480 A1, hereinafter Jannink).

Regarding Claim 1 is an independent claim; Houh teaches a non-transitory machine-readable storage medium that provides instructions that, when executed by a processor, are capable of causing the processor to perform operations comprising:
creating a playlist of excerpts from audio recordings, the creating including(Houh: Fig:8 and [0094]; FIG. 8 is a flow diagram illustrating a computerized method for merging content segments for playback): 
accepting, from the user, a selection that identifies a first keyword of interest (Houh: Fig:8 and [0095]; At step 915, the client 710, under the direction of a user, selects a number of the content segments to merge for playback by selecting the corresponding snippets, see into [0073]; At step 560, the snippet generator 440 compares the text of the words spoken during the selected content segment, if any, to the keyword(s) of the search query, further see into [0057]; a search result or “snippet” can be generated that enables a user to arbitrarily select, (“see into Fig:9B and [0102]; user selected segments 1022 and 1024 from media file 1020, So Examiner considered here a selection that identifies a first keyword of interest as a selection on snippet including metadata satisfying a search query”)); 
accepting, from the user, a selection of a user interface element in the media player (Houh: Fig:8 and [0095]; a button or menu item is provided to enable the user to submit the metadata information identifying each of the selected content segments to the media merge module 900, (“After selection of “snippet” user select a button or menu item which considered as a user interface element here”); 
adding to the playlist, responsive to the selection of the user interface element, data that identifies a first excerpt, from the first audio recording, that includes a mention of the first keyword of interest (Houh: Fig:8 and [0097]; At step 925, the media merge module 900 obtains the enhanced metadata corresponding to each of the underlying media files/streams containing the selected content segments, further see into [0091]; the merged media content is implemented as a playlist that identifies the content segments to be merged for playback), 
the adding the data that identifies the first excerpt including: adding to the playlist an identifier for the first audio recording; and adding to the playlist data to locate the first excerpt in the first audio recording, the data to locate the first excerpt in the first audio recording including an identifier for the first keyword of interest (Houh: Fig:8 and [0095]; a button or menu item is provided to enable the user to submit the metadata information identifying each of the selected content segments to the media merge module 900. Such metadata information includes, for example, the segment identifiers and the locations of the underlying media content (e.g. URL links or filenames), (“see into Fig:9B and [0102]; user selected segments 1022 and 1024 from a first media file 1020 added to playlist 1000”)), According to [0079]; raw metadata information can include any combination of the parameters including a segment identifier, the location of the underlying content (e.g., URL or filename), segment type, the text of the word or group of words spoken during that segment (if any), timing information (e.g., start offset, end offset, and/or duration) and a confidence score (if any). Further see into [0122]; if a rule specified a search for the topic “steroids,” the results of the media search can include a set of enhanced metadata documents for one or more candidate audio/video podcasts that include a reference to the keyword “steroids.”. (“So Identifier itself including a search keyword inputted by user”)); 
accepting, from the user, another selection of the user interface element in the media player; and adding to the playlist, responsive to the another selection of the user interface element, data that identifies a second excerpt, from the second audio recording, that includes a mention of a second keyword of interest that is different from the first keyword of interest, the adding the data that identifies the second excerpt including: Inventor(s): Vandit GARG, et al.Examiner: Pritisha N PARBADIA Application No.: 16/833,399-2/13- Art Unit: 2145adding to the playlist an identifier for the second audio recording (Houh: Fig:9A-B and [0102]; playlist 1000 provides an entry for each of the selected segments, namely segments 1022, 1024, 1032, and 1042 from each of the underlying media files/streams 1020, 1030 and 1040. Each entry includes a filename 1010 a, a segment identifier 1010 b, start offset 1010 c and end offset or duration 1010 d, Further see into [0104]; FIG. 9A at step 955, the media merge module 900 obtains the metadata information for the next content segment, namely a segment identifier, a start offset, and an end offset or duration (as determined at step 915 or 930) and continues at step 935 to repeat the process for adding the next content segment to the playlist, (“see into Fig:9B and [0102]; user selected segment 1032 from second media file 1030 added to playlist 1000”))); 
wherein the second audio recording is different from the first audio recording (Houh: [0119]; a second rule with instructions to conduct a media search on a first topic (e.g. “steroids”) and to add one or more of media segments resulting from that search; a third rule with instructions to conduct a media search on a second topic (e.g. “World Baseball Classic”) and to add one or more of media segments resulting from that search); and 
adding to the playlist data to locate the second excerpt in the second audio recording, the data to locate the second excerpt in the second audio recording including an identifier for the second keyword of interest (Houh: [0079]; raw metadata information can include any combination of the parameters including a segment identifier, the location of the underlying content (e.g., URL or filename), segment type, the text of the word or group of words spoken during that segment (if any), timing information (e.g., start offset, end offset, and/or duration) and a confidence score (if any). Further see into [0119]; a third rule with instructions to conduct a media search on a second topic (e.g. “World Baseball Classic”) and to add one or more of media segments resulting from that search.”. (“So identifier itself including a search keyword inputted by user”)), 
wherein the playlist is searchable by keyword of interest (Houh: [0119]; a user-defined media channel, the channel selector 1310 can provide a user interface (not shown) for selecting the topics for the media search, specifying allocations of the resulting media segments for the channel content, and define the maximum duration of the channel content).
	
However, Houh does not explicitly teach accepting, from a user, a selection of a first audio recording for playback by a media player for playing audio; accepting, from the user, a selection of a second audio recording for playback by the media player;
However, Jannink teaches accepting, from a user, a selection of a first audio recording for playback by a media player for playing audio (Jannink: Fig:5 and [0055]; selecting a media file within the search results 50, (“examiner considered user select Result 1 media file with a search result”), further see into [0040]; “media file(s)” refers to audio files, video files, voice recordings, streamed media content, and combinations of the foregoing, see into [0042]; media files that are stored within the server 2 and database 4 may be derived from audio-only content (e.g., a telephone conversation or talk radio), see into [0055]; FIG. 5, the invention provides that each media file that is selected and streamed to a user's device 12,14 may be graphically portrayed within the graphical user interface of the centralized website 8, which may further include a media player that allows a user to control the playback of the media file); accepting, from the user, a selection of a second audio recording for playback by the media player (Jannink: Fig:5 and [0055]; selecting a media file within the search results 50, (“examiner considered user select Result 2 media file with a search result”));
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Houh to include the feature of Jannink. One would have been motivated to make such a combination to provide new and powerful ways to labeling a plurality of media files with various relevant attributes, to identify and even suggest to a user of the system which key word may be relevant to the user.

Regarding claim 5, is a dependent on claim 1, Houh and Jannink discloses the non-transitory machine-readable storage medium comprising:
However, Houh does not explicitly teach 
wherein the data to locate the first excerpt in the first audio recording further includes an indication of a position of the mention of the first keyword of interest in the first audio recording, and wherein the indication of the position is an index, as measured from a beginning of the first audio recording, of the mention of the first keyword of interest in mentions of the first keyword of interest in the first audio recording.
However, Jannink teaches 
wherein the data to locate the first excerpt in the first audio recording further includes an indication of a position of the mention of the first keyword of interest in the first audio recording (Jannink: Fig:4 and [0052]; the invention provides that the location of each search term that was queried may be indicated along the line 56. For example, the location of each search term may be indicated with a triangle 58, further see into Fig:13 and [0081]; the graphically portrayed line 56 that is correlated with a particular media file may exhibit multiple colors, with each color identifying a segment of the media file in which a different person is talking), and wherein the indication of the position is an index, as measured from a beginning of the first audio recording, of the mention of the first keyword of interest in mentions of the first keyword of interest in the first audio recording (Jannink: Fig:4 and [0052]; a set of search results will preferably be graphically portrayed, such as in the form of a line 56 that begins at time equals zero (t=0) and ends at a point when the media file is terminated. For example, if the total length of a media file is five minutes, the left side of the line will be correlated with t=0 of the media file, whereas the right side of the line will be correlated with t=5 minutes of the media file. Still further, the invention provides that the location of each search term that was queried may be indicated along the line 56. For example, the location of each search term may be indicated with a triangle 58, further see into [0081]; FIG. 13, a user may select 100 a particular speaker from several different speakers who contribute audio content to a single media file, whereupon the line 56 will graphically display the segments 102 of the media file in which the selected speaker 100 is talking, e.g., the segments 102 of the media file, illustrated in the line 56, will exhibit a color that is different than the other parts of the line 56).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Houh to include the feature of Jannink. One would have been motivated to make such a combination to provide new and powerful ways to labeling a plurality of media files with various relevant attributes, to identify and even suggest to a user of the system which key word may be relevant to the user.

Regarding claim 6, is a dependent on claim 1, Houh and Jannink discloses the non-transitory machine-readable storage medium comprising:
However, Houh does not appear to expressly disclose:
wherein the data to locate the first excerpt in the first audio recording further includes an indication of a position of the mention of the first keyword of interest in the first audio recording, and wherein the indication of the position is an offset relative to a beginning of the first audio recording.
However, Jannink teaches:
wherein the data to locate the first excerpt in the first audio recording further includes an indication of a position of the mention of the first keyword of interest in the first audio recording, and wherein the indication of the position is an offset relative to a beginning of the first audio recording (Jannink: Fig:4 and [0052]; a set of search results will preferably be graphically portrayed, such as in the form of a line 56 that begins at time equals zero (t=0) and ends at a point when the media file is terminated. For example, if the total length of a media file is five minutes, the left side of the line will be correlated with t=0 of the media file, whereas the right side of the line will be correlated with t=5 minutes of the media file. Still further, the invention provides that the location of each search term that was queried may be indicated along the line 56. For example, the location of each search term may be indicated with a triangle 58, further see into [0081]; FIG. 13, a user may select 100 a particular speaker from several different speakers who contribute audio content to a single media file, whereupon the line 56 will graphically display the segments 102 of the media file in which the selected speaker 100 is talking, e.g., the segments 102 of the media file, illustrated in the line 56, will exhibit a color that is different than the other parts of the line 56).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Houh to include the feature of Jannink. One would have been motivated to make such a combination to provide new and powerful ways to labeling a plurality of media files with various relevant attributes, to identify and even suggest to a user of the system which key word may be relevant to the user.

Regarding claim 7, is a dependent on claim 6, Houh and Jannink discloses the non-transitory machine-readable storage medium comprising:
However, Houh does not appear to expressly disclose:
wherein the position corresponds to a starting position for the first excerpt, and wherein the starting position for the first excerpt is a predetermined period of time before the mention of the first keyword of interest.
However, Jannink teaches:
wherein the position corresponds to a starting position for the first excerpt, and wherein the starting position for the first excerpt is a predetermined period of time before the mention of the first keyword of interest (Jannink: Fig:11-12 and [0059]; the audio track (audio content) that is streamed to the device will preferably begin at the location of the key word within the media file (or at a position located a pre-defined period of time prior to the first usage of the key word in the media file), further see into [0060]; the audio content will represent an excerpted portion of the media file that begins at (or at a predefined period of time prior to) a location of the queried key word in the audio track (audio content). In other words, referring to FIGS. 11 and 12, if a user selects a specific media file (e.g., a talk radio file) within a set of media files 82 that comprise a set of search results, the server 2 will cause a portion of the corresponding audio content to be streamed to the user's device 12,14. The audio content may begin at the exact location at which a key word is found within the audio content for the selected media file or, alternatively, at a predefined period of time prior to the location of the key word. In certain embodiments, for example, the predefined period of time, e.g., 5, 10, 15, 20, or more seconds, may be specified and adjusted by a user within the centralized website 8).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Houh to include the feature of Jannink. One would have been motivated to make such a combination to provide new and powerful ways to labeling a plurality of media files with various relevant attributes, to identify and even suggest to a user of the system which key word may be relevant to the user.

Regarding claim 8, is a dependent on claim 6, Houh and Jannink discloses the non-transitory machine-readable storage medium comprising:
However, Houh does not appear to expressly disclose:
wherein the position corresponds to a starting position for the first excerpt, and wherein the starting position for the first excerpt is such that the first excerpt includes a start of a sentence that includes the mention.
However, Jannink teaches:
wherein the position corresponds to a starting position for the first excerpt, and wherein the starting position for the first excerpt is such that the first excerpt includes a start of a sentence that includes the mention (Jannink: Fig:11-12 and [0059]; the audio track (audio content) that is streamed to the device will preferably begin at the location of the key word within the media file, further see into [0060]; the audio content will represent an excerpted portion of the media file that begins at (or at a predefined period of time prior to) a location of the queried key word in the audio track (audio content). In other words, referring to FIGS. 11 and 12, if a user selects a specific media file (e.g., a talk radio file) within a set of media files 82 that comprise a set of search results, the server 2 will cause a portion of the corresponding audio content to be streamed to the user's device 12,14. The audio content may begin at the exact location at which a key word is found within the audio content for the selected media file).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Houh to include the feature of Jannink. One would have been motivated to make such a combination to provide new and powerful ways to labeling a plurality of media files with various relevant attributes, to identify and even suggest to a user of the system which key word may be relevant to the user.

Regarding claim 9, is a dependent on claim 1, Houh and Jannink discloses the non-transitory machine-readable storage medium comprising:
However, Houh does not appear to expressly disclose:
wherein the creating further includes: accepting, from the user, a selection of a caret in the media player from carets that indicate mentions of the first keyword of interest in the first audio recording, wherein the selected caret indicates the mention included in the first excerpt.
However, Jannink teaches:
wherein the creating further includes: accepting, from the user, a selection of a caret in the media player from carets that indicate mentions of the first keyword of interest in the first audio recording, wherein the selected caret indicates the mention included in the first excerpt (Jannink: Fig:4 and [0053]; when a user places a cursor (within the graphical user interface of the centralized website 8) over or in the near vicinity of a triangle 58 (or other element indicating the location of a search term) or a comment 60, the graphical user interface of the website 8 will automatically publish a temporary text box 62 in which the search term may be viewed, along with a limited number of words before and after the search term (i.e., the context in which the search term is used), which were transcribed by the system from the media file, further see into [0055]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Houh to include the feature of Jannink. One would have been motivated to make such a combination to provide new and powerful ways to labeling a plurality of media files with various relevant attributes, to identify and even suggest to a user of the system which key word may be relevant to the user.

Regarding claim 12, is a dependent on claim 1, Houh and Jannink discloses the non-transitory machine-readable storage medium comprising:
wherein the operations further comprise: performing, for the user, a search for audio recordings that include one or more mentions of the keyword of interest, wherein the selection of the first audio recording for playback is from results of the search (Houh: Fig:8 and [0094]; AAt step 910, the search engine 720 conducts a keyword search of the index 730 for metadata enhanced for audio/video search that satisfies a search query. Subsequently, the search engine 720, or alternatively the snippet generator 740 itself, downloads a set of metadata information or instructions to enable presentation of a set of search snippets at the client 710, further see into [0095]; At step 915, the client 710, under the direction of a user, selects a number of the content segments to merge for playback by selecting the corresponding snippets).

Claim 13 is similar in scope to claim 1 and is rejected similarly.
Claim 17 is similar in scope to claim 5 and is rejected similarly.
Claim 18 is similar in scope to claim 6 and is rejected similarly.
Claim 19 is similar in scope to claim 9 and is rejected similarly.


Claims 2 and 14 are rejected under 35 U.S.C. 103 as being Houh, in a view of Jannink, as applied above to claim 1, and in a further view of Yousef Shemisa et al. (US Pub. NO. 20060198504 A1; hereinafter Shemisa).
Regarding claim 2, is a dependent on claim 1, Houh and Jannink discloses the non-transitory machine-readable storage medium comprising:
However, Houh and Jannink does not appear to expressly disclose:
wherein the first audio recording and the second audio recording are each of a conversation between an agent of a call center and a caller.
However, Shemisa teaches:
wherein the first audio recording and the second audio recording are each of a conversation between an agent of a call center and a caller (Shemisa: Fig:30 and [0188]; FIG. 30 is a screen shot showing an example of a results list of call records returned for playback and review in the viewer according to the invention).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to combine a method for generating and using enhanced metadata in search-driven applications as taught in Houh and Jannink with further a method for a digital voice recording platform that is used for live monitoring of calls, voice logging, incident reconstruction, compliance and liability reduction as taught by Shemisa. Thus, one of ordinary skill in the art would be motivated to make such a combination to provide access to a customizable variety of call recording options and modalities, a fast and effective search, and full archiving capabilities and management (Shemisa: [0045]).

Claim 14 is similar in scope to claim 2 and is rejected similarly.


Claims 10-11 and 20 are rejected under 35 U.S.C. 103 as being Houh, in a view of Jannink, as applied above to claim 1, and in a further view of Mertens; Timo et al. (US Pub. NO. 20190259387 A1; hereinafter Mertens).
Regarding claim 10, is a dependent on claim 1, Houh and Jannink discloses the non-transitory machine-readable storage medium comprising:
However, Houh and Jannink does not appear to expressly disclose:
wherein the operations further comprise: creating a transcript for the playlist, the creating the transcript including: adding, to the transcript for the playlist, a first portion of a transcript for the first audio recording that corresponds to the first excerpt of the first audio recording, and adding, to the transcript for the playlist, a second portion of a transcript for the second audio recording that corresponds to the second excerpt of the second audio recording.
However, Mertens teaches:
wherein the operations further comprise: creating a transcript for the playlist, the creating the transcript including: adding, to the transcript for the playlist, a first portion of a transcript for the first audio recording that corresponds to the first excerpt of the first audio recording, and adding, to the transcript for the playlist, a second portion of a transcript for the second audio recording that corresponds to the second excerpt of the second audio recording (Mertens: Fig:7 and [0155]; search query 715, the collaborative content management system 130 can perform a full document search (i.e., a search of the text transcript, of text included elsewhere within the collaboration document, and the like). The collaborative content management system 130 can return search results, some of which include portions of the text transcript that corresponding to portions of the audio data, further see into [0156]; FIG. 7, two search results 725 are displayed: "RYAN: What flavor do you eat first in Neapolitan ice cream?" and "TOREY: I don't know, I don't eat Neapolitan." In some embodiments, each search result 725 is formatted to include a link to a location of the search result within the text transcript included within the document. In other embodiments, portions of the search results 725 including text of the search query 715 may be formatted (e.g., by bolding, highlighting, or underlining the result) to emphasize the search query text within each search result. For example, in the example of FIG. 7, the search term "Neapolitan" is bolded within each search result).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to combine a method for generating and using enhanced metadata in search-driven applications as taught in Houh and Jannink with further a method to accesses audio data associated with a collaborative document as taught by Mertens. Thus, one of ordinary skill in the art would be motivated to make such a combination to provide systems to capture and manipulate audio data within or in conjunction with collaborative documents to make professional collaborations, such as meetings, more efficient for users.

Regarding claim 11, is a dependent on claim 10, Houh and Jannink discloses the non-transitory machine-readable storage medium comprising:
However, Houh and Jannink does not appear to expressly disclose:
wherein the first portion of the transcript for the first audio recording includes a sentence that includes the mention of the first keyword of interest.
However, Mertens teaches:
wherein the first portion of the transcript for the first audio recording includes a sentence that includes the mention of the first keyword of interest (Mertens: Fig:7 and [0156]; portions of the search results 725 including text of the search query 715 may be formatted (e.g., by bolding, highlighting, or underlining the result) to emphasize the search query text within each search result. For example, in the example of FIG. 7, the search term "Neapolitan" is bolded within each search result).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention to combine a method for generating and using enhanced metadata in search-driven applications as taught in Houh and Jannink with further a method to accesses audio data associated with a collaborative document as taught by Mertens. Thus, one of ordinary skill in the art would be motivated to make such a combination to provide systems to capture and manipulate audio data within or in conjunction with collaborative documents to make professional collaborations, such as meetings, more efficient for users.

Claim 20 is similar in scope to claim 10 and is rejected similarly.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. A). US 20160283796 A1: Systems, methods, and non-transitory computer-readable media can identify a set of video segments that represents a video. 
B). US 20150312652 A1: A system and method are disclosed for automatically generating a highlight reel of video content. Segments from a segment list may be associated with, or indexed to, corresponding sequences from a video of an event for which the segment list is prepared.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PRITISHA N PARBADIA whose telephone number is (571)270-0683. The examiner can normally be reached Monday 9am -5pm and Friday 9am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Adam Queler can be reached on (571) 272-4140. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/ADAM M QUELER/Supervisory Patent Examiner, Art Unit 2145