Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This is in response to Applicant’s Remarks filed on 02/08/2021 regarding to the application 16/550,776 which was filed on 08/26/2019.
Claims 1-19 are currently pending for consideration.

Response to Amendment and Remark
	The applicant's amendments and remarks have been fully and carefully considered, with Examiner's response set forth below.
Applicant remarks that the “35 U.S.C. § 103 Rejections” with “Independent Claims 1,10, and 19… creating a log file associated with the digital audio stream; and…
recording each text string and its associated unique timestamp to the log file…
Applicant respectfully disagrees. The digest file described in Church is an audio or video file, not a text-based log file described in the present application…”
The examiner respectfully disagrees and asserts that Church discloses “capture real-time events (i.e. unique timestamp) occurring within a recording (i.e. audio stream) capture area of the recording device, receiving a set of content-related parameter, applying automated content identifier to identify and label (i.e. text string) portions (e.g., time interval) of the recording with a content type… to assign a priority to some portions of the recording… to generate a sub-sample of clusters to be included in the digest file that has runtime (i.e. unique timestamp)… convert the recorded speech into text, e.g., a transcript, (i.e. text string)… use converted text, to identify portions of the recording to be excluded from or included into a digest file in step 402-408 of 
The examiner notes that the [0015] of the specification recites “accepts as its input a digital audio stream and a set of one or more time intervals in the audio stream for which the speech therein shall be transcribed as text data...using the predefined interval… such as every 5 seconds, which minimizes the number of timestamps and thus the amount of timestamp data recorded in the log file” and [0015] of the specification further recites “provides as its output a log file containing the transcribed text along with one or more timestamps that link the transcribed text with its corresponding position in the audio stream”, so the log file is a consolidated audio file and appended the transcribed text along with the time stamp based on the specification and claim limitations logical sequence.
	
Applicant remarks that the “35 U.S.C. § 103 Rejections” with “Dependent Claims 2-9, and 11-18… Claims 2-9 depend on independent claim 1 and include all the limitations of claim 1, as well as additional limitations. Claims 11-18 depend on independent claim 10 and include all the limitations of claim 10, as well as additional limitations. Accordingly, dependent claims 2-9 and 11-18 are patentable for at least the same reasons set forth above with respect to independent claims 1 and 10…”

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-19 are rejected under 35 U.S.C. 103 as being unpatentable over Church et al. (US 20190370283 A1, “Church”) in view of Pokharel et al. (US 20200126583 A1, “Pokharel”).
As to claim 1, Church discloses A method for providing audio highlighter functionality comprising the steps of: (Church discloses [0082] a user is with a complete or partial transcript generated by a speech-to-text API, that highlights emotional speech, selected keywords, or any parameters) 
receiving a digital audio stream synchronously from a digital audio playback application; (Church discloses [0094, 0033] networked recording devices and playback devices are strategically positioned in an area such as a living space or office space to capture the speech (i.e. digital audio stream) and other real-time events (i.e. synchronously)… receiving a recording (e.g., Audio and/or video file) that comprises recorded speech (i.e. digital audio stream)).
starting a timer that measures a current playback position in the digital audio stream; (Church discloses [0032] starting a runtime that is measured in units of minutes… corresponds to the target runtime (i.e. position) of the recorded speech (i.e. digital audio stream) playback).
creating a log file associated with the digital audio stream; and (Church discloses [0054] derived from time-domain waveforms or by examining frequency-domain spectrograms that are derived from such waveforms of an audio or video file comprises recorded speech (i.e. 
transcribing the digital audio stream to text; (Church discloses [0009] The speech-to-text engine creates a transcript (i.e. text) of the recording (i.e. digital audio stream)).
	recording each text string and its associated unique timestamp to the log file. (Church discloses [0006, 0054] capture real-time events (i.e. unique timestamp) occurring within a recording capture area of the recording device, receiving a set of content-related parameter, applying automated content identifier to identify and label (i.e. text string) portions of the recording with a content type… to assign a priority to some portions of the recording… to generate a sub-sample of clusters to be included in the digest file that has runtime (i.e. unique timestamp)… convert the recorded speech into text, e.g., a transcript, (i.e. text string)… use converted text, to identify portions of the recording to be excluded from or included into a digest file in FIG. 4).
However, Church may not explicitly disclose all the aspect of the wherein the step of transcribing the digital audio stream to text comprises the substeps of: 
dividing the digital audio stream into a plurality of digital audio chunks;
associating a unique timestamp with each digital audio chunk;
converting each digital audio chunk into a corresponding text string;
associating the unique timestamp of each digital audio chunk to the corresponding text string; and
Pokharel discloses wherein the step of transcribing the digital audio stream to text comprises the substeps of: 
dividing the digital audio stream into a plurality of digital audio chunks; (Pokharel discloses [0063] the master audio track is divided into a plurality of audio segments (i.e. chunks) that are sent to a transcription service).
associating a unique timestamp with each digital audio chunk; (Pokharel discloses [0082] linear interpolation is added throughout an entire media chunk to realign (i.e. associating) the timing (i.e. unique timestamp) data).
Pokharel discloses converting each digital audio chunk into a corresponding text string; (Pokharel discloses [0063] the master audio track is divided into a plurality of audio segments (i.e. chunks) that are sent to a transcription service to be transcribed (i.e. converted into text string)).
Pokharel discloses associating the unique timestamp of each digital audio chunk to the corresponding text string; and (Pokharel discloses [0065] The identified times (i.e. unique timestamp) then are used to align a portion of the transcript (i.e. text string) with a corresponding portion (i.e. chunk) of the master audio track).
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Church and Pokharel disclosing marking highlights in the audio with transcription which are analogous art from the “same field of endeavor”, and, when Pokharel’s discovering highlights in transcribed divided audio segments was combined with Church's summarizing recorded audio content with transcription, the claimed limitation on the wherein the step of transcribing the digital audio stream to text comprises the substeps of: 
dividing the digital audio stream into a plurality of digital audio chunks;
associating a unique timestamp with each digital audio chunk;
converting each digital audio chunk into a corresponding text string;
associating the unique timestamp of each digital audio chunk to the corresponding text string 
As to claim 2, Church in view of Pokharel discloses The method of claim 1 wherein the step of transcribing the digital audio stream to text is started in response to user input. (Pokharel discloses [0119] a closed captioning (i.e. transcribing) button for turning on (i.e. start) or off the appearance of closed captioning text in the playback pane… which is in response to the user input).
The examiner notes that the [0029-0030] of the specification recites “the application provides a user interface for the user to control the start and stop of the transcription ‘on demand’ during playback of the audio stream so that the user may choose to transcribe selected portions of the audio stream… transcription control button 205 is toggleable between ‘on’ and ‘off’ states. The start signal is generated in response to the user pressing and releasing transcription control button 205, and the stop signal is generated in response to the user pressing and releasing transcription control button 205 a second time”. Pokharel’s closed captioning button is on or off toggleable to start or stop the transcribing on demand by the user during playback the media.
As to claim 3, Church in view of Pokharel discloses The method of claim 2 wherein the step of transcribing the digital audio stream to text is stopped in response to user input. (Pokharel discloses [0119] a closed captioning (i.e. transcribing) button for turning on or off (i.e. stopped) the appearance of closed captioning text in the playback pane… which is in response to the user input).
As to claim 4, Church in view of Pokharel discloses The method of claim 1 further comprising the step of providing a first user interface to display a graphical timeline representation of the digital audio stream, (Pokharel discloses [Claim 11] displays the respective media source in the media player in the first pane of the first interface time-aligned (i.e. timeline) with the respective text string excerpt in the respective synchronized transcript).
wherein the graphical timeline representation comprises at least one highlight mark indicating a position in the digital audio stream of the unique timestamp associated with the corresponding text string. (Church discloses [0090, 0011, 0053] providing user to select from a user-interface… Labeling portions of the recording (i.e. digital audio stream)... tagging a location (i.e. position) in a portion of the recording with a marker (i.e. highlight mark) that identifies a location (i.e. position with the unique timestamp) in the portion of the recording as substantive content… with a speech-to-text converter generated transcript (i.e. text string)).
As to claim 5, Church in view of Pokharel discloses The method of claim 4 further comprising the step of starting playback of the digital audio stream from one of the unique timestamps in response to user selection of the corresponding highlight mark in the first user interface. (Pokharel discloses [0108] The playback control allows the user to playback the new highlight in the media source highlighting interface. In response to user selection of the playback control, the media player plays the portion of the video or audio media file corresponding to the transcript text (i.e. text string) of the new highlight and, at the same timeline (i.e. timestamp)).
As to claim 6, Church in view of Pokharel discloses The method of claim 1 further comprising the step of providing a second user interface to display the at least one text string and its associated unique timestamp. (Pokharel discloses [0106] The media source highlighting interface includes a media player pane for playing video content of the selected media source and a transcript pane for displaying the corresponding synchronized transcript of the selected media source… display the transcript, the current playback time (i.e. unique timestamp) in the media source).
As to claim 7, Church in view of Pokharel discloses The method of claim 6 further comprising the step of starting playback of the digital audio stream from one of the unique timestamps in response to user selection of the corresponding text string in the second user interface. (Pokharel discloses [0107-0108] The media source highlighting interface (i.e. second user interface) enable a user to select source media to playback… in response to user’s selection of the text as a highlight …The playback control allows the user to playback the new 
As to claim 8, Church in view of Pokharel discloses The method of claim 1 wherein the step of converting each digital audio chunk into a corresponding text string comprises the substeps of: 
sending the digital audio chunk to a speech-to-text converter; (Church discloses [0053] the filtered and selected sections (i.e. chunks) are served as input to a speech-to-text converter).
transcribing the digital audio chunk into its corresponding text string with the speech-to-text converter; and (Church discloses [0053] a speech-to-text converter is used to generate a transcript from the filtered and selected sections (i.e. chunks) of an audio or video recording... converts the speech to text).
 	receiving the text string from the speech-to-text converter. (Church discloses [0006, 0053] supplying the digest file to a recipient comprising sample clusters of speech content… with the speech-to-text converter generated transcript).
As to claim 9, Church in view of Pokharel discloses The method of claim 8 wherein the speech-to-text converter is located on a server computer system, (Church discloses [0040] a processing system that is located remotely from the recording device (e. g., cloud computing system 708 and/or virtual family server 710 in FIG. 7)… comprises a coding and speech recognition application… a speech-to-text converter).
wherein the step of transcribing the digital audio chunk into its corresponding text string with the speech-to-text converter is performed by the server computer system, and (Church discloses [0053-0054] a speech-to-text converter is performed by the cloud computing 
wherein the remaining method steps are performed by a mobile device. (Church discloses [0087] an information handling system may be a mobile device (e.g., personal digital assistant or smart phone)… record audio/video to capture content are relevant and playback).
Regarding claims 10-18, these claims recite the system performed by the method of claims 1-9, respectively; therefore, the same rationale of rejection is applicable.

As to claim 19, Church discloses A method for providing audio highlighter functionality comprising the steps of: (Church discloses [0082] a user is with a complete or partial transcript generated by a speech-to-text API, that highlights emotional speech, selected keywords, or any parameters)
receiving a digital audio stream synchronously from a digital audio playback application; (Church discloses [0094, 0033] networked recording devices and playback devices are strategically positioned in an area such as a living space or office space to capture the speech (i.e. digital audio stream) and other real-time events (i.e. synchronously)… receiving a recording (e.g., Audio and/or video file) that comprises recorded speech (i.e. digital audio stream)).
starting a timer that measures a current playback position in the digital audio stream; (Church discloses [0032] starting a runtime that is measured in units of minutes… corresponds to the target runtime (i.e. position) of the recorded speech (i.e. digital audio stream) playback).
creating a log file associated with the digital audio stream; (Church discloses [0033] generate a digest file (i.e. log file) that has a reduced content and start a runtime that matches the target runtime of the recorded speech (i.e. digital audio stream)).
wherein the graphical timeline representation comprises at least one highlight mark indicating a position in the digital audio stream of the unique timestamp associated with the corresponding text string; (Church discloses [0090, 0011, 0053] providing user to select from a user-interface… Labeling portions of the recording (i.e. digital audio stream)... tagging a location (i.e. position) in a portion of the recording with a marker (i.e. highlight mark) that identifies a location (i.e. position with the unique timestamp) in the portion of the recording as substantive content… with a speech-to-text converter generated transcript (i.e. text string)).
recording each text string and its associated unique timestamp to the log file; and (Church discloses [0006] generating a digest file with the transcript of the recording... Using priorities comprise sampling clusters (i.e. chunks) of speech content to generate a sub-sample of clusters to be included in the digest file… associated with events associated with a speaker, a location… grouping the labeled (i.e. text string) portions of the recording into clusters of speech).
wherein the step of converting each digital audio chunk into a corresponding text string comprises the substeps of: 
sending the digital audio chunk to a speech-to-text converter located on a server computer system; (Church discloses [0040] a processing system that is located remotely from the recording device (e. g., cloud computing system 708 and/or virtual family server 710 in FIG. 7)… comprises a coding and speech recognition application… a speech-to-text converter).
transcribing the digital audio chunk into its corresponding text string with the speech-to-text converter on the server computer system; and (Church discloses [0053-0054] a speech-to-text converter is performed by the cloud computing system 708 and/or virtual family server… analyze a speech to identify portions to be included and convert to text e.g., to generate a transcript).
receiving the text string from the speech-to-text converter. (Church discloses [0006, 0053] supplying the digest file to a recipient comprising sample clusters of speech content… with the speech-to-text converter generated transcript).

However, Church may not explicitly disclose all the aspect of the transcribing the digital audio stream to text in response to user input;
providing a first user interface to display a graphical timeline representation of the digital audio stream,
starting playback of the digital audio stream from one of the unique timestamps in response to user selection of the corresponding highlight mark in the first user interface;
providing a second user interface to display the at least one text string and its associated unique timestamp; and
starting playback of the digital audio stream from one of the unique timestamps in response to user selection of the corresponding text string in the second user interface;
wherein the step of transcribing the digital audio stream to text comprises the substeps of: 
dividing the digital audio stream into a plurality of digital audio chunks; 
associating a unique timestamp with each digital audio chunk;
converting each digital audio chunk into a corresponding text string;
associating the unique timestamp of each digital audio chunk to the corresponding text string; and

Pokharel discloses transcribing the digital audio stream to text in response to user input; (Pokharel discloses [0119] a closed captioning (i.e. transcribing) button for turning on (i.e. start) or off the appearance of closed captioning text in the playback pane… which is in response to the user input).
providing a first user interface to display a graphical timeline representation of the digital audio stream, (Pokharel discloses [Claim 11] displays the respective media source in the media player in the first pane of the first interface time-aligned (i.e. timeline) with the respective text string excerpt in the respective synchronized transcript).21ALPH7O.91100
Pokharel discloses starting playback of the digital audio stream from one of the unique timestamps in response to user selection of the corresponding highlight mark in the first user interface; (Pokharel discloses [0108] The playback control allows the user to playback the new highlight in the media source highlighting interface. In response to user selection of the playback control, the media player plays the portion of the video or audio media file corresponding to the transcript text (i.e. text string) of the new highlight and, at the same timeline (i.e. timestamp)).
Pokharel discloses providing a second user interface to display the at least one text string and its associated unique timestamp; and (Pokharel discloses [0106] The media source highlighting interface includes a media player pane for playing video content of the selected media source and a transcript pane for displaying the corresponding synchronized transcript of the selected media source… display the transcript, the current playback time (i.e. unique timestamp) in the media source)..
Pokharel discloses starting playback of the digital audio stream from one of the unique timestamps in response to user selection of the corresponding text string in the second user interface; (Pokharel discloses [0107-0108] The media source highlighting interface (i.e. second user interface) enable a user to select source media to playback… in response to user’s selection of the text as a highlight …The playback control allows the user to playback the new highlight in the media source highlighting interface. In response to user selection of the playback control, the media player plays the portion of the video or audio media file corresponding to the transcript text (i.e. text string) of the new highlight and, at the same timeline (i.e. timestamp)).
wherein the step of transcribing the digital audio stream to text comprises the substeps of: 
Pokharel discloses dividing the digital audio stream into a plurality of digital audio chunks; (Pokharel discloses [0063] the master audio track is divided into a plurality of audio segments (i.e. chunks) that are sent to a transcription service).
Pokharel discloses associating a unique timestamp with each digital audio chunk; (Pokharel discloses [0063] the master audio track is divided into a plurality of audio segments (i.e. chunks) that are sent to a transcription service).
Pokharel discloses converting each digital audio chunk into a corresponding text string; (Pokharel discloses [0063] the master audio track is divided into a plurality of audio segments (i.e. chunks) that are sent to a transcription service to be transcribed (i.e. converted into text string)).
Pokharel discloses associating the unique timestamp of each digital audio chunk to the corresponding text string; and (Pokharel discloses [0065] The identified times (i.e. unique timestamp) then are used to align a portion of the transcript (i.e. text string) with a corresponding portion (i.e. chunk) of the master audio track).
Thus, one of ordinary skill in the art before the effective filing date of the claimed invention would have recognized that with both Church and Pokharel disclosing marking highlights in the audio with transcription which are analogous art from the “same field of endeavor”, and, when Pokharel’s discovering highlights in transcribed divided audio segments was combined with Church's summarizing recorded audio content with transcription, the claimed limitation on the transcribing the digital audio stream to text in response to user input;
providing a first user interface to display a graphical timeline representation of the digital audio stream,
starting playback of the digital audio stream from one of the unique timestamps in response to user selection of the corresponding highlight mark in the first user interface;
providing a second user interface to display the at least one text string and its associated unique timestamp; and
starting playback of the digital audio stream from one of the unique timestamps in response to user selection of the corresponding text string in the second user interface;
wherein the step of transcribing the digital audio stream to text comprises the substeps of: 
dividing the digital audio stream into a plurality of digital audio chunks; 
associating a unique timestamp with each digital audio chunk;
converting each digital audio chunk into a corresponding text string;
associating the unique timestamp of each digital audio chunk to the corresponding text string; and would be obvious. The motivation to combine Church and Pokharel is to provide synchronization the time-coded written transcript with an audio or video effectively. (See Pokharel [0011]).

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kavita Padmanabhan can be reached on 5712728352.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/JENQ-KANG CHU/Examiner, Art Unit 2176               

/KAVITA STANLEY/Supervisory Patent Examiner, Art Unit 2176