DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-15 are pending in this application.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 02/27/2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-15 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Regarding claim 1, the limitation(s) of “receive a first command”, “determine a start position”, “output…an audio”, “receive a second command”, “record…the comment”, “receive a third command”, “modify at least one format”, “receive a fourth command”, and “output…the audio reading”, as drafted, are processes that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. More specifically, the mental process of a first human hearing a second human ask the first human to read a text, the first 
Regarding claim 7, the limitation(s) of “converting a text”, “providing the audio”, “receiving at least one…event”, “determining at least one note”, “generating a document”, and “providing the document”, as drafted, are processes that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. More specifically, the mental process of a first human marking on the printed text demarcation marks of the text and speaking the text aloud to a second human, hearing the second human ask for a formatting change or note to be added to the text and recognizing that the time of the request corresponds to a specific demarcation mark, deciding how and where to write the requested change on the text document, writing the changes into the document, and handing the updated document to the second human.
Regarding claim 11, the limitation(s) of “record an audio file”, “receive an input”, “determining an event time”, “receive the audio file”, “receive the at least one…event”, “split the audio file”, “convert the audio file”, “determine at least one note”, “generate a document”, and “provide the document”, as drafted, are processes that, under broadest 
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the --Mental Processes-- grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
This judicial exception is not integrated into a practical application because the recitation of “a computing device”, “speaker”, “microphone”, “memory”, and processor” in claim 1, a “user device” of claim 7, and a “system”, “user device”, “microphone”, “first memory”, “first processor”, “server device”, “second memory”, and “second processor” in claim 11 reads to generalized computer components, based upon the claim interpretation wherein the structure is interpreted using page 35, lines 4-9, and page 38, lines 4-17, in the specification. Accordingly, these additional elements do not integrate 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of using generalized computer components to perform the listed limitations amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible.
	With respect to claims 2, 4, and 5, the claims recite “the first/second/fourth command is a voice command”, which reads on a human making the request aloud. The additional recitation of a microphone reads to a generalized computer component as previously presented.

	With respect to claim 3, the claim recites “the start position…is identified in the first command”, which reads on a second human asking the first human to begin reading the text from a specific position. No additional limitations are present.

	With respect to claim 6, the claim recites “generate a new text file” and “provide the new text file”, which reads on a first human writing down the comment made by the second human onto the paper containing the text being read and handing the paper to the second human. No additional limitations are present.

, which reads on a human being handed the text printed on paper from another human. The additional recitation of a user device reads to a generalized computer component as previously presented.

	With respect to claim 9, the claim recites “receiving a voice command”, which reads on a human making the request aloud. The additional recitation of a microphone reads to a generalized computer component as previously presented.

	With respect to claim 10, the claim recites “identifying a time”, “determining a text position”, and “generating the at least one note”, which reads on a human recognizing when during the text being read aloud the request for an edit was made, identifying the location in the printed text that corresponds to what was read, and writing down the edit at the location of the words the requesting human wanted the edit to be associated with. No additional limitations are present.

	With respect to claim 12, the claim recites “input…is received as a voice command”, which reads on a human making the request aloud. The additional recitation of a microphone reads to a generalized computer component as previously presented. 

	With respect to claim 13, the claim recites “receive a tag” and “modify the at least one note”, which reads on a human telling another human the category of the edit, and 

	With respect to claim 14, the claim recites “generate a text version”, “augment the text version”, and “provide the augmented text version”, which reads on a human writing down the dictated speech, writing down notes and edits based on the edit requests on the written speech, and handing the edited document to the speaker. The additional recitation of a user device reads to a generalized computer component as previously presented.

	With respect to claim 15, the claim recites “generating a new audio file”, which reads on a human recognizing a larger portion of the dictated speech as being related to the editing request. No additional limitations are present.

These claims further do not remedy the judicial exception being integrated into a practical application and further fail to include additional elements that are sufficient to amount to significantly more than the judicial exception.
	
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1, 2, and 4-6 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Duncan et al. (US Patent No. 7107533), hereinafter Duncan.

Regarding claim 1, Duncan teaches
A computing device (an electronic book device, i.e. computing device (3:4)), comprising:
a speaker to output audio signals (the book includes a second output device, such as an audio speaker, i.e. a speaker, that presents content in audio form, i.e. output audio signals (3:18-24,34-36));
a microphone to receive audio signals (an input device, i.e. receive, that includes a microphone, i.e. microphone, as part of an audio user interface, i.e. audio signals (3:18-26));
a memory that stores instructions and text (a data storage device, i.e. memory, that may contain computer-executable instructions, i.e. instructions (3:39-4:8), and one or more books in electronic form where the files contain the text content of the book, i.e. text (4:23-28)); and
a processor that executes the instructions to (a microprocessor, i.e. processor, executes the computer-executable instructions (3:27-4:1)):
receive a first command from a user to read the text (a voice command, i.e. first command, from a user, such as “begin speak”, can invoke recitation of an electronic book which has text content, i.e. read the text (4:23-28),(7:16-19), where the audio user interface receives the user command, i.e. receive (5:44-48));
determine a start position for reading the text (the abstract interface stores information such as the current position in the content being rendered, or the start position of the text being displayed, i.e. determine a start position, where displaying the text may occur by way of audio output via a speaker, i.e. reading the text (5:55-67));
output, via the speaker, an audio reading of the text to the user beginning at the start position (recitation may be invoked, i.e. output an audio, where the system begins reading the current page from the beginning, i.e. reading of the text to the user beginning at the start position (5:55-67),(7:4-19), where the audio speaker presents content, i.e. output, via the speaker (3:18-24,34-36));
receive a second command from the user to provide a comment (a user clicks on an annotation button, i.e. receive a second command, to put the system into text-note mode for entering an annotation, i.e. provide a comment (8:3-8));
record, via the microphone, the comment provided by the user at a current reading position in the text (a text-note mode provides the user with a text box in which to type an annotation associated with a position in the text, i.e. comment provided by the user at a current reading position in the text (7:60-8:11) an annotation ;
receive a third command from the user to format the text, wherein the third command is a voice command received via the microphone (presentation buttons can be invoked by a click, i.e. receive a third command from the user, that can control the size of the fonts used to display the text, i.e. format the text (7:16-21), and where the user commands able to be input in graphics-related ways can also be input using the voice recognition module that has a vocabulary for recognizing the command, i.e. command is a voice command (Fig 3),(6:45-51), and where the voice recognition module includes a microphone, i.e. received via the microphone (3:21-24));
modify at least one format characteristic of at least a portion of the text based on the third command received from the user (presentation buttons control the size of the font, i.e. modify at least one format characteristic, used to display the text, i.e. at least a portion of the text, where the presentation buttons can be invoked by a click or a voice command, i.e. based on the third command received from the user (Fig 3),(6:45-51),(7:16-21));
receive a fourth command from the user to modify the current reading position in the text (command buttons include navigation buttons, i.e. receive a fourth command from the user, that can be used to change pages, navigate up or down a single page, or go to the end or start of a book (6:59-62), where navigation by the user can alter the location of recitation to a different page or chapter, i.e. modify the current reading position in the text (11:36-51)); and
 output, via the speaker, the audio reading of the text to the user from the modified reading position (when a user changes pages or chapters, i.e. modified reading position, while the system is speaking a page, the system will create a new speech thread, where the thread is provided with the text content of the current page, i.e. the audio reading of the text (11:26-51), and where the audio speaker presents content, i.e. output, via the speaker (3:18-24,34-36)).  

Regarding claim 2, Duncan teaches claim 1, and further teaches
the first command is a voice command received via the microphone from the user to initiate the audio reading of the text (a voice command, i.e. first command is a voice command, from a user, such as “begin speak”, can invoke recitation of an electronic book which has text content, i.e. initiate the audio reading of the text (4:23-28),(7:16-19), where the audio user interface receives the user command from an input device that includes a microphone, i.e. received via the microphone from the user (3:18-26), (5:44-48)).  

	Regarding claim 4, Duncan teaches claim 1, and further teaches 
	the second command is a voice command received via the microphone from the user to input an audible comment (a user speaks a command such as “annotate”, i.e. second command is a voice command…from the user, followed by a spoken annotation, i.e. audible comment (7:50-59), and where an input device includes a microphone as part of an audio user interface, i.e. received via the microphone (3:18-26)).

	the fourth command is a voice command received via the microphone to modify the current reading position in the text (a user can speak a command that is recognized as a navigation command by the audio interface, i.e. the fourth command is a voice command (7:41-49), that can be used to change pages, navigate up or down a single page, or go to the end or start of a book (6:59-62), where navigation by the user can alter the location of recitation to a different page or chapter, i.e. modify the current reading position in the text (11:36-51), and where an input device includes a microphone as part of an audio user interface, i.e. received via the microphone (3:18-26)).  

	Regarding claim 6, Duncan teaches claim 1, and further teaches 
generate a new text file to include at least one of (character offsets are used to indicate annotation positions in the rendered text to tell the JEditorPane where to put the highlights and annotation marks when displaying annotated text, i.e. generate a new text file (9:35-52)):
a text version of the comment provided by the user (an annotation can be a text note, i.e. text version of the comment, that is typed or dictated by the user, i.e. provided by the user (7:60-8:11), and stored as an annotation associated with the document (9:35-52)), the portion of the text with the modified at least one format characteristic, or a text version associated with the at least one format characteristic; and
 provide the new text file to the user (the annotations associated with the content, i.e. new text file (9:35-52), may be displayed with the content, such as through a GUI, i.e. provide…to the user (6:30-44)).  
	
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 3 is/are rejected under 35 U.S.C. 103 as being unpatentable over Duncan, and further in view of Shih et al. (U.S. PG Pub No. 2010/0153114), hereinafter Shih.

Regarding claim 3, Duncan teaches claim 1.
While Duncan provides the system recognizing any changes in page or chapter that would require a new audio stream, Duncan does not specifically teach that the navigation command is part of the command to begin reading, and thus does not teach
the start position for reading the text is identified in the first command.  
Shih, however, teaches the start position for reading the text is identified in the first command (voice commands for playback, i.e. reading the text, can include locations in the text, such as “Repeat Sentence”, “Repeat Paragraph”, “Next Paragraph”, “Next Chapter”, “Page N”, or “Restart”, where the playback will jump to the location in the command and start reading, i.e. start position…is identified in the first command [0035-6]).  
Duncan and Shih are analogous art because they are from a similar field of endeavor in enabling voice control of a system that audibly reads text for a user. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the system recognizing any changes in page or chapter that would require a new audio stream teachings of Duncan with the voice command including language that identifies a location to start reading as taught by Shih. The motivation to do so would have been to achieve a predictable result of enabling the user to control play of an audio document, including reading specific portions of the document (Shih [0004]).

Claim(s) 7-10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Duncan, and further in view of Ganong, III (U.S. PG Pub No. 2014/0278354), hereinafter Ganong.

Regarding claim 7, Duncan teaches
A method (a method for presenting content (2:33-35)), comprising:
converting text to an audio version and a plurality of speech marks (a graphics output thread and audio output thread are run simultaneously so that each thread is at the same location in the content as the other thread, i.e. plurality of speech marks (2:41-49), where the audio speech thread is created from the text content of the current page as seen on the screen, i.e. converting text to an audio version (11:16-35));
providing the audio version to a user device (the audio output thread can be displayed to the user, i.e. providing the audio version, by an audio speaker included in the device, i.e. user device (1:56-59),(2:41-49) for outputting content);
receiving at least one highlight or vocabulary event from the user device, the at least one highlight or vocabulary event includes an event time position associated with the audio file (a user highlights text and clicks on an annotation button, i.e. receive at least one…vocabulary event from the user device, to put the system into text-note mode for entering an annotation (8:3-8), where an annotation is associated with a particular location in the text as represented by a highlight starting and ending offset, i.e. includes an event time position (9:35-67), and when a chapter file is read, the corresponding annotation file is read, where the graphics output thread and audio output thread are run simultaneously so that each thread is at the same location in the content as the other thread, i.e. associated with the audio file (2:41-49),(9:65-67));
determining at least one note from the text based on the event time position and the plurality of speech marks (an annotation is associated with a particular location in the text as represented by a highlight starting and ending offset, which is tied to a location in an audio output thread, i.e. based on the event time ;
generating a document with the at least one note (when retrieving the annotation, character offsets are used to tell the JEditorPane where to put the highlights and annotation marks when displaying the rendered annotated text, i.e. generating a document, and where displaying the annotation and associated content can be done using a graphical user interface, i.e. with the at least one note (2:30-32)); and
providing the document to the user device (output modes can include graphics and sound via a visual display and audio speaker of the device, i.e. user device (1:56-61), where a graphics output thread and audio output thread are run simultaneously and the audio speech thread is created from the text content of the current page as seen on the screen (2:41-49),(11:16-35), including displaying the annotation and associated content graphically, as well as reading the chapter file and corresponding annotation file, i.e. providing the document (2:30-32),(9:65-67)).  
While Duncan provides the recognition of location in an audio output thread, Duncan does not specifically teach that the location is a time.
Ganong, however, teaches that the source position in an audio representation may be represented in any suitable manner, such as a time into the audio representation where the same position is found [0052].


Regarding claim 8, Duncan in view of Ganong teaches claim 7, and Duncan further teaches
receiving the text or a selection of the text from the user device (the annotation is associated with the text clicked on or dragged over, i.e. selection of the text, by the user through the GUI, i.e. receiving…from the user device (7:55-8:6)).  

Regarding claim 9, Duncan in view of Ganong teaches claim 7, and Duncan further teaches
receiving a voice command from a user of the user device to obtain a portion of the text for the document (user input commands from the audio interface can be received, i.e. receiving a voice command from a user of the user device, and will update the graphics user interface in response (2:12-21), where a grammar for commands can include phrases such as “find (a word/a passage/ a phrase)”, i.e. obtain a portion of the text for the document (10:35-60)).  

identifying a time in the plurality of speech marks that matches the event time position (a graphics output thread and audio output thread are run simultaneously so that each thread is at the same location in the content as the other thread, i.e. plurality of speech marks (2:41-49), and where the abstract interface stores information such as the current position in the content being rendered, where rendering may be in a display or through the speaker, i.e. identifying a time, (5:55-67),(6:26-29), and where commands, such as the spoken command “annotate”, are processed by the abstract interface to update the other interfaces, associating the annotation with the portion of the content being displayed, where displaying also refers to audio output, i.e. event time position (5:55-67),(7:50-59));
Where Ganong teaches that the location in an audio representation is specifically a time value [0052].
determining a text position in the text that corresponds to the identified time (commands, such as the spoken command “annotate”, are processed by the abstract interface to update the other interfaces, associating the annotation with the portion of the content being displayed, where displaying refers to visual, i.e. determining a text position in the text, and audio output (5:55-67),(7:50-59), and where the abstract interface stores information such as the current position in the content being rendered, where rendering may be in a display or through the speaker, i.e. identified time, (5:55-67),(6:26-29)); and
 generating the at least one note based on an identified number of sentences or an identified word in the text associated with the determined text position (commands can include phrases such as “find (a word/a passage/ a phrase)”, i.e. an identified number of sentences or an identified word in the text (10:35-60), and the spoken command “annotate”, can be processed by the abstract interface to update the other interfaces, associating the annotation with the portion of the content being displayed, where displaying refers to visual, i.e. determined text position in the text, and audio output (5:55-67),(7:50-59), where the user can type or dictate an annotation associated with the chosen text, i.e. generating the at least one note (7:60-8:11)).  
The motivation to combine is the same as previously presented.

Claim(s) 11-14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Quidilig et al. (U.S. PG Pub No. 2012/0004910), hereinafter Quidilig, in view of Lee et al. (U.S. PG Pub No. 2013/0311186), hereinafter Lee, and further in view of Duncan.

Regarding claim 11, Quidilig teaches,
A system, comprising (a system [0010]):
a user device that includes (the user connects to the server of the system using one of a number of devices, such as a telephone, a cellphone, or a computer, i.e. user device [0027]):
record an audio file…(the user speaks into a device such as the cellular telephone, and the sound is converted into a stream of digitized electrical signals, i.e. record an audio file [0037:8-15])
receive an input from a user identifying at least one highlight or vocabulary event associated with the audio file (as the user listens to the echo audio stream, i.e. associated with the audio file, corrections can be made, such as providing an alternate to the text, or making text bold/underlined/italicized, i.e. at least one highlight or vocabulary event, based on the user speaking a command, i.e. receive an input from a user [0069, including table of commands]); and
 determining an event time position associated with each of the at least one highlight or vocabulary event (the user command to make a correction to the echo audio stream, i.e. associated with each of the at least one highlight or vocabulary event, is associated with a particular audio segment from the echo audio stream, i.e. determining an event time position [0069]) ; and
a server device that includes (the network connects the server, i.e. server device, to a plurality of users [0027]):
a second memory that stores second instructions (the server includes program code storage, i.e. second memory, that includes instructions, i.e. second instructions [0087]); and
a second processor that executes the second instructions to (the server includes a processor, i.e. second processor, that executes instructions, i.e. second instructions [0087]):
receive the audio file from the user device (the input audio stream input to the user’s cellular telephone, i.e. audio file from the user device, is sent over the network to the server, i.e. receive [0037:8-15]);
43WO 2018/187234PCT/US2018/025739receive the at least one highlight or vocabulary event associated with the audio file from the user device (editing commands spoken by the user, i.e. the at least one highlight or vocabulary event, as the user listens to the echo audio stream, i.e. associated with the audio file, are part of the user’s input audio stream that is sent to the server, i.e. receive…from the user device [0037:8-15],[0038],[0040]).
While Quidilig provides the input of audio and edit commands at a user device and sending the information to a server for further processing, Quidilig does not specifically teach splitting the audio file into separate audio files for each input command, or the features of the user device, and thus does not teach
a microphone to receive audio signals;
a first memory that stores first instructions;
a first processor that executes the first instructions to:
split the audio file into separate audio files for each of the at least one highlight or vocabulary event based on the event time position for each of the at least one highlight or vocabulary event;
convert the separate audio files into separate text files;
determine at least one note for each separate text file;
generate a document with the at least one note; and
provide the document to the user device.  
Lee, however, teaches a microphone to receive audio signals (the AV input unit of a mobile terminal may include a microphone for receiving external audio signals [0402],[0415],[0417]);
a first memory that stores first instructions (the mobile terminal includes a memory, i.e. first memory, that may store a program, i.e. first instructions [0402],[0439]);
a first processor that executes the first instructions to (a method may be implemented as codes, i.e. first instructions, readable by a processor, i.e. first processor [0519]):
split the audio file into separate audio files for each of the … event based on the event time position for each of the … event (user may select an audio section, such as selecting a portion of a progress bar corresponding to the audio file, i.e. event…based on the event time position for each…event [0298], and a separate audio file is generated from the partial audio section, i.e. split the audio file into separate audio files [0307]);
Where Quidlilig teaches that the event is a highlight or vocabulary event [0069].
convert the separate audio files into separate text files (when storing an audio file, i.e. separate audio files, a text file containing the text generated by STT may be also stored along with the audio file, i.e. convert…into separate text files [0184]).
Quidilig and Lee are analogous art because they are from a similar field of endeavor in enabling a user to edit dictated information. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the input of audio and edit commands at a user device and sending the information to a server for further processing teachings of Quidilig with the generation of new audio files based on selected content as taught by Lee. The motivation to do so would have been to achieve a predictable result of enabling a user 
While Quidilig in view of Lee provides the generation of a separate audio file and subsequent conversion into text, Quidilig does not specifically teach the creation of notes from the text files, and thus does not teach
determine at least one note for each separate text file;
generate a document with the at least one note; and
provide the document to the user device.  
Duncan, however, teaches determine at least one note for each separate text file (a text-note annotation made by the user is identified as such, i.e. determine at least one note (8:3-10), where an annotation is stored as an annotation file, i.e. for each separate text file (9:35-52));
generate a document with the at least one note (when retrieving the annotation, character offsets are used to tell the JEditorPane where to put the highlights and annotation marks when displaying the rendered annotated text, i.e. generating a document, and where displaying the annotation and associated content can be done using a graphical user interface, i.e. with the at least one note (2:30-32),(9:35-52)); and
provide the document to the user device (output modes can include graphics and sound via a visual display and audio speaker of the device, i.e. user device (1:56-61), where a graphics output thread and audio output thread are run simultaneously and the audio speech thread is created from the text content of the current page as seen on the screen (2:41-49),(11:16-35), including displaying the annotation and associated .  
Quidilig, Lee, and Duncan are analogous art because they are from a similar field of endeavor in enabling a user to edit information. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the generation of a separate audio file and subsequent conversion into text teachings of Quidilig, as modified by Lee, with the display of an annotation as stored in a file as taught by Duncan. The motivation to do so would have been to achieve a predictable result of allowing an annotation to be read along with the associated text (Duncan (9:63-67)).

	Regarding claim 12, Quidilig in view of Lee and Duncan teaches claim 11, and Quidilig further teaches
	the input received from the user identifying the at least one highlight or vocabulary event is received as a voice command … (as the user listens to the echo audio stream, corrections can be made, such as providing an alternate to the text, or making text bold/underlined/italicized, i.e. at least one highlight or vocabulary event, based on the user speaking a command, i.e. input received from the user…is received as a voice command [0069, including table of commands]).  
	And Lee teaches a microphone (the AV input unit of a mobile terminal may include a microphone for receiving external audio signals, i.e. received…via a microphone [0402],[0415],[0417]).
	Where the motivation to combine is the same as previously presented.

receive a tag provided by the user of the user device identifying a category associated with the at least one highlight or vocabulary event (the user input to create an annotation includes clicking on different buttons, i.e. provided by the user of the user device…associated with the at least one highlight or vocabulary event, where each button identifies the type of annotation, i.e. identifying a category, where the type of annotation is used as a header, i.e. a tag, in the annotation file (7:60-8:13),(9:54-10:15)); and
modify the at least one note to include the tag (the type of annotation is used as a header, i.e. include the tag, in the annotation file, i.e. modify the at least one note (7:60-8:13),(9:54-10:15)).  
Where the motivation to combine is the same as previously presented.

Regarding claim 14, Quidilig in view of Lee and Duncan teaches claim 11, and Quidilig further teaches
generate a text version of the audio file (the user’s speech as an input audio stream, i.e. audio file, is obtained by the server and processed by a speech to text function to convert the audio into text, i.e. generate a text version [0037:8-15],[0038]);
augment the text version based on the at least one highlight or vocabulary event (corrections can be made to the user input text, i.e. augment the text version, such as providing an alternate to the text, or making text bold/underlined/italicized, i.e. .
And Duncan further teaches provide the augmented text version to the user device (output modes can include graphics via a visual display of the device, i.e. user device (1:56-61), including displaying the annotation and associated content graphically, i.e. providing the augmented text version (2:30-32),(9:65-67)).  
Where the motivation to combine is the same as previously presented.

Claim(s) 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Quidilig, in view of Lee, in view of Duncan, and further in view of Ganong.

Regarding claim 15, Quidilig in view of Lee and Duncan teaches claim 11.
While Quidilig in view of Lee and Duncan provides the creation of a new audio file based on user-selected input, Quidilig in view of Lee and Duncan does not specifically teach that the chosen amount of the audio section includes a period of time before and after the input event, and thus does not teach
		the splitting of the audio file into separate audio files for each of the at 44WO 2018/187234PCT/US2018/025739 least one highlight or vocabulary event includes generating a new audio file for each of the at least one highlight or vocabulary event to include a first portion of time prior to a corresponding event time position and a second portion of time after the corresponding event time position.
Ganong, however, teaches the splitting of the audio file into separate audio files for each of the at 44WO 2018/187234PCT/US2018/025739 least one highlight or vocabulary event includes generating a new audio file for each of the at least one highlight or vocabulary event to include a first portion of time prior to a corresponding event time position and a second portion of time after the corresponding event time position (a server may execute a position determination engine (PDE) upon receiving a request from a reader to find a particular location, i.e. each of the at  least one highlight or vocabulary event, in an audiobook, i.e. audio file [0039], where the request identifies a source position in the audio representation, where the source position may be represented as time into the audio representation where the same position is found, i.e. corresponding event time position [0052], and the audio segment, i.e. separate audio file, used to identify and confirm the location of the source position may be a longer segment that includes the source position at a specific position within the audio segment, where the segment may include time before, i.e. first portion of time prior, and after the source position, i.e. a second portion of time after [0063]).
Quidilig, Lee, Duncan, and Ganong are analogous art because they are from a similar field of endeavor in enabling a user to find specific information in audio files. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the creation of a new audio file based on user-selected input teachings of Quidilig, as modified by Lee and Duncan, with the use of different lengths of audio segments surrounding a particular source position as taught by Ganong. The motivation to do so would have been to achieve a predictable result of enabling a method to identify a target audio position in an audio representation of a work (Ganong [0002]).


Conclusion

	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICOLE A K SCHMIEDER whose telephone number is (571)270-1474.  The examiner can normally be reached on 8:00 - 5:00 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on (571) 272-7799.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access 






/NICOLE A K SCHMIEDER/Examiner, Art Unit 2659                                                                                                                                                                                                        

/PIERRE LOUIS DESIR/Supervisory Patent Examiner, Art Unit 2659