DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . This action is made final.
Claims 1-7, 10, 12, 14-18, and 20-25 are pending in the case. Claims 1, 10, and 15 are independent claims. Claims 8, 9, 11, 13, and 19 have been canceled.

Priority
	This application is a Continuation-In-Part of application 15/852,350 filed 12/22/2017. However, Applicant is not granted the priority date of 12/22/2017 because the claims are not fully supported by the disclosure of application 15/852,350. For purposes of applying prior art, the effective filing date 05/29/2020 of the instant application is considered.

Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.


The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 15-18, 20, and 22 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.

Claim 15 recites “audial descriptors, including keyframes for the audio, wherein the keyframes indicate timings for spoken words, spoken characters, spoken sentences, or spoken sentence fragments corresponding to words, characters, sentences, or sentence fragments of the text” from the end of page 7 to the beginning of page 8 of the claims. The claim then recites “wherein the keyframes further indicate beginning timestamps and ending timestamps…based on a visual cue”. The Specification merely refers to such a visual cue in paragraph [0190], the visual cue of being used in a specific embodiment for synchronization applied to a video “with a non-verbal communication method” as keyframes are generated based on detected visual cues like signed words. Thus, there is lack of support for a visual cue that determines the timestamps for the keyframes when verbal communication is indeed involved, 
Dependent claims 17, 18, 20, and 22 are also rejected due to inheriting the deficiencies of claim 15.


The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 15-18 and 20-22 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.

Claim 15 recites “audial descriptors, including keyframes for the audio, wherein the keyframes indicate timings for spoken words, spoken characters, spoken sentences, or spoken sentence fragments corresponding to words, characters, sentences, or sentence fragments of the text” from the end of page 7 to the beginning of page 8 of the claims. The claim then recites “wherein the keyframes further indicate beginning timestamps and ending timestamps…based on a visual cue”. The claim is indefinite because, while keyframes are associated with the audio, the keyframes are also recited to have beginning and ending timestamps based on a visual cue, which the spoken elements. As a result of the lack of support and the contradiction brought up by the claim’s indefiniteness, Examiner cannot logically include “based on a visual cue” as part of the claim interpretation and the recitation will be considered null. Applicant is advised to review the Specification and amend with the proper support.
Dependent claims 17, 18, 20, and 22 are also rejected due to inheriting the deficiencies of claim 15.

Regarding claim 21, the claim recites “the confidence score” in the last two lines of the claim. There is insufficient antecedent basis for this limitation in the claim. Examiner interprets this limitation as “the confidence value”.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have 


Claims 1, 3, 4, 23, and 24 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kurzweil et al. (US 2010/0318363 A1), in view of Montiel (US 2018/0165987 A1), in view of Bullock (US 2011/0261030 A1), in view of McQuiggan et al. (US 2014/0013192 A1), in view of Cragun et al. (US 2004/0080532 A1), and in view of Chang et al. (US 2007/0166683 A1).

Regarding claim 1, Kurzweil teaches a system for an improved eReader interface, comprising:
a memory (storage 16 of FIG. 1, [0023], and [0025-0027]);
a processor coupled with the memory (processor 14 of FIG. 1 and [0023]);
text and audio relating to a digital book, wherein the text includes at least two language sets of text ([0025-0026] and [0046-0047]: text may include at least two language sets of text; FIG. 2 and [0028]: the text may be related to a digital book), and wherein the audio includes at least two language sets of audio (end of [0026] to [0027] and [0046-0047]: audio includes at least two language sets of audio corresponding to different languages based on various voice models included in a database; FIG. 2 and [0028]: the audio may be related to a digital book);
a graphical user interface (GUI) (see the GUI shown on user display 51 in FIG. 2 and [0028]); and
keyframes for the audio, wherein the keyframes are derived from force alignment of the audio to the text (FIG. 10 and [0059]: force alignment involves synchronizing the timestamps for spoken words, spoken characters, spoken sentences, or spoken sentence fragments corresponding to words, characters, sentences, or sentence fragments of the text (FIG. 10 and [0059]: a time mark is “an indication of an elapsed time period from the start of the audio recording to each word in the sequence of words”);
wherein the processor is operable to:
highlight the words, the characters, the sentences, or the sentence fragments of the text (FIG. 10 and [0059]: the system is operable to highlight aspects such as words of the text);
wherein the system is further operable to:
play back the audio and synchronize the play back with the highlighting (FIG. 10 and [0059]: the system can play back the audio and synchronize the playback with the highlighting. The system may, for example, highlight at least one word of the text for a time);
highlight at least one word, at least one character, at least one sentence, or at least one sentence fragment of the text for a time (FIG. 10 and [0059]: the system can play back the audio and synchronize the playback with the highlighting. The system may, for example, highlight at least one word of the text for a time); wherein the highlighting and the play back occurs based on the language set and a corresponding language set of the at least two language sets of audio (FIG. 10 and [0059]: highlighting and the playback occurs with respect to 
highlight the words, the characters, the sentences, or the sentence fragments based on a word selection, a character selection, a sentence selection, or a sentence fragment selection (FIG. 4 and [0037]: the system is operable to highlight a portion of the text, including words, based on selection of the portion of the text).


Kurzweil does not explicitly teach wherein the processor is operable to display text corresponding to a selected language set of the at least two language sets of text via the GUI;
wherein the system is further operable to:
wherein the highlighting and the playback occurs based on the selected language set and a corresponding language set of the at least two language sets of audio;
play back audio corresponding to the word selection, the character selection, the sentence selection, or the sentence fragment selection based on the keyframes; 
create an alternate language dynamic text container, wherein the alternate language dynamic text container is operable to display an alternate language text corresponding to an alternate selected language set of the at least two language sets of text via the GUI, wherein the alternate language text corresponds to the 
position the alternate language dynamic text container such that the alternate language dynamic text container does not overlap with the text corresponding to the selected language set of the at least two language sets of text; and
reposition the alternate language dynamic text container after the system finishes the highlighting and/or the play back audio corresponding to the word selection, the character selection, the sentence selection, or the sentence fragment selection.

Montiel teaches wherein the processor is operable to:
display text corresponding to a selected language set of the at least two language sets of text (FIG. 2 and [0016]: the system may display, for example, Spanish text that was selected as supported in [0019]);
highlight the words, the characters, the sentences, or the sentence fragments based on a word selection, a character selection, a sentence selection, or a sentence fragment selection (FIGS. 1-4 and [0021]: highlighting and the playback occurs based on the selected language set and a corresponding language set of the at least two language sets of audio)
	wherein the system is further operable to:
the selected language set of the at least two language sets of text and a corresponding language set of the at least two language sets of audio (FIGS. 1-4, [0007], [0017-0019], and [0021]: highlighting and the playback occurs based on the selected language set of text and a corresponding language set of the at least two language sets of audio);
play back audio corresponding to the word selection, the character selection, the sentence selection, or the sentence fragment selection (FIGS. 1-4 and [0021]: playback audio corresponds to, for example, sentence selection);
create an alternate language dynamic text container, wherein the alternate language dynamic text container is operable to display an alternate language text corresponding to an alternate selected language set of the at least two language sets of text via the GUI, wherein the alternate language text further corresponds to the text corresponding to an alternate selected language set of the at least two language sets of text (FIG. 1 and 3, [0016-0017], and [0019]: the dynamic text container is the partitioned region that displays blocks of second language data 20. In this case, the text is in English and corresponds to the text of the selected language set);
position the alternate language dynamic text container such that the alternate language dynamic text container does not overlap with the text corresponding to the selected language set of the at least two language sets of text (FIG. 1 and 3, [0016], and [0019]: the alternate language dynamic text 

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil to incorporate the teachings of Montiel and including displaying text corresponding to a selected language set of the at least two language sets of text; wherein the highlighting and the play back occurs based on the selected language set of the at least two language sets of text and a corresponding language set of the at least two language sets of audio; play back audio corresponding to the word selection, the character selection, the sentence selection, or the sentence fragment selection based on the keyframes; create an alternate language dynamic text container, wherein the alternate language dynamic text container is operable to display an alternate language text corresponding to an alternate selected language set of the at least two language sets of text via the GUI, wherein the alternate language text corresponds to the text corresponding to the selected language set of the at least two language sets of text, wherein the alternate language dynamic text container is operable to display an alternate language text corresponding to an alternate selected language set of the at least two language sets of text; and position the alternate language dynamic text container such that the alternate language dynamic text container does not overlap with the text corresponding to the selected language set of the at least two language sets of text. Doing so would allow the user to switch between language sets of text so that the user is not limited to viewing only a single language set of text at a time. Also, the user would be able to select just a desired, 
Kurzweil in view of Montiel does not explicitly teach repositioning the alternate language dynamic text container after the system finishes the highlighting and/or the play back audio corresponding to the word selection, the character selection, the sentence selection, or the sentence fragment selection.

Bullock teaches create an alternate language dynamic text container, wherein the alternate language dynamic text container is operable to display an alternate language text corresponding to an alternate selected language set of the at least two language sets of text via the GUI, wherein the alternate language text corresponds to the text corresponding to the selected language set of the at least two language sets of text, wherein the alternate language dynamic text container is operable to display an alternate language text corresponding to an alternate selected language set of the at least two language sets of text (FIG. 13 and [0116-0119]: alternate language dynamic text container hosts translated term 46 of alternate selected language, Spanish, and corresponds to selected term 18 in first language, English);

reposition the alternate language dynamic text container after the system finishes the highlighting and/or the play back audio corresponding to the word selection, the character selection, the sentence selection, or the sentence fragment selection (FIG. 17 and [0123]: for example, the alternate language dynamic text container is repositioned after finishing highlighting of word selection for “village”.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the alternate dynamic text container of Kurzweil in view of Montiel to incorporate the teachings of Bullock and reposition the alternate language dynamic text container after the system finishes the highlighting and/or the play back audio corresponding to the word selection, the character selection, the sentence selection, or the sentence fragment selection. Doing so would further ensure that the alternate dynamic text container does not obstruct corresponding text in the first language from the user even after highlighting is done so that, for example, if a new portion of text is highlighted then the alternate dynamic text container remains functional and displayed concurrently with the corresponding original text. By repositioning the alternate dynamic text container, the user may continue to compare words and more easily learn or recognize new words by visually comparing corresponding text of both languages.

Kurzweil in view of Montiel and in view of Bullock does not explicitly teach wherein the keyframes indicate beginning timestamps and ending timestamps.

McQuiggan teaches wherein the keyframes indicate beginning timestamps and ending timestamps for spoken words, spoken characters, spoken sentences, or spoken sentence fragments corresponding to words, characters, sentences, or sentence fragments of the text (FIG. 14 and [0076-0077]: keyframes include the start time and end time for each word.);
wherein the system is further operable to playback the audio and synchronize the playback with the highlighting, wherein the system is operable to highlight at least one word, at least one character, at least one sentence, or at least one sentence fragment of the text for a time according to the keyframes (FIG. 14, [0072-0073], [0076-0077]: keyframes include the start time and end time for each word. Each word is highlighted as audio play backed with respect to the keyframes which maintain synchronization);
wherein the system is further operable to playback audio corresponding to the word selection, the character selection, the sentence selection, or the sentence fragment selection based on the keyframes (FIGS. 16-17 and [0080]: the user is able to select a text sub-element to play back corresponding audio).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the keyframes of Kurzweil in view of Montiel and in view of Bullock to incorporate the teachings of McQuiggan and have wherein the keyframes indicate beginning timestamps and ending timestamps. Doing so 
Although Bullock teaches that a user may perform word highlighting, sentence highlighting, and sentence fragment highlighting (FIGS. 13 and 17, [0116-0119], [0123], and [0125]), Kurzweil in view of Montiel, in view of Bullock, and in view of McQuiggan does not explicitly teach wherein the system is operable to provide highlighting preference options via the GUI, wherein the highlighting preference options include selections for word highlighting, character highlighting, sentence highlighting, and sentence fragment highlighting; and wherein the system is operable to highlight the words, the characters, the sentences, or the sentence fragments of the text based on a selection of the highlighting preference options received via the GUI.

Cragun teaches wherein the system is operable to provide highlighting preference options via the GUI, wherein the highlighting preference options include selections for word highlighting, sentence highlighting, and sentence fragment highlighting (FIG. 8 and [0044]: see the highlighting preference options/menu 810 in GUI 800, which includes selections for word highlighting, sentence highlighting, and sentence fragment highlighting);

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil in view of Montiel, in view of Bullock, and in view of McQuiggan to incorporate the teachings of Cragun and have wherein the system is operable to provide highlighting preference options via the GUI, wherein the highlighting preference options include selections for word highlighting, sentence highlighting, and sentence fragment highlighting; and wherein the system is operable to highlight the words, the characters, the sentences, or the sentence fragments of the text based on a selection of the highlighting preference options received via the GUI. Doing so allow the user to adjust highlighting options so that the user may select a highlighting mode that is most effective for that user to read text. For example, a user may prefer highlighting words rather than highlighting sentences as the user can focus on less content at a given time, while highlighting full sentences may detract from being able to focus on key words.

Kurzweil in view of Montiel, in view of Bullock, in view of McQuiggan, and in view of Cragun does not explicitly teach character highlighting as a highlighting preference option.

Chang teaches character highlighting as a highlighting option ([0017] and [0053-0054]: character highlighting may be implemented with audio synchronization.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the highlighting preference options of Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, and in view of Cragun to incorporate the teachings of Chang and include character highlighting. Again, doing so allow the user to adjust highlighting options so that the user may select a highlighting mode that is most effective for that user to read text. In the case of a beginning reader, for example, it may be beneficial to include character highlighting, as each character is emphasized so that the user may clearly distinguish the corresponding pronunciation, or audio. This could help prevent the user from being overwhelmed when too much text is highlighted at a time preventing the user from being able to follow the audio with the highlighted text.

Regarding claim 3, Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, and in view of Chang teaches the system of claim 1. Kurzweil in view of McQuiggan further teaches wherein the system is operable to perform the force alignment, wherein the force alignment includes synthesizing text from the spoken words, spoken characters, spoken sentences, or spoken sentence fragments, matching the synthesized text to the text derived from the digital book, and determining keyframes for the audio corresponding to the matched synthesized text and text derived from the digital book (Kurzweil, FIG. 10 and [0059]: force alignment 

Regarding claim 4, Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, and in view of Chang teaches the system of claim 1. McQuiggan further teaches wherein the text is extracted from the digital book, and wherein the system is operable to generate and store textual descriptors for each of the words, characters, sentences, or sentence fragments of the text, including definitions, translations, and/or a number of occurrences (FIG. 10 and [0075]: textual descriptors include definitions).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, and in view of Chang to incorporate the further teachings of McQuiggan and have wherein the text is extracted from the digital book, and wherein the system is operable to generate and store textual descriptors for each of the words, characters, sentences, or sentence fragments of the text, including definitions, translations, and/or a number of occurrences. Doing so would allow the system to store more information about text that may be helpful for the user in case the user, for example, does not understand the meaning of certain words.

Regarding claim 23, Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, and in view of Chang teaches the system of claim 1. Kurzweil further teaches wherein the at least one word, the at least one character, the at least one sentence, or the at least one sentence fragment of the text includes at least two words, at least two characters, at least two sentences or at least two sentence fragments of the text, wherein the at least two words include a first word and a second word, wherein the at least two characters include a first character and a second character, wherein the at least two sentences include a first sentence and a second sentence, wherein the at least two sentence fragments include a first sentence fragment and a second sentence fragment (FIG. 13 and [0116-0119], FIG. 16 and [0122]: for example, there are at least two words, including “village”/first word and “confront”/second word), wherein the system is configured to separately highlight the first word, the second word, the first character, the second character, the first sentence, the second sentence, the first sentence fragment, and the second sentence fragment, wherein the system is further configured to reposition the alternate language dynamic text container when highlighting the second word, the second character, the second sentence, or the second sentence fragment after completing highlighting the first word, the first character, the first sentence, or the first sentence fragment to prevent the alternate language dynamic text container from overlapping with the second word, the second character, the second sentence, or the second sentence fragment (FIG. 13 and [0116-0119], FIG. 16 and [0122]: the system separately highlights the first word and the second word and the dynamic text container is repositioned, preventing it from overlapping with the second word “confront” as seen in FIG. 16, as opposed to FIG. 13).

Regarding claim 24, Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, and in view of Chang teaches the system of claim 1. Montiel further teaches wherein the system is further configured to simultaneously highlight the alternate language text and the text corresponding to the selected language set of the at least two language sets of text (end of [0007], [0017], FIG. 1, [0020]: text of both languages are simultaneously highlighted).

Claim 2 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kurzweil et al. (US 2010/0318363 A1), in view of Montiel (US 2018/0165987 A1), in view of Bullock (US 2011/0261030 A1), in view of McQuiggan et al. (US 2014/0013192 A1), in view of Cragun et al. (US 2004/0080532 A1), in view of Chang et al. (US 2007/0166683 A1), and in view of Hamaker et al. (US 9684641 B1).

Regarding claim 2, Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, and in view of Chang teaches the system of claim 1. Kurzweil in view of McQuiggan further teaches wherein the audio includes audial descriptors wherein the audial descriptors include at least the keyframes (Kurzweil, FIG. 10 and [0059]: time marks correspond to keyframes) (McQuiggan, FIG. 14, [0072-0073], and [0076-0077]: keyframes include the start time and end time for each word.), a corresponding word (Kurzweil, FIG. 10 and [0059]: audial descriptors of time marks correspond to their corresponding words), an audial runtime of the corresponding word (Kurzweil, [0060-0063]: an audial runtime of the corresponding word may be a 
While Kurzweil and Montiel teach the text includes textual descriptors including a language, Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, and in view of Chang does not explicitly teach wherein the textual descriptors includes at least a page number, a word or character length, and a language, for each of the words, characters, sentences, or sentence fragments of the text.
Hamaker teaches wherein the textual descriptors includes at least a page number (FIG. 5, Col. 12, lines 44-61, and Col. 16, lines 37-41: page number is associated with text in content portion 506, for example.; FIG. 7 and Col. 15, lines 5-33: location information 724 is part of metadata 722), a word or character length (FIG. 8 and Col. 15, line 34 to Col. 16, line 26: a word length is equivalent to the tokens comprising the word; See also FIG. 9 and Col. 17, lines 31-53 for another example with tokens for English and Spanish text), and a language (FIG. 7 and Col. 15, lines 5-33: language information 724 is included), for each of the words, characters, sentences, or sentence fragments of the text.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, and in view of Chang to incorporate the teachings of Hamaker and have wherein the textual descriptors includes at least a page number, a word or character length, and a language, for each of the words, characters, sentences, or sentence fragments of the text. Doing so would .

Claims 5 and 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kurzweil et al. (US 2010/0318363 A1), in view of Montiel (US 2018/0165987 A1), in view of Bullock (US 2011/0261030 A1), in view of McQuiggan et al. (US 2014/0013192 A1), in view of Cragun et al. (US 2004/0080532 A1), in view of Chang et al. (US 2007/0166683 A1), and in view of Nicol et al. (US 2014/0120503 A1).

Regarding claim 5, Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, and in view of Chang teaches the system of claim 1. While Kurzweil teaches loading a digital book and displaying the corresponding digital book (FIG. 2 and [0028]), Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, and in view of Chang does not explicitly teach wherein the GUI is further configured to receive an indication of a digital book selection, load a corresponding digital book from the at least one database, and display the corresponding digital book.
Nicol teaches wherein the GUI is further configured to receive an indication of a digital book selection (FIG. 6 and [0062]: a user may select a digital book), load a corresponding digital book from the at least one database (FIG. 2, [0032], and [0036]: dBook file is generated/loaded from a database/memory module 208), and display the 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the GUI of Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, and in view of Chang to incorporate the teachings of Nicol and have wherein the GUI is further configured to receive an indication of a digital book selection, load a corresponding digital book from the at least one database, and display the corresponding digital book. Doing so would allow the user to select a desired digital book from database rather than being limited to certain content locally available.

Regarding claim 6, Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, and in view of Chang teaches the system of claim 1. Kurzweil further teaches wherein the text corresponding to the selected language set is displayed in a dynamic text container, wherein the dynamic text container is configured to display electronic markup, stylesheet, and/or semi-structured data according to textual descriptors of the text (FIGS. 1-4 and [0016-0017]: the dynamic text container is the partitioned region that displays blocks of first language data 18. In this example, the text is in Spanish).
Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, and in view of Chang does not explicitly teach textual descriptors including a font size and a typeface.

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil, in view of Montiel, in view of McQuiggan, in view of Cragun, and in view of Chang to incorporate the teachings of Nicol and have the textual descriptors include a font size and a typeface. Doing so would allow the system to maintain a defined aesthetic of the text so that the text appears appealing and, more importantly, effectively readable to the user.

Claim 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kurzweil et al. (US 2010/0318363 A1), in view of Montiel (US 2018/0165987 A1), in view of Bullock (US 2011/0261030 A1), in view of McQuiggan et al. (US 2014/0013192 A1), in view of Cragun et al. (US 2004/0080532 A1), in view of Chang et al. (US 2007/0166683 A1), in view of Nicol et al. (US 2014/0120503 A1), and in view of Skaggs (US 2007/0206022 A1).

Regarding claim 7, Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, in view of Chang, and in view of Nicol teaches the system of claim 6. Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, in view of Chang, and in view of Nicol does not explicitly teach wherein dimensions of the dynamic text container are preset, and wherein the dynamic text container is further configured to enable scrolling for overflow text within the dynamic text container.
Skaggs teaches wherein the GUI is further operable to display a dynamic text container, wherein the dynamic text container is operable to display text according to the descriptors and provide an interactive scrolling method for text if the text overflows boundaries of the dynamic text container ([0029] and [0036]: after a file 104 provides the bases for rendering document 106, as described in [0016] and FIG. 1, the text component 116 of the file 104 helps render the text 120 of the document 106 defined by an overflow attribute that enables interactive scrolling for text when the text overflows boundaries of the dynamic text container of which size, as a descriptor, is maintained from the text component 116. Note how the rendered document 106 may be a GUI for a webpage as described in [0018]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, in view of Chang, and in view of Nicol to incorporate the teachings of Skaggs and have a dynamic text container that provides a scrolling method when the text overflows the container’s boundaries. Doing so would help retain the layout of the text as originally intended in the descriptors by .

Claims 10 and 25 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kurzweil et al. (US 2010/0318363 A1), in view of Montiel (US 2018/0165987 A1), in view of McQuiggan et al. (US 2014/0013192 A1), in view of Cragun et al. (US 2004/0080532 A1), in view of Chang et al. (US 2007/0166683 A1), and in view of Kung et al. (US 2010/0153091 A1).

Regarding claim 10, Kurzweil teaches a method for an improved eReader interface, comprising:
receiving text and audio, wherein the text includes at least two language sets of text ([0025-0026] and [0046-0047]: text may include at least two language sets of text; FIG. 2 and [0028]: the text may be related to a digital book), and wherein the audio includes at least two language sets of audio (end of [0026] to [0027] and [0046-0047]: audio includes at least two language sets of audio corresponding to different languages based on various voice models included in a database; FIG. 2 and [0028]: the audio may be related to a digital book);
displaying text corresponding to the 
deriving keyframes for the audio via force alignment of the audio to the text (FIG. 10 and [0059]: force alignment involves synchronizing the text with audio. This can be timestamps for spoken words, spoken characters, spoken sentences, or spoken sentence fragments corresponding to words, characters, sentences, or sentence fragments of the text (FIG. 10 and [0059]: a time mark is “an indication of an elapsed time period from the start of the audio recording to each word in the sequence of words”);
highlighting the words, the characters, the sentences, or the sentence fragments of the text (FIG. 10 and [0059]: the system is operable to highlight aspects such as words of the text);
playing the audio and synchronizing the playing with the highlighting, wherein the highlighting includes highlighting the at least one word, the at least one character, the at least one sentence, or the at least one sentence fragment of the text for a time (FIG. 10 and [0059]: the system can play back the audio and synchronize the playback with the highlighting. The system may, for example, highlight at least one word of the text for a time); wherein the highlighting and the play back occurs based on the language set and a corresponding language set of the at least two language sets of audio (FIG. 10 and [0059]: highlighting and the playback occurs with respect to text and audio, which correspond to the language set of the text and corresponding language set of the audio, respectively);
receiving a word selection, a character selection, a sentence selection, a graphic selection or a sentence fragment selection (FIG. 4 and [0037]: the system is operable to highlight a portion of the text, including words, based on selection of the portion of the text);

playing audio corresponding to the word selection, the character selection, the sentence selection, the graphic selection, or the sentence fragment selection 

Kurzweil does not explicitly teach receiving a selected language set of the at least two language sets of text;
displaying text corresponding to the selected language text;
wherein the highlighting and the playback occurs based on the selected language set and a corresponding language set of the at least two language sets of audio;
playing audio corresponding to the word selection, the character selection, the sentence selection, the graphic selection, or the sentence fragment selection based on the keyframes.

Montiel teaches receiving a selected language set of the at least two language sets of text (FIG. 2, [0016], and [0019]: Spanish selected);

highlighting the words, the characters, the sentences, or the sentence fragments based on a word selection, a character selection, a sentence selection, or a sentence fragment selection (FIGS. 1-4 and [0021]: highlighting and the playback occurs based on the selected language set and a corresponding language set of the at least two language sets of audio);
wherein the highlighting and the playback occurs based on the selected language set and a corresponding language set of the at least two language sets of audio (FIGS. 1-4, [0007], [0017-0019], and [0021]: highlighting and the playback occurs based on the selected language set of text and a corresponding language set of the at least two language sets of audio); and
playing audio corresponding to the word selection, the character selection, the sentence selection, the graphic selection, or the sentence fragment selection based on the keyframes (FIGS. 1-4 and [0021]: playback audio corresponds to, for example, sentence selection);

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil to incorporate the teachings of Montiel and include receiving a selected language set of the at least two language sets of text; displaying text corresponding to the selected language text; wherein the highlighting and the playback occurs based on the selected language set 

Kurzweil in view of Montiel does not explicitly teach wherein the keyframes indicate beginning timestamps and ending timestamps.

McQuiggan teaches wherein the keyframes indicate beginning timestamps and ending timestamps for spoken words, spoken characters, spoken sentences, or spoken sentence fragments corresponding to words, characters, sentences, or sentence fragments of the text (FIG. 14 and [0076-0077]: keyframes include the start time and end time for each word.);
wherein the system is further operable to playback the audio and synchronize the playback with the highlighting, wherein the system is operable to highlight at least one word, at least one character, at least one sentence, or at least one sentence fragment of the text for a time according to the keyframes (FIG. 14, [0072-0073], [0076-0077]: 
wherein the system is further operable to playback audio corresponding to the word selection, the character selection, the sentence selection, or the sentence fragment selection based on the keyframes (FIGS. 16-17 and [0080]: the user is able to select a text sub-element to play back corresponding audio).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the keyframes of Kurzweil in view of Montiel to incorporate the teachings of McQuiggan and have wherein the keyframes indicate beginning timestamps and ending timestamps. Doing so would provide clear definitions for when distinct constituents of text, like words, start and end. This would help prevent the overlapping of words if only the start time is recorded, for example, with the end time not being defined. Since words have keyframes that definitively set a start time and an end time, each word is designated a clear time block that helps prevent words from coinciding with each other, either in storage and/or during playback.

Kurzweil in view of Montiel, in view of Bullock, and in view of McQuiggan does not explicitly teach providing highlighting preference options via the GUI, wherein the highlighting preference options include selections for word highlighting, character highlighting, sentence highlighting, and sentence fragment highlighting; and receiving a selection of the highlighting preference options.


receiving a selection of the highlighting preference options (FIG. 8 and [0044]: see the highlighting preference options/menu 810 in GUI 800, which includes selections for word highlighting, sentence highlighting, and sentence fragment highlighting. Certain text would be highlighted according to the selected option).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil in view of Montiel and in view of McQuiggan to incorporate the teachings of Cragun and include providing highlighting preference options via the GUI, wherein the highlighting preference options include selections for word highlighting, character highlighting, sentence highlighting, and sentence fragment highlighting; and receiving a selection of the highlighting preference options. Doing so allow the user to adjust highlighting options so that the user may select a highlighting mode that is most effective for that user to read text. For example, a user may prefer highlighting words rather than highlighting sentences as the user can focus on less content at a given time, while highlighting full sentences may detract from being able to focus on key words.



Chang teaches character highlighting as a highlighting option ([0017] and [0053-0054]: character highlighting may be implemented with audio synchronization.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the highlighting preference options of Kurzweil, in view of Montiel, in view of McQuiggan, and in view of Cragun to incorporate the teachings of Chang and include character highlighting. Again, doing so allow the user to adjust highlighting options so that the user may select a highlighting mode that is most effective for that user to read text. In the case of a beginning reader, for example, it may be beneficial to include character highlighting, as each character is emphasized so that the user may clearly distinguish the corresponding pronunciation, or audio. This could help prevent the user from being overwhelmed when too much text is highlighted at a time preventing the user from being able to follow the audio with the highlighted text.

Kurzweil, in view of Montiel, in view of McQuiggan, in view of Cragun, and in view of Chang, does not explicitly teach automatically or manually inserting non-printing characters between at least two of the at least one word, the at least one character, the at least one sentence, or the at least one sentence fragment of the text, wherein the non-printing characters do not increase a distance between the at least one word, the at least one character, the at least one sentence, or the at least one sentence fragment of 
outputting computer-readable code that separates the at least one word, the at least one character, the at least one sentence, or the at least one sentence fragment of the text into individually highlightable elements based on the non-printing character.

Kung teaches teaches automatically or manually inserting non-printing characters between at least two of the at least one word, the at least one character, the at least one sentence, or the at least one sentence fragment of the text, wherein the non-printing characters do not increase a distance between the at least one word, the at least one character, the at least one sentence, or the at least one sentence fragment of the text on the GUI, and wherein the non-printing characters are not visible via the GUI ([0018], [0021-0022], [0027], [0030], [0035-0036], [0049-0052]: non-printing characters/separators, like backslashes, may be inserted between at least two characters, for example, to separate perform word-breaking that separates the at least one character into individually highlightable elements based on the backslashes. The non-printing characters would not be visible as they act merely as separators that indicate separation of words for system processing. The result is distinguished words within a string of characters); and
outputting computer-readable code that separates the at least one word, the at least one character, the at least one sentence, or the at least one sentence fragment of the text into individually highlightable elements based on the non-printing character (FIG. 8 and [0051-0052]: individual words are highlightable).


Regarding claim 25, Kurzweil, in view of Montiel, in view of McQuiggan, in view of Cragun, in view of Chang, and in view of Kung teaches the method of claim 10. Montiel further teaches creating an alternate language dynamic text container, wherein the alternate language dynamic text container is operable to display an alternate language text corresponding to an alternate selected language set of the at least two language sets of text via the GUI (FIG. 1 and 3, [0016-0017], and [0019]: the dynamic text container is the partitioned region that displays blocks of second language 
positioning the alternate language dynamic text container such that the alternate language dynamic text container does not overlap with the text corresponding to the selected language set of the at least two language sets of text (FIG. 1 and 3, [0016], and [0019]: the alternate language dynamic text container corresponding to English does not overlap with the selected language set of Spanish).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil, in view of Montiel, in view of McQuiggan, in view of Cragun, in view of Chang, and in view of Kung to incorporate the further teachings of Montiel and include creating an alternate language dynamic text container, wherein the alternate language dynamic text container is operable to display an alternate language text corresponding to an alternate selected language set of the at least two language sets of text via the GUI; and positioning the alternate language dynamic text container such that the alternate language dynamic text container does not overlap with the text corresponding to the selected language set of the at least two language sets of text. Doing so would prevent obscuring of the text of the first language so that the user may compare words and more easily learn or recognize new words by visually comparing corresponding text of both languages.

Claim 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kurzweil et al. (US 2010/0318363 A1), in view of Montiel (US 2018/0165987 A1), in view of McQuiggan et al. (US 2014/0013192 A1), in view of Cragun et al. (US 2004/0080532 A1), in view of Chang et al. (US 2007/0166683 A1), in view of Kung et al. (US 2010/0153091 A1), in view of Pegg et al. (US 2014/0123311 A1), and in view of Morris et al. (US 2014/0325407 A1).

Regarding claim 12, Kurzweil, in view of Montiel, in view of McQuiggan, in view of Cragun, and in view of Chang teaches the method of claim 10. Kurzweil, in view of Montiel, in view of McQuiggan, in view of Cragun, in view of Chang, and in view of Kung does not explicitly teach the method further comprising tracking an amount of time the text is displayed via the GUI and determining a number of digital books read.
	Pegg teaches tracking an amount of time the text is displayed via the GUI ([0048]: an amount of time text is displayed via the GUI, for example the time spent on each page, is tracked).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil, in view of Montiel, in view of McQuiggan, in view of Cragun, in view of Chang, and in view of Kung to incorporate the teachings of Pegg and track an amount of time the text is displayed via the GUI. Doing so would allow the system to discern the reading level of the user so as to recommend more relevant content for the user (see Pegg [0026-0029]) to support the growth of the user’s reading skills.
Kurzweil, in view of Montiel, in view of McQuiggan, in view of Cragun, in view of Chang, in view of Kung, and in view of Pegg does not explicitly teach determining a number of digital books read.

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil, in view of Montiel, in view of McQuiggan, in view of Cragun, in view of Chang, in view of Kung, and in view of Pegg to incorporate the teachings of Morris and determine a number of digital books read. Doing so would allow the system to collect more information that more accurately represents the user to, for example, help target relevant content (see [0004-0005] of Morris).

Claim 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kurzweil et al. (US 2010/0318363 A1), in view of Montiel (US 2018/0165987 A1), in view of McQuiggan et al. (US 2014/0013192 A1), in view of Cragun et al. (US 2004/0080532 A1), in view of Chang et al. (US 2007/0166683 A1), in view of Kung et al. (US 2010/0153091 A1), in view of Nagata (US 2017/0078504 A1).

Regarding claim 14, Kurzweil, in view of Montiel, in view of McQuiggan, in view of Cragun, in view of Chang, and in view of Kung teaches the method of claim 10. Kurzweil, in view of Montiel, in view of McQuiggan, in view of Cragun, in view of Chang, and in view of Kung does not explicitly teach reversing a layout of graphics and/or mirroring the graphics, wherein the selected language set is a right-to-left language set.

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil, in view of Montiel, in view of McQuiggan, in view of Cragun, in view of Chang, and in view of Kung to incorporate the teachings of Nagata and reverse a layout of graphics and/or mirroring the graphics, wherein the selected language set is a right-to-left language set. Doing so would maintain the intended directional viewing of graphics in the same manner as originally designed. For example, the graphics, if left unreversed for a right-to-left language set, could unintentionally offer spoilers for what may be described in the text. By reversing the graphics, the original reading experience can be maintained. In this way, a timeline of graphics is prevented from being displayed out-of-sequence.

Claims 15, 16, and 22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kurzweil et al. (US 2010/0318363 A1), in view of Montiel (US 2018/0165987 A1), in view of Cragun et al. (US 2004/0080532 A1), and in view of McQuiggan et al. (US 2014/0013192 A1).

Regarding claim 15, Kurzweil teaches a system for an improved eReader interface, comprising:
a memory (storage 16 of FIG. 1, [0023], and [0025-0027]);

text and audio, wherein the text includes at least one language set of text ([0025-0026] and [0046-0047]: text may include at least two language sets of text; FIG. 2 and [0028]: the text may be related to a digital book), and wherein the audio includes at least one language set of audio (end of [0026] to [0027] and [0046-0047]: audio includes at least two language sets of audio corresponding to different languages based on various voice models included in a database; FIG. 2 and [0028]: the audio may be related to a digital book);
a graphical user interface (GUI) (see the GUI shown on user display 51 in FIG. 2 and [0028]); and
audial descriptors, including keyframes for the audio, wherein the keyframes indicate timings for spoken words, spoken characters, spoken sentences, or spoken sentence fragments corresponding to words, characters, sentences, or sentence fragments of the text  (FIG. 10 and [0059]: force alignment involves synchronizing the text with audio. This can be done through speech recognition process, which generates output files denoting the time marks/keyframes for the audio);
wherein the processor is operable to:
highlight the words, the characters, the sentences, or the sentence fragments of the text (FIG. 10 and [0059]: the system is operable to highlight aspects such as words of the text);
play back the audio and synchronize the play back with the highlighting (FIG. 10 and [0059]: the system can play back the audio and synchronize the 
wherein the system is operable to:
highlight at least one word, at least one character, at least one sentence, or at least one sentence fragment of the text for a time according to the keyframes (FIG. 10 and [0059]: the system can play back the audio and synchronize the playback with the highlighting. The system may, for example, highlight at least one word of the text for a time); and
wherein the highlighting and the play back occurs based on the language set of the at least one language set of text and a corresponding language set of the at least one language set of audio (FIG. 10 and [0059]: highlighting and the playback occurs with respect to text and audio, which correspond to the language set of the text and corresponding language set of the audio, respectively).
Kurzweil does not explicitly teach displaying text corresponding to a selected language set of the at least one language set of text, and wherein the highlighting and the playback occurs based on the selected language set and a corresponding language set of the at least one language set of audio; and wherein the system is further operable to playback audio corresponding to the word selection, the character selection, the sentence selection, or the sentence fragment selection based on the keyframes.
Montiel teaches displaying text corresponding to a selected language set of the at least two language sets of text (FIG. 2 and [0016]: the system may display, for example, Spanish text that was selected as supported in [0019]);
the selected language set and a corresponding language set of the at least two language sets of audio (FIGS. 1-4, [0007], [0017-0019], and [0021]: highlighting and the playback occurs based on the selected language set of text and a corresponding language set of the at least two language sets of audio);
wherein the system is further operable to highlight the words, the characters, the sentences, or the sentence fragments based on a word selection, a character selection, a sentence selection, or a sentence fragment selection (FIGS. 1-4 and [0021]: highlighting and the playback occurs based on the selected language set and a corresponding language set of the at least two language sets of audio); and
wherein the system is further operable to play back audio corresponding to the word selection, the character selection, the sentence selection, or the sentence fragment selection (FIGS. 1-4 and [0021]: playback audio corresponds to, for example, sentence selection).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil to incorporate the teachings of Montiel and including displaying text corresponding to a selected language set of the at least two language sets of text; wherein the highlighting and the play back occurs based on the selected language set and a corresponding language set of the at least two language sets of audio; wherein the system is further operable to play back audio corresponding to the word selection, the character selection, the sentence selection, or the sentence fragment selection based on the keyframes. Doing so would allow the user to switch between language sets of text so that the user is not limited to 

Kurzweil in view of Montiel does not explicitly teach wherein the system is operable to highlight the words, the characters, the sentences, or the sentence fragments of the text based on a highlighting preference selection received via the GUI.
Cragun teaches wherein the system is operable to highlight the words, the characters, the sentences, or the sentence fragments of the text based on a selection of the highlighting preference options received via the GUI (FIG. 8 and [0044]: see the highlighting preference options/menu 810 in GUI 800, which includes selections for word highlighting, sentence highlighting, and sentence fragment highlighting. Certain text would be highlighted according to the selected option).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil in view of Montiel to incorporate the teachings of Cragun and have wherein the system is operable to highlight the words, the characters, the sentences, or the sentence fragments of the text based on a selection of the highlighting preference options received via the GUI. Doing so allow the user to adjust highlighting options so that the user may select a highlighting mode that is most effective for that user to read text. For example, a user may prefer highlighting words rather than highlighting sentences as the user can focus on less 

Kurzweil, in view of Montiel, and in view of Cragun does not explicitly teach wherein the keyframes indicate beginning timestamps and ending timestamps.

McQuiggan teaches wherein the keyframes indicate beginning timestamps and ending timestamps for spoken words, spoken characters, spoken sentences, or spoken sentence fragments corresponding to words, characters, sentences, or sentence fragments of the text (FIG. 14 and [0076-0077]: keyframes include the start time and end time for each word.);
wherein the system is further operable to playback the audio and synchronize the playback with the highlighting, wherein the system is operable to highlight at least one word, at least one character, at least one sentence, or at least one sentence fragment of the text for a time according to the keyframes (FIG. 14, [0072-0073], [0076-0077]: keyframes include the start time and end time for each word. Each word is highlighted as audio play backed with respect to the keyframes which maintain synchronization);
wherein the system is further operable to playback audio corresponding to the word selection, the character selection, the sentence selection, or the sentence fragment selection based on the keyframes (FIGS. 16-17 and [0080]: the user is able to select a text sub-element to play back corresponding audio).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the keyframes of Kurzweil, in view 

Regarding claim 16, Kurzweil, in view of Montiel, in view of Cragun, and in view of McQuiggan teaches the system of claim 15. Montiel further teaches wherein the system is further operable to receive a second language selection and modify the selected language set to be a second language set of the at least two language sets of the text (FIGS. 2 and 4 and [0019]: a second language is selected to modify the selected language set).

Regarding claim 22, Kurzweil, in view of Montiel, in view of Cragun, and in view of McQuiggan teaches the system of claim 15. Montiel further teaches wherein the at least one language set of text further includes at least two language sets of text (FIGS. 1 and 3 and [0016-0017]: there at least two language sets of text included), wherein the system further includes an alternate language dynamic text container, wherein the alternate language dynamic text container is operable to display an alternate language text corresponding to an alternate selected language set of the at least two language 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil, in view of Montiel, in view of Cragun, and in view of McQuiggan to incorporate the further teachings of Montiel and have wherein the at least one language set of text further includes at least two language sets of text, wherein the system further includes an alternate language dynamic text container, wherein the alternate language dynamic text container is operable to display an alternate language text corresponding to an alternate selected language set of the at least two language sets of text, wherein the system is further configured to simultaneously highlight the alternate language text and the text corresponding to the selected language set of the at least two language sets of text, wherein the system is configured to provide play back for the text corresponding to the selected language set of text or the second language set of text. Doing so would allow the user to switch .

Claim 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kurzweil et al. (US 2010/0318363 A1), in view of Montiel (US 2018/0165987 A1), in view of Cragun et al. (US 2004/0080532 A1), in view of McQuiggan et al. (US 2014/0013192 A1), in view of Pegg et al. (US 2014/0123311 A1), and in view of Morris et al. (US 2014/0325407 A1).

Regarding claim 17, Kurzweil, in view of Montiel, in view of Cragun, and in view of McQuiggan teaches the method of claim 15. Kurzweil, in view of Montiel, in view of Cragun, and in view of McQuiggan does not explicitly teach wherein the system is further operable to track metrics for reading the text, including a number of books read, a reading time, and activity related to the text.
	Pegg teaches wherein the system is further operable to track metrics for reading the text, including a reading time and activity related to the text ([0048]: an amount of time text is displayed via the GUI, for example the time spent on each page, is tracked).

Kurzweil, in view of Montiel, in view of Cragun, in view of McQuiggan, and in view of Pegg does not explicitly teach wherein the system is further operable to track metrics for reading the text, including a number of books read.
Morris teaches wherein the system is further operable to track metrics for reading the text, including a number of books read and activity related to the text ([0056]: metrics for reading the text include a number of books read and a reading speed, corresponding to reading the text).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil, in view of Montiel, in view of Cragun, in view of McQuiggan, and in view of Pegg to incorporate the teachings of Morris and have wherein the system is further operable to track metrics for reading the text, including a number of books read. Doing so would allow the system to collect more information that more accurately represents the user to, for example, help target relevant content (see [0004-0005] of Morris).

Claims 18 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kurzweil et al. (US 2010/0318363 A1), in view of Montiel (US 2018/0165987 A1), in view of Cragun et al. (US 2004/0080532 A1), in view of McQuiggan et al. (US 2014/0013192 A1), and in view of Mbenkum et al. (US 2012/0324355 A1).

Regarding claim 18, Kurzweil in view of Montiel, in view of Cragun, and in view of McQuiggan teaches the system of claim 15. Kurzweil in view of Montiel, in view of Cragun, and in view of McQuiggan does not explicitly teach wherein the system is further operable to record video with sound and store the video with the sound, wherein the video with the sound is associated with a page or spread of the GUI.
 Mbenkum teaches wherein the system is further operable to record video with sound and store the video with the sound, wherein the video with the sound is associated with a page or spread of the GUI (FIG. 5 and [0051-0054], and [0036-0038]: the system can record video with sound and store the video with sound as part of a recorded session. The video with sound is associated with a page of a GUI.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil, in view of Montiel, in view of Cragun, and in view of McQuiggan to incorporate the teachings of Mbenkum and have wherein the system is further operable to record video with sound and store the video with the sound, wherein the video with the sound is associated with a page or spread of the GUI. Doing so would allow the user to not only record audio but include video that could help supplement understanding of the content displayed on the page. 

Regarding claim 20, Kurzweil in view of Montiel, in view of Cragun, in view of McQuiggan, and in view of Mbenkum teaches the system of claim 18. Montiel further teaches wherein at least one graphic is stored with corresponding descriptors related to at least one audio file or at least one audio clip, and wherein upon receiving a selection of the at least one graphic via the GUI, the system is operable to playback the at least one audio file or the at least one audio clip based on the corresponding descriptors (FIGS. 1-4, [0017], and [0021]: for example, a graphic corresponding to a play button in control input 14, when selected, allows for playback of at least one audio file based on descriptors from logic and/or set of rules for synchronous audio output).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Kurzweil, in view of Montiel, in view of Cragun, in view of McQuiggan, and in view of Mbenkum to incorporate the further teachings of Montiel and have wherein at least one graphic is stored with corresponding descriptors related to at least one audio file or at least one audio clip, and wherein upon receiving a selection of the at least one graphic via the GUI, the system is operable to playback the at least one audio file or the at least one audio clip based on the corresponding descriptors. Doing so would allow the user to not only interact with text, but also graphics that could play back audio. In this particular example of the play button graphic, the user does not have to select specific text to hear audio playback, which may require multiple inputs, but can simply select the play button to .

Claim 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kurzweil et al. (US 2010/0318363 A1), in view of Montiel (US 2018/0165987 A1), in view of Bullock (US 2011/0261030 A1), in view of McQuiggan et al. (US 2014/0013192 A1), in view of Cragun et al. (US 2004/0080532 A1), in view of Chang et al. (US 2007/0166683 A1), and in view of Port et al. (US 2019/0362732 A1).

Regarding claim 21, Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, and in view of Chang teaches the system of claim 1. Kurzweil further teaches wherein the system is further configured to capture user audio data, wherein the user audio data includes spoken words, spoken characters, spoken phrases, a spoken sentence, a spoken sentence fragment and/or a spoken paragraph related to the text from the digital book ([0057-0065]: user audio is captured, the user audio data including spoken words related to the text from the digital book) (See also McQuiggan, FIG. 4, [0065-0067]: user audio is captured, including spoken words).
Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, and in view of Chang does not explicitly teach wherein the system is further configured to generate a confidence value for each word, each character, each punctuation mark, each syllable, each phrase, each sentence, each sentence fragment, or each paragraph based on a comparison of the user audio data and the audio relating to the digital book, and wherein the system is further configured to match the user audio 
	Port teaches wherein the system is further configured to generate a confidence value for each word, each character, each punctuation mark, each syllable, each phrase, each sentence, each sentence fragment, or each paragraph based on a comparison of the user audio data and the audio, and wherein the system is further configured to match the user audio data with the text relating to the digital book when the confidence score is above a threshold (FIG. 1, [0028-0031], [0034-0037], FIG. 5 and [0050-0052]: for example, words or even phonemes are compared between the synthesized dialogue/audio relating to the text and the audio signal/user audio data. A speech similarity score corresponding to a confidence score is measured between such synchronized speech and the audio signal. When this score exceeds a certain threshold, then the audio data is matched with the text, as exemplified in [0056]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the user audio data and the audio relating to the digital book of Kurzweil, in view of Montiel, in view of Bullock, in view of McQuiggan, in view of Cragun, and in view of Chang to incorporate the teachings of Port and have generate a confidence value for each word, each character, each punctuation mark, each syllable, each phrase, each sentence, each sentence fragment, or each paragraph based on a comparison of the user audio data and the audio relating to the digital book, and wherein the system is further configured to match the user audio data with the text relating to the digital book when the confidence score is above a threshold. Doing so would improve the accuracy of user audio data and prevent the .

Response to Arguments
Applicant’s arguments with respect to the claims have been considered but are moot because the new ground of rejection does not rely on the combination of references applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Please see the attached PTO-892 Notice of References Cited Form for additional prior art.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KENNY NGUYEN whose telephone number is (571)272-4980.  The examiner can normally be reached on M-Th 7AM to 5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on 571-270-3169.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/K.N./Examiner, Art Unit 2171                                                                                                                                                                                                        
/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2171