DETAILED ACTION

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
This communication is responsive to the applicant's amendment dated 09/08/2022.  The applicant(s) amended claims 1-2, 4-5 and 9-10 (see the amendment: pages 2-5).
The examiner withdrew previous claim rejection under 35 USC 112 (b), because the applicant amended corresponding claim(s).  
	
Response to Arguments
Applicant's arguments filed on 09/08/2022 with respect to the claim rejection under 35 USC 102, have been fully considered but are moot in view of the new ground(s) of rejection, since the amended claims introduce new issue and/or change the scope of the claims. Accordingly, response to the applicant’s arguments (see Remarks: page 6, paragraph 6 to page 7, paragraph 2) based on the amended claims is directed to new claim rejection with necessitated new ground (see detail below).  It is also noted that the previously cited references are still applicable to the amended claims for prior art rejection with necessitated new ground(s) (may include newly combined teachings and/or interpretations) (see detailed rejection below).

Claim Rejections - 35 USC § 112
Claim 10 is rejected under 35 U.S.C. 112(b), as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
Regarding claim 10, it recites the limitation of “the hardware processor on the display device (Note: underline portion is newly amended limitation by the applicant).”  There is insufficient antecedent basis for this limitation in the claim(s).   

Claim Rejections - 35 USC § 102
Claims 1-2 and 4-11 are rejected under 35 U.S.C. 102(a) (1) as being anticipated by LATORRE-MARTINEZ et al. (US 2015/0042662) hereinafter referenced as LATORRE.
As per claim 10, as best understood in view of claim rejection under 112 (b) see above, LATORRE discloses ‘synthetic audiovisual storyteller’ (title) including a method and system (apparatus) for ‘animating a computer generation of a head (read on ‘nonverbal information generation’) and displaying the text of an electronic book’ (abstract, p(paragraph)40, p74), comprising:
partitioning (‘dividing’) text (such as ‘input text’) into predetermined units (read on ‘acoustic units’ including but not limiting ‘phonemes or graphemes’, or ‘each section of text’ with ‘text display indicators’ including ‘single words’ or ‘longer passage’ within/of the/said ‘text’, in a broad sense), (p42, p46, p52, p159); 
displaying (or ‘outputting…as video’) the text partitioned into the predetermined units (same above), (p40, p46, p49-p53, p57); and 
making nonverbal information (read on ‘aminating a computer generation and displaying the text of an electronic book’, or ‘sequence of image vectors as video such that the mouth of said head moves’) that represents first information about behavior (including but not limiting: ‘lip movement of the head’ or ‘an animated head’) of a verbal output agent (read on ‘the speaker’, ‘the character speaking the text’) or second information (same/similar as stated above) about behavior of a receiver (read on ‘listener’, or ‘each character on receipt of a…input text’) of verbal information (read on related ‘text’ and ‘speech’ data, in light of the specification: p414) of the verbal output agent (same above), the nonverbal information (same above) corresponding to the predetermined units of the text (same above) when the verbal output agent outputs the verbal information visible in association with the predetermined units of the text (same above, wherein the ‘outputting’ including ‘sequence of text display indicator’ comprising ‘single words of text’ as ‘video’ and/or ‘sequence of speech vectors’ as ‘audio’ and either or both them is/are ‘synchronized with the lip movement of the head’) (Figs. 8, 9A and 9B, p40-p57, p110, p70-p74), 
wherein the predetermined units of the text and the nonverbal information (save above) are displayed (or ‘output’/ ‘render’ as ‘text’ such as ‘word’ and/or ‘video’ such as ‘a talking head’, or ‘mouth’ or ‘lip movement of head’), by the hardware processor (‘processor’) on the display device (‘displaying unit’) at the same time (read on ‘simultaneously’ or being ‘synchronized’), (p40, p46-p52, p57-58, also see Figs. 6 and 10, p114-p115, p135-p138) 
	As per claim 1, it recites an apparatus. The rejection is based on the same reason described for claim 10, because the claim recites/includes the same/similar limitation(s) as claim 10, wherein limitations regarding “display device” and “a hardware processor” are also disclosed by LATORRE (Figs. 6 and 10, p114-p115, p135-p138: ‘display unit’ and ‘processor’ executing ‘program’ (instructions) stored in ‘storage or memory’).
As per claim 2 (depending on claim 1), LATORRE further discloses “…controls the display device so as to generate (‘produce’) the nonverbal information (same above, such as talking head) on a basis of feature quantities (read on ‘text display vector’, ‘text display indicators’ or related ‘timing and duration of the display of each section of the text) of the text or feature quantities (read on ‘speech vectors’ or converted ‘speech parameters’) of voice (‘speech’) corresponding to the text (same above) and a learned (trained) nonverbal information generation model (read on ‘statistic model’ used to ‘convert said sequence of acoustic unit to a sequence of image vectors’ comprising ‘parameters’ to ‘define a face of said head’, or related ‘image model’), and cause the display device to display the nonverbal information and the text”, (Figs. 27, 19, p40, p49, p57-p58,  p72-p94, p133, p161, p248).
As per claim 4 (depending on claim 2), LATORRE further discloses that “wherein time information (‘time’ or ‘duration’) representing times of the predetermined units of the text (same above) is assigned to the text (read on ‘determine the timing and duration of the display of each section of text’ in a broad sense), and the hardware processor controls the display device so as to cause the display device to display the nonverbal information (same above) in association (synchronized) with the predetermined units of the text (such as ‘outputting said sequence of text display indicators (with text) as video which is synchronized with the lip movement of the head’), on a basis of time-information-stamped nonverbal information (such as ‘timed sound effects and an animated head’) generated on a basis of time-information-stamped feature quantities of the text (such as ‘timed text display indicator’ related to ‘text display vectors) or the voice (such as ‘speech’ related to ‘speech vectors’) corresponding to the text, the learned nonverbal information generation model (same above), and the time information representing the times assigned to the predetermined units of the text (same above)” (LATORRE: p46, p56-p58, p72, p145, p356).
As per claim 5 (depending on claim 2), LATORRE further discloses that “wherein the display device displays in a state in which a setting of additional information (read on ‘sound effects’, ‘text display indicator’ with ‘subtitles’, ‘information about how the head should output speech from a further source’, or ‘additional information’ given ‘in the input to allow expression to be selected’ such as from ‘the user interface system’) is receivable, and the hardware processor, upon receiving the setting of the additional information, controls the display device so as to cause the display device to display nonverbal information generated on a basis of the feature quantities of the text or the feature quantities of the voice corresponding to the text (same as stated for claim 4), the additional information, and the learned nonverbal information generation model, and the text (same as stated for claim 4)” (LATORRE: p123, p147, p154; also see p46, p56-p58, p72, p145, p356 p123, p147, p154).
As per claim 6 (depending on claim 1), LATORRE further discloses “…displays (‘output’ with ‘movement of the face’) in a state (situation) in which a change instruction (read on recognized ‘command’ regarding ‘change the weighing to introduce a new expression’, or received ‘instructions’ regarding ‘emotion or expression’ for collecting ‘audio and video data’ during ‘training image’ including ‘pose change’) of the nonverbal information (same above) is receivable” (LATORRE: p149-p152, p387-p391).
As per claim 7 (depending on claim 1), LATORRE further discloses “…upon receiving a change instruction of the nonverbal information (same as stated for claim 6), generates a combination (or synchronization, or read on an operation on ‘video mixer’)  of the feature quantities of the text or the feature quantities of the voice corresponding to the text (same as stated for claim 2) and the nonverbal information (same as stated for claim 2) changed in accordance with the change instruction (same/similar as stated for claim 6) as learning (training data) data for learning (training) the nonverbal information generation model (same as stated for claim 1, including ‘audiovisual model’)” (LATORRE: p58, p101, p120, p124-p131, p200, p248, p400).
As per claim 8 (depending on claim 7), LATORRE further discloses “…a relearning instruction (similar to change instruction as stated for claims 6 and 7, for which the command/instruction regarding ‘re-estimate model’, ‘re-estimate weights’, ‘re-compute/re-build/reconstruct decision trees’ for ‘the training of a system for a head generation system where the weightings are factorised’) of the nonverbal information generation model (same above) and a setting of a weight (such as ‘a set of expression dependent weights’) for the generated learning (training) data are receivable, and the hardware processor, upon receiving the relearning instruction and the setting of the weight (same above), uses (‘using’ or ‘applying’)the generated learning data and the weight that has been set to cause a nonverbal information generation model learning apparatus to learn (‘training’) the nonverbal information generation model (same above)”, (LATORRE: Figs. 19, 23-25, p29, p197, p215-248, p287, p387-p389).
As per claim 9 (depending on claim 4), LATORRE further discloses “wherein the time information is assigned on a basis of a result of partitioning a range (such as ‘duration’ as state above) of time when the text is output in accordance with the number of partitions (read on ‘number of syllables in the word’) when the text has been partitioned in the predetermined units”, (LATORRE: p46, p57, p145-147, p159).
As per claim 11, it recites a non-transitory computer-readable medium.  The rejection is based on the same reason described for claim 1, because the claim recites/includes the same/similar limitation(s) as claim 1 (see above).

Claim Rejections - 35 USC § 103
Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over LATORRE in view of TILTON et al. (US 9,582,080) hereinafter referenced as TILTON.
As per claim 3 (depending on claim 2), even though LATORRE discloses “an expression device (read on ‘video mixer’, ‘user interface’ or a mechanism associated with an expression, in a broad sense) that expresses the behavior (read on but not limited to: outputting ‘audio-visual output’ regarding ‘the audio-visual book’ comprising ‘computer generated, animated, expressive head and text’,  ‘to generate standard theatre scripts with annotation…to guide actor regarding expression or style or style…’, or using ‘computer generated speech’ to ‘guide an actor on how to deliver the text information…’, in a broad sense), wherein the display device displays in a state (or ‘situation’ or ‘mode’) in which an instruction (or ‘command’ recognized from the ‘text’ and ‘opposed to the narrator’) [to start, stop, fast-forward, or rewind expression of the behavior by the expression device] is receivable (read on received from ‘director’ or recognized from the ‘text’), and the hardware processor, upon receiving the instruction (same above), controls (such as generating ‘scripts with annotation’, providing ‘computer generated speech’, and/or displaying ‘the desired expression’ including ‘animated head’ and synchronizing related ‘text’, ‘speech’ and ‘movement’) the expression of the behavior by the expression device in accordance with the instruction (Figs.1 5 and 8, p120-p126, p414, also see p65, p133, p291), LATORRE does not expressly disclose the receivable instruction (or command, or guide or operation) as “to start, stop, fast-forward, or rewind expression of the behavior” by the expression device. However, the same/similar concept/feature is well known in the art as evidenced by TILTON who discloses ‘methods and apparatus for learning sensor data patterns for gesture-based input’, comprising ‘learning, classification and analysis of real-world cyclic patters’, and that ‘temporal cyclic pattern are gestures/physical activities’ including ‘head gestures’ and ‘vocal gesture (e.g. voice command)’ (co4, lines 9-23), and that ‘audible gestures include a spoken word’ or ‘phrase’, such as ‘start’, ‘stop’, ‘back’ (rewind) and/or ‘fast forward’ (col. 5, lines 23-49).  Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine teachings of LATORRE and TILTON together by providing a mechanism of handling/processing received vocal/audible gesture as voice/speech command /instruction to output and/or display including to start, stop, back (rewind), or fast forward, expression(s) of a behaver (such as a generated expression based on computer generated speech including narrator or annotation, animated head including lip of movement of face, related text, or a combination thereof) by the mixer (or a user interface as expression unit/device), as claimed, for the purpose (motivation) of offering better method/system for learning, recognition, classification, and analysis of real-word cyclic patterns and/or being able to produce a unquiet identifier for a particular sound-based and/or gesture-based user input (TILTON: col. 4, lines 9-34).  

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 


Any inquiry concerning this communication or earlier communications from the examiner should be directed to QI HAN whose telephone number is (571)272-7604.  The examiner can normally be reached on 9-19:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on 571-272-7799.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.                                                                                                                                                                                          

QH/qh
November 20, 2022
/QI HAN/Primary Examiner, Art Unit 2659