Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-4, 7-9, 11, and 18-20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Adams (U.S. Patent No. 10,770,092), referred herein as Adams.
Regarding claim 1, Adams teaches a method for controlling avatar motion, performed by one or more processors, the method comprising: receiving input audio by an audio sensor; and controlling, by the one or more processors, a motion of a first user avatar based on the input audio (col 8, lines 35-42; col 9, lines 8-15 and 58-66; col 10, lines 50-60; col 41, lines 33-37).
Regarding claim 2, Adams teaches the method according to claim 1, wherein the controlling includes: detecting a text string from the input audio through speech recognition; and controlling a body motion of the first user avatar based on the detected text string (col 22, lines 1-12; col 24, line 64 through col 25, line 3; col 29, lines 30-47).
Regarding claim 3, Adams teaches the method according to claim 2, wherein the controlling a body motion of the first user avatar includes: searching for an avatar motion associated with the detected text string based on a similarity score between the detected text string and registered instructions by using a mapping table in which the registered instructions and avatar motions are mapped to each other; and controlling the body motion of the first user avatar based on the searched avatar motion (col 22, lines 1-12; col 22, line 66 through col 23, line 11; col 24, line 64 through col 25, line 3; [database associating motions with instructions based on text]; col 29, lines 30-47; col 30, lines 19-27 and 42-51; col 31, lines 5-14; [similarity scores]).
Regarding claim 4, Adams teaches the method according to claim 2, wherein the controlling a body motion of the first user avatar includes: detecting a first avatar motion and a second avatar motion from the detected text string by using a mapping table in which instructions and avatar motions are mapped to each other; and in response to determining that the first avatar motion and the second avatar motion are applicable in an overlapping manner, applying the first avatar motion and the second avatar motion to the first user avatar in the overlapping manner (col 22, lines 1-12; col 22, line 66 through col 23, line 11; col 24, line 64 through col 25, line 3; [database associating motions with instructions based on text]; col 29, lines 30-47; col 30, lines 19-27 and 42-51; col 31, lines 5-14; [similarity scores]; continuous series of overlapping visemes creating the visual sequence).
Regarding claim 7, Adams teaches the method according to claim 2, wherein the controlling a motion of a first user avatar further includes controlling a lip motion of the first user avatar based on the detected text string (col 10, lines 17-24; col 24, line 64 through col 25, line 3).
Regarding claim 8, Adams teaches the method according to claim 7, wherein the controlling a motion of a first user avatar further includes controlling a facial expression of the first user avatar based on a speech tone detected from the input audio (col 8, lines 35-42 and 57-62; col 30, lines 19-32; col 39, lines 55-63; col 41, lines 55-63).
Regarding claim 9, Adams teaches the method according to claim 1, wherein the controlling further includes controlling a body motion of the first user avatar based on at least one of a tempo or a melody code of music detected from the input audio (col 11, lines 52-65; col 40, lines 50-61).
Regarding claim 11, Adams teaches the method according to claim 9, wherein the controlling a motion of a first user avatar further includes: detecting a text string from the input audio through speech recognition; and controlling a shape of a mouth of the first user avatar based on the detected text string (col 8, lines 35-42; col 22, lines 1-12; col 24, line 64 through col 25, line 3; col 29, lines 30-47).
Regarding claim 18, Adams teaches the method according to claim 1, wherein the controlling further includes: analyzing the input audio to recognize a song; and applying a choreography associated with the recognized song to the first user avatar (col 7, lines 25-31 and 42-45; col 11, lines 52-65; col 17, lines 18-21 and 30-36; col 40, lines 50-61).
Regarding claim 19, Adams teaches a non-transitory computer-readable recording medium storing instructions for execution by one or more processors that, when executed by the one or more processors, cause an apparatus including the one or more processors to perform the method according to claim 1 (col 4, lines 17-21 and 56-61).
Regarding claim 20, the limitations of this claim substantially correspond to the limitations of claim 1 (except for the memory, processor, and instructions, which are taught by Adams, col 4, lines 17-21 and 56-61); thus they are rejected on similar grounds.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 5 and 6 are rejected under 35 U.S.C. 103 as being unpatentable over Adams, in view of Carson (U.S. Patent Application Publication No. 2013/0141643), referred herein as Carson.
Regarding claim 5, Adams teaches the method according to claim 2, wherein the controlling a body motion of the first user avatar includes: detecting a first avatar motion and a second avatar motion from the detected text string by using a mapping table in which instructions and avatar motions are mapped to each other; and synchronizing a timing of applying the second avatar motion such that it is applied to the first user avatar after application of the first avatar motion (col 9, lines 58-66; col 10, lines 50-60; col 22, lines 1-12; col 22, line 66 through col 23, line 11; col 24, line 64 through col 25, line 3);
Adams does not explicitly teach that in response to determining that the first avatar motion and the second avatar motion are not applicable in an overlapping manner, delaying a timing of applying the second avatar motion such that the second avatar motion is applied to the first user avatar after application of the first avatar motion is finished.  Carson teaches a method for controlling animated avatar motion based on input audio using viseme and phoneme synchronization (para 55, lines 1-5; para 56, lines 1-14; para 59, lines 1-9), wherein in response to determining that a first avatar motion and a second avatar motion are not applicable in an overlapping manner, delaying a timing of applying the second avatar motion such that the second avatar motion is applied to the first user avatar after application of the first avatar motion is finished (para 85, lines 1-9; para 86).  It would have been obvious to one of ordinary skill in the art to delay the timing in this manner because as taught by Carson, audio and video can become out of synch for a variety of reasons, which can have a detrimental effect on the resulting output video; thus synchronizing the audio and video in this manner helps to ensure that this does not happen and that the video output is of the highest level of quality (see, for example, Carson, paras 4 and 6).
Regarding claim 6, Adams teaches the method according to claim 2, wherein the controlling a body motion of the first user avatar includes: detecting a first avatar motion and a second avatar motion from the detected text string by using a mapping table in which instructions and avatar motions are mapped to each other; and synchronizing a playback time of the first avatar motion (col 9, lines 58-66; col 10, lines 50-60; col 22, lines 1-12; col 22, line 66 through col 23, line 11; col 24, line 64 through col 25, line 3).
Adams does not explicitly teach that in response to determining that the first avatar motion and the second avatar motion are not applicable in an overlapping manner, shortening a playback time of the first avatar motion.  Carson teaches a method for controlling animated avatar motion based on input audio using viseme and phoneme synchronization (para 55, lines 1-5; para 56, lines 1-14; para 59, lines 1-9), wherein in response to determining that a first avatar motion and a second avatar motion are not applicable in an overlapping manner, shortening a playback time of the first avatar motion (para 85, lines 1-9; para 86).  It would have been obvious to one of ordinary skill in the art to shorten the playback in this manner because as taught by Carson, audio and video can become out of synch for a variety of reasons, which can have a detrimental effect on the resulting output video; thus synchronizing the audio and video in this manner helps to ensure that this does not happen and that the video output is of the highest level of quality (see, for example, Carson, paras 4 and 6).

Claims 10 and 12-17 are rejected under 35 U.S.C. 103 as being unpatentable over Adams, in view of Hingorani (U.S. Patent Application Publication No. 2018/0214777), referred herein as Hingorani.
Regarding claim 10, Adams teaches the method according to claim 9, wherein the controlling a body motion of the first user avatar includes: searching for an avatar motion associated with the music by using a mapping table in which instructions and avatar motions are mapped to each other (col 22, lines 1-12; col 22, line 66 through col 23, line 11; col 24, line 64 through col 25, line 3); determining a playback speed of the searched avatar motion based on the detected tempo of the music; and applying the searched avatar motion to the first user avatar (col 11, lines 52-65; col 40, lines 50-61).
Adams does not explicitly teach detecting and utilizing melody codes that are associated with the avatar motions.  Hingorani teaches a method for controlling motion of an avatar based on audio input (para 38; paras 67 and 69), comprising detecting and utilizing melody codes that are associated with the avatar motions (para 70; para 77, the last 4 lines).  It would have been obvious to one of ordinary skill in the art to utilize melody codes because as shown in Hingorani, thus customizes the avatar motion such that a user’s experience with the video output can be enhanced (see, for example, Hingorani, paras 77 and 80).
Regarding claim 12, Adams teaches the method according to claim 1, further comprising displaying a second user avatar associated with another user terminal or the first user avatar on a screen, wherein a motion of the second user avatar is controlled based on another input audio received by the another user terminal (col 23, lines 12-22; col 24, lines 35-38; col 26, lines 41-52).
Adams teaches that more than one digital rendering may be presented (col 12, lines 43-48) but does not explicitly teach displaying the first and second user avatars together on a screen.  Hingorani teaches a method for controlling motion of an avatar based on audio input (para 38; paras 67 and 69), comprising displaying the first and second user avatars together on a screen (paras 106 and 108).  It would have been obvious to one of ordinary skill in the art to display the avatars together because as shown in Hingorani, this enhances a user’s experience by enabling interactions with other individuals, thus making the avatar motion process more accessible to other platforms just as games, social media, and so on (see, for example, Hingorani, paras 109 and 110).
Regarding claim 13, Adams in view of Hingorani teaches the method according to claim 12, wherein the controlling further includes: detecting a text string from the input audio through speech recognition; searching for an avatar motion from the detected text string by using a mapping table in which instructions and avatar motions are mapped to each other; and in response to the searched avatar motion being determined as a group motion, applying the searched avatar motion to the first user avatar and the second user avatar (Adams, col 22, lines 1-12; col 22, line 66 through col 23, line 11; col 24, line 64 through col 25, line 3; Hingorani, para 60, the last 5 lines; para 61; para 108).
Regarding claim 14, Adams teaches the method according to claim 1, further comprising: transmitting a request to participate in an event, and in response to accepting the request to participate in the event, displaying the first user avatar or a second user avatar associated with the another user terminal on a screen of the first user terminal (col 10, lines 17-49; col 25, lines 24-32; col 26, lines 41-52; col 34, lines 41-48 and 55-67).
Adams does not explicitly teach searching for another user terminal in a vicinity of a first user terminal of the first user through short-range communication; transmitting a request to participate in an event to the another user terminal; and in response to the another user terminal accepting the request to participate in the event, displaying the first user avatar and a second user avatar associated with the another user terminal together on a screen of the first user terminal.  Hingorani teaches a method for controlling motion of an avatar based on audio input (para 38; paras 67 and 69), comprising searching for another user terminal in a vicinity of a first user terminal of the first user through short-range communication, transmitting a request to participate in an event to the another user terminal, and in response to the another user terminal accepting the request to participate in the event, displaying the first user avatar and a second user avatar associated with the another user terminal together on a screen of the first user terminal (para 49; paras 106 and 108).  It would have been obvious to one of ordinary skill in the art to enable join events and displaying the avatars together because as shown in Hingorani, this enhances a user’s experience by enabling interactions with other individuals, thus making the avatar motion process more accessible to other platforms just as games, social media, and so on (see, for example, Hingorani, paras 109 and 110).
Regarding claim 15, Adams in view of Hingorani teaches the method according to claim 14, further comprising: receiving an input video by an image sensor; and displaying the first user avatar and the second user avatar on the input video (Adams, col 38, lines 61-63; Hingorani, para 62; para108).
Regarding claim 16, Adams in view of Hingorani teaches the method according to claim 14, further comprising: controlling the motion of the first user avatar and a motion of the second user avatar together based on the input audio received by the audio sensor (Adams, col 8, lines 35-42; col 9, lines 8-15 and 56-66; col 26, lines 41-52; Hingorani, para 60, the last 4 lines; para 61; paras 106 and 108).
Regarding claim 17, Adams in view of Hingorani teaches the method according to claim 14, further comprising: controlling a motion of the second user avatar based on the input audio received by the another user terminal (Adams, col 25, lines 24-32; col 26, lines 41-42; Hingorani, para 60, the last 4 lines; para 61; paras 106 and 108).

Conclusion
The following prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Cosatto (U.S. Patent Application Publication No. 2005/0057570); Audio-visual selection process for the synthesis of photo-realistic talking-head animations.
Ma (U.S. Patent Application Publication No. 2006/0009978); Methods and systems for synthesis of accurate visible speech via transformation of motion capture data.
Cheiky (U.S. Patent No. 7,027,054); Do-it-yourself photo realistic talking head creation system and method.
Cooper (U.S. Patent Application Publication No. 2007/0153125); Method, system, and program product for measuring audio video synchronization.
Cooper (U.S. Patent Application Publication No. 2008/0111887); Method, system, and program product for measuring audio video synchronization independent of speaker characteristics.
Ach (U.S. Patent Application Publication No. 2009/0278851); Method and system for animating an avatar in real time using the voice of a speaker.
Smith (U.S. Patent Application Publication No. 2010/0007665); Do-it-yourself photo realistic talking head creation system and method.
McCoy (U.S. Patent Application Publication No. 2015/0199978); Methods and apparatuses for use in animating video content to correspond with audio content.
Navaratnam (U.S. Patent Application Publication No. 2017/0011745); Virtual photorealistic digital actor system for remote service of customers.
Li (U.S. Patent Application Publication No. 2017/0243387); High-fidelity facial and speech animation for virtual reality head mounted displays.
Roche (U.S. Patent No. 10,521,946); Processing speech to drive animations on avatars.
Roche (U.S. Patent No. 10,586,369); Using dialog and contextual data of a virtual reality environment to create metadata to drive avatar animation.
Roche (U.S. Patent No. 10,732,708); Disambiguation of virtual reality information using multi-modal data including speech.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVID T WELCH whose telephone number is (571)270-5364.  The examiner can normally be reached on Monday-Thursday, 8:30-5:30 EST, and alternate Fridays, 9:00-2:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Xiao Wu can be reached on 571-272-7761.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



DAVID T. WELCH
Primary Examiner
Art Unit 2613



/DAVID T WELCH/Primary Examiner, Art Unit 2613