DETAILED ACTION
This final action is in response to application filed on 03/10/2021. In this amendment, claims 1-2, 4-7, 9-12 and 14-15 have been amended. Claims 1-15 are pending of which claims 1, 6 and 11 are independent claims. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
This application claims priority to Chinese Application No. 201910573596.7, filed on June 28, 2019.

Response to Arguments
Objection to Drawings and Specification
Objections have been withdrawn in view of amended drawings and specification.
Claim Objections
Objections have been withdrawn in view of amended claims.
Claim Rejections - 35 USC § 103
Applicant’s arguments with respect to USC § 103 rejections have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 5-6, 10-11 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Wu (US 2018/0150749, Pub. May 31, 2018), in view of Anders et al. (US 2019/0325864, filed Apr. 16, 2018), and further in view of Walsh (US 2018/0342095, Pub. Nov. 29, 2018).
As per claim 1, Wu discloses a method for generating information (Wu-Abstract, method for providing a conversation session with an artificial intelligence entity), comprising: 
receiving a video and an audio of a user that are sent by a client (Wu- Fig.3, Para. [0097], the query component 300 may include a user interface 310 that provides an input area in which the user can provide input to the query component 300. The input may include typed text, provide or otherwise attach an image file, provide voice input, select emoji symbols, make an audio or voice call, and/or initiate a video conversation with the artificial intelligence entity advertisement system 200) by instant communication (See Wu-Fig.4, instant communication between the user and the artificial intelligence); 
(Wu-Para. [0047], the artificial intelligence entity advertisement system 140 may store additional information about the user based on the interaction with the artificial intelligence entity. This information may then be used to determine demographic information about the user, interests and hobbies of the user, favorite foods of the user and so on) and text reply information according to the video and the audio (Wu-Para. [0100], the worker 340 may also be associated with a speech worker 344 that recognizes sounds and other voice input and converts it text … Once the sound input is converted to text, the speech worker 344 may provide the newly converted text to the chat worker 342 in order to determine the context of the converted text and may request information on how to respond to respond to the input 320. Once a response is generated, the speech worker 344 may convert the text to speech and provide the response back to the worker 340; Wu-Para. [0101], an image worker 346 may be used to determine subject matter contained in a received image or video file … a received image may need to be decoded into text in order for an appropriate response to be generated); 
generating a reply audio according to the text reply information (Wu-Para. [0100], Once a response is generated, the speech worker 344 may convert the text to speech and provide the response back to the worker 340);
generating a video based on the reply audio (Wu-Para. [0100], once a response is generated, the speech worker 344 may convert the text to speech and provide the response back to the worker 340; Wu-Para. [0103], the worker 340 may be used to generate a response 370; Wu-Para. [0106], the response 370 may be an audio (speech) response, a text response, a video response, an image response or a combination thereof);
transmitting the video to the client by instant communication, for the client to present to the user (Wu-Para. [0097], the user interface 310 may also be used to receive responses from the artificial intelligence entity. As with the input provided by the user, the response provided by the artificial intelligence entity may include text, images, sound, video and so on; See Wu-Fig.4, instant communication between the user and the artificial intelligence).
Wu does not explicitly disclose:
generating a control parameter for a three-dimensional virtual portrait according to the user feature information; a characteristic of the reply audio being associated with the user feature information, the characteristic of the reply audio comprising at least one of tone, speech rate, or timbre;
generating a video of the three-dimensional virtual portrait based on the control parameter;
transmitting the video of the three-dimensional virtual portrait.
Andres teaches:
a characteristic of the reply audio being associated with the user feature information (Andres-Para. [0016], the automated assistant may select, from a plurality of candidate voice synthesis models, a given voice synthesis model that is associated with the predetermined age group [user feature information] predicted for the user), the characteristic of the reply audio comprising at least one of tone, speech rate, or timbre (Anders-Para. [0016], a voice synthesis model employed by an automated assistant when engaged with a child user may be the voice of a cartoon character, and/or may speak at a relatively slower pace).
It would been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify Wu in view of Anders for a characteristic of the reply audio being associated with the user feature information, the characteristic of the reply audio comprising at least one of tone, speech rate, or timbre.
One of ordinary skill in the art would have been motived because it offers the advantage of providing output that is suitable with particular user (see Anders Para. [0007, 0111].
Wu-Anders does not explicitly disclose:
generating a control parameter for a three-dimensional virtual portrait according to the user feature information;
generating a video of the three-dimensional virtual portrait based on the control parameter;
transmitting the video of the three-dimensional virtual portrait.
Walsh teaches:
generating a control parameter for a three-dimensional virtual portrait (Walsh-Para. [0022], configuring gesture sets functions to define visual actions, animations, or motions. The gesture sets [control parameter] preferably provide variations of a character's expressed behavior in response to stimuli; Walsh-Para. [0018], the method could be used for animating and automating responses of characters within 3D virtual world) according to the user feature information (Walsh-Para. [0042], the system can facilitate emotion sensing that tells the system what a user or other character is saying and feeling ... The sensing data also drives the thought Al that can drive the gestures and responses of the virtual character; Walsh-Para. [0024], Setting a gesture could additionally include setting a rigging specification ... The riggings could be manually animated, but could alternatively be motion captured or scanned from a person or face. In some variations, video of a person could be used as a provided gesture);
generating a video of the three-dimensional virtual portrait (Walsh-Para. [0046], the personality engine may be used for procedural generated content in animated video production; Walsh-Para. [0018], the method could be used for animating and automating responses of characters within 3D virtual world) based on the control parameter (Walsh-Para. [0022], configuring gesture sets functions to define visual actions, animations, or motions. The gesture sets [control parameter] preferably provide variations of a character's expressed behavior in response to stimuli; Walsh-Para. [0038], if the input indicates a user is mad and frustrated, the virtual character may be configured to respond with a soothing and empathetic delivery of a spoken response);
transmitting the video of the three-dimensional virtual portrait (Walsh-Para. [0039], character responses are preferably modified in real-time to reflect the appropriate reactions as the virtual character receives input … if a user were to say "I won the lottery today while I was at the store". The method may trigger a surprise character response on detecting the word "won" and then a happy character response on the subsequent word "lottery").

One of ordinary skill in the art would have been motived because it offers the advantage of expressing realistic and responsive representations of a character's thoughts, emotions, and moods (Walsh-Para. [0007]).

As per claim 5, Wu-Anders-Walsh discloses the method according to claim 1 as set forth above, Wu-Anders-Walsh also discloses wherein the generating of the control parameter and the reply audio for the three-dimensional virtual portrait according to the user feature information and the text reply information comprises:
generating the reply audio according to the text reply information (Wu-Para. [0100], the worker 340 may also be associated with a speech worker 344 that recognizes sounds and other voice input and converts it text. The speech worker 344 may utilize a speech recognition API to perform the conversion. Once a response is generated, the speech worker 344 may convert the text to speech and provide the response back to the worker 340).
Wu-Anders does not explicitly disclose
the user feature information comprises a user expression, 
generating the control parameter for the three-dimensional virtual portrait according to the user expression and the reply audio.

the user feature information comprises a user expression (Wash-Para. [0034], the image-based input could be images of a user's face. Facial expressions, gaze analysis, and/or other attributes of the face may be used to determine the state of the user), 
generating the control parameter for the three-dimensional virtual portrait according to the user expression and the reply audio (Walsh-Para. [0022], configuring gesture sets functions to define visual actions, animations, or motions. The gesture sets [parameter] preferably provide variations of a character's expressed behavior in response to stimuli; Walsh-Para. [0038], if the input indicates a user is mad and frustrated, the virtual character may be configured to respond with a soothing and empathetic delivery of a spoken response; Walsh-Para. [0030], a character could be executed within a 2D or 3D simulated environment).
It would been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to further modify Wu in view of Walsh for the user feature information comprises a user expression and generating the control parameter for the three-dimensional virtual portrait according to the user expression and the reply audio.
One of ordinary skill in the art would have been motived because it offers the advantage of expressing realistic and responsive representations of a character's thoughts, emotions, and moods (Walsh-Para. [0007]).

As per claims 6, Wu discloses An apparatus for generating information (Wu-Abstract, method for providing a conversation session with an artificial intelligence entity), comprising:
at least one processor (Wu-Fig. 13, Processing Unit 1310); and
a memory (Wu-Fig. 13, System Memory 1315) storing instructions, the instructions when executed by the at least one processor, cause the at least one processor to perform operations (Wu-Para. [0155], the system memory 1315 may include an operating system 1325 and one or more program modules 1320 suitable for parsing received input, determining subject matter of received input, recommending artificial intelligence entities and so on; Wu-Para. [0158], a number of program modules and data files may be stored in the system memory 1315), the operations comprising:
receiving a video and an audio of a user that are sent by a client (Wu- Fig.3, Para. [0097], the query component 300 may include a user interface 310 that provides an input area in which the user can provide input to the query component 300. The input may include typed text, provide or otherwise attach an image file, provide voice input, select emoji symbols, make an audio or voice call, and/or initiate a video conversation with the artificial intelligence entity advertisement system 200) by instant communication (See Wu-Fig.4, instant communication between the user and the artificial intelligence);
generating user feature information (Wu-Para. [0047], the artificial intelligence entity advertisement system 140 may store additional information about the user based on the interaction with the artificial intelligence entity. This information may then be used to determine demographic information about the user, interests and hobbies of the user, favorite foods of the user and so on) and text reply information according to the video and the audio (Wu-Para. [0100], the worker 340 may also be associated with a speech worker 344 that recognizes sounds and other voice input and converts it text … Once the sound input is converted to text, the speech worker 344 may provide the newly converted text to the chat worker 342 in order to determine the context of the converted text and may request information on how to respond to respond to the input 320. Once a response is generated, the speech worker 344 may convert the text to speech and provide the response back to the worker 340; Wu-Para. [0101], an image worker 346 may be used to determine subject matter contained in a received image or video file … a received image may need to be decoded into text in order for an appropriate response to be generated);
generating a reply audio according to the text reply information (Wu-Para. [0100], Once a response is generated, the speech worker 344 may convert the text to speech and provide the response back to the worker 340); 
generating a video based on the reply audio (Wu-Para. [0100], once a response is generated, the speech worker 344 may convert the text to speech and provide the response back to the worker 340; Wu-Para. [0103], the worker 340 may be used to generate a response 370; Wu-Para. [0106], the response 370 may be an audio (speech) response, a text response, a video response, an image response or a combination thereof); and
transmitting the video to the client by the instant communication, for the client to present to the user (Wu-Para. [0097], the user interface 310 may also be used to receive responses from the artificial intelligence entity. As with the input provided by the user, the response provided by the artificial intelligence entity may include text, images, sound, video and so on; See Wu-Fig.4, instant communication between the user and the artificial intelligence).
Wu does not explicitly disclose:
generating a control parameter for a three-dimensional virtual portrait according to the user feature information; a characteristic of the reply audio being associated with the user feature information, the characteristic of the reply audio comprising at least one of tone, speech rate, or timbre;
generating a video of the three-dimensional virtual portrait by an animation engine based on the control parameter
transmitting the video of the three-dimensional virtual portrait.
Andres teaches:
a characteristic of the reply audio being associated with the user feature information (Andres-Para. [0016], the automated assistant may select, from a plurality of candidate voice synthesis models, a given voice synthesis model that is associated with the predetermined age group [user feature information] predicted for the user), the characteristic of the reply audio comprising at least one of tone, speech rate, or timbre (Anders-Para. [0016], a voice synthesis model employed by an automated assistant when engaged with a child user may be the voice of a cartoon character, and/or may speak at a relatively slower pace).
It would been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify Wu in view of Anders for a characteristic of the reply audio being associated with the user feature information, the characteristic of the reply audio comprising at least one of tone, speech rate, or timbre.
One of ordinary skill in the art would have been motived because it offers the advantage of providing output that is suitable with particular user (see Anders Para. [0007, 0111].
Wu-Anders does not explicitly disclose:
generating a control parameter for a three-dimensional virtual portrait according to the user feature information;
generating a video of the three-dimensional virtual portrait by an animation engine based on the control parameter;
transmitting the video of the three-dimensional virtual portrait.
Walsh teaches:
generating a control parameter for a three-dimensional virtual portrait (Walsh-Para. [0022], configuring gesture sets functions to define visual actions, animations, or motions. The gesture sets [control parameter] preferably provide variations of a character's expressed behavior in response to stimuli; Walsh-Para. [0018], the method could be used for animating and automating responses of characters within 3D virtual world) according to the user feature information (Walsh-Para. [0042], the system can facilitate emotion sensing that tells the system what a user or other character is saying and feeling ... The sensing data also drives the thought Al that can drive the gestures and responses of the virtual character; Walsh-Para. [0024], Setting a gesture could additionally include setting a rigging specification ... The riggings could be manually animated, but could alternatively be motion captured or scanned from a person or face. In some variations, video of a person could be used as a provided gesture);
generating a video of the three-dimensional virtual portrait by an animation engine (Walsh-Para. [0046], the personality engine [animation engine] may be used for procedural generated content in animated video production; Walsh-Para. [0018], the method could be used for animating and automating responses of characters within 3D virtual world) based on the control parameter (Walsh-Para. [0022], configuring gesture sets functions to define visual actions, animations, or motions. The gesture sets [control parameter] preferably provide variations of a character's expressed behavior in response to stimuli; Walsh-Para. [0038], if the input indicates a user is mad and frustrated, the virtual character may be configured to respond with a soothing and empathetic delivery of a spoken response);
transmitting the video of the three-dimensional virtual portrait (Walsh-Para. [0039], character responses are preferably modified in real-time to reflect the appropriate reactions as the virtual character receives input … if a user were to say "I won the lottery today while I was at the store". The method may trigger a surprise character response on detecting the word "won" and then a happy character response on the subsequent word "lottery").

One of ordinary skill in the art would have been motived because it offers the advantage of expressing realistic and responsive representations of a character's thoughts, emotions, and moods (Walsh-Para. [0007]).

Claim 10 is an apparatus claim reciting similar subject matter to those recited in the method claim 5, and is rejected under similar rationale.

Claims 11 and 15 are computer readable medium claims reciting similar subject matter to those recited in the method claims 1 and 5 respectively, and are rejected under similar rationale. Wu also discloses:
A non-transitory computer readable medium, storing a computer program (Wu-Para. [0007], a computer-readable storage medium storing computer executable instructions), wherein the computer program, when executed by a processor, causes
the processor to perform operations (Wu-Para. [0007], the computer executable instructions are executed by a processing unit, the processing unit performs a method for providing a conversation session with an artificial intelligence entity associated with a business entity).

Claims 2-4, 7-9 and 12-14 are rejected under 35 U.S.C. 103 as being unpatentable over Wu (US 20180150749, Pub. May 31, 2018), in view of Anders et al. (US 2019/0325864, filed Apr. 16, 2018), in view of Walsh (US 2018/0342095, Pub. Nov. 29, 2018), and further in view of Brown et al. (US 2014/0317502, Pub. Oct. 23, 2014).
As per claim 2, Wu-Anders-Walsh discloses the method according to claim 1 as set forth above, Wu-Anders also discloses wherein the generating of the user feature information and the text reply information according to the video and the audio comprises
identifying the audio to obtain text information (Wu-Para. [0066], when the input is sound, voice or image input, the input is converted into text using a voice-to-text API or an image-to-text APL); 
generating the text reply information based on the text information (Wu-Para. [0100], once the sound input is converted to text, the speech worker 344 may provide the newly converted text to the chat worker 342 in order to determine the context of the converted text and may request information on how to respond to respond to the input 320. Once a response is generated, the speech worker 344 may convert the text to speech and provide the response back to the worker 340).
Wu-Anders does not explicitly disclose:
identifying the video to obtain the user feature information; and
acquiring relevant information, the relevant information comprising historical user feature information and historical text information; and 
generating reply information based on the user feature information and the relevant information.
Walsh teaches:
identifying the video to obtain the user feature information (Walsh-Para. [0024], video of a person could be used as a provided gesture; Walsh-Para. [0031], Block S210, which includes receiving persona stimulus input ... The input could be text-based, audio-based, image-based, motion-based, or any suitable medium of input; Walsh-Para. [0034], receiving personal stimulus input includes processing image-based input. The image-based input could be images of a user's face. Facial expressions, gaze analysis, and/or other attributes of the face maybe used to determine the state of the user. Posture, hand gestures, and other forms of body language could be detected);
acquiring relevant information, the relevant information comprising historical user feature information (see Walsh-Para. [0029, 0046], Version history, a new virtual character initialized from a set of existing virtual configurations [historical user feature information]);
generating reply information based on the user feature information (Walsh-Para. [0038], if the input indicates a user is mad and frustrated, the virtual character may be configured to respond with a soothing and empathetic delivery of a spoken response; Walsh Para. [0030], The character could alternatively be executed as an audio or text based character; Walsh Para. [0038], a text-to-speech engine used by the virtual character could be modulated to alter the speech properties).

One of ordinary skill in the art would have been motived because it offers the advantage of expressing realistic and responsive representations of a character's thoughts, emotions, and moods (Walsh-Para. [0007]).
Wu-Anders-Walsh does not explicitly disclose:
the relevant information comprising historical text information; and 
generating reply information based on the relevant information.
Brown teaches:
the relevant information comprising historical text information (see Brown-Para. [0115]: the smart device 104 may identify contextual information that is related to the conversation. The contextual information may comprise conversation history of the user with the virtual assistant in a current conversation, conversation history of the user with virtual assistant in a previous conversation; Brown-Para. [0031], The client application 110 may also provide any type of response, such as audio, text); and
generating reply information based on the relevant information (Brown-Para. [0046], contextual information [relevant information] may comprise any type of information that aids in understanding the meaning of a query of a user and/or in formulating a response for a virtual assistant).

One of ordinary skill in the art would have been motived because it offers the advantage of providing a response to user that more closely emulates human-to-human interaction (Brown-Para. [0044]).

As per claim 3, Wu-Anders-Wash-Brown discloses the method according to claim 2 as set forth above, Wu-Anders does not explicitly disclose
storing the user feature information and the text information in association into a session information set that is set for a current session.
Walsh teaches:
storing the user feature information (see Walsh-Para. [0029], Version history, a new virtual character initialized from a set of existing virtual configurations [user feature information]; Walsh-Para. [0046], the system could output a virtual character configuration object that can be stored locally).
It would been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to further modify Wu in view of Walsh for storing the user feature information.
One of ordinary skill in the art would have been motived because it offers the advantage of expressing realistic and responsive representations of a character's thoughts, emotions, and moods (Walsh-Para. [0007, 0029]).

storing the text information in association into a session information set that is set for a current session.
Brown teaches:
storing the text information in association into a session information set that is set for a current session (see Brown-Para. [0046-0047]: contextual information may be stored in a context data store 138 and may include conversation history during a current session).
It would been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to further modify Wu in view of Brown for storing the text information in association into a session information set that is set for a current session.
One of ordinary skill in the art would have been motived because it offers the advantage of generating virtual character for providing a response to user that more closely emulates human-to-human interaction (Brown-Para. [0044]).

As per claim 4, Wu-Anders-Brown-Walsh discloses the method according to claim 3 as set forth above, Wu-Anders-Walsh discloses wherein the acquiring of the relevant information comprises acquiring the relevant information (see Walsh-Para. [0029, 0046], Version history, a new virtual character initialized from a set of existing virtual configurations [relevant information]). The same rationale as in claim 2 applies.
Wu-Anders-Walsh does not explicitly disclose:
acquiring the relevant information from the session information set. 

acquiring the relevant information from the session information set (see Brown-Para. [0115]: the smart device 104 may identify contextual information that is related to the conversation. The contextual information may comprise conversation history of the user with the virtual assistant in a current conversation, conversation history of the user with virtual assistant in a previous conversation).
It would been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to further modify Wu in view of Brown for acquiring the relevant information from the session information set.
One of ordinary skill in the art would have been motived because it offers the advantage of providing a response to user that more closely emulates human-to-human interaction (Brown-Para. [0044]).

Claims 7-9 are apparatus claims reciting similar subject matter to those recited in the method claims 2-4 respectively, and are rejected under similar rationale.

Claims 12-14 are computer readable medium claims reciting similar subject matter to those recited in the method claims 2-4 respectively, and are rejected under similar rationale.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Leahy et al. (US 8082501) System and Method for Enabling Users to Interact In a Virtual Space;
Kumar (US 20190189117) System and Methods for In-Meeting Group Assistance Using a Virtual Assistant;
Miyajima (US 20180241701) Information Processing System and Information Processing Method;
Xu et al. (US 20120130717) Real-time Animation for an Expressive Avatar.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 


Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KAMAL B DIVECHA can be reached on (571)272-5863.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.








/KAMAL B DIVECHA/Supervisory Patent Examiner, Art Unit 2453