Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Applicant’s amendment filed on 6/14/2021 has been entered. Claims 1-15 remain pending in the present application.

Response to Arguments
Applicant argued: “Bouguerra is putting some image/video showing emoticon on the face image/video which is not like generating a realistic lip sync on an image uploaded by user which requires trained model to make lips move in sync on the text/voice data and generate expression on face. It is simply putting that effect in overlay,” (Remarks, page 10).
Examiner’s response: A claim is rejected based on limitations explicitly recited in it, not based on limitations described in the specification but not recited in the claim. In other words, limitations from the specification are not read into the claim (see MPEP, section 2111 Claim Interpretation; Broadest Reasonable Interpretation [R-07.2015]). In this case, claim 1 did not recite a “realistic lip sync” or a “trained model”. Therefore, the argument is not persuasive. If it is believed that these limitations are novel and inventive, they should be incorporated into claim 1. For example, claim 1 could be amended to recite 

Applicant argued: “It is pertinent to be noted that, in Karikatos [sic], the body model generation is not about only the image/video uploading, but through different pose the dimensions/points of edge or contours are mapped,” (Remarks, page 14), and “FIG 4 of Karikatos [sic] clearly states that user need to give full body photo. It can’t create full body photo from a selfie and it talks about measuring and provide training for leg and arm length and else,” and “Simply Karikatos [sic] need lot of data for training, user input is much complicated and can't produce similar results as we do. Technique of the current invention is much simple and easy. In Karikatos [sic], user need to provide training for everything like expression, voice and movement which is frustrating while in the current invention, its a quick thing. In the current invention, input, database and step are different and also same results can't be produce by above mentioned citations,” (Remarks, page 15).
Examiner’s response: to argue that technique of the present invention is much simpler than that of Karakotsios is moot because claim 1 did not recite in detail how this technique was performed. Again, Applicant should keep in mind that a claim is rejected based on limitations explicitly recited in it, not based on processing the one or more images of the first person... to generate a body model of the first person,” and Karakotsios teaches generating a body model of a user based on the user’s image(s), it could be said that Karakotsios teaches this limitation. If it is believed that the simpler technique of the present invention is novel and inventive, this technique should be described in the claims in such a way that it would overcome the Karakotsios reference.

Applicant argued: “in the current invention, an image is an input which is mostly a face image while in Corazza, uploading a 3d model is required. To animate a 3d model is very easy and trivial so this is a different invention altogether,” (Remarks, page 16).
Examiner’s response: Corazza was cited only for teaching a character creation system that creates a photorealistic, animatable mesh model of a user which may be made up of a head model of the user and a body model of another user, and the head and body models may be based on one or more images or photographs of the users,” (see page 8 of the previous office action). Therefore, arguing that Corazza uploading a 3D model instead of inputting an image is moot.

Claim Objections
Regarding claim 5, the examiner suggests the following amendment: “generating an animation of the body model of the first person enacting the message with lip sync”.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 6-8, 10 and 15 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 6 and 7 recite “the first persons” (note the plural form) throughout the claims, while their parent claim (claim 1) recites only one first person. It’s not clear whether the “first persons” recited in these claims are different from the first person recited in their parent claim. Clarification is required.
Claim 8 recites “using a human body information to identify requirement of the other body part/s”. It is not clear whether “the other body part/s” means body part/s of the person whose face is present in the image, or body part/s of the other person.
Claims 10 and 15 recite “using human body information related to each of the persons whose image/s is received to identify requirement of the other body part/s for each of the persons”. It is not clear whether “the other body part/s” means body part/s of the same person, or body part/s of a different person.
Corrections are required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-4, 7-9, 13 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bouguerra (Pub. No. US 2012/0069028), in view of Karakotsios (Pat. No. US 9,479,736), and further in view of Corazza et al. (Pub. No. US 2016/0027200).

Regarding claim 1, Bouguerra discloses a method for providing visual sequences using one or more images comprising:
receiving one or more images of a first person showing at least one face of the first person (Par. 49: “In one embodiment, video chat client 243 may support video-chat sessions, wherein a video of a user may be captured using video capture device 259 and streamed to another user for display with display 254. Additionally or alternatively, a video of the other user may be captured and streamed to video chat client device 200 for display with display 254”),


,
receiving a message to be enacted by the first person, wherein the message comprises at least a text or an emotional and movement command (Par. 61: “The video emoticon may be selected from a menu, or a video emoticon may be selected through text input. For example, a ‘smiley’ video emoticon may be selected by typing ":-)" into a chat window associated with the video-chat. Additionally or alternatively, a video emoticon may be selected from a graphical interface”. In particular, the smiley video emoticon is an emotional and movement command because it results in an emotion (happiness) and facial movement (animation of the lips). It could be inputted in textual form (such as “:-)”) or graphical form as a message to be enacted by the user. See also par. 74: “Video emoticons menu 506 may be used to select a video emoticon. Chat box 508 provides an alternative means for a user to select a video emoticon, as discussed above”),
processing the message to extract or receive (Par. 65),
processing (Pars. 64, 69 and 75),
wherein the emotional and movement command is a GUI or multimedia-based instruction to invoke the generation of one or more facial expression and/or one or more body parts movement (Par. 74: “Video emoticons menu 506 may be used to select a video emoticon. Chat box 508 provides an alternative means for a user to select a video emoticon”. See also pars. 75-76).
Bouguerra does not disclose the above strike-through limitations.
In the same field of endeavor, Karakotsios teaches a video conferencing system wherein a message from a user is processed to extract audio data related to a voice of the user (Col. 9, ll. 2-5). Karakotsios also teaches generating a photorealistic body model of the user based on one or more images of the user captured by a camera (Col. 2, ll. 6-52 and col. 10, line 46 continuing to col. 11, line 34).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to incorporate the teaching of Karakotsios into Bouguerra by generating a photorealistic body model of the user based on one or more images of the user, processing a message of the user to extract audio data related to a speech of the user, and transmitting the body model together with the extracted audio data to a remote user participating in the video chat. The motivation would have been to reduce the computational cost associated with high bandwidth consumption in video transmission (Karakotsios, col. 1, ll. 26-33), and also to provide speech-to-text and translation features based on the extracted audio data (Karakotsios, col. 9, ll. 2-5).
Furthermore, Corazza teaches a character creation system that creates a photorealistic, animatable mesh model of a user which may be made up of a head model of the user and a body model of another user, and the head and body models may be based on one or more images or photographs of the users (See Abstract and pars. 10, 34 and 35).
Corazza and Bouguerra are in the same field of computer graphics. Therefore, it would have been obvious to one skilled in the art before the effective filing date of the claimed invention to incorporate the character creation system of Corazza into Bouguerra such that a photorealistic, animatable mesh model of the user would be generated by merging a head model of the user with a body model of another user according to human body information, wherein the head and body models are based on one or more images or photographs of the user and the other user. The motivation would have been to allow the user to correlate the head and body models together in order to mix and match different body parts (Corazza, par. 36).

Regarding claim 2, Bouguerra in view of Karakotsios and Corazza teaches the method of claim 1, wherein the message is received as an input from the first person (Bouguerra, par. 61: “The video emoticon may be selected from a menu, or a video emoticon may be selected through text input. For example, a ‘smiley’ video emoticon may be selected by typing ":-)" into a chat window associated with the video-chat. Additionally or alternatively, a video emoticon may be selected from a graphical interface”).

Regarding claim 3, Bouguerra in view of Karakotsios and Corazza teaches the method according to claim 2, wherein the message comprises the audio data (Karakotsios, col. 9, ll. 2-5).

Regarding claim 4, Bouguerra in view of Karakotsios and Corazza teaches the method according to claim 1, wherein the message comprises a body movement data related to movement of the one or more body parts of the first person (Bouguerra, par. 60: “Each video emoticon is associated with a predefined set of features. In the case of a `surprise` video emoticon, the predefined features may include a pair of eyes. However, other features are similarly contemplated, including a nose, ears, mouth, chin, teeth, neck, hair, torso, arms, legs, hand, fingers, thumb, and/or wrist. In addition to body parts, features such as a dog's face, a vacuum cleaner, a car, or virtually any other object is similarly contemplated”), the method further comprises:
processing the body model, the audio data, the body movement data and the facial movement data, and generating an animation of the body model of the first person enacting the message with the movement of the one or more body parts (Bouguerra, pars. 60, 64, 69 and 75).

Regarding claim 7, Bouguerra in view of Karakotsios and Corazza teaches the method according to claim 1, comprising:
- receiving a selection input to select one or more persons with one or more faces from the one or more images of the first person received, 
- generating a scene showing one or more body models of the one or more persons with the one or more faces based on the selection input, 
- processing, the scene, the audio data, and the facial movement data, and generating an animation of the body models of the persons enacting the message (Bouguerra, par. 64: “In one embodiment, when a user of a client device selects an animated video emoticon, the animated video emoticon is applied to the video captured by the first client device before it is transmitted to another client device. However, the user of the client device may additionally or alternatively select to apply an animated video emoticon to a video stream received from the other client device. For example, a first friend may want to see what his video-chat buddy would look like `surprised`, and so the first friend may invoke the `surprised` video emoticon on the video stream depicting his buddy”. In particular, instead of applying the animation to the user’s own images, the user can select a friend that he or she is chatting with, and apply the animation the friend’s images).

Claim 8 recites similar limitations as claim 1, and further recites additional limitations related to a chat environment established between two users. Since Bouguerra also teaches a video chat environment (see Fig. 1), claim 8 can be rejected under the same rationale set forth in the rejection of claim 1.

Regarding claim 9, Bouguerra in view of Karakotsios and Corazza teaches the method according to claim 8, wherein the chat message from a first computing device is received at a second computing device (Fig. 1 of Bouguerra suggests this limitation because Bouguerra teaches a video chat environment. For example, a user of video chat client device 101 can send a message with emoticons to video chat client device 102), and processing the body models, the audio data, and the facial movement data, and generating an animation of the first person enacting the chat message in the chat environment, and displaying the animation on a display of the second computing device in the chat environment (Bouguerra, pars. 64, 69 and 75).

Regarding claim 13, Bouguerra in view of Karakotsios and Corazza teaches the method according to claim 1, comprising: 
receiving an image of a cloth and combining the body model of the person and the image of the cloth to show the body model of the person wearing the cloth (Karakotsios, Figs. 4A-4B, and column 11, starting at line 35 to column 12, ending at line 6).

Regarding claim 14, Bouguerra in view of Karakotsios and Corazza teaches the method according to claim 1, comprising: 
receiving an animation input related to nodes of skeleton of the body model, wherein the skeleton of the body model is thinned down structure of the body model (Corazza, pars. 44 and 75);
processing the body model, the audio data, and the facial movement data, and generating an animation of the body model of the person enacting the message (Bouguerra, pars. 64, 69 and 75).

Claim(s) 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bouguerra in view of Karakotsios and Corazza as applied to claim 1 above, and further in view of Wang et al. (HIGH QUALITY LIP-SYNC ANIMATION FOR 3D PHOTO-REALISTIC TALKING HEAD, 2012).

Regarding claim 5, Bouguerra in view of Karakotsios and Corazza teaches the method according to claim 1, comprising:


In the same field of endeavor, Wang teaches producing lip sync based on audio data (See Abstract and section 3).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to further modify Bouguerra by processing the message and its audio data to producing lip sync data, and processing the body model, the audio data, the facial movement data, the lip sync data to generate an animation of the body model of the user enacting the message with lip sync. The motivation would have been to provide a photo-realistic talking head (Wang, Abstract).

Claim(s) 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bouguerra in view of Karakotsios and Corazza as applied to claim 1 above, and further in view of Winchester (Pub. No. US 2011/0304629).

Regarding claim 6, Bouguerra in view of Karakotsios and Corazza teaches the method according to claim 1, 



Bouguerra in view of Karakotsios and Corazza does not expressly teach the above receiving and processing steps when the one or more images of the first person comprise faces of more than one person. However, it should be noted that Bouguerra in view of Karakotsios and Corazza teaches these steps when the one or more images comprise only the face of the first person, as pointed out in the rejection of claim 1.
In the same field of endeavor, Winchester teaches animating facial expressions of two different users who are captured in the same image (Fig. 4 and par. 37: “In the game being played by the individuals 402 and 404, two avatars 412 and 414 that represent the individuals 402 and 404 are displayed and utilized during game play. Specifically, the avatar 412 can represent the first individual 402 and the avatar 414 can represent the second individual 404. As the individuals 402 and 404 are playing the game, they can ascertain how their co-player/competitor is emoting by watching the facial expression animated on the avatars 412-414. This can enhance game play by providing the players with realistic emotions captured in real time by the sensor unit 410”. Note that if user 402 enacts a first message at time t0, and user 404 enacts a second message at time t1 (t1 > t0), avatar 412 of user 402 will be animated first, and avatar 414 of user 404 will be animated second. This could be interpreted as receiving the first and second messages in an order, and processing the first and second messages and generating an animation of the body models of the users 402 and 404 in the respective order).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to further modify Bouguerra to perform the above receiving and processing steps when the one or more images of the first person comprise faces of more than one person, as suggested by Winchester. The motivation would have been to enhance users' experience by providing the users with realistic emotions captured in real time.

Claim(s) 10 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bouguerra in view of Karakotsios and Corazza as applied to respective claims 9 and 1 above, further in view of Winchester, and still further in view of Sun et al. (Pub. No. US 2007/0216675).

Regarding claim 10, Bouguerra in view of Karakotsios and Corazza teaches the method according to claim 9, comprising:



processing the image/s of the first person with the image/s of other human body part/s using the human body information to generate a body model of the first person for whom the other body part/s are required, the body model comprises face of the first person, 
processing the body model, images of the first person for whom the other body parts were not required and generating an image showing the first person in the chat environment, 
receiving a message from the first person in the chat environment, wherein the message comprises at least a text or an emotional and movement command, 
processing the message to extract or receive the audio data related to voice of the first person, and the facial movement data related to expression to be carried on face of the first person, 
processing the audio data, and the facial movement data, and generating an animation of the first person enacting the message in the chat environment,
wherein emotional and movement command is a GUI or multimedia based instruction to invoke the generation of facial expression/s and or body part/s movement (For the limitations that are not struck through, see the rejection of claim 1 above).
In the same field of endeavor, Winchester teaches animating facial expressions of two avatars representing two different users who are captured in the same image (Fig. 4 and par. 37).
In light of the above teaching of Winchester, it would have been obvious to one skilled in the art before the effective filing date of the claimed invention to further modify Bouguerra to receive at least one image representative of more than one user in the chat environment, and generate a body model for each of the users using the technique taught by Corazza. The motivation would have been to enhance users' experience by providing the users with realistic emotions captured in real time.
Also in the same field of endeavor, Sun teaches generating a scene in a chat environment (Fig. 2 and pars. 24 and 29. In particular, a user can select a background image (scene) to replace the current background).
In light of the above teaching of Sun, it would have been obvious to one skilled in the art before the effective filing date of the claimed invention to further modify Bouguerra by generating a scene image showing a plurality of users in the chat environment, and processing the scene image, the audio data, and facial movement data, and generating an animation of the person enacting the message in the chat environment. The motivation would have been to give the user an option to replace the current background with a scene image that he or she likes.

Claim 15 recites similar limitations as claim 1 with additional limitations related to generating body models for a plurality of users in an image and generating a scene showing the plurality of users. Since Winchester teaches animating facial expressions of two avatars representing two different users who are captured in the same image, and Sun teaches generating a scene in a chat environment, claim 15 can be rejected using the same rationales set forth in the rejection of claims 1 and 10 above.

Claim(s) 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bouguerra in view of Karakotsios and Corazza as applied to claim 1 above, and further in view of Sun.

Regarding claim 11, Bouguerra in view of Karakotsios and Corazza teaches the method according to claim 1, comprising:





In the same field of video chat, Sun teaches receiving a wearing input related to a body part of a user in a video stream onto which a fashion accessory is to be worn, processing the wearing input and identifying body parts of the user onto which the fashion accessory is to be worn, receiving an image of the accessory according to the wearing input, and processing the identified body parts of the user and the image of the accessory and generating a view showing the user wearing the fashion accessory (See pars. 30-31 and Fig. 3).
In light of the above teaching of Sun, it would have been obvious to one skilled in the art before the effective filing date of the claimed invention to further modify Bouguerra by receiving a wearing input related to a body part of the body model onto which a fashion accessory is to be worn, processing the wearing input and identifying body parts of the body model onto which the fashion accessory is to be worn, receiving an image of the accessory according to the wearing input, processing the identified body parts of the body model and the image of the accessory and generating a view showing the body model wearing the fashion accessory, and processing the view, the audio data, and the facial movement data, and generating an animation of the persons enacting the message wearing the fashion accessory. The motivation would have been to give the user an option to incorporate digital effects into existing video streams.

Claim(s) 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bouguerra in view of Karakotsios and Corazza as applied to claim 1 above, and further in view of Afifi et al. (“Video Face Replacement System Using a Modified Poisson Blending Technique”, 2014).

Regarding claim 12, Bouguerra in view of Karakotsios and Corazza teaches the method according to claim 1, comprising:



In the same field of video editing, Afifi teaches generating a morphed video showing a user’s face from a first video on a user’s body from a second video (See Abstract and Fig. 2).
In light of the above teaching of Afifi, it would have been obvious to one skilled in the art before the effective filing date of the claimed invention to further modify Bouguerra by receiving a target image showing a face of another person or animal, processing the body model and the target image to generate a morphed body model showing the face from the target image on the person's body model, processing the morphed body model, the audio data, and the facial movement data, and generating an animation of the morphed body model enacting the message. The motivation would have been to provide a means for face replacement (Afifi, section I. Introduction, 1st paragraph).

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PHONG X NGUYEN whose telephone number is (571)270-1591.  The examiner can normally be reached on Mon-Fri 8am - 5pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Xiao Wu can be reached on (571)272-7761.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/PHONG X NGUYEN/           Primary Patent Examiner, Art Unit 2613