Response to Arguments
Applicant's arguments filed 09/24/2021 have been fully considered but they are not persuasive.
Applicant argues that Li does not disclose the recited claim element(s) of selectively acquiring second information associated with additional interaction with the individual in the context based at least in part on the one or more deficiencies. See page. 10, para. 1 of Applicant’s Remarks of 09/24/2021. Applicant points to para. 58 of Li and argues that what is stated is not the same as what is recited in the claim. See page. 10, para. 1 of Applicant’s Remarks of 09/24/2021(stating that “[t]he continuous acquisition and training disclosed in Li et al. is not the same as the recited selectively acquiring second information based at least in part on the one or more deficiencies”). 
Examiner respectfully disagrees. As Fig. 3 of Li details the tracking process of 300 begins with initial data 302 being first put in the tracking pipeline in which rigid motion tracking is used 306 to fit a blend shape 308 with the linear adaptive PCA subspace 310.  Then Li states in para. 0058, “The out-of-adaptive space deformations for feeding the incremental PCA process includes training the correctives 320 of the adaptive PCA model (M) 316 in the adaptive PCA space. For example, the anchor shapes 318 may be initialized with A=23 orthonormalized vectors from the initial blendshapes, and the K corrective shapes 320 may be learned to improve the fitting accuracy over time. To train the correctives 320, new expression samples S of the subject may first be collected that fall outside of the currently used adaptive PCA space.” (emphasis added); See also pages. 5-6 of Examiner’s Non-Final Office Action dated 06/24/2021. 
selectively acquiring second information (i.e. new expression samples S) associated with additional interaction with the individual in the context(i.e. samples S of the subject collected that fall outside of the currently used adaptive PCA space) based at least in part on the one or more deficiencies(i.e. the K corrective shapes 320 may be learned to improve the fitting accuracy over time). 
Applicant argues that Li does not disclose the recited claim element(s) of involves provoking specific responses from the individual based at least in part on the one or more deficiencies. See page. 10, para. 2 of Applicant’s Remarks of 09/24/2021. In support of Applicant’s argument, applicant admits that “[w]hile Li … [does] disclose[] collecting new expression samples that fall outside of the currently used adaptive space, this is not a driven process.” id. (emphasis added). Applicant then goes on to argue that “Li…does not disclose or suggest that specific responses (e.g. specific new expressions) are provoked based at least in part on the one or more deficiencies.” id. See page. 10, para. 3 of Applicant’s Remarks of 09/24/2021(stating that “[t]here is no disclosure or suggestion that there is a trigger or an instruction that intentionally provokes or results in this new expression based at least part on the one or more deficiencies”)(emphasis added); See also page. 11, para. 1 of Applicant’s Remarks of 09/24/2021(stating that “[t]his indicates that the occurrence of a new expression is a random process, not a causal one that is intentionally triggered or driven based at least part on the one or more deficiencies”)(emphasis added).
Examiner respectfully disagrees. In response to applicant's argument that Li fails to show certain features of applicant’s invention, it is noted that the features upon which applicant relies (i.e., a driven process, specific new expressions, is a trigger or an instruction that intentionally provokes or results in this new expression, not a causal one that is intentionally triggered or driven) are not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993) (Claims to a superconducting magnet which generates a "uniform magnetic field" were not limited to the degree of magnetic field uniformity required for Nuclear Magnetic Resonance (NMR) imaging. Although the specification disclosed that the claimed magnet may be used in an NMR apparatus, the claims were not so limited.); Constant v. Advanced Micro-Devices, Inc., 848 F.2d 1560, 1571-72, 7 USPQ2d 1057, 1064-1065 (Fed. Cir.), cert. denied, 488 U.S. 892 (1988) (Various limitations on which appellant relied were not stated in the claims; the specification did not provide evidence indicating these limitations must be read into the claims to give meaning to the disputed terms.).
The limitation that Applicant argues Li does not teach is stated in the following manner as found in Applicant’s 08/03/2018 claim limitation: acquiring the second information involves provoking specific responses from the individual based at least in part on the one or more deficiencies. Nowhere are the elements of a driven process, specific new expressions, is a trigger or an instruction that intentionally provokes or results in this new expression and/or not a causal one that is intentionally triggered or driven mentioned in recited limitation. 
In addition to para. 0058, para. 0093 of Li states, “FIG. 8 illustrates two examples of the evolution of the correctives 320…For example, when the subject performs a new expression that could not be captured by the current adaptive PCA model for the previous frame, a large error may be measured between the adaptive PCA fit and the final tracking.”(emphasis added).  Accordingly, under the broadest reasonable interpretation of the claims, in light of the  acquiring the second information(i.e. new expression samples S) involves provoking specific responses from the individual (i.e. when the subject performs a new expression) based at least in part on the one or more deficiencies(i.e. the K corrective shapes 320 may be learned to improve the fitting accuracy over time).

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Huang et al. (US 10,504,268 Bl, “Huang”) in view of LI et al.(US 2015/0084950 Al, “LI”).
Regarding claim 1, Huang teaches: a computer system, comprising: a computation device; memory configured to store program instructions, wherein, when executed by the computation device, the program instructions cause the computer system to perform one or more operations(Huang, col.6, lines 13-24, fig. 6, “FIG. 6 is a diagram of an example system 60 for generating facial expressions in an avatar-based user interface for a computing device 62. The computing device 62 includes a digital camera 64, a display device 66, and a user interface application 68 that is stored on a non-transitory computer readable medium and executable by a processor.”) comprising: receiving information associated with an interaction with an individual in a context; analyzing the information to extract features associated with one or more attributes of the individual(Huang, col.6, lines 39-46, fig. 7, “FIG. 7 is a flow diagram of an example method 90 for generating facial expressions in an avatar-based user interface for a computing device… At 92, image data is captured over time, for example using a digital camera in the computing device. At 94, a plurality of facial expression descriptor vectors are generated based on the image data.” ); generating, based at least in part on the extracted features and using a group of behavioral agents in a multi-layer hierarchy, a dynamic virtual representation that automatically mimics the one or more attributes of the individual, wherein a given behavioral agent receives one or more inputs and provides an output corresponding to one or more of the extracted features, and wherein the inputs to at least some of the behavioral agents include outputs from one or more of the other behavioral agents(Huang, col.6, lines 46-52, fig. 7, “At 96, an expressive facial sketch image of an avatar is generated based on the plurality of facial expression descriptor vectors, for example using a first DC-GAN model as described above. At 98, a facial expression image is generated from the expressive facial sketch image, for example using a second DC-GAN model as described above. At 100, the facial expression image is displayed as a user interface avatar on a display device.” ). 
Huang does not teach: calculating one or more performance metrics associated with the dynamic virtual representation and the one or more attributes; determining, based at least in part on the one or more performance metrics one or more deficiencies in the extracted features; and selectively acquiring second information associated with additional interaction with the individual in the context based at least in part on the one or more deficiencies, wherein the second information at least in part corrects for the one or more deficiencies, and wherein acquiring the second information involves provoking specific responses from the individual based at least in part on the one or more deficiencies.
However, LI teaches: calculating one or more performance metrics associated with the dynamic virtual representation and the one or more attributes; determining, based at least in part on the one or more performance metrics one or more deficiencies in the extracted features(LI, pg. 11, para. 0096, fig. 12, fig. 3,  “FIG. 12 graphically illustrates the fast convergence of the incremental PCA process 314 by showing the error as time progresses. As described above, the final facial tracking output includes the vertices of the mesh                         
                            
                                
                                    v
                                
                                
                                    4
                                
                            
                        
                     that result from the out-of-adaptive space deformation 312. The error shown in green includes the average distance between these vertices and the anchor shape space                         
                            
                                
                                    M
                                
                                
                                    A
                                
                            
                        
                    , while the error shown in blue includes the full adaptive PCA space M.”); and selectively acquiring second information associated with additional interaction with the individual in the context based at least in part on the one or more deficiencies, wherein the second information at least in part corrects for the one or more deficiencies, and wherein acquiring the second information involves provoking specific responses from the individual based at least in part on the one or more deficiencies (LI, pg. 7, para. 0058, fig. 3, fig. 8, fig. 12,  “Using the out-of-adaptive space deformation of 312, the PCA space is continuously trained as the subject is being tracked. The out-of-adaptive space deformations for feeding the incremental PCA process includes training the correctives 320 of the adaptive PCA model (M) 316 in the adaptive PCA space. For example, the anchor shapes 318 may be initialized with A=23 orthonormalized vectors from the initial blendshapes, and the K corrective shapes 320 may be learned to improve the fitting accuracy over time. To train the correctives 320, new expression samples S of the subject may first be collected that fall outside of the currently used adaptive PCA space. These samples are obtained by warping the result of the initial blendshape fit to fit the current input depth map and 2D facial features using a per-vertex Laplacian deformation algorithm. These samples are used to refine the corrective shapes 320 using the incremental PCA technique at 314.” & see also LI, pg. 7, para(s) 62-66, fig.3, detailing the use of the EM algorithm to update the corrective shapes 320 in                         
                            
                                
                                    M
                                
                                
                                    k
                                
                            
                        
                     to reduce the overall error in the adaptive PCA model 316).
Accordingly, one of ordinary skill in the art would modify Huang’s computer system  in view of LI to teach: calculating one or more performance metrics associated with the dynamic virtual representation and the one or more attributes; determining, based at least in part on the one or more performance metrics one or more deficiencies in the extracted features; and selectively acquiring second information associated with additional interaction with the individual in the context based at least in part on the one or more deficiencies, wherein the second information at least in part corrects for the one or more deficiencies, and wherein 
Regarding claim 2, Huang in view of LI teaches the computer system of claim 1, wherein the one or more operations comprise: analyzing the second information to extract second features associated with one or more attributes of the individual(LI, pg. 5, para. 0043, fig. 3, “At step 308, the input data is fit with one or more initial blendshapes… an initial blendshape fit may be performed for each input frame using 3D point constraints on the input scans (the 3D depth map data) and 2D facial features constraints (the video data). The 2D facial feature constraints enable the identification of a correct shape of the subject that cannot be easily identified using the depth data.”); generating, based at least in part on the second extracted features, a revised dynamic virtual representation that automatically mimics the one or more attributes of the individual(LI, pg. 5, para(s). 0045-0046, fig. 3, “The 2D facial feature constraints may then be pre-associated to a fixed set of mesh vertices of the tracked 3D model. A facial feature fitting term is then formulated as one or more vectors between the 2D facial features and their corresponding mesh vertices in camera space:                        
                            
                                
                                    c
                                
                                
                                    j
                                
                                
                                    F
                                
                            
                            
                                
                                    x
                                
                            
                            =
                            
                                
                                    
                                        
                                            
                                                1
                                            
                                        
                                        
                                            
                                                0
                                            
                                        
                                    
                                    
                                        
                                            
                                                0
                                            
                                        
                                        
                                            
                                                1
                                            
                                        
                                    
                                    
                                        
                                            
                                                -
                                                
                                                    
                                                        u
                                                    
                                                    
                                                        j
                                                    
                                                    
                                                        x
                                                    
                                                
                                            
                                        
                                        
                                            
                                                -
                                                
                                                    
                                                        u
                                                    
                                                    
                                                        j
                                                    
                                                    
                                                        y
                                                    
                                                
                                            
                                        
                                    
                                
                            
                            P
                            
                                
                                    v
                                
                                
                                    j
                                
                                
                                    1
                                
                            
                            (
                            x
                            )
                        
                     where                         
                            
                                
                                    u
                                
                                
                                    j
                                
                            
                            =
                            [
                            
                                
                                    u
                                
                                
                                    j
                                
                                
                                    x
                                
                            
                            ,
                             
                            
                                
                                    u
                                
                                
                                    j
                                
                                
                                    v
                                
                            
                            ]
                        
                     is the j-th 2D facial feature position and                         
                            
                                
                                    P
                                
                                
                                    3
                                    ×
                                    3
                                
                            
                        
                     is the camera projection matrix…[a]fter the aforementioned processing, the appropriate initial blendshape expression is fit to the current frame of the input data 302.”); calculating one or more second performance metrics associated with the revised dynamic virtual representation and the one or more attributes; and determining, based at least in part on the one or more second performance metrics, one or more second deficiencies in the second extracted features(LI, pg. 5, para. 0047, fig. 3,  “The fitting from step 308 is then refined at step 310 by fitting or warping the blendshape model to a linear subspace( e.g., an adaptive PCA space, or the like) using a progressively updated adaptive model 316 ( e.g., an adaptive PCA model). As noted above, the initial blendshapes 304 are personalized and coarse approximations of various real expressions of the subject. For example, one blendshape may approximate how the subject would smile. By performing step 310, and fitting the blendshape model to a linear adaptive PCA subspace, expressions that are personalized to the subject may be obtained. In order to create personalized expressions, the PCA space extends the space that is spanned by the initial blendshapes with PCA modes. In essence, the PCA modes describe the difference between the real expressions of the subject and the expressions that have been approximated by the initial blendshapes. By determining the difference between the real and approximated expressions, the linear PCA space more accurately describes the actual expression of the subject….”).
Regarding claim 3, Huang in view of LI teaches the computer system of claim 1, wherein the context comprises interacting with the individual(Huang, col.3, lines 5-11, fig.1,  “The present disclosure utilizes the study of human dyadic  interactions to address the problem of facial expression generation in human-avatar dyadic interactions using conditional Generative Adversarial Networks (GANs). In this way, a model may be constructed that takes into account ).
Regarding claim 4, Huang in view of LI teaches the computer system of claim 1, wherein the information comprises one or more of: one or more images, sound, writing, an anatomic response of the individual, a selection from a human interface, neuronal signals, or a second type of measurement(LI, pg. 4, para. 0036, fig. 1, “The input sensor 104 may be used to capture input data of the subject 108. The input data may include scans, depth data (e.g., one or more depth maps), video with 2D features, and/or the like. In some embodiments, the input sensor 104 may be used to capture scans and/or depth data, and a separate camera may be used to capture video data. In some embodiments, the input sensor 104 may include a depth sensor ( e.g., KINECT depth sensor, a short range Primesense Carmine 1.09 depth sensor, or the like). A depth sensor may also be referred to herein as a camera. The input sensor 104 may run in real or substantial real-time along with the computer 102 as the subject 108 is making facial expressions. For example, the user may make a facial expression 110 that is captured by the input sensor 104. Using the input data, a facial performance capture program running on one or more processors of computer 102 may perform the techniques described in further detail below to render the output 112 including a model of the subject's facial expression.”).1
Regarding claim 5, Huang in view of LI teaches the computer system of claim 1, wherein the one or more attributes comprise one or more of: a behavior, an emotion, a type of humor, a mannerism, a style of speech, a memory or a thought process(LI, pg. 11, para. 0095, fig. 10, fig.11,  “FIGS. 10 and 11 illustrate how the facial performance capture techniques described herein using an adaptive tracking model 316 with corrective shapes 320 conforms to subject ).2
Regarding claim 6, Huang in view of LI teaches  the computer system of claim 1, wherein a given behavioral agent comprises an artificial neural network(Huang, col.4, lines 26-42, fig. 1, “With reference again to FIG. 1, the disclosed example 10 addresses this as a tractable optimization problem that divides the model into two stages 12, 18. The first stage 12 is a conditional deep convolutional generative adversarial network (DC-GAN) designed to produce expressive facial sketch images 14 of the interviewer that are conditioned on the interviewee's facial expressions. The second stage 18 is another DC-GAN to transfer refined sketch images 14 into real facial expression images 20. On both stages 12, 18, generator and discriminator architectures may be adapted from and use modules of the form convolution-BatchNorm-ReLu to stabilize optimization. A training phase may utilize a mini-batch SGD and apply the Adam solver. To avoid the fast convergence of discriminators, generators may be updated twice for 
Regarding claim 7, Huang in view of LI teaches the computer system of claim 1, wherein the information is associated with an electronic device(Huang, col.6, lines 13-24, fig. 6, “FIG. 6 is a diagram of an example system 60 for generating facial expressions in an avatar-based user interface for a computing device 62. The computing device 62 includes a digital camera 64, a display device 66, and a user interface application 68 that is stored on a non-transitory computer readable medium and executable by a processor.”).
Regarding claim 8, Huang in view of LI teaches the computer system of claim 1, wherein the features comprise one or more of: spoken or written communication of the individual, an emotion of the individual, non-verbal communication by the individual, a tone, a style or manner of speaking, a gesture, facial expression, a vital sign, body language, folded arms or a posture, an eyebrow position or motion, a sudden motion, a rate or frequency of blinking, a twitch, a gaze direction and/or emotional prosody(LI, pg. 11, para. 0095, fig. 10, fig.11,  “FIGS. 10 and 11 illustrate how the facial performance capture techniques described herein using an adaptive tracking model 316 with corrective shapes 320 conforms to subject input expressions on-the-fly. FIG. 10 compares the techniques described herein with less-effective methods using six basic human emotional expressions and shows that the adaptive tracking model 316 can be used to accurately capture these characteristics. As can be seen by the various models 1002-1012 illustrating the different human expressions, the facial performance capture processes described herein produce a better fit to the subject than traditional data driven techniques, which are typically confined to learned motion or expressions received prior to tracking. The adaptive model 316 accurately captures emotions more effectively than other methods without requiring a ).3
Regarding claim 9, Huang in view of LI teaches the computer system of claim 1, wherein at least some of the operations of the computer system are performed by a discriminator in a generative adversarial network(Huang, col.4, lines 26-42, fig. 1, “With reference again to FIG. 1, the disclosed example 10 addresses this as a tractable optimization problem that divides the model into two stages 12, 18. The first stage 12 is a conditional deep convolutional generative adversarial network (DC-GAN) designed to produce expressive facial sketch images 14 of the interviewer that are conditioned on the interviewee's facial expressions. The second stage 18 is another DC-GAN to transfer refined sketch images 14 into real facial expression images 20. On both stages 12, 18, generator and discriminator architectures may be adapted from and use modules of the form convolution-BatchNorm-ReLu to stabilize optimization. A training phase may utilize a mini-batch SGD and apply the Adam solver. To avoid the fast convergence of discriminators, generators may be updated twice for each discriminator update, which differs from the original setting in that the discriminator and generator update alternately.”).
Regarding claim 10, Huang teaches: a non-transitory computer-readable storage medium for use in conjunction with a computer system, the computer-readable storage medium configures to store program instruction when executed by the computer system, causes the computer system to perform one or more operations(Huang, col.6, lines 13-24, fig. 6, “FIG. 6 is a diagram of an example system 60 for generating facial expressions in an )comprising: receive information associated with an interaction with an individual in a context; analyzing the information to extract features associated with one or more attributes of the individual(Huang, col.6, lines 39-46, fig. 7, “FIG. 7 is a flow diagram of an example method 90 for generating facial expressions in an avatar-based user interface for a computing device… At 92, image data is captured over time, for example using a digital camera in the computing device. At 94, a plurality of facial expression descriptor vectors are generated based on the image data.” ); generating, based at least in part on the extracted features and using a group of behavioral agents in a multi-layer hierarchy a group of behavioral agents in a multi-layer hierarchy that automatically mimics the one or more attributes of the individual, wherein a given behavioral agent receives one or more inputs and provides an output corresponding to one or more of the extracted features, and wherein the inputs to at least some of the behavioral agents include outputs from one or more of the other behavioral agents(Huang, col.6, lines 46-52, fig. 7, “At 96, an expressive facial sketch image of an avatar is generated based on the plurality of facial expression descriptor vectors, for example using a first DC-GAN model as described above. At 98, a facial expression image is generated from the expressive facial sketch image, for example using a second DC-GAN model as described above. At 100, the facial expression image is displayed as a user interface avatar on a display device.” ). 
Huang does not teach: calculating one or more performance metrics associated with the dynamic virtual representation and the one or more attributes; determining, based at least in part on the one or more performance metrics one or more deficiencies in the extracted features; and 
However, LI teaches: calculating one or more performance metrics associated with a dynamic virtual representation and the one or more attributes; determining, based at least in part on the one or more performance metrics one or more deficiencies in the extracted features(LI, pg. 11, para. 0096, fig. 12, fig. 3,  “FIG. 12 graphically illustrates the fast convergence of the incremental PCA process 314 by showing the error as time progresses. As described above, the final facial tracking output includes the vertices of the mesh                         
                            
                                
                                    v
                                
                                
                                    4
                                
                            
                        
                     that result from the out-of-adaptive space deformation 312. The error shown in green includes the average distance between these vertices and the anchor shape space                         
                            
                                
                                    M
                                
                                
                                    A
                                
                            
                        
                    , while the error shown in blue includes the full adaptive PCA space M.”); and selectively acquiring second information associated with additional interaction with the individual in the context based at least in part on the one or more deficiencies, wherein the second information at least in part corrects for the one or more deficiencies, and wherein acquiring the second information involves provoking specific responses from the individual based at least in part on the one or more deficiencies (LI, pg. 7, para. 0058, fig. 3, fig. 8, fig. 12,  “Using the out-of-adaptive space deformation of 312, the PCA space is continuously trained as the subject is being tracked. The out-of-adaptive space deformations for feeding the incremental PCA process includes training the correctives 320 of the adaptive PCA model (M) 316 in the adaptive PCA space. For example, the anchor shapes 318 may be initialized with A=23 orthonormalized vectors from the                         
                            
                                
                                    M
                                
                                
                                    k
                                
                            
                        
                     to reduce the overall error in the adaptive PCA model 316).
Accordingly, one of ordinary skill in the art would modify Huang’s non-transitory computer-readable storage medium in view of LI to teach: calculating one or more performance metrics associated with the dynamic virtual representation and the one or more attributes; determining, based at least in part on the one or more performance metrics one or more deficiencies in the extracted features; and selectively acquiring second information associated with additional interaction with the individual in the context based at least in part on the one or more deficiencies, wherein the second information at least in part corrects for the one or more deficiencies, and wherein acquiring the second information involves provoking specific responses from the individual based at least in part on the one or more deficiencies. The motivation to do so would be to have real-time animation system that corrects for facial expressions as the animation is rendered(LI, pg. 1, para. 0005, “Using the neutral scan, a three-dimensional (3D) tracking model may be generated and used to track input data (video and depth data) of the subject. The tracking can be refined over time using an adaptive principal component analysis (PCA) model in order to incrementally improve the 3D model of the subject. Specifically, the adaptive PCA model utilizes shape correctives that are adjusted on-the-fly to the 
Regarding claim 11, Huang in view of LI teaches the computer-readable storage medium of claim 10, wherein the one or more operations comprise: analyzing the second information to extract second features associated with one or more attributes of the individual(LI, pg. 5, para. 0043, fig. 3, “At step 308, the input data is fit with one or more initial blendshapes… an initial blendshape fit may be performed for each input frame using 3D point constraints on the input scans (the 3D depth map data) and 2D facial features constraints (the video data). The 2D facial feature constraints enable the identification of a correct shape of the subject that cannot be easily identified using the depth data.”); generating, based at least in part on the second extracted features, a revised dynamic virtual representation that automatically mimics the one or more attributes of the individual(LI, pg. 5, para(s). 0045-0046, fig. 3, “The 2D facial feature constraints may then be pre-associated to a fixed set of mesh vertices of the tracked 3D model. A facial feature fitting term is then formulated as one or more vectors between the 2D facial features and their corresponding mesh vertices in camera space:                        
                            
                                
                                    c
                                
                                
                                    j
                                
                                
                                    F
                                
                            
                            
                                
                                    x
                                
                            
                            =
                            
                                
                                    
                                        
                                            
                                                1
                                            
                                        
                                        
                                            
                                                0
                                            
                                        
                                    
                                    
                                        
                                            
                                                0
                                            
                                        
                                        
                                            
                                                1
                                            
                                        
                                    
                                    
                                        
                                            
                                                -
                                                
                                                    
                                                        u
                                                    
                                                    
                                                        j
                                                    
                                                    
                                                        x
                                                    
                                                
                                            
                                        
                                        
                                            
                                                -
                                                
                                                    
                                                        u
                                                    
                                                    
                                                        j
                                                    
                                                    
                                                        y
                                                    
                                                
                                            
                                        
                                    
                                
                            
                            P
                            
                                
                                    v
                                
                                
                                    j
                                
                                
                                    1
                                
                            
                            (
                            x
                            )
                        
                     where                         
                            
                                
                                    u
                                
                                
                                    j
                                
                            
                            =
                            [
                            
                                
                                    u
                                
                                
                                    j
                                
                                
                                    x
                                
                            
                            ,
                             
                            
                                
                                    u
                                
                                
                                    j
                                
                                
                                    v
                                
                            
                            ]
                        
                     is the j-th 2D facial feature position and                         
                            
                                
                                    P
                                
                                
                                    3
                                    ×
                                    3
                                
                            
                        
                     is the camera projection matrix…[a]fter the aforementioned processing, the appropriate initial blendshape expression is fit to the current frame of the input data 302.”); calculating one or more second performance metrics associated with the revised dynamic virtual representation and the one or more attributes; and determining, based at least in part on the one or more second performance metrics, one or more second deficiencies in the second extracted features(LI, pg. 5, para. 0047, fig. 3,  “The fitting from step 308 is then refined at step 310 by fitting or warping the blendshape model to a linear subspace( e.g., an adaptive PCA space, or the like) using a 
Regarding claim 12, Huang in view of LI teaches the computer-readable storage medium of claim 10, wherein the context comprises interacting with the individual(Huang, col.3, lines 5-11, fig.1,  “The present disclosure utilizes the study of human dyadic  interactions to address the problem of facial expression generation in human-avatar dyadic interactions using conditional Generative Adversarial Networks (GANs). In this way, a model may be constructed that takes into account behavior of one individual in generating a valid facial expression response in their virtual dyad partner.”).
Regarding claim 13, Huang in view of LI teaches the computer-readable storage medium of claim 10, wherein the information comprises one or more of: one or more images, sound, writing, an anatomic response of the individual, a selection from a human interface, neuronal signals, or a second type of measurement(LI, pg. 4, para. 0036, fig. 1, “The input sensor 104 may be used to capture input data of the subject 108. The input data may include scans, depth data (e.g., one or more depth maps), video with 2D features, and/or the like. In some ).4
Regarding claim 14, Huang in view of LI teaches the computer-readable storage medium of claim 10, wherein the one or more attributes comprise one or more of: a behavior, an emotion, a type of humor, a mannerism, a style of speech, a memory or a thought process(LI, pg. 11, para. 0095, fig. 10, fig.11,  “FIGS. 10 and 11 illustrate how the facial performance capture techniques described herein using an adaptive tracking model 316 with corrective shapes 320 conforms to subject input expressions on-the-fly. FIG. 10 compares the techniques described herein with less-effective methods using six basic human emotional expressions and shows that the adaptive tracking model 316 can be used to accurately capture these characteristics. As can be seen by the various models 1002-1012 illustrating the different human expressions, the facial performance capture processes described herein produce a better fit to the subject than traditional data driven techniques, which are typically confined to learned motion or expressions received prior to tracking. The adaptive model 316 accurately captures ).5
Regarding claim 15, Huang in view of LI teaches the computer-readable storage medium of claim 10, wherein a given behavioral agent comprises an artificial neural network(Huang, col.4, lines 26-42, fig. 1, “With reference again to FIG. 1, the disclosed example 10 addresses this as a tractable optimization problem that divides the model into two stages 12, 18. The first stage 12 is a conditional deep convolutional generative adversarial network (DC-GAN) designed to produce expressive facial sketch images 14 of the interviewer that are conditioned on the interviewee's facial expressions. The second stage 18 is another DC-GAN to transfer refined sketch images 14 into real facial expression images 20. On both stages 12, 18, generator and discriminator architectures may be adapted from and use modules of the form convolution-BatchNorm-ReLu to stabilize optimization. A training phase may utilize a mini-batch SGD and apply the Adam solver. To avoid the fast convergence of discriminators, generators may be updated twice for each discriminator update, which differs from the original setting in that the discriminator and generator update alternately.”).
Regarding claim 16, Huang in view of LI teaches the computer-readable storage medium of claim 10, wherein the information is associated with an electronic device(Huang, col.6, lines 13-24, fig. 6, “FIG. 6 is a diagram of an example system 60 for generating facial expressions in an avatar-based user interface for a computing device 62. The computing device ).
Regarding claim 17, Huang in view of LI teaches the computer-readable storage medium of claim 10, wherein the features comprise one or more of: spoken or written communication of the individual, an emotion of the individual, non-verbal communication by the individual, a tone, a style or manner of speaking, a gesture, facial expression, a vital sign, body language, folded arms or a posture, an eyebrow position or motion, a sudden motion, a rate or frequency of blinking, a twitch, a gaze direction and/or emotional prosody(LI, pg. 11, para. 0095, fig. 10, fig.11,  “FIGS. 10 and 11 illustrate how the facial performance capture techniques described herein using an adaptive tracking model 316 with corrective shapes 320 conforms to subject input expressions on-the-fly. FIG. 10 compares the techniques described herein with less-effective methods using six basic human emotional expressions and shows that the adaptive tracking model 316 can be used to accurately capture these characteristics. As can be seen by the various models 1002-1012 illustrating the different human expressions, the facial performance capture processes described herein produce a better fit to the subject than traditional data driven techniques, which are typically confined to learned motion or expressions received prior to tracking. The adaptive model 316 accurately captures emotions more effectively than other methods without requiring a training session. FIG. 11 compares standard tracking and retargeting with tracking and retargeting that is based on the adaptive model 316 with correctives 320. It can be see that the adaptive model 316 leads to more accurate expressions being tracked and retargeted to the various characters.”).6
at least some of the operations of the computer system are performed by a discriminator in a generative adversarial network(Huang, col.4, lines 26-42, fig. 1, “With reference again to FIG. 1, the disclosed example 10 addresses this as a tractable optimization problem that divides the model into two stages 12, 18. The first stage 12 is a conditional deep convolutional generative adversarial network (DC-GAN) designed to produce expressive facial sketch images 14 of the interviewer that are conditioned on the interviewee's facial expressions. The second stage 18 is another DC-GAN to transfer refined sketch images 14 into real facial expression images 20. On both stages 12, 18, generator and discriminator architectures may be adapted from and use modules of the form convolution-BatchNorm-ReLu to stabilize optimization. A training phase may utilize a mini-batch SGD and apply the Adam solver. To avoid the fast convergence of discriminators, generators may be updated twice for each discriminator update, which differs from the original setting in that the discriminator and generator update alternately.”).
Regarding claim 19, Huang teaches a method for dynamically and intuitively aggregating a training dataset, wherein the method comprises: by a computer system (Huang, col.6, lines 13-24, fig. 6, “FIG. 6 is a diagram of an example system 60 for generating facial expressions in an avatar-based user interface for a computing device 62. The computing device 62 includes a digital camera 64, a display device 66, and a user interface application 68 that is stored on a non-transitory computer readable medium and executable by a processor.”): receive information associated with an interaction with an individual in a context; analyzing the information to extract features associated with one or more attributes of the individual(Huang, col.6, lines 39-46, fig. 7, “FIG. 7 is a flow diagram of an example method 90 for generating facial  At 92, image data is captured over time, for example using a digital camera in the computing device. At 94, a plurality of facial expression descriptor vectors are generated based on the image data.” ); generating, based at least in part on the extracted features and using a group of behavioral agents in a multi-layer hierarchy a group of behavioral agents in a multi-layer hierarchy that automatically mimics the one or more attributes of the individual, wherein a given behavioral agent receives one or more inputs and provides an output corresponding to one or more of the extracted features, and wherein the inputs to at least some of the behavioral agents include outputs from one or more of the other behavioral agents(Huang, col.6, lines 46-52, fig. 7, “At 96, an expressive facial sketch image of an avatar is generated based on the plurality of facial expression descriptor vectors, for example using a first DC-GAN model as described above. At 98, a facial expression image is generated from the expressive facial sketch image, for example using a second DC-GAN model as described above. At 100, the facial expression image is displayed as a user interface avatar on a display device.” ). 
Huang does not teach: calculating one or more performance metrics associated with the dynamic virtual representation and the one or more attributes; determining, based at least in part on the one or more performance metrics one or more deficiencies in the extracted features; and selectively acquiring second information associated with additional interaction with the individual in the context based at least in part on the one or more deficiencies, wherein the second information at least in part corrects for the one or more deficiencies, and wherein acquiring the second information involves provoking specific responses from the individual based at least in part on the one or more deficiencies.
calculating one or more performance metrics associated with a dynamic virtual representation and the one or more attributes; determining, based at least in part on the one or more performance metrics one or more deficiencies in the extracted features(LI, pg. 11, para. 0096, fig. 12, fig. 3,  “FIG. 12 graphically illustrates the fast convergence of the incremental PCA process 314 by showing the error as time progresses. As described above, the final facial tracking output includes the vertices of the mesh                         
                            
                                
                                    v
                                
                                
                                    4
                                
                            
                        
                     that result from the out-of-adaptive space deformation 312. The error shown in green includes the average distance between these vertices and the anchor shape space                         
                            
                                
                                    M
                                
                                
                                    A
                                
                            
                        
                    , while the error shown in blue includes the full adaptive PCA space M.”); and selectively acquiring second information associated with additional interaction with the individual in the context based at least in part on the one or more deficiencies, wherein the second information at least in part corrects for the one or more deficiencies, and wherein acquiring the second information involves provoking specific responses from the individual based at least in part on the one or more deficiencies (LI, pg. 7, para. 0058, fig. 3, fig. 8, fig. 12,  “Using the out-of-adaptive space deformation of 312, the PCA space is continuously trained as the subject is being tracked. The out-of-adaptive space deformations for feeding the incremental PCA process includes training the correctives 320 of the adaptive PCA model (M) 316 in the adaptive PCA space. For example, the anchor shapes 318 may be initialized with A=23 orthonormalized vectors from the initial blendshapes, and the K corrective shapes 320 may be learned to improve the fitting accuracy over time. To train the correctives 320, new expression samples S of the subject may first be collected that fall outside of the currently used adaptive PCA space. These samples are obtained by warping the result of the initial blendshape fit to fit the current input depth map and 2D facial features using a per-vertex Laplacian deformation algorithm. These samples are used to                         
                            
                                
                                    M
                                
                                
                                    k
                                
                            
                        
                     to reduce the overall error in the adaptive PCA model 316).
Accordingly, one of ordinary skill in the art would modify Huang’s method in view of LI to teach: calculating one or more performance metrics associated with the dynamic virtual representation and the one or more attributes; determining, based at least in part on the one or more performance metrics one or more deficiencies in the extracted features; and selectively acquiring second information associated with additional interaction with the individual in the context based at least in part on the one or more deficiencies, wherein the second information at least in part corrects for the one or more deficiencies, and wherein acquiring the second information involves provoking specific responses from the individual based at least in part on the one or more deficiencies. The motivation to do so would be to have real-time animation system that corrects for facial expressions as the animation is rendered(LI, pg. 1, para. 0005, “Using the neutral scan, a three-dimensional (3D) tracking model may be generated and used to track input data (video and depth data) of the subject. The tracking can be refined over time using an adaptive principal component analysis (PCA) model in order to incrementally improve the 3D model of the subject. Specifically, the adaptive PCA model utilizes shape correctives that are adjusted on-the-fly to the subject's expressions through incremental PCA-based learning. As a result, the animation accuracy of the subject's expressions over time can be improved.”).
Regarding claim 20, Huang in view of LI teaches the method of claim 19, wherein at least some of the operations of the computer system are performed by a discriminator in a generative adversarial network (Huang, col.4, lines 26-42, fig. 1, “With reference again to FIG. 1, the disclosed example 10 addresses this as a tractable optimization problem that divides .

Conclusion

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
 

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J Huntley can be reached on (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
Adam Clark Standke
Examiner
Art Unit 2129
/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 According to the broadest reasonable interpretation (BRI), the use of alternative language amounts to the claim requiring one or more elements but not all.  
        2 According to the broadest reasonable interpretation (BRI), the use of alternative language amounts to the claim requiring one or more elements but not all.  
        3 According to the broadest reasonable interpretation (BRI), the use of alternative language amounts to the claim requiring one or more elements but not all.  
        4 According to the broadest reasonable interpretation (BRI), the use of alternative language amounts to the claim requiring one or more elements but not all.  
        5 According to the broadest reasonable interpretation (BRI), the use of alternative language amounts to the claim requiring one or more elements but not all.  
        6 According to the broadest reasonable interpretation (BRI), the use of alternative language amounts to the claim requiring one or more elements but not all.