DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 10 June 2022 has been entered.
 
Claims 1, 6, 7, 12, 16 and 19-22 are currently amended, claim 4 is as previously presented, claims 8-11 and 13-15 are as originally presented and claims 2, 3, 17 and 18 were previously cancelled.  In summary, claims 1, 4-16 and 19-22 are pending in the application.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1, 4-10, 12-16, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Williams et al. (U. S. Patent Application Publication 2013/0243255 A1, already of record, hereafter ‘255) and in view of Zhang et al. (“Dynamic Facial Expression Analysis and Synthesis With MPEG-4 Facial Animation Parameters”).

Regarding claim 1, (Currently Amended), Williams teaches an image processing method (‘255, ¶ 0007; image processing method described), comprising: obtaining an image (‘255; ¶ 0007; receiving image data from the capture device); obtaining a feature of a part of a target based on the image (‘255; figs. 3-5; ¶ 0078; texture features to identify parts of a human body contained within the received image data from which additional features may be determined - Exemplar, which is a known technique for receiving a two-dimensional depth texture image and generating body part proposals as probabilities as to the proper identification of specific body parts within the image to identify limbs; ¶ 0067; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the three-dimensional location of 31 three-dimensional points, corresponding to locations on the human body thus providing at least features of at least a part of a target; the three-dimensional position of j2, j18 and j20 of fig. 4 for example, are tracked and updated at real-time rates providing at least three features relating to two particular parts, upper and lower left arm parts of a human target satisfying the requirement of obtaining a feature of a part of a target based on the image); determining movement information of the part based on the feature (‘255; figs. 3-5; ¶ 0081; trajectories of joint locations are repeatedly sampled and updated over time - movement information of the left upper arm connection portions, j2 and j18, and left lower arm connection portions at j18 and j20 are tracked as movement information of the connection portion based on the features of j2, j18 and j20 as various traction actions along the left arm ultimately determine the movement information of j18, a connection portion of the part for which movement information of the part based on the feature is determined in real-time); and controlling movement of a corresponding part in a controlled model according to the movement information (‘255; fig. 1A; avatar 19 follows the limb, connection portion j18 and other body components of user 18; ¶ 0039; The user's movements are tracked and used to animate the movements of the avatar 19. In embodiments, the avatar 19 mimics the movements of the user 18 in real world space so that the user 18 may perform movements and gestures which control the movements and actions of the avatar 19 on the display 14); wherein obtaining the feature of the part of the target based on the image comprises: obtaining a first-type feature (‘255; figs. 3-5; bp1 is a head; ¶ 0030; ¶ 0035; user head gesture as an expression feature – a yes and a no head/facial movement are examples of a first-type feature) of a first-type part of the target (‘255; figs. 3-5; bp1 is the head of the body which is a first-type part of the target as defined by the specification of the instant application) based on the image (‘255; ¶ 0007; received image data from the capture device) obtaining a second-type feature of a second-type part of the target based on the image (‘255; figs. 3-5; ¶ 0078; texture features to identify parts of a human body contained within the received image data from which additional features may be determined - Exemplar, which is a known technique for receiving a two-dimensional depth texture image and generating body part proposals as probabilities as to the proper identification of specific body parts within the image to identify limbs; ¶ 0067; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the three-dimensional location of 31 three-dimensional points, corresponding to locations on the human body thus providing at least features of at least a part of a target; the three-dimensional position of j2, j18 and j20 of fig. 4 for example, are tracked and updated at real-time rates providing at least three features, defined by the specification of the instant application as second-type features, relating to two particular parts, upper and lower left arm parts of a human target, defined by the specification of the instant application as second-type parts, satisfying the requirement of obtaining a second-type feature of a second-type part of a target based on the image) by obtaining position information of a key point of the second-type part of the target (‘255; figs. 3-5; ¶ 0060; ¶ 0078; Exemplar, which is a known technique for receiving a two-dimensional depth texture image and generating body part proposals as probabilities as to the proper identification of specific body parts within the image; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the location of 31 three-dimensional points); wherein the first-type part and the second-type part are different types of parts (‘255; figs. 3-5; second type part is a limb; ¶ 0078; first type part, bp1 is a head; ¶ 0030; ¶ 0035), and the different types of parts comprise parts with different amplitudes of movement (‘255; figs. 3-5; second type part is a limb; ¶ 0078; first type part, bp1 is a head; ¶ 0030; ¶ 0035; parts with different amplitudes of movement) or parts with different movement fineness (‘255; figs. 3-5; second type part is a limb; ¶ 0078; first type part, bp1 is a head; ¶ 0030; ¶ 0035; parts with different fineness of movement), and the intensity coefficient represents a strength of an expression action corresponding to each of the facial expressions.
Zhang, working in the same field of endeavor, however, teaches obtaining an expression feature of a face of a head (Zhang; Abstract; page 1383, column 2, lines 5-22 and lines 41-47, The MPEG-4 visual standard specifies a set of facial definition parameters (FDPs) and facial animation parameters (FAPs) for facial animation; page 1385, column 1, lines 7-13; Video Analysis: Video analysis is to generate the measurements of FAPs and face pose. The use of 3-D facial shape model and eye detection technique makes our facial feature detection and pose estimation robust under the head motion and non-rigid facial expression. The detected facial feature points are used to produce measurements for face pose and the FAPs as defined in MPEG-4 visual standard) and an intensity coefficient of the expression feature (Zhang; Abstract; page 1389, column 1, ln. 8-31; The two BNs are coupled to unify the facial expression analysis and synthesis into one coherent structure so that the visual evidences observed at the analysis end can be propagated directly to the synthesizer for reconstructing the FAPs and their intensity; page 1390, column 1; equation (9) and associated descriptive text), wherein the expression feature comprises movements of first-type features of first-type parts that indicate facial expressions of the target (Zhang; Abstract; page 1388, column 2, lines 4-16), and the intensity coefficient represents a strength of an expression action corresponding to each of the facial expressions (Zhang; Abstract; page 1388, column 2, lines 4-16; intensity and amplitude) for the benefit of providing important movement metrics to incorporate into an image processing system which recognizes and tracks a user’s movements and displays a simulation of the user (avatar) performing the captured motions in a game or virtual reality environment).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to have combined the teachings of Zhang for determining an intensity coefficient of an expression feature of a face of a head based on the image with the image processing methods taught by Williams for the benefit of providing important movement metrics to incorporate into an image processing system which recognizes and tracks a user’s movements and displays a simulation of the user (avatar) performing the captured motions in a game or virtual reality environment.

Claims 2 and 3 (Cancelled).

In regard to claim 4 (Previously Presented), Williams and Zhang teach the method according to claim 1 and further teach wherein obtaining the intensity coefficient of the expression feature comprises: obtaining, based on the image, an intensity coefficient that represents each sub-part in the first-type part (Zhang; Abstract; page 1386, fig. 2, Tables 1 and 2; section B; cast the FAPs and facial action coding system (FACS) into a dynamic Bayesian network (DBN) to account for uncertainties in FAP extraction and to model the dynamic evolution of facial expressions. At the synthesizer, a static BN reconstructs the FAPs and their intensity).

Regarding claim 5 (Previously Presented), Williams and Zhang teach the method according to claim 1 and further teach wherein determining the movement information of the part based on the feature comprises: determining movement information of the head based on the expression feature (Zhang; Abstract; page 1383, column 2, lines 5-22 and lines 41-47, The MPEG-4 visual standard specifies a set of facial definition parameters (FDPs) and facial animation parameters (FAPs) for facial animation; page 1385, column 1, lines 7-13; Video Analysis: Video analysis is to generate the measurements of FAPs and face pose. The use of 3-D facial shape model and eye detection technique makes our facial feature detection and pose estimation robust under the head motion and non-rigid facial expression. The detected facial feature points are used to produce measurements for face pose and the FAPs as defined in MPEG-4 visual standard) and the intensity coefficient (Zhang; Abstract; page 1389, column 1, ln. 8-31; The two BNs are coupled to unify the facial expression analysis and synthesis into one coherent structure so that the visual evidences observed at the analysis end can be propagated directly to the synthesizer for reconstructing the FAPs and their intensity; page 1390, column 1; equation (9) and associated descriptive text); and controlling the movement of the corresponding part in the controlled model according to the movement information comprises: controlling an expression change of a head in the controlled model according to the movement information of the head (Zhang; Abstract; page 1385, fig. 1; facial animation block; page 1392-1394, starting at column 2, section B; figs 14-16).

In regard to claim 6 (Currently Amended), Williams and Zhang teach the method according to claim 1 and further teach wherein three-dimensional position of every tracked point. In embodiments, the present system may track the three-dimensional location of 31 three-dimensional points, corresponding to locations on the human body thus providing at least features of at least a part of a target; the three-dimensional position of j2, j18 and j20 of fig. 4 for example, are tracked and updated at real-time rates providing at least three features, defined by the specification of the instant application as second-type features, relating to two particular parts, upper and lower left arm parts of a human target, defined by the specification of the instant application as second-type parts, satisfying the requirement of obtaining a second-type feature of a second-type part of a target based on the image).

Regarding claim 7 (Currently Amended), Williams and Zhang teach the method according to claim [[6]]1, wherein obtaining the position information of the key point of the second-type part of the target based on the image comprises: obtaining a first coordinate of a support key point of the second-type part of the target based on the image (‘255; figs. 3-5; ¶ 0078; texture features to identify parts of a human body contained within the received image data from which additional features may be determined - Exemplar, which is a known technique for receiving a two-dimensional depth texture image and generating body part proposals as probabilities as to the proper identification of specific body parts within the image to identify limbs; ¶ 0067; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the three-dimensional location of 31 three-dimensional points, corresponding to locations on the human body thus providing at least features of at least a part of a target; selecting j2, j18 and j20, define an upper limb where the three-dimensional position of j2, a first coordinate of a support key point of the second-type part of the target based on the image); and obtaining a second coordinate based on the first coordinate (‘255; ¶ 0067; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the three-dimensional location of 31 three-dimensional points, corresponding to locations on the human body thus providing at least coordinate or position features of at least a part of a target; the three-dimensional position of j2, j18 and j20 of fig. 4 for example, are tracked and updated at real-time rates providing obtaining a second coordinate based on the first coordinate as the coordinates of the skeletal representation are continually updated).

In regard to claim 8 (Original), Williams and Zhang teach the method according to claim 7 and further teach wherein obtaining the first coordinate of the support key point of the second-type part of the target based on the image comprises: obtaining a first 2-Dimensional (2D) coordinate of the support key point of the second-type part based on a 2D image (‘255; figs. 3-5; ¶ 0078; texture features to identify parts of a human body contained within the received image data from which additional features may be determined - Exemplar, which is a known technique for receiving a two-dimensional depth texture image and generating body part proposals as probabilities as to the proper identification of specific body parts within the image to identify limbs; ¶ 0067; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the three-dimensional location of 31 three-dimensional points, corresponding to locations on the human body thus providing at least features of at least a part of a target; selecting j2, j18 and j20, define an upper limb where the three-dimensional position of j2, a first coordinate of a support key point of the second-type part of the target based on the image); and obtaining the second coordinate based on the first coordinate comprises: obtaining a first 3-Dimensional (3D) coordinate corresponding to the first 2D coordinate based on the first 2D coordinate and a conversion relationship between a 2D coordinate and a 3D coordinate (‘255; ¶ 0067; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the three-dimensional location of 31 three-dimensional points, corresponding to locations on the human body thus providing at least coordinate or position features of at least a part of a target; the three-dimensional position of j2, j18 and j20 of fig. 4 for example, are tracked and updated at real-time rates providing obtaining a second coordinate based on the first coordinate as the coordinates of the skeletal representation are continually updated).

Regarding claim 9 (Original), Williams and Zhang teach the method according to claim 7 and further teach wherein obtaining the first coordinate of the support key point of the second-type part of the target based on the image comprises: obtaining a second 3D coordinate of the support key point of the second-type part of the target based on a 3D image (‘255; figs. 3-5; ¶ 0078; texture features to identify parts of a human body contained within the received image data from which additional features may be determined - Exemplar, which is a known technique for receiving a two-dimensional depth texture image and generating body part proposals as probabilities as to the proper identification of specific body parts within the image to identify limbs; ¶ 0067; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the three-dimensional location of 31 three-dimensional points, corresponding to locations on the human body thus providing at least features of at least a part of a target; selecting j2, j18 and j20, define an upper limb where the three-dimensional position of j2, a first coordinate of a support key point of the second-type part of the target based on the image); and obtaining the second coordinate based on the first coordinate comprises: obtaining a third 3D coordinate based on the second 3D coordinate (‘255; ¶ 0067; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the three-dimensional location of 31 three-dimensional points, corresponding to locations on the human body thus providing at least coordinate or position features of at least a part of a target; the three-dimensional position of j2, j18 and j20 of fig. 4 for example, are tracked and updated at real-time rates providing obtaining a second coordinate based on the first coordinate as the coordinates of the skeletal representation are continually updated).

In regard to claim 10 (Original), Williams and Zhang teach the method according to claim 9 and further teach wherein obtaining the third 3D coordinate based on the second 3D coordinate comprises: correcting, based on the second 3D coordinate, a 3D coordinate of a support key point corresponding to an occluded portion of the second-type part in the 3D image, to obtain the third 3D coordinate (‘255; ¶ 0107; As noted above, it may happen that a given joint was not identified either due to occlusion, failure in another subsystem, or some other problem. The centroid-based joint fusion skeletal generator 194a handled this situation with a null candidate. Volumetric model-based tracking expert 194e is a further example of an expert where missing joints and other body parts may be "grown." That is, where there is no good Exemplar and/or historical data for an intermediate joint or an extremity, the neighboring joints and depth data may be examined to interpolate the data for the missing body part to, in effect, grow the body part).

In regard to claim 12 (Currently Amended), Williams and Zhang teach the method according to claim [[6]]1 and further teach wherein obtaining the position information of the key point of the second-type part of the target based on the image comprises: obtaining first position information of a support key point of a first part in the second-type part (‘255; figs. 3-5; ¶ 0078; texture features to identify parts of a human body contained within the received image data from which additional features may be determined - Exemplar, which is a known technique for receiving a two-dimensional depth texture image and generating body part proposals as probabilities as to the proper identification of specific body parts within the image to identify limbs; ¶ 0067; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the three-dimensional location of 31 three-dimensional points, corresponding to locations on the human body thus providing at least features of at least a part of a target; selecting j2, j18 and j20, define an upper limb where the three-dimensional position of j2, a first coordinate of a support key point of the second-type part based on the coordinate of j2 as part of the torso, a first-type part, all based on the user target based on the image); and obtaining second position information of a support key point of a second part in the second-type part (‘255; figs. 3-5; ¶ 0078; texture features to identify parts of a human body contained within the received image data from which additional features may be determined - Exemplar, which is a known technique for receiving a two-dimensional depth texture image and generating body part proposals as probabilities as to the proper identification of specific body parts within the image to identify limbs; ¶ 0067; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the three-dimensional location of 31 three-dimensional points, corresponding to locations on the human body thus providing at least features of at least a part of a target; selecting j2, j18 and j20, define an upper limb where the three-dimensional position of j2, and provide a second position information of a support key point of a second part in the second-type part of the target based on the image).

Regarding claim 13 (Original), Williams and Zhang teach the method according to claim 12 and further teach wherein determining the movement information of the second-type part based on the position information comprises: determining movement information of the first part according to the first position information (‘255; figs. 3-5; ¶ 0081; trajectories of joint locations are repeatedly sampled and updated over time - movement information of the left upper arm connection portions, j2 and j18, are tracked as movement information with j2 as part of the torso, reflects the movement information of the first part according to the first position formation for which movement information of the part based on the feature is determined in real-time); and determining movement information of the second part according to the second position information (‘255; figs. 3-5; ¶ 0081; trajectories of joint locations are repeatedly sampled and updated over time - movement information of the left upper arm connection portions, j2 and j18, and left lower arm connection portions at j18 and j20 are tracked as movement information of the connection portion based on the features of j2, j18 and j20 as various traction actions along the left arm ultimately determine the movement information of j18, a connection portion of the part for which movement information of the part based on the feature is determined in real-time).

In regard to claim 14 (Original), Williams and Zhang teach The method according to claim 13 and further teach wherein controlling the movement of the corresponding part in the controlled model according to the movement information comprises: controlling movement of a part in the controlled model corresponding to the first part according to the movement information of the first part (Zhang; Abstract; page 1385, fig. 1; facial animation block; page 1392-1394, starting at column 2, section B; figs 14-16); and controlling movement of a part in the controlled model corresponding to the second part according to the movement information of the second part (‘255; fig. 1A; avatar 19 follows the limbs and other body components of user 18, including part bp1, j32 and j33; ¶ 0039; The user's movements are tracked and used to animate the movements of the avatar 19. In embodiments, the avatar 19 mimics the movements of the user 18 in real world space so that the user 18 may perform movements and gestures which control the movements and actions of the avatar 19 on the display 14).

Regarding claim 15 (Original), Williams and Zhang teach the method according to claim 12 and further teach wherein the first part is a torso (‘255; figs. 3-5; ¶ 0078; ¶ 0078; 0095); and/or the second part is an upper limb (‘255, fig. 3; elements bp4-bp5 and bp8-bp9; upper limbs; ¶ 0058-0059), a lower limb (‘255, fig. 3; elements bp11-bp14; lower limbs; ¶ 0058-0059), or four limbs (‘255, fig. 3; elements bp8-bp9 and bp11-bp14; ¶ 0058-0059).

In regard to claim 16 (Currently Amended), Williams teaches an image device (‘255; fig. 19B; ¶ 0172), comprising: a memory (‘255; fig. 19B, element 722; ¶ 0172; RAM and ROM memory) storing computer-executable instructions (‘255; ¶ 0174); and a processor (‘255; fig. 19B, element 759; ¶ 0172) coupled to the memory (‘255; fig. 19B, element 722; ¶ 0172; RAM and ROM memory), wherein the processor (‘255; fig. 19B, element 759; ¶ 0172) is configured to obtain an image (‘255; ¶ 0007; receiving image data from the capture device); obtain a feature of a part of a target based on the image (‘255; figs. 3-5; ¶ 0078; texture features to identify parts of a human body contained within the received image data from which additional features may be determined - Exemplar, which is a known technique for receiving a two-dimensional depth texture image and generating body part proposals as probabilities as to the proper identification of specific body parts within the image to identify limbs; ¶ 0067; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the three-dimensional location of 31 three-dimensional points, corresponding to locations on the human body thus providing at least features of at least a part of a target; the three-dimensional position of j2, j18 and j20 of fig. 4 for example, are tracked and updated at real-time rates providing at least three features relating to two particular parts, upper and lower left arm parts of a human target satisfying the requirement of obtaining a feature of a part of a target based on the image); determine movement information of the part based on the feature (‘255; figs. 3-5; ¶ 0081; trajectories of joint locations are repeatedly sampled and updated over time - movement information of the left upper arm connection portions, j2 and j18, and left lower arm connection portions at j18 and j20 are tracked as movement information of the connection portion based on the features of j2, j18 and j20 as various traction actions along the left arm ultimately determine the movement information of j18, a connection portion of the part for which movement information of the part based on the feature is determined in real-time); and control movement of a corresponding part in a controlled model according to the movement information (‘255; fig. 1A; avatar 19 follows the limb, connection portion j18 and other body components of user 18; ¶ 0039; The user's movements are tracked and used to animate the movements of the avatar 19. In embodiments, the avatar 19 mimics the movements of the user 18 in real world space so that the user 18 may perform movements and gestures which control the movements and actions of the avatar 19 on the display 14); wherein obtaining the feature of the part of the target based on the image comprises: obtaining a first-type feature (‘255; figs. 3-5; bp1 is a head; ¶ 0030; ¶ 0035; user head gesture as an expression feature – a yes and a no head/facial movement are examples of a first-type feature) of a first-type part of the target (‘255; figs. 3-5; bp1 is the head of the body which is a first-type part of the target as defined by the specification of the instant application) based on the image (‘255; ¶ 0007; received image data from the capture device) obtain a second-type feature of a second-type part of the target based on the image (‘255; figs. 3-5; ¶ 0078; texture features to identify parts of a human body contained within the received image data from which additional features may be determined - Exemplar, which is a known technique for receiving a two-dimensional depth texture image and generating body part proposals as probabilities as to the proper identification of specific body parts within the image to identify limbs; ¶ 0067; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the three-dimensional location of 31 three-dimensional points, corresponding to locations on the human body thus providing at least features of at least a part of a target; the three-dimensional position of j2, j18 and j20 of fig. 4 for example, are tracked and updated at real-time rates providing at least three features, defined by the specification of the instant application as second-type features, relating to two particular parts, upper and lower left arm parts of a human target, defined by the specification of the instant application as second-type parts, satisfying the requirement of obtaining a second-type feature of a second-type part of a target based on the image) by obtaining position information of a key point of the second-type part of the target (‘255; figs. 3-5; ¶ 0060; ¶ 0078; Exemplar, which is a known technique for receiving a two-dimensional depth texture image and generating body part proposals as probabilities as to the proper identification of specific body parts within the image; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the location of 31 three-dimensional points); wherein the first-type part and the second-type part are different types of parts (‘255; figs. 3-5; second type part is a limb; ¶ 0078; first type part, bp1 is a head; ¶ 0030; ¶ 0035), and the different types of parts comprise parts with different amplitudes of movement (‘255; figs. 3-5; second type part is a limb; ¶ 0078; first type part, bp1 is a head; ¶ 0030; ¶ 0035; parts with different amplitudes of movement) or parts with different movement fineness (‘255; figs. 3-5; second type part is a limb; ¶ 0078; first type part, bp1 is a head; ¶ 0030; ¶ 0035; parts with different fineness of movement), and the intensity coefficient represents a strength of an expression action corresponding to each of the facial expressions.
Zhang, working in the same field of endeavor, however, teaches obtaining an expression feature of a face of a head (Zhang; Abstract; page 1383, column 2, lines 5-22 and lines 41-47, The MPEG-4 visual standard specifies a set of facial definition parameters (FDPs) and facial animation parameters (FAPs) for facial animation; page 1385, column 1, lines 7-13; Video Analysis: Video analysis is to generate the measurements of FAPs and face pose. The use of 3-D facial shape model and eye detection technique makes our facial feature detection and pose estimation robust under the head motion and non-rigid facial expression. The detected facial feature points are used to produce measurements for face pose and the FAPs as defined in MPEG-4 visual standard) and an intensity coefficient of the expression feature (Zhang; Abstract; page 1389, column 1, ln. 8-31; The two BNs are coupled to unify the facial expression analysis and synthesis into one coherent structure so that the visual evidences observed at the analysis end can be propagated directly to the synthesizer for reconstructing the FAPs and their intensity; page 1390, column 1; equation (9) and associated descriptive text), wherein the expression feature comprises movements of first-type features of first-type parts that indicate facial expressions of the target (Zhang; Abstract; page 1388, column 2, lines 4-16), and the intensity coefficient represents a strength of an expression action corresponding to each of the facial expressions (Zhang; Abstract; page 1388, column 2, lines 4-16; intensity and amplitude) for the benefit of providing important movement metrics to incorporate into an image processing system which recognizes and tracks a user’s movements and displays a simulation of the user (avatar) performing the captured motions in a game or virtual reality environment).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to have combined the teachings of Zhang for determining an intensity coefficient of an expression feature of a face of a head based on the image with the image processing methods taught by Williams for the benefit of providing important movement metrics to incorporate into an image processing system which recognizes and tracks a user’s movements and displays a simulation of the user (avatar) performing the captured motions in a game or virtual reality environment.

Claims 17 and 18 (Cancelled).

Regarding claim 19 (Currently Amended), Williams and Zhang teach the device according to claim 16 and further teach wherein three-dimensional position of every tracked point. In embodiments, the present system may track the three-dimensional location of 31 three-dimensional points, corresponding to locations on the human body thus providing at least features of at least a part of a target; the three-dimensional position of j2, j18 and j20 of fig. 4 for example, are tracked and updated at real-time rates providing at least three features, defined by the specification of the instant application as second-type features, relating to two particular parts, upper and lower left arm parts of a human target, defined by the specification of the instant application as second-type parts, satisfying the requirement of obtaining a second-type feature of a second-type part of a target based on the image).
In regard to claim 20 (Currently Amended), Williams teaches a non-transitory computer storage medium (‘255; fig. 19B, elements 738, 753 and 754; ¶ 0173) storing computer-executable instructions (‘255; ¶ 0174) that are executed by a processor (‘255; fig. 19B, element 759; ¶ 0172) to: obtain an image (‘255; figs. 3-5; ¶ 0078; texture features to identify parts of a human body contained within the received image data from which additional features may be determined - Exemplar, which is a known technique for receiving a two-dimensional depth texture image and generating body part proposals as probabilities as to the proper identification of specific body parts within the image to identify limbs; ¶ 0067; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the three-dimensional location of 31 three-dimensional points, corresponding to locations on the human body thus providing at least features of at least a part of a target; the three-dimensional position of j2, j18 and j20 of fig. 4 for example, are tracked and updated at real-time rates providing at least three features relating to two particular parts, upper and lower left arm parts of a human target satisfying the requirement of obtaining a feature of a part of a target based on the image); determine movement information of the part based on the feature (‘255; figs. 3-5; ¶ 0081; trajectories of joint locations are repeatedly sampled and updated over time - movement information of the left upper arm connection portions, j2 and j18, and left lower arm connection portions at j18 and j20 are tracked as movement information of the connection portion based on the features of j2, j18 and j20 as various traction actions along the left arm ultimately determine the movement information of j18, a connection portion of the part for which movement information of the part based on the feature is determined in real-time); and control movement of a corresponding part in a controlled model according to the movement information (‘255; fig. 1A; avatar 19 follows the limb, connection portion j18 and other body components of user 18; ¶ 0039; The user's movements are tracked and used to animate the movements of the avatar 19. In embodiments, the avatar 19 mimics the movements of the user 18 in real world space so that the user 18 may perform movements and gestures which control the movements and actions of the avatar 19 on the display 14); wherein obtaining the feature of the part of the target based on the image comprises: obtaining a first-type feature (‘255; figs. 3-5; bp1 is a head; ¶ 0030; ¶ 0035; user head gesture as an expression feature – a yes and a no head/facial movement are examples of a first-type feature) of a first-type part of the target (‘255; figs. 3-5; bp1 is the head of the body which is a first-type part of the target as defined by the specification of the instant application) based on the image (‘255; ¶ 0007; received image data from the capture device) obtain a second-type feature of a second-type part of the target based on the image (‘255; figs. 3-5; ¶ 0078; texture features to identify parts of a human body contained within the received image data from which additional features may be determined - Exemplar, which is a known technique for receiving a two-dimensional depth texture image and generating body part proposals as probabilities as to the proper identification of specific body parts within the image to identify limbs; ¶ 0067; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the three-dimensional location of 31 three-dimensional points, corresponding to locations on the human body thus providing at least features of at least a part of a target; the three-dimensional position of j2, j18 and j20 of fig. 4 for example, are tracked and updated at real-time rates providing at least three features, defined by the specification of the instant application as second-type features, relating to two particular parts, upper and lower left arm parts of a human target, defined by the specification of the instant application as second-type parts, satisfying the requirement of obtaining a second-type feature of a second-type part of a target based on the image) by obtaining position information of a key point of the second-type part of the target (‘255; figs. 3-5; ¶ 0060; ¶ 0078; Exemplar, which is a known technique for receiving a two-dimensional depth texture image and generating body part proposals as probabilities as to the proper identification of specific body parts within the image; ¶ 0080-0081; a state estimate vector x.sub.t which contains the three-dimensional position of every tracked point. In embodiments, the present system may track the location of 31 three-dimensional points); wherein the first-type part and the second-type part are different types of parts (‘255; figs. 3-5; second type part is a limb; ¶ 0078; first type part, bp1 is a head; ¶ 0030; ¶ 0035), and the different types of parts comprise parts with different amplitudes of movement (‘255; figs. 3-5; second type part is a limb; ¶ 0078; first type part, bp1 is a head; ¶ 0030; ¶ 0035; parts with different amplitudes of movement) or parts with different movement fineness (‘255; figs. 3-5; second type part is a limb; ¶ 0078; first type part, bp1 is a head; ¶ 0030; ¶ 0035; parts with different fineness of movement), and the intensity coefficient represents a strength of an expression action corresponding to each of the facial expressions.
Zhang, working in the same field of endeavor, however, teaches obtaining an expression feature of a face of a head (Zhang; Abstract; page 1383, column 2, lines 5-22 and lines 41-47, The MPEG-4 visual standard specifies a set of facial definition parameters (FDPs) and facial animation parameters (FAPs) for facial animation; page 1385, column 1, lines 7-13; Video Analysis: Video analysis is to generate the measurements of FAPs and face pose. The use of 3-D facial shape model and eye detection technique makes our facial feature detection and pose estimation robust under the head motion and non-rigid facial expression. The detected facial feature points are used to produce measurements for face pose and the FAPs as defined in MPEG-4 visual standard) and an intensity coefficient of the expression feature (Zhang; Abstract; page 1389, column 1, ln. 8-31; The two BNs are coupled to unify the facial expression analysis and synthesis into one coherent structure so that the visual evidences observed at the analysis end can be propagated directly to the synthesizer for reconstructing the FAPs and their intensity; page 1390, column 1; equation (9) and associated descriptive text), wherein the expression feature comprises movements of first-type features of first-type parts that indicate facial expressions of the target (Zhang; Abstract; page 1388, column 2, lines 4-16), and the intensity coefficient represents a strength of an expression action corresponding to each of the facial expressions (Zhang; Abstract; page 1388, column 2, lines 4-16; intensity and amplitude) for the benefit of providing important movement metrics to incorporate into an image processing system which recognizes and tracks a user’s movements and displays a simulation of the user (avatar) performing the captured motions in a game or virtual reality environment).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to have combined the teachings of Zhang for determining an intensity coefficient of an expression feature of a face of a head based on the image with the image processing methods taught by Williams for the benefit of providing important movement metrics to incorporate into an image processing system which recognizes and tracks a user’s movements and displays a simulation of the user (avatar) performing the captured motions in a game or virtual reality environment.

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Williams et al. (U. S. Patent Application Publication 2013/0243255 A1, already of record, hereafter ‘255) as applied to claims 1, 4-10, 12-16, 19 and 20 above, and in view of Zhang et al. (“Dynamic Facial Expression Analysis and Synthesis With MPEG-4 Facial Animation Parameters”) as applied to claims 1, 4-10, 12-16, 19 and 20 above, and further in view of Matsumiya et al. (U. S. Patent 10,022,628 B1, already of record, hereafter ‘628).

Regarding claim 11 (Original), Williams and Zhang teach the method according to claim 6 but do not teach wherein determining the movement information of the second-type part based on the position information comprises: determining a quaternion of the second-type part based on the position information.
Matsumiya, working in the same field of endeavor, however, teaches wherein determining the movement information of the second-type part based on the position information comprises: determining a quaternion of the second-type part based on the position information (‘628; col. 10, ln. 45-54; the game engine 102 may use the character movement engine 110 to determine the initial position of the movable character. Determining the initial position of the movable character can include determining a position of a number of joints of the character or a position of number of joints of a skeleton of the character. Further, determining the initial position of movable character can include determining a rotation of the joints of the character, such as a quaternion rotation) for the benefit of the increased computational efficiency provided by the compactness of rotation representations as expressed in quaternion form.
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to have combined the teachings of Matsumiya for representing movement information comprising a quaternion with the image processing methods taught by Williams in view of Zhang for the benefit of the increased computational efficiency provided by the compactness of rotation representations as expressed in quaternion form.

Claims 21 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Williams et al. (U. S. Patent Application Publication 2013/0243255 A1, already of record, hereafter ‘255) as applied to claims 1, 4-16, 19 and 20 above, and in view of Zhang et al. (“Dynamic Facial Expression Analysis and Synthesis With MPEG-4 Facial Animation Parameters”) as applied to claims 1, 4-16, 19 and 20 above, and further in view of Cook (U. S. Patent 6,657,628 A1, already of record, hereafter ‘628).

Regarding claim 21 (Currently Amended), Williams and Zhang teach he method according to claim 1 but do not teach wherein obtaining the expression feature of the face of the head and the intensity coefficient of the expression feature further comprises: obtaining mesh information representing an expression change of the head, wherein the mesh information is formed by a predetermined number of face key points, and change in a position of an intersection point of a mesh represents the expression change; and obtaining the expression feature of the face of the head and the intensity coefficient of the expression feature based on the mesh information.
Cook, working in the same field of endeavor, however, teaches wherein obtaining the expression feature of the head and the intensity coefficient of the expression feature further comprises: obtaining mesh information representing an expression change of the head (‘628; fig. 7; col. 2 ln. 50-51; a polygon mesh as an underlying model of a physical structure of a character to be animated; col. 8, ln. 39-43; base sets of facial actions allow for the representation of most natural facial expressions. Other parameter sets specifying amounts, degrees, or other motion values of the models may also be utilized; col. 8, ln. 48-56; a smile for example having several muscles (control points) that function together (movement of polygons on the mesh of FIG. 7, for example) to produce the smile), wherein the mesh information is formed by a predetermined number of face key points (‘628; fig. 8, simple facial objects; col. 8, ln. 9-14), and change in a position of an intersection point of a mesh represents the expression change (‘628; fig. 8; col. 8, ln. 1-6; Animation, particularly facial, is mesh deformation achieved through movement of one or more vertices in a specific direction (both unidirectional and bi-directional). Such movement is specified by sets of parameters that 5 describe the range of movement or articulation for a particular segment or control point outlined in the model); and obtaining the expression feature of the face of the head and the intensity coefficient of the expression feature based on the mesh information (‘628; fig. 8; col. 8, ln. 1-6; Animation, particularly facial, is mesh deformation achieved through movement of one or more vertices in a specific direction (both unidirectional and bi-directional; col. 8, ln. 29-37; obtaining the expression feature of the head and the intensity coefficient of the expression feature based on the mesh information) for the benefit of enhancing the communicative abilities of an animated character.
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to have combined the methods for expression feature of the head and the intensity coefficient of the expression feature comprising a mesh representation and feature points for representing movement points as taught by Cook with the image processing methods taught by Williams in view of Zhang for the benefit of enhancing the communicative abilities of an animated character.

In regard to claim 22 (Currently Amended), Williams and Zhang teach the device according to claim 16 but do not teach wherein obtaining the expression feature of the face of the head and the intensity coefficient of the expression feature further comprises: obtaining mesh information representing an expression change of the head, wherein the mesh information is formed by a predetermined number of face key points, and change in a position of an intersection point of a mesh represents the expression change; and obtaining the expression feature of the face of the head and the intensity coefficient of the expression feature based on the mesh information.
Cook, working in the same field of endeavor, however, teaches wherein obtaining the expression feature of the face of the head and the intensity coefficient of the expression feature further comprises: obtaining mesh information representing an expression change of the head (‘628; fig. 7; col. 2 ln. 50-51; a polygon mesh as an underlying model of a physical structure of a character to be animated; col. 8, ln. 39-43; base sets of facial actions allow for the representation of most natural facial expressions. Other parameter sets specifying amounts, degrees, or other motion values of the models may also be utilized; col. 8, ln. 48-56; a smile for example having several muscles (control points) that function together (movement of polygons on the mesh of FIG. 7, for example) to produce the smile), wherein the mesh information is formed by a predetermined number of face key points (‘628; fig. 8, simple facial objects; col. 8, ln. 9-14), and change in a position of an intersection point of a mesh represents the expression change (‘628; fig. 8; col. 8, ln. 1-6; Animation, particularly facial, is mesh deformation achieved through movement of one or more vertices in a specific direction (both unidirectional and bi-directional). Such movement is specified by sets of parameters that 5 describe the range of movement or articulation for a particular segment or control point outlined in the model); and obtaining the expression feature of the face of the head and the intensity coefficient of the expression feature based on the mesh information (‘628; fig. 8; col. 8, ln. 1-6; Animation, particularly facial, is mesh deformation achieved through movement of one or more vertices in a specific direction (both unidirectional and bi-directional; col. 8, ln. 29-37; obtaining the expression feature of the head and the intensity coefficient of the expression feature based on the mesh information) for the benefit of enhancing the communicative abilities of an animated character.
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the invention to have combined the methods for expression feature of the head and the intensity coefficient of the expression feature comprising a mesh representation and feature points for representing movement points as taught by Cook with the image processing methods taught by Williams in view of Zhang for the benefit of enhancing the communicative abilities of an animated character.

Response to Arguments
Applicant’s arguments with respect to claims 1, 4-16 and 19-22 have been considered but are moot because the arguments apply to the amended independent claims and do not reflect the current combination of references and citations used in the current prior art rejection presented above that includes a new ground of rejection necessitated by the Applicant’s amendment.

The Applicant’s arguments filed 10 June 2022 are primarily based upon the extensively amended claim features incorporated into independent claims 1, 16 and 20.

The Examiner respectfully submits that, at the time Applicant argued against the references as applied to the independent claims, Applicant was arguing against limitations that had not been previously claimed and thus, were not previously examined nor addressed in the previous office action and requests that Applicant look to the Office Action provided above wherein these newly added limitations have now been examined and addressed and, in particular, the newly cited prior art reference of Zhang is now relied upon for showing many of the features added to the independent claims 1, 16 and 20.

Independent claims 1, 16 and 20 are rejected as shown in the first claim rejection section above and are argued as shown immediately above.

Dependent claims 4-15, 19, 21 and 22 are rejected for being dependent upon a rejected base claim and for the additional features that they add as shown in the claim rejection sections above.

Conclusion
The following prior art, made of record, was not relied upon but is considered pertinent to applicant's disclosure:
Tian et al.	"Facial Expression Analysis;" Chapter 11 in "Handbook of Face Recognition" – Facial expression analysis includes both measurement of facial motion and recognition of expression. The general approach to automatic facial expression analysis (AFEA) consists of three steps (Fig. 11.1): face acquisition, facial data extraction and representation, and facial expression recognition.

Lien et al.	"Subtly different facial expression recognition and expression intensity estimation" - We have developed a computer vision system, including both facial feature extraction and recognition, that automatically discriminates among subtly different facial expressions. Expression classification is based on Facial Action Coding System (FACS) action units (AUs), and discrimination is performed using Hidden Markov Models (HMMs). Three methods are developed to extract facial expression information for automatic recognition. The first method is facial feature point tracking using a coarse-to-fine pyramid method. This method is sensitive to subtle feature motion and is capable of handling large displacements with sub-pixel accuracy. The second method is dense flow tracking together with principal component analysis (PCA), where the entire facial motion information per frame is compressed to a lowdimensional weight vector. The third method is high gradient component (i.e., furrow) analysis in the spatiotemporal domain, which exploits the transient variation associated with the facial expression. Upon extraction of the facial information, non-rigid facial expression is separated from the rigid head motion component, and the face images are automatically aligned and normalized using an affine transformation. This system also provides expression intensity estimation, which has significant effect on the actual meaning of the expression.

Mahoor et al.	" A framework for automated measurement of the intensity of non-posed Facial Action Units" - This paper presents a framework to automatically measure the intensity of naturally occurring facial actions. Naturalistic expressions are non-posed spontaneous actions. The Facial Action Coding System (FACS) is the gold standard technique for describing facial expressions, which are parsed as comprehensive, nonoverlapping Action Units (AUs). AUs have intensities ranging from absent to maximal on a six-point metric (i.e., 0 to 5). Despite the efforts in recognizing the presence of non-posed action units, measuring their intensity has not been studied comprehensively. In this paper, we develop a framework to measure the intensity of AU12 (Lip Corner Puller) and AU6 (Cheek Raising) in videos captured from infant-mother live face-to-face communications. The AU12 and AU6 are the most challenging case of infant’s expressions (e.g., low facial texture in infant’s face). One of the problems in facial image analysis is the large dimensionality of the visual data. Our approach for solving this problem is to utilize the spectral regression technique to project high dimensionality facial images into a low dimensionality space. Represented facial images in the low dimensional space are utilized to train Support Vector Machine classifiers to predict the intensity of action units. Analysis of 18 minutes of captured video of non-posed facial expressions of several infants and mothers shows significant agreement between a human FACS coder and our approach, which makes it an efficient approach for automated measurement of the intensity of non-posed facial action units.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Edward Martello whose telephone number is (571) 270-1883.  The examiner can normally be reached on M-F 7:30-5:00 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Xiao Wu can be reached on (571) 272-7761.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/EDWARD MARTELLO/
Primary Examiner, Art Unit 2613