Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 7 and 15-16 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.

Claims 7 and 15 depend from dependent claims 6 and 14 (Indirectly from independent claims 1 and 9). They recite “the global position decoder machine-learned model …. the inverse kinematics decoder machine-learned model …”. However, none of them defines those machine-learned model. The limitations render those claims indefinite.

Dependent claim 16 is rejected because it depends upon dependent claim 15.
	
	
	
	
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 6, 9, 14 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Yang et al (U.S. Patent Application Publication 2019/0295305 A1) in view of Akhoundi (U.S. Patent Application Publication 2021/0308580 A1).

	Regarding claim 1, Yang discloses a system (Paragraph [0006], the disclosed systems use a neural network with a forward kinematics layer to generate a motion sequence for a target skeleton based on an initial motion sequence for an initial skeleton …) comprising: 
one or more computer processors (Paragraph [0213], FIG. 11 illustrates a block diagram of exemplary computing device 1100 that may be configured to perform one or more of the processes described above. The computing device 1100 can comprise a processor 1102); 
one or more computer memories (Paragraph [0213], the computing device 1100 can comprise  a memory 1104); 
a set of instructions stored in the one or more computer memories (Paragraph [0214], the memory 1104 may be a volatile or non-volatile memory used for storing data, metadata, and programs), the set of instructions configuring the one or more computer processors to perform operations (Paragraph [0214],  the processor 1102 may retrieve (or fetch) the instructions from an internal register, an internal cache, the memory 1104, or the storage device 1106 and decode and execute them), the operations comprising: 
a single input (Paragraph [0059], FIG. 3A illustrates the retargeted motion system using the motion synthesis neural network 300 to generate predicted joint features of a training target skeleton B based on training input joint features of a training initial skeleton A. FIG. 3B illustrates the retargeted motion system using the motion synthesis neural network 300 to determine a cycle consistency loss; paragraph [0061], input joint features 302a include joint positions 304a and global-motion parameters 306a into an encoder RNN 308), the variable numbers and types of supplied inputs (Paragraph [0061], joint positions 304a and global-motion parameters 306a) corresponding to one or more effector constraints for one or more joints (Paragraph [0061], joint positions 304a for joints of the training initial skeleton A and global-motion parameters 306a for a root joint of the training initial skeleton A) of a character (FIG. 1; paragraph [0045], character 104); 
transforming the single input into a pose embedding (Paragraph [0062], the encoder RNN 308 generates an encoded feature vector 310a … the encoded feature vector 310a comprises encoded representations of the training input joint features 302a), the pose embedding being a machine-learned representation of the single input (Paragraph [0063], the encoder RNN 308 outputs the encoded feature vector 310a as an input for the decoder RNN 312); and 
expanding the pose embedding into a pose representation output (Paragraph [0064], regardless of the kind of reference positions, by analyzing the encoded feature vector 310a, the decoder RNN 312 generates predicted joint rotations 316a for joints of the training target skeleton B; paragraph [0066], after the decoder RNN 312 generates the predicted joint rotations 316a, the forward kinematics layer 318 receives the predicted joint rotations 316a as inputs; paragraph [0067], the forward kinematics layer 318 applies a predicted rotation matrix to each joint of the training target skeleton B to generate predicted joint features 320a), the pose representation output (Paragraph [0068], the predicted joint features 320a) including local rotation data (Paragraph [0068], predicted joint positions 322a for joints of the training target skeleton B; paragraph [0067], the forward kinematics layer 318 applies the predicted joint rotations 316a to joints of the training target skeleton B with the reference joint positions 314) and global position data for the one or more joints of the character (Paragraph [0068], global-motion parameters 324a for a root joint of the training target skeleton B; paragraph [0044], the term “global-motion parameters” refers to velocities and rotation of a skeleton's root joint. In some embodiments, for example, the term “global-motion parameters” refers to velocities in three dimensions (x, y, and z directions) and a rotation of a skeleton's root joint with respect to an axis perpendicular to the ground … As used in this disclosure, the term “root joint” refers to a joint within a skeleton that functions as a reference for other joints within the skeleton. In particular, the term “root joint” refers to a joint within a skeleton having a higher position of hierarchy than all other joints within the skeleton's hierarchy).
However, Yang does not specifically disclose combining variable numbers and types of supplied inputs into a single input.
In the similar field of endeavor, Akhoundi discloses (Abstract, systems and methods are provided for enhanced pose generation based on generative modeling. An example method includes accessing an autoencoder trained based on poses of real-world persons, each pose being defined based on location information associated with joints, with the autoencoder being trained to map an input pose to a feature encoding associated with a latent feature space …; FIG. 1 shows pose representation system 100; paragraph [0043], the pose representation system 100 may train a machine learning model (e.g., an autoencoder) based on the multitude of poses. Thus, the multitude of poses may represent a batch of poses. In some embodiments, there may be a multitude of batches) combining variable numbers and types of supplied inputs into a single input (Paragraph [0038], two poses 102A-102B are illustrated as being included in the pose information 102. While two poses are illustrated, it may be appreciated that thousands, hundreds of thousands, millions, and so on, poses may be input to the pose representation system 100; paragraph [0042], pose A 102A includes joint A 104A corresponds to an elbow and joint B 104B corresponds to a knee; paragraph [0043],  in some embodiments, a multitude of poses (e.g., hundreds, thousands, and so on) may be provided to the pose representation system 100. As will be described below, the pose representation system 100 may train a machine learning model (e.g., an autoencoder) based on the multitude of poses. Thus, the multitude of poses may represent a batch of poses. Thus, the input of multitude of poses are combined into a pose information 102 and provided to the pose representation system).
Yang and Akhoundi are analogous art because both pertain to utilize the machine learning for generating the pose of character. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the system taught by Yang incorporate the teachings of Akhoundi, and applying the pose input taught by Akhoundi to combine a multitude of pose inputs into one pose information and provide a single input to the pose representation system for generating reconstructed pose information. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Yang according to the relied-upon teachings of Akhoundi to obtain the invention as specified in claim.

	Regarding claim 6, the combination of Yang in view of Akhoundi discloses everything claimed as applied above (see claim 1), and Yang further disclose (Paragraph [0059], FIGS. 3A-3C depict the retargeted motion system training a motion synthesis neural network 300 … FIG. 3C illustrates the retargeted motion system using a discriminator neural network to determine an adversarial loss) wherein the transforming includes applying a machine-learned model (Paragraph [0090], the retargeted motion system includes a discriminator neural network 336) that, during training (Paragraph [0090], the retargeted motion system inputs the predicted joint features 320a and joint offsets 334a into the discriminator neural network 336), uses a combination loss function (Paragraph [0090], the retargeted motion system applies a loss function 340 to determine an adversarial loss 342), the combination loss function including rotation and position error terms via randomized weights (Paragraphs [0096]-[0099],  the discriminator neural network 336 determines the realism score 338a using equation (14) … the discriminator neural network 336 determines the realism score 338b using equation (15); paragraph [0101], the retargeted motion system uses the following equation to switch between adversarial loss and square loss using equation (16);  the retargeted motion system randomly selects training target skeleton B as the training initial skeleton Ẋ1:TB; paragraph [0099],  Ẋ1:TB  =[ṗ_hd 1:TB, ṽ1:TB]; ṗ2:TB- ṗ1:T-1B represent predicted positions for joints of the training target skeleton B corresponding to each time of the training target motion sequence; ṽ1:T-1B represents global motion parameters (e.g., velocities and a rotation of the training target skeleton B's root joint) for each time of the training target motion sequence through T-1) based on randomly-generated effector tolerance levels (Paragraph [0102], β represents a balancing term that regulates the strength of a discriminator signal to modify the parameters of a motion synthesis neural network f to fool the discriminator neural network g. In some instances, for example, β=0.001).

	Regarding claim 9, Yang discloses a non-transitory computer-readable storage medium storing a set of instructions (Paragraph [0213], FIG. 11 illustrates a block diagram of exemplary computing device 1100 that may be configured to perform one or more of the processes described above. The computing device 1100 can comprise a processor 1102 and a memory 1104; paragraph [0214], the memory 1104 may be a volatile or non-volatile memory used for storing data, metadata, and programs) that, when executed by one or more computer processors, causes the one or more computer processors to perform operations (Paragraph [0214],  the processor 1102 may retrieve (or fetch) the instructions from an internal register, an internal cache, the memory 1104, or the storage device 1106 and decode and execute them), the operations comprising: - 54 -4147.093US1 
a single input (Paragraph [0059], FIG. 3A illustrates the retargeted motion system using the motion synthesis neural network 300 to generate predicted joint features of a training target skeleton B based on training input joint features of a training initial skeleton A. FIG. 3B illustrates the retargeted motion system using the motion synthesis neural network 300 to determine a cycle consistency loss; paragraph [0061], input joint features 302a include joint positions 304a and global-motion parameters 306a into an encoder RNN 308), the variable numbers and types of supplied inputs (Paragraph [0061], joint positions 304a and global-motion parameters 306a) corresponding to one or more effector constraints for one or more joints (Paragraph [0061], joint positions 304a for joints of the training initial skeleton A and global-motion parameters 306a for a root joint of the training initial skeleton A) of a character (FIG. 1; paragraph [0045], character 104); 
transforming the single input into a pose embedding (Paragraph [0062], the encoder RNN 308 generates an encoded feature vector 310a … the encoded feature vector 310a comprises encoded representations of the training input joint features 302a), the pose embedding being a machine-learned representation of the single input (Paragraph [0063], the encoder RNN 308 outputs the encoded feature vector 310a as an input for the decoder RNN 312); and 
expanding the pose embedding into a pose representation output (Paragraph [0064], regardless of the kind of reference positions, by analyzing the encoded feature vector 310a, the decoder RNN 312 generates predicted joint rotations 316a for joints of the training target skeleton B; paragraph [0066], after the decoder RNN 312 generates the predicted joint rotations 316a, the forward kinematics layer 318 receives the predicted joint rotations 316a as inputs; paragraph [0067], the forward kinematics layer 318 applies a predicted rotation matrix to each joint of the training target skeleton B to generate predicted joint features 320a), the pose representation output (Paragraph [0068], the predicted joint features 320a) including local rotation data (Paragraph [0068], predicted joint positions 322a for joints of the training target skeleton B; paragraph [0067], the forward kinematics layer 318 applies the predicted joint rotations 316a to joints of the training target skeleton B with the reference joint positions 314) and global position data for the one or more joints of the character (Paragraph [0068], global-motion parameters 324a for a root joint of the training target skeleton B; paragraph [0044], the term “global-motion parameters” refers to velocities and rotation of a skeleton's root joint. In some embodiments, for example, the term “global-motion parameters” refers to velocities in three dimensions (x, y, and z directions) and a rotation of a skeleton's root joint with respect to an axis perpendicular to the ground … As used in this disclosure, the term “root joint” refers to a joint within a skeleton that functions as a reference for other joints within the skeleton. In particular, the term “root joint” refers to a joint within a skeleton having a higher position of hierarchy than all other joints within the skeleton's hierarchy).
However, Yang does not specifically disclose combining variable numbers and types of supplied inputs into a single input.
In the similar field of endeavor, Akhoundi discloses (Abstract, systems and methods are provided for enhanced pose generation based on generative modeling. An example method includes accessing an autoencoder trained based on poses of real-world persons, each pose being defined based on location information associated with joints, with the autoencoder being trained to map an input pose to a feature encoding associated with a latent feature space …; FIG. 1 shows pose representation system 100; paragraph [0043], the pose representation system 100 may train a machine learning model (e.g., an autoencoder) based on the multitude of poses. Thus, the multitude of poses may represent a batch of poses. In some embodiments, there may be a multitude of batches) combining variable numbers and types of supplied inputs into a single input (Paragraph [0038], two poses 102A-102B are illustrated as being included in the pose information 102. While two poses are illustrated, it may be appreciated that thousands, hundreds of thousands, millions, and so on, poses may be input to the pose representation system 100; paragraph [0042], pose A 102A includes joint A 104A corresponds to an elbow and joint B 104B corresponds to a knee; paragraph [0043],  in some embodiments, a multitude of poses (e.g., hundreds, thousands, and so on) may be provided to the pose representation system 100. As will be described below, the pose representation system 100 may train a machine learning model (e.g., an autoencoder) based on the multitude of poses. Thus, the multitude of poses may represent a batch of poses. Thus, the input of multitude of poses are combined into a pose information 102 and provided to the pose representation system).
Yang and Akhoundi are analogous art because both pertain to utilize the machine learning for generating the pose of character. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the system taught by Yang incorporate the teachings of Akhoundi, and applying the pose input taught by Akhoundi to combine a multitude of pose inputs into one pose information and provide a single input to the pose representation system for generating reconstructed pose information. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Yang according to the relied-upon teachings of Akhoundi to obtain the invention as specified in claim.

	Regarding claim 14, the combination of Yang in view of Akhoundi discloses everything claimed as applied above (see claim 9), and Yang further disclose (Paragraph [0059], FIGS. 3A-3C depict the retargeted motion system training a motion synthesis neural network 300 … FIG. 3C illustrates the retargeted motion system using a discriminator neural network to determine an adversarial loss)  wherein the transforming includes applying a machine-learned model (Paragraph [0090], the retargeted motion system includes a discriminator neural network 336) that, during training (Paragraph [0090], the retargeted motion system inputs the predicted joint features 320a and joint offsets 334a into the discriminator neural network 336), uses a combination loss function (Paragraph [0090], the retargeted motion system applies a loss function 340 to determine an adversarial loss 342), the combination loss function including rotation and position error terms via randomized weights (Paragraphs [0096]-[0099],  the discriminator neural network 336 determines the realism score 338a using equation (14) … the discriminator neural network 336 determines the realism score 338b using equation (15); paragraph [0101], the retargeted motion system uses the following equation to switch between adversarial loss and square loss using equation (16);  the retargeted motion system randomly selects training target skeleton B as the training initial skeleton Ẋ1:TB; paragraph [0099],  Ẋ1:TB  =[ṗ_hd 1:TB, ṽ1:TB]; ṗ2:TB- ṗ1:T-1B represent predicted positions for joints of the training target skeleton B corresponding to each time of the training target motion sequence; ṽ1:T-1B represents global motion parameters (e.g., velocities and a rotation of the training target skeleton B's root joint) for each time of the training target motion sequence through T-1) based on randomly-generated effector tolerance levels (Paragraph [0102], β represents a balancing term that regulates the strength of a discriminator signal to modify the parameters of a motion synthesis neural network f to fool the discriminator neural network g. In some instances, for example, β=0.001).

	Regarding claim 17, Yang discloses a method (Paragraph [0006], this disclosure describes one or more embodiments of methods) comprising: 
a single input (Paragraph [0059], FIG. 3A illustrates the retargeted motion system using the motion synthesis neural network 300 to generate predicted joint features of a training target skeleton B based on training input joint features of a training initial skeleton A. FIG. 3B illustrates the retargeted motion system using the motion synthesis neural network 300 to determine a cycle consistency loss; paragraph [0061], input joint features 302a include joint positions 304a and global-motion parameters 306a into an encoder RNN 308), the variable numbers and types of supplied inputs (Paragraph [0061], joint positions 304a and global-motion parameters 306a) corresponding to one or more effector constraints for one or more joints (Paragraph [0061], joint positions 304a for joints of the training initial skeleton A and global-motion parameters 306a for a root joint of the training initial skeleton A) of a character (FIG. 1; paragraph [0045], character 104); - 57 -4147.093US1 
transforming the single input into a pose embedding (Paragraph [0062], the encoder RNN 308 generates an encoded feature vector 310a … the encoded feature vector 310a comprises encoded representations of the training input joint features 302a), the pose embedding being a machine-learned representation of the single input (Paragraph [0063], the encoder RNN 308 outputs the encoded feature vector 310a as an input for the decoder RNN 312); and 
expanding the pose embedding into a pose representation output (Paragraph [0064], regardless of the kind of reference positions, by analyzing the encoded feature vector 310a, the decoder RNN 312 generates predicted joint rotations 316a for joints of the training target skeleton B; paragraph [0066], after the decoder RNN 312 generates the predicted joint rotations 316a, the forward kinematics layer 318 receives the predicted joint rotations 316a as inputs; paragraph [0067], the forward kinematics layer 318 applies a predicted rotation matrix to each joint of the training target skeleton B to generate predicted joint features 320a), the pose representation output (Paragraph [0068], the predicted joint features 320a) including local rotation data (Paragraph [0068], predicted joint positions 322a for joints of the training target skeleton B; paragraph [0067], the forward kinematics layer 318 applies the predicted joint rotations 316a to joints of the training target skeleton B with the reference joint positions 314) and global position data for the one or more joints of the character  (Paragraph [0068], global-motion parameters 324a for a root joint of the training target skeleton B; paragraph [0044], the term “global-motion parameters” refers to velocities and rotation of a skeleton's root joint. In some embodiments, for example, the term “global-motion parameters” refers to velocities in three dimensions (x, y, and z directions) and a rotation of a skeleton's root joint with respect to an axis perpendicular to the ground … As used in this disclosure, the term “root joint” refers to a joint within a skeleton that functions as a reference for other joints within the skeleton. In particular, the term “root joint” refers to a joint within a skeleton having a higher position of hierarchy than all other joints within the skeleton's hierarchy).
However, Yang does not specifically disclose combining variable numbers and types of supplied inputs into a single input.
In the similar field of endeavor, Akhoundi discloses (Abstract, systems and methods are provided for enhanced pose generation based on generative modeling. An example method includes accessing an autoencoder trained based on poses of real-world persons, each pose being defined based on location information associated with joints, with the autoencoder being trained to map an input pose to a feature encoding associated with a latent feature space …; FIG. 1 shows pose representation system 100; paragraph [0043], the pose representation system 100 may train a machine learning model (e.g., an autoencoder) based on the multitude of poses. Thus, the multitude of poses may represent a batch of poses. In some embodiments, there may be a multitude of batches) combining variable numbers and types of supplied inputs into a single input (Paragraph [0038], two poses 102A-102B are illustrated as being included in the pose information 102. While two poses are illustrated, it may be appreciated that thousands, hundreds of thousands, millions, and so on, poses may be input to the pose representation system 100; paragraph [0042], pose A 102A includes joint A 104A corresponds to an elbow and joint B 104B corresponds to a knee; paragraph [0043],  in some embodiments, a multitude of poses (e.g., hundreds, thousands, and so on) may be provided to the pose representation system 100. As will be described below, the pose representation system 100 may train a machine learning model (e.g., an autoencoder) based on the multitude of poses. Thus, the multitude of poses may represent a batch of poses. Thus, the input of multitude of poses are combined into a pose information 102 and provided to the pose representation system).
Yang and Akhoundi are analogous art because both pertain to utilize the machine learning for generating the pose of character. It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the system taught by Yang incorporate the teachings of Akhoundi, and applying the pose input taught by Akhoundi to combine a multitude of pose inputs into one pose information and provide a single input to the pose representation system for generating reconstructed pose information. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Yang according to the relied-upon teachings of Akhoundi to obtain the invention as specified in claim.

Allowable Subject Matter
Claims 2-5, 8, 10-13 and 18-20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Dependent claims 2, 10 and 18 depend from independent claims 1, 9 and 17. They recite additional limitations of “wherein the expanding of the pose embedding into a pose representation output includes the following operations: translating the pose embedding into unconstrained predictions of 3D joint positions for the one or more - 52 -4147.093US1 joints, the translating using a global position decoder machine learned model; and generating the local rotation angles for the one or more joints, the generating using an inverse kinematic decoder machine-learned model” for supporting “the expanding of the pose embedding into a pose representation output” recited in independent claims 1, 9 and 17. 
Examiner discovered the prior art reference MANGALAM et al (U.S. Patent Application Publication 2021/0097266 A1). MANGALAM discloses a network architecture for completing the detected human poses and for disentangling global and local streams (As shown in FIGS. 4, 5A and 5B). However, MANGALAM does not specifically disclose “generating the local rotation angles for the one or more joints, the generating using an inverse kinematic decoder machine-learned model”. The search results fail to show the obviousness of the claims as a whole. None of the prior art cited alone or in combination provides the motivation to teach the limitations.

Dependent claim 3-4 depend from dependent claim 2, dependent claim 8 depends from dependent claim 3, dependent claims 11-12 depends from dependent claim 10, dependent claims 19-20 depends from dependent claim 18. They have the same reasons at least due to their respective dependencies from an objected claim.

Dependent claims 5 and 13 depend from independent claims 1 and 9. They recite additional limitations of “wherein the transforming of the single input is performed by a pose encoder machine learned model, wherein the pose encoder machine learned model includes a connected residual neural network topology that includes a plurality of blocks connected using residual connections, and with each block including - 53 -4147.093US1 a prototype layer to combine outputs from the plurality of blocks, and wherein a block of the plurality of blocks includes a plurality of connected neural network layers with residual connections” to specific “the transforming of the single input is performed by a pose encoder machine learned mode” and describe a connected residual neural network topology for the pose encoder machine learned model.
The search results fail to show the obviousness of the claims as a whole. None of the prior art cited alone or in combination provides the motivation to teach the limitations recited in claims 5 and 13.

	
Examiner’s Comment
Claims 7 and 15-16 have not art rejection but rejected under U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph. A final determination of patentability, after further search, will be mode upon resolution of above 35 U.S.C. 112 rejection.

	
Conclusion

	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Xilin Guo whose telephone number is (571)272-5786. The examiner can normally be reached Monday - Friday 9:00 AM-5:30 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on 571-272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/XILIN GUO/Primary Examiner, Art Unit 2616