Detailed Action
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1 – 20 are rejected under 35 U.S.C. 103 as being unpatentable over 
Colbert US PGPub: US 2019/0287301 A1 Sep. 19, 2019 and in view of
Lin US PGPub: US 2019/0043269 A1 Feb. 7, 2019.

Regarding claims 1, 11, 20, Colbert discloses,
a computer-implemented method for generating training items for three- dimensional 3D, one or more non-transitory computer readable media including instructions that, when executed by one or more processors and a system comprising: one or more memories storing instructions; and one or more processors coupled to the one or more memories (synthesizing images of apparel ensembles on models, where neural networks of suitable topology are trained with sets of images, where one image of each set depicts a garment and another pair of images of each set depicts an item of apparel from multiple viewpoints, and a final image of each set depicts a model wearing the garment and the other item of apparel. Once trained, the neural network can synthesize a new image based on input images including an image of a garment and a pair of images of another item of apparel. Quantitative parameters controlling the image synthesis permit adjustment of features of the synthetic image, including skin tone, body shape and pose of the model, as well as characteristics of the garment and other items of apparel. the synthesis process exposes controls that can be used to adjust characteristics of the synthesized image. For example, the skin tone, body shape, pose and other characteristics of the model in the synthesized image may be varied over a useful range. This permits a user - e.g., a customer, to adjust a synthetic “as-worn” image to resemble the user more closely – ABSTRACT, Figs.1, 2, 4, 7, 8, paragraphs 0007, 0017, 0062), the method comprising: 

generating a plurality of posed 3D models based on a plurality of 3D poses and a 3D model of a first person wearing a first costume associated with a plurality of visual attributes (the user constructs three-dimensional models of several items of apparel. These may be, for example, shirts, pants, jackets, dresses, scarves, hats or shoes. The models may include information to support rendering photo-realistic images, such as material characteristics, weaves, colors, patterns, and so on. The models can be automatically manipulated, for example, by simulating their physical characteristics and the effect of forces such as gravity and inertia on their materials– Fig. 8/800, paragraph 0053. Render images of 3D models and construct poseable 3D model of human figure. This figure may include information like height, weight, body-part lengths and girths, hair and skin color, and so on – Fig. 8/820, paragraphs 0054, 0055); 

for each posed 3D model, performing at least one rendering operation to generate at least one synthetic image (a neural network is trained with pairs of images, where one image of a pair shows a garment, and the other image shows a model wearing the garment. Then, a new image is presented to the trained network, and a synthetic image that might be the new image's pair is automatically generated. The synthetic image is displayed to a user – paragraph 0007. These photographic images are provided to the trained neural network, which delivers a corresponding synthetic image that appears to show a model dressed in the apparel of the photographic images – Figs. 4/410, 5/510, 8/880, paragraph 0058); and 

for each synthetic image, generating a training item to include in a synthetic training dataset based on the synthetic image and a 3D pose associated with the posed 3D model from which the synthetic image was rendered (a photo-realistic image of the posed and dressed model is created by rendering 850, and the input images from 810, and the posed and dressed “target” image from 850 are used to train the neural network. Additional training images can be made by re-posing the human-figure model, dressing it again, and rendering another target image – Fig. 8/860, paragraph 0057. The output of network 230, once the GAN has been trained, is a synthetic image 240 that represents what garment 210 may look like when worn by a model – Fig. 2, paragraph 0024), 

but, does not disclose, pose estimation and
wherein the synthetic training dataset is tailored for training a machine-learning model to compute estimated 3D poses of persons from two-dimensional 2D input images.
Modeling garments using single view images method, the method includes receiving an image depicting a person wearing at least one garment. The method also includes constructing a body model based on the person in the image and a template from a body model database. The method further includes constructing at least one garment model based on the at least one garment in the image and at least one template from a garment model database. The method also includes constructing a combined model based on the body model and the at least one garment model. The method further includes adjusting the combined model by modifying body pose parameters and determining garment material properties and sizing parameters (ABSTRACT, Figs. 2 – 5, 17, 18, paragraphs 0005, 0006).

To estimate the clothing model, we first compute a semantic parse of the garments in the image to identify and localize depicted clothing item (Fig. 2, paragraphs 0035, 0036).
The subject matter described herein synthesize different ideas and extend these methods to process 2D input image and fluidly transfer the results to the simulation of 3D garments. Aspects of the subject matter described herein also include functionality for editing the 2D sewing patterns with information extracted from a single-view image, which can be used to guide the generation of garments of various sizes and styles (paragraph 0041).
Our algorithm, on the other hand, is able to fit the human body from a single 2D image with an optimized virtual outfit recovered from other images. We provide the optimized design pattern together with a 3D view of the garment fitted to the human body (paragraph 0095).

Capturing clothing from images or videos (paragraphs 0044, 0109).
Method for real-time single RGBD image human pose and shape estimation (paragraphs 0042, 0048).

It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the system for synthesizing images of apparel ensembles on models of Colbert (Colbert, ABSTRACT, Figs.1, 2, 4, 7, 8, paragraphs 0007, 0017, 0062), wherein the system of Colbert, would have incorporated generation of synthesize different ideas and extend these methods to process 2D input image and fluidly transfer the results to the simulation of 3D garments of Lin (Lin, ABSTRACT, Fig. 2, paragraphs Figs. 2 – 5, 17, 18, paragraphs 0035, 0036, 0041, 0042, 0044), for an application to virtual try-on garments on a joint material-pose optimization framework that reconstruct both body and cloth models with material properties from a single image (Lin, paragraphs 0002, 0037).

Regarding claims 2, 12, Colbert discloses,

the computer-implemented method of claim 1, wherein each visual attribute included in the plurality of visual attributes comprises a color (garment color – paragraph 0025. Synthetic color image – paragraph 0032. Garments may be grouped and presented by color, style, weight, designer, size, price, or any other desired arrangement – paragraph 0036. The models may include information to support rendering photo-realistic images, such as material characteristics, weaves, colors, patterns, and so on – paragraph 0053), a pattern (garments may be grouped and presented by color, style, weight, designer, size, price, or any other desired arrangement – paragraph 0036. The models may include information to support rendering photo-realistic images, such as material characteristics, weaves, colors, patterns, and so on – paragraph 0053), or a texture.

Regarding claim 3, Colbert discloses,

the computer-implemented method of claim 1, wherein the plurality of visual attributes differentiates between one or more body parts of the first person (a 3D poseable model of a human figure is created 820. This figure may include information like height, weight, body-part lengths and girths, hair and skin color, and so on. In fact, many different human-figure models may be created – Fig. 7, paragraph 0055. Model pose including whether the model is facing the camera or turned to one side, or the position of the arms or legs – paragraph 0026. The “garment” network could produce an image of a model wearing the input garment, and the “shoe” network could produce an image of a model's leg or a pair of legs wearing the input shoe – Fig. 7, paragraph 0047).

Regarding claims 4, 14, Colbert discloses,

the computer-implemented method of claim 1, wherein the first costume comprises at least one of a full suit, pants (pants 712 whose image was provided through input neural network 722 – Fig. 7/712, paragraph 0047. The user constructs three-dimensional models of several items of apparel 800. These may be, for example, shirts, pants, jackets, dresses, scarves, hats or shoes – paragraph 0053), an arm band, or a lower sleeve (Z-Vector elements may also control details such as dress length, sleeve length, or collar style – paragraphs 0026, 0030, 0048).

Regarding claim 5, Colbert discloses,

the computer-implemented method of claim 1, wherein generating the plurality of posed 3D models comprises, for a first 3D pose included in the plurality of 3D poses, fitting the 3D model of the first person wearing the first costume to the first 3D pose to generate a first posed 3D model (3D poseable model of a human figure is created 820. This figure may include information like height, weight, body-part lengths and girths, hair and skin color, and so on. In fact, many different human-figure models may be created – Fig. 7, paragraph 0055).

Regarding claims 6, 15, Colbert discloses,

the computer-implemented method of claim 1, wherein generating the plurality of posed 3D models comprises, for a first 3D pose included in the plurality of 3D poses: modifying at least one shape associated with the 3D model of the first person wearing the first costume to generate a modified 3D model (3D poseable model of a human figure is created 820. This figure may include information like height, weight, body-part lengths and girths, hair and skin color, and so on. In fact, many different human-figure models may be created – Fig. 7, paragraph 0055); and fitting the modified 3D model to the first 3D pose to generate a second posed 3D model (the “garment” network could produce an image of a model wearing the input garment, and the “shoe” network could produce an image of a model's leg or a pair of legs wearing the input shoe – Fig. 7, paragraph 0047).

Regarding claims 7, 16, Colbert discloses,

the computer-implemented method of claim 1, wherein, for a first posed 3D model, performing the at least one rendering operation comprises: rendering the first posed 3D model based on first environmental data to generate a first synthetic image; and rendering the first posed 3D model based on second environmental data to generate a second synthetic image (the “garment” network could produce an image of a model wearing the input garment, and the “shoe” network could produce an image of a model's leg or a pair of legs wearing the input shoe – Fig. 7, paragraph 0047).

Regarding claim 8, Colbert discloses,

the computer-implemented method of claim 1, wherein the machine-learning model comprises a convolutional neural network (neural network – ABSTRACT. Convolutional networks are known to those of ordinary skill in the art, and can be used to good effect by adhering to the principles and approaches described herein – paragraph 0031).

Regarding claims 9, 18, Colbert discloses,

the computer-implemented method of claim 1, further comprising performing at least one training operation on the machine-learning model based on the synthetic training dataset to generate a trained machine-learning model (a method of training a neural network. These training images can be created without lighting artifacts, minor pose variations and other defects that would be present in actual photos of a live model. With this fine-grained control over training images and the ability to automatically produce an arbitrary number of images, the neural networks can be trained efficiently, and with less risk of inadvertently training the network to recognize or respond to irrelevant information – Fig. 8, paragraphs 0015, 0052).

Regarding claim 10, Colbert discloses,

the computer-implemented method of claim 9, further comprising inputting a 2D input image that depicts a second person wearing a second costume associated with the plurality of visual attributes into the trained machine-learning model to generate an estimated 3D pose of the second person (the “garment” network could produce an image of a model wearing the input garment, and the “shoe” network could produce an image of a model's leg or a pair of legs wearing the input shoe – Fig. 7, paragraph 0047. A neural network is trained with pairs of images, where one image of a pair shows a garment, and the other image shows a model wearing the garment. Then, a new image is presented to the trained network, and a synthetic image that might be the new image's pair is automatically generated. The synthetic image is displayed to a user – paragraph 0007).

Here, examiner’s note is hereby taken, that once a system is able to display synthetic image to a user, it can also perform the same steps for the second person and generate an estimated 3D pose for the second person.

It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the system for synthesizing images of apparel ensembles on models of combined Colbert and Lin (combined Colbert and Lin, ABSTRACT, Figs.1, 2, 4, 7, 8, paragraphs 0007, 0017, 0062), wherein the system of combined Colbert and Lin, would have incorporated, that the system and method is used for a single person or group of persons for engineering and/or systems requirements and/or design options.

Regarding claim 13, Colbert discloses,

the one or more non-transitory computer readable media of claim 11, wherein the plurality of visual attributes differentiates between at least two of a portion of the left arm of the first person, a portion of the right arm of the first person, a portion of the chest of the first person, or a portion of the back of the first person (3D poseable model of a human figure is created 820. This figure may include information like height, weight, body-part lengths and girths, hair and skin color, and so on. In fact, many different human-figure models may be created – Fig. 7, paragraph 0055. Model pose including whether the model is facing the camera or turned to one side, or the position of the arms or legs – paragraph 0026. The “garment” network could produce an image of a model wearing the input garment, and the “shoe” network could produce an image of a model's leg or a pair of legs wearing the input shoe – Fig. 7, paragraph 0047).

Regarding claim 17, Colbert discloses,

the one or more non-transitory computer readable media of claim 16, wherein the first environmental data is associated with at least one of a lighting (these training images can be created without lighting artifacts, minor pose variations and other defects that would be present in actual photos of a live model – paragraph 0052), a camera viewpoint (control model pose - including whether the model is facing the camera or turned to one side, or the position of the arms or legs – paragraph 0026), or a texture.

Regarding claim 19, Colbert discloses,

the one or more non-transitory computer readable media of claim 18, further comprising inputting a 2D input image that depicts a second person wearing a second costume associated with the plurality of visual attributes into the trained machine- learning model to generate an estimated 3D pose of the second person (the “garment” network could produce an image of a model wearing the input garment, and the “shoe” network could produce an image of a model's leg or a pair of legs wearing the input shoe – Fig. 7, paragraph 0047. A neural network is trained with pairs of images, where one image of a pair shows a garment, and the other image shows a model wearing the garment. Then, a new image is presented to the trained network, and a synthetic image that might be the new image's pair is automatically generated. The synthetic image is displayed to a user – paragraph 0007).

Here, examiner’s note is hereby taken, that once a system is able to display synthetic image to a user, it can also perform the same steps for the second person and generate an estimated 3D pose for the second person.

It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the system for synthesizing images of apparel ensembles on models of combined Colbert and Lin (combined Colbert and Lin, ABSTRACT, Figs.1, 2, 4, 7, 8, paragraphs 0007, 0017, 0062), wherein the system of combined Colbert and Lin, would have incorporated, that the system and method is used for a single person or group of persons for engineering and/or systems requirements and/or design options.

But, does not disclose, wherein the 2D input image comprises a frame of a video sequence acquired via a video camera or a red green blue image acquired via a still camera.

Lin teaches, capturing clothing from images or videos (paragraphs 0044, 0109).
Method for real-time single RGBD image human pose and shape estimation (paragraphs 0042, 0048).

It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the system for synthesizing images of apparel ensembles on models of Colbert (Colbert, ABSTRACT, Figs.1, 2, 4, 7, 8, paragraphs 0007, 0017, 0062), wherein the system of Colbert, would have incorporated generation of synthesize different ideas and extend these methods to process 2D input image and fluidly transfer the results to the simulation of 3D garments of Lin (Lin, ABSTRACT, Fig. 2, paragraphs Figs. 2 – 5, 17, 18, paragraphs 0035, 0036, 0041, 0042, 0044), for an application to virtual try-on garments on a joint material-pose optimization framework that reconstruct both body and cloth models with material properties from a single image (Lin, paragraphs 0002, 0037).

The prior arts are made of record and not relied upon is considered pertinent to applicants disclosure.

Chen US PGPub: US 2018/0181802 A1 Jun. 28, 2018.
Recognizing combinations of body shape, pose and clothing in three-dimensional input images (ABSTRACT, Figs. 2, 3, 10, paragraph 0007).
The machine learning algorithm 106 used by the training module 102 is an auto-encoder neural network model 500. The auto-encoder neural network model 500 is trained to learn feature descriptors 108 for synthetic training images. The training process involves iteratively modifying a structure of the auto-encoder neural network model 500 to encode different features of a depth map as dimensions in a feature vector. For example, a training process for the auto-encoder neural network model 500 involves one or more modifications such as (but not limited to) changing the number of nodes in the auto-encoder neural network model 500, changing the number of layers in the auto-encoder neural network model 500, changing one or more mapping functions used in the auto-encoder neural network model 500, changing the number of dimensions included in the feature vector 506 that is extracted by the encoder module 504, adjusting the number of dimensions from the feature vector 506 that are used to encode a certain variation (e.g., body pose, viewpoint, body shape, etc.), modifying connections between layers in auto-encoder neural network model 500 (e.g., adding connections between non-successive layers), etc. The auto-encoder neural network model 500 is thereby configured to detect body shape, body pose, and clothing from a depth map of an input image (e.g., a 3D scan) and to generate a feature descriptor describing the body shape, body pose, and clothing (Fig. 5, paragraph 0068).

Kopeinigg US PGPub: US 2020/0312037 A1 Oct. 1, 2020.
Method and system for generating an animated 3D model based on a 2D image. An illustrative volumetric capture system accesses a two-dimensional (“2D”) image captured by a capture device and depicting a first subject of a particular subject type. The volumetric capture system generates a custom three-dimensional (“3D”) model of the first subject by identifying a parameter representative of a characteristic of the first subject, applying the parameter to a parametric 3D model to generate a custom mesh, and applying a custom texture based on the 2D image to the custom mesh (ABSTRACT, Figs. 2, 3, 4, 9, paragraph 0014).

Bleiweiss US PGPub: US 2017/0169620 A1 Jun. 15, 2017.
Bleiweiss teaches, generation of synthetic 3-dimentional object images. Techniques for rendering multiple variations of 3-dimensional 3D object images based on a 3D model of the object. Each rendered 3D image comprises a pair of 2-dimensional 2D images: one of which provides a color image, where each pixel may have a standard red-green-blue RGB value; and the other provides a depth image, where each pixel encodes depth as a grayscale value. 3D images may be referred to as RGB-D images herein to emphasize that they are represented by a paring of a color image and a depth image. The generated synthetic 3D images can then be used as input to a machine learning system, which in turn outputs a classifier for real world object recognition (ABSTRACT, Figs. 1, 3A,3B, 5B, paragraph 0013). Camera 690 configured to provide 2D or 3D images or scans of an object from which a 3D model of the object is generated (Fig. 6/690, paragraph 0050).

Kristal US PGPub: US 2018/0047192 A1 Feb. 15, 2018.
The User Handler Module 4 prepares a user image, which can be taken from different client types, for the virtual dressing process. Using one or more of image processing, computer vision and machine learning, this module only needs a single user image that may be invariant, but is not limited to quality and camera angles in order to create a 3D estimation of the user (Fig. 4, paragraphs 0081, 0088).
An electronic device 110 running a client application 110, and may be further shared or sent by the user to selected recipients, such as via the social networks 112. The entity that receives the output of the Universal Dressing Module 8 could also be a system from which a product image was obtained, such as a vendor web server hosting an e-commerce website that offers the product for sale. In that case, a user interface element, such as a “Try It On” button, could be incorporated into the webpage to allow users to try on the product using the functionality of the system 2 (Fig. 38, paragraph 0123). The product images include the previously discussed “Try it On” user interface element that activates the process of Fig. 37 and virtually dresses the product on an image of a user (paragraph 0449).

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NIMESH PATEL whose telephone number is (571)270-1228. The examiner can normally be reached Monday thru Friday: 6:30 AM - 3:30 PM EST. 

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Rafael Perez-Gutierrez can be reached on 571-272-7915. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NIMESH PATEL/           Primary Examiner, Art Unit 2642