Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Priority
Receipt is acknowledged of certified copies of papers submitted under 35 U.S.C. 119(a)-(d), which papers have been placed of record in the file.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 13-15 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.

Claim 13 recites the limitation "VGG-19 neural network" in lines 1.  The term " VGG-19" is a relative term which renders the claim indefinite. The term " VGG-19" is not defined by the claim. There is insufficient antecedent basis for this limitation in the claim.



Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claims 1-3 and 16-20 are rejected under 35 U.S.C. 102 (a)(2) as being anticipated by Ayush et al  (U.S. Patent Application Publication 2021/0142539 A1).

	Regarding claim 1, Ayush discloses a server (FIG. 1 server 104) for generating an virtual clothing wearing image based on deep-learning, comprising: 
a communicator (Paragraph [0143], FIG. 14 shows an example computing device 1400 (e.g., the computing device 1100, the client device 108, and/or the server(s) 104), the computing device can comprise a communication interface 1410; paragraph [0149], the communication interface 1410 can provide one or more interfaces for communication (such as, for example, packet-based communication) between the computing device and one or more other computing devices 1400 or one or more networks; paragraph [0043], as shown, the system environment includes server(s) 104, a client device 108 and a network 112.  Each of the components of the system environment can communicate via the network 112, and the network 112 may be any suitable network over which computing devices can communicate.  Example networks are discussed in more detail below in relation to FIG. 14) configured to receive a user image and a clothing image for virtual 5wearing (Paragraph [0044], the client device 108 can communicate with the server(s) 104 via the network 112.  For example, the client device 108 can receive user input from a user interacting with the client device 108 (e.g., via the client application 110) to request generation of a virtual try-on digital image; paragraph [0045], as shown, the client device 108 includes a client application 110.  In particular, the client application 110 may be a web application, a native application installed on the client device 108 (e.g., a mobile application, a desktop application, etc.), or a cloud-based application where all or part of the functionality is performed by the server(s) 104. For example, the client application 110 can present an online catalog of model digital images and product digital images for browsing.  A user can interact with the client application 110 to provide user input to, for example, change a product worn by a model in a model digital image (e.g., an uploaded digital image of the user). Thus,  an uploaded digital image of the user is a user image “for example, as shown in FIG. 3, 302; paragraph [0055], the virtual try-on digital image generation system 102 receives the model digital image 302 as an upload from the client device 108 or captures the model digital image 302 via the client device 108” and a selected product digital image is a clothing image for wearing “for example, as shown in FIG. 3, 304”); 
a memory (Paragraph [0112], FIG. 11 illustrates an example schematic diagram of the virtual try-on digital image generation system 102. the virtual try-on digital image generation system 102 may include a storage manager 1110) configured to store a virtual clothing wearing deep-learning model (Paragraph [0112], the storage manager 1110 can include one or more memory devices that store various data such as model digital images, product digital images, warped versions of a product digital images, neural networks, and/or warping parameters; paragraph [0038], the term "neural network" refers to a machine learning model that can be trained and/or tuned based on inputs to determine classifications or approximate unknown functions.  In particular, the term neural network can include a model of interconnected artificial neurons (e.g., organized in layers) that communicate and learn to approximate complex functions and generate outputs (e.g., determinations of digital image classes) based on a plurality of inputs provided to the neural network.  In addition, a neural network can refer to an algorithm (or set of algorithms) that implements deep learning techniques to model high-level abstractions in data) including a first deep-learning model (Paragraph [0054], FIG. 3 shows the virtual try-on digital image generation system 102 utilizes a coarse regression neural network 308 and fine regression neural network 320 as a multi-stage coarse-to-fine process for generating the fine warped product digital image 322) and a second deep-learning model (Paragraph [0097], FIG. 8 shows the virtual try-on digital image generation system 102 can utilize a neural network 804 to generate the virtual try-on digital image 814); 
a processor (Paragraph [0118], the components of the virtual try-on digital image generation system 102 can include one or more instructions stored on a computer-readable storage medium and executable by processors of one or more computing devices (e.g., the computing device 1100)) configured to generate a virtual wearing image of virtually dressing a cloth (Paragraph [0118], when executed by the one or more processors, the computer-executable instructions of the virtual try-on digital image generation system 102 can cause the computing device 1100 to perform the methods described herein), included the clothing image, on a user, included in the user image, using the virtual clothing 10wearing deep-learning model (Paragraph [0097], the virtual try-on digital image generation system 102 can utilize a neural network 804 to generate the virtual try-on digital image 814), wherein the processor is configured to: 
generate, by the first deep-learning model (Paragraph [0054], FIG. 3 show the virtual try-on digital image generation system 102 utilizes a coarse regression neural network 308 and fine regression neural network 320 as a multi-stage coarse-to-fine process for generating the fine warped product digital image 322; paragraph [0060], the virtual try-on digital image generation system 102 inputs the digital image priors 306 and the product digital image 304 into a coarse regression neural network 308), an image (Paragraph [0054], in one or more embodiments, the description of FIG. 3, including the disclosed algorithms, provide the corresponding structure for performing a step for coarse-to-fine warping of the product digital image 304 to align with the model digital image 302) of a transformed virtual wearing clothing by transforming the received clothing image (Paragraph [0060], based on analyzing the digital image priors 306 and the product digital image 304 using its various constituent components/layers, the coarse regression neural network 308 outputs a coarse warped product digital image 318.  In particular, the virtual try-on digital image generation system 102 generates the coarse warped product digital image 318 by (coarsely) modifying one or more portions of the product digital image 304 in accordance with coarse transformation parameters learned by the coarse regression neural network 308.  For example, the virtual try-on digital image generation system 102 modifies the product digital image 304 by moving portions to align with a shape and a pose of the model digital image 302 (as indicated by the digital image priors 306); paragraph [0063], the virtual try-on digital image generation system 102 utilizes the coarse regression neural network 308 to generate the coarse transformation parameters ɵ in the form of a coarse offset matrix.  In particular, the virtual try-on digital image generation system 102 generates a coarse offset matrix that includes coarse modifications for modifying portions of the product digital image to align with a pose and a shape of the model digital image.  Indeed, different fields of the coarse offset matrix can include different offsets or other transformation parameters that indicate how to modify respective portions of the product digital image 304; paragraph [0065], based on generating the coarse warped product digital image 318 (Istn0), the virtual try-on digital image generation system 102 can further generate a fine warped product digital image 322) in accordance with a body of the user included in the received user image (Paragraph [0056], the virtual try-on digital image generation system 102 accesses or determines digital image priors 306 (Ipriors) for the model digital image 302.  For example, the virtual try-on digital image generation system 102 determines shape priors as an outline of a shape of a model in the model digital image 302 and pose priors as locations of anchor points for joints or other portions of the model in the model digital image 302.  As shown in FIG. 3, the virtual try-on digital image generation system 102 determines shape priors in the form of a white silhouette outlining the shape of the model in the model digital image 302.  The virtual try-on digital image generation system 102 also determines pose priors in the form of points indicating particular portions of the model in the model digital image 302 such as a chin, a head, shoulders, hands, and hips to give an indication of the pose of the model.  The digital image priors 306 can leave out effects of clothes (like color, texture, and shape), while preserving the person's face, hair, body shape, and pose), and 
generate, by the second deep-learning model (Paragraph [0097], FIG. 8 shows the virtual try-on digital image generation system 102 can utilize a neural network 804 to generate the virtual try-on digital image 814), the virtual wearing image (Paragraph [0099], the virtual try-on digital image generation system 102 further generates the virtual try-on digital image 814 by combining the composition mask 810, the rendered person image 812, and the fine warped product digital image 322) by dressing the 15transformed virtual wearing clothing (Paragraph [0098], the fine warped product digital image 322), generated by the first deep-learning model, on the body of the user of the received user image (Paragraph [0097], for example, the virtual try-on digital image generation system 102 inputs the fine warped product digital image 322, the corrected segmentation mask 608, and texture translation priors 802 of the model digital image 302 into the neural network 804.  The virtual try-on digital image generation system 102 generates or identifies the texture translation priors 802 which can include pixels of the model digital image 302 that are unaffected such as face pixels and pixels of a product not being replaced in the model digital image 302 (e.g., pants in the illustrated case); paragraph [0098], as shown, the virtual try-on digital image generation system 102 utilizes a convolutional encoder 806 to extract features relating to the texture translation priors 802, the corrected segmentation mask 608, and the fine warped product digital image 322.  The virtual try-on digital image generation system 102 further pass these features through an upsampling convolutional decoder 808 (and/or other components/layers) to generate two outputs--an RGB rendered person image 812 and a composition mask 810.  For example, the neural network 804 produces a 4-channel output where three channels are the R, G, and B values of the rendered person image 812, and the fourth channel is the composite mask 810).

	Regarding claim 2, Ayush discloses everything claimed as applied above (see claim 1), and Ayush further disclose wherein the first deep-learning model includes a first-1 deep-learning model (FIG. 3; paragraph [0054], the virtual try-on digital image generation system 102 utilizes a coarse regression neural network 308) and a first- 202 deep-learning model (Paragraph [0054], fine regression neural network 320), 
wherein the first-1 deep-learning model is configured to generate a first-1 transformation virtual wearing clothing image (Paragraph [0055], the virtual try-on digital image generation system 102 identifies or receives a model digital image 302 (Im) and a product digital image 304 (Ip); paragraph [0060], based on analyzing the digital image priors 306 and the product digital image 304 using its various constituent components/layers, the coarse regression neural network 308 outputs a coarse warped product digital image 318) by performing Perspective Transformation to the received clothing image to match with a direction of the body of the user included in the received user image (Paragraph [0056], the virtual try-on digital image generation system 102 accesses or determines digital image priors 306 (Ipriors) for the model digital image 302.  For example, the virtual try-on digital image generation system 102 determines shape priors as an outline of a shape of a model in the model digital image 302 and pose priors as locations of anchor points for joints or other portions of the model in the model digital image 302.  As shown in FIG. 3, the virtual try-on digital image generation system 102 determines shape priors in the form of a white silhouette outlining the shape of the model in the model digital image 302.  The virtual try-on digital image generation system 102 also determines pose priors in the form of points indicating particular portions of the model in the model digital image 302 such as a chin, a head, shoulders, hands, and hips to give an indication of the pose of the model.  The digital image priors 306 can leave out effects of clothes (like color, texture, and shape), while preserving the person's face, hair, body shape, and pose; paragraphs [0057]-[0059, the digital image priors 306 are a 19-channel map of pose and body-shape map …The pose heatmap can comprise an 18-channel feature map with each channel corresponding to a human pose keypoint.  To leverage the spatial layout, the virtual try-on digital image generation system 102 can transform each keypoint into a heatmap, with and 11x11 neighborhood around the keypoint filled with ones and zeroes everywhere else…; paragraph [0060], the virtual try-on digital image generation system 102 generates the coarse warped product digital image 318 by (coarsely) modifying one or more portions of the product digital image 304 in accordance with coarse transformation parameters learned by the coarse regression neural network 308.  For example, the virtual try-on digital image generation system 102 modifies the product digital image 304 by moving portions to align with a shape and a pose of the model digital image 302 (as indicated by the digital image priors 306); paragraph [0063], the virtual try-on digital image generation system 102 utilizes the coarse regression neural network 308 to generate the coarse transformation parameters ɵ in the form of a coarse offset matrix.  In particular, the virtual try-on digital image generation system 102 generates a coarse offset matrix that includes coarse modifications for modifying portions of the product digital image to align with a pose and a shape of the model digital image.  Indeed, different fields of the coarse offset matrix can include different offsets or other transformation parameters that indicate how to modify respective portions of the product digital image 304), and 
25wherein the first-2 deep-learning model is configured to generate an image of first-2 42transformation virtual wearing clothing (Paragraph [0065], the virtual try-on digital image generation system 102 can determine fine modifications or fine transformations to make to the coarse warped product digital image 318 to more closely align the depicted product with the shape and the pose of the digital image priors 306 (or, by association, the model digital image 302).  As shown in FIG. 3, the virtual try-on digital image generation system 102 inputs the coarse warped product digital image 318 into a fine regression neural network 320 to generate the fine warped product digital image 322 (Istn1)) by transforming the first-1 transformation virtual wearing clothing (Paragraph [0060], the coarse regression neural network 308 outputs a coarse warped product digital image 318), included in the image of the first-1 transformation virtual wearing clothing, to be matched with a shape of the body of the user included in the received user image (Paragraph [0066]-[0067], the virtual try-on digital image generation system 102 inputs the coarse warped product digital image 318 into a convolutional encoder 323 of the fine regression neural network 320.  The convolutional encoder 323, in turn, encodes or generates a feature representations of the coarse warped product digital image 318 including observable features and/or hidden latent features.  The virtual try-on digital image generation system 102 passes the features through a feature correlator 324 along with the digital image priors 306 to determine relationships or correlations between the features of the coarse warped product digital image 318 and the digital image priors 306. Based on these relationships, the virtual try-on digital image generation system 102 can determine how much warping or transformation is still required to align with the digital image priors 306.  Indeed, the virtual try-on digital image generation system 102 passes the correlated features/relationships to a regressor 326 to determine a difference or a change of the coarse warped product digital image 318 still required to align with the digital image priors 306).  520

Regarding claim 3, Ayush discloses everything claimed as applied above (see claim 2), and Ayush further disclose wherein the second deep-learning model includes a second-1 deep-learning model (Paragraph [0097], FIG. 8 shows the virtual try-on digital image generation system 102 can utilize a neural network 804) configured to generate a synthesis mask image (Paragraph [0098], the virtual try-on digital image 
generation system 102 further pass these features through an upsampling convolutional decoder 808 (and/or other components/layers) to generate two outputs-- a composition mask 810) and an intermediate person image (Paragraph [0098],  an RGB rendered person image 812. For example, the neural network 804 produces a 4-channel output where three channels are the R, G, and B values of the rendered person image 812) based on the image of first-2 transformation virtual wearing clothing and the received user image (Paragraph [0097], the virtual try-on digital image generation system 102 inputs the fine warped product digital image 322), and 
wherein the second deep-learning model is configured to generate a first virtual wearing 10person image (Paragraph [0099], using these two outputs, the virtual try-on digital image generation system 102 further generates the virtual try-on digital image 814) by synthesizing the synthesis mask image, the intermediate person image and the image of first-2 transformation virtual wearing clothing (Paragraph [0099], the virtual try-on digital image generation system 102 further generates the virtual try-on digital image 814 by combining the composition mask 810, the rendered person image 812, and the fine warped product digital image 322).  

Regarding claim 16, Ayush discloses a terminal (FIG. 1 Client device 108), comprising: 
20a communicator (Paragraph [0143], FIG. 14 shows an example computing device 1400 (e.g., the computing device 1100, the client device 108, and/or the server(s) 104), the computing device can comprise a communication interface 1410; paragraph [0149], the communication interface 1410 can provide one or more interfaces for communication (such as, for example, packet-based communication) between the computing device and one or more other computing devices 1400 or one or more networks; paragraph [0043], as shown, the system environment includes server(s) 104, a client device 108 and a network 112.  Each of the components of the system environment can communicate via the network 112, and the network 112 may be any suitable network over which computing devices can communicate.  Example networks are discussed in more detail below in relation to FIG. 14) configured to transmit a user image and a clothing image for virtual wearing (Paragraph [0044], the client device 108 can communicate with the server(s) 104 via the network 112.  For example, the client device 108 can receive user input from a user interacting with the client device 108 (e.g., via the client application 110) to request generation of a virtual try-on digital image; paragraph [0045], as shown, the client device 108 includes a client application 110.  In particular, the client application 110 may be a web application, a native application installed on the client device 108 (e.g., a mobile application, a desktop application, etc.), or a cloud-based application where all or part of the functionality is performed by the server(s) 104. For example, the client application 110 can present an online catalog of model digital images and product digital images for browsing.  A user can interact with the client application 110 to provide user input to, for example, change a product worn by a model in a model digital image (e.g., an uploaded digital image of the user). Thus,  an uploaded digital image of the user is a user image “for example, as shown in FIG. 3, 302; paragraph [0055], the virtual try-on digital image generation system 102 receives the model digital image 302 as an upload from the client device 108 or captures the model digital image 302 via the client device 108” and a selected product digital image is a clothing image for wearing “for example, as shown in FIG. 3, 304”); 
at least one processor (Paragraph [0143], the computing device can comprise a processor 1402) configured to provide a virtual wearing image of virtually dressing a cloth, included in the clothing image, to a user, included in the user image (Paragraph [0045], as shown, the client device 108 includes a client application 110.  In particular, the client application 110 may be a web application, a native application installed on the client device 108 (e.g., a mobile application, a desktop application, etc.), or a cloud-based application where all or part of the functionality is performed by the server(s) 104. For example, the client application 110 can present an online catalog of model digital images and product digital images for browsing.  A user can interact with the client application 110 to provide user input to, for example, change a product worn by a model in a model digital image (e.g., an uploaded digital image of the user). Thus,  an uploaded digital image of the user is a user image “for example, as shown in FIG. 3, 302), Paragraph [0045], as shown, the client device 108 includes a client application 110); and 
25a memory (Paragraph [0143], the computing device can comprise memory 1404) configured to store the virtual clothing wearing service request program that (Paragraph [0145], the memory 1404 may be used for storing data, metadata, and programs for execution by the processor(s)), 46if executed by the at least one processor, configure the at least one processor to: 
select the user image and the clothing image for virtual wearing (Paragraph [0045], as shown, the client device 108 includes a client application 110.  In particular, the client application 110 may be a web application, a native application installed on the client device 108 (e.g., a mobile application, a desktop application, etc.), or a cloud-based application where all or part of the functionality is performed by the server(s) 104. For example, the client application 110 can present an online catalog of model digital images and product digital images for browsing.  A user can interact with the client application 110 to provide user input to, for example, change a product worn by a model in a model digital image (e.g., an uploaded digital image of the user). Thus,  an uploaded digital image of the user is a user image “for example, as shown in FIG. 3, 302; paragraph [0055], the virtual try-on digital image generation system 102 receives the model digital image 302 as an upload from the client device 108 or captures the model digital image 302 via the client device 108” and a selected product digital image is a clothing image for wearing “for example, as shown in FIG. 3, 304”), transmit the selected user image and the selected clothing image using the Paragraph [0044], the client device 108 can communicate with the server(s) 104 via the network 112.  For example, the client device 108 can receive user input from a user interacting with the client device 108 (e.g., via the client application 110) to request generation of a virtual try-on digital image), and 
5receive a virtual wearing person image generated by a virtual clothing wearing server based on deep-learning through the communicator (Paragraph [0046], the server(s) 104 may receive data from the client device 108 in the form of a request to generate a virtual try-on digital image.  In addition, the server(s) 104 can transmit data to the client device 108 to provide a virtual try-on digital image), 
wherein the virtual clothing wearing server based on the deep-learning (Paragraph [0112], the storage manager 1110 can include one or more memory devices that store various data such as model digital images, product digital images, warped versions of a product digital images, neural networks, and/or warping parameters; paragraph [0038], the term "neural network" refers to a machine learning model that can be trained and/or tuned based on inputs to determine classifications or approximate unknown functions.  In particular, the term neural network can include a model of interconnected artificial neurons (e.g., organized in layers) that communicate and learn to approximate complex functions and generate outputs (e.g., determinations of digital image classes) based on a plurality of inputs provided to the neural network.  In addition, a neural network can refer to an algorithm (or set of algorithms) that implements deep learning techniques to model high-level abstractions in data) includes a first deep-learning model (Paragraph [0054], FIG. 3 shows the virtual try-on digital image generation system 102 utilizes a coarse regression neural network 308 and fine regression neural network 320 as a multi-stage coarse-to-fine process for generating the fine warped product digital image 322) and a second deep-learning model (Paragraph [0097], FIG. 8 shows the virtual try-on digital image generation system 102 can utilize a neural network 804 to generate the virtual try-on digital image 814), the first deep-learning model (Paragraph [0054], FIG. 3 show the virtual try-on digital image generation system 102 utilizes a coarse regression neural network 308 and fine regression neural network 320 as a multi-stage coarse-to-fine process for generating the fine warped product digital image 322; paragraph [0060], the virtual try-on digital image generation system 102 inputs the digital image priors 306 and the product digital image 304 into a coarse regression neural network 308) is configured to generate an image (Paragraph [0054], in one or more embodiments, the description of FIG. 3, including the disclosed algorithms, provide the corresponding structure for performing a step for coarse-to-fine warping of the product digital image 304 to align with the model digital image 302) of a transformed virtual wearing clothing by transforming the 10clothing image (Paragraph [0060], based on analyzing the digital image priors 306 and the product digital image 304 using its various constituent components/layers, the coarse regression neural network 308 outputs a coarse warped product digital image 318.  In particular, the virtual try-on digital image generation system 102 generates the coarse warped product digital image 318 by (coarsely) modifying one or more portions of the product digital image 304 in accordance with coarse transformation parameters learned by the coarse regression neural network 308.  For example, the virtual try-on digital image generation system 102 modifies the product digital image 304 by moving portions to align with a shape and a pose of the model digital image 302 (as indicated by the digital image priors 306); paragraph [0063], the virtual try-on digital image generation system 102 utilizes the coarse regression neural network 308 to generate the coarse transformation parameters ɵ in the form of a coarse offset matrix.  In particular, the virtual try-on digital image generation system 102 generates a coarse offset matrix that includes coarse modifications for modifying portions of the product digital image to align with a pose and a shape of the model digital image.  Indeed, different fields of the coarse offset matrix can include different offsets or other transformation parameters that indicate how to modify respective portions of the product digital image 304; paragraph [0065], based on generating the coarse warped product digital image 318 (Istn0), the virtual try-on digital image generation system 102 can further generate a fine warped product digital image 322) in accordance with a body of the user included in the user image (Paragraph [0056], the virtual try-on digital image generation system 102 accesses or determines digital image priors 306 (Ipriors) for the model digital image 302.  For example, the virtual try-on digital image generation system 102 determines shape priors as an outline of a shape of a model in the model digital image 302 and pose priors as locations of anchor points for joints or other portions of the model in the model digital image 302.  As shown in FIG. 3, the virtual try-on digital image generation system 102 determines shape priors in the form of a white silhouette outlining the shape of the model in the model digital image 302.  The virtual try-on digital image generation system 102 also determines pose priors in the form of points indicating particular portions of the model in the model digital image 302 such as a chin, a head, shoulders, hands, and hips to give an indication of the pose of the model.  The digital image priors 306 can leave out effects of clothes (like color, texture, and shape), while preserving the person's face, hair, body shape, and pose), and the second deep-learning model (Paragraph [0097], FIG. 8 shows the virtual try-on digital image generation system 102 can utilize a neural network 804 to generate the virtual try-on digital image 814) is configured to generate the virtual wearing person image (Paragraph [0099], the virtual try-on digital image generation system 102 further generates the virtual try-on digital image 814 by combining the composition mask 810, the rendered person image 812, and the fine warped product digital image 322) by dressing the transformed virtual wearing clothing (Paragraph [0098], the fine warped product digital image 322), generated by the first deep-learning model, on the body of the user included in the user image (Paragraph [0097], for example, the virtual try-on digital image generation system 102 inputs the fine warped product digital image 322, the corrected segmentation mask 608, and texture translation priors 802 of the model digital image 302 into the neural network 804.  The virtual try-on digital image generation system 102 generates or identifies the texture translation priors 802 which can include pixels of the model digital image 302 that are unaffected such as face pixels and pixels of a product not being replaced in the model digital image 302 (e.g., pants in the illustrated case); paragraph [0098], as shown, the virtual try-on digital image generation system 102 utilizes a convolutional encoder 806 to extract features relating to the texture translation priors 802, the corrected segmentation mask 608, and the fine warped product digital image 322.  The virtual try-on digital image generation system 102 further pass these features through an upsampling convolutional decoder 808 (and/or other components/layers) to generate two outputs--an RGB rendered person image 812 and a composition mask 810.  For example, the neural network 804 produces a 4-channel output where three channels are the R, G, and B values of the rendered person image 812, and the fourth channel is the composite mask 810).

	Regarding claim 17, Ayush discloses a method for providing a virtual clothing wearing service by a virtual clothing wearing server based on deep-learning, the method comprising: 
obtaining a user image and a clothing image for virtual wearing (FIGS. 1, 11 and 14; paragraph [0044], the client device 108 can communicate with the server(s) 104 via the network 112.  For example, the client device 108 can receive user input from a user interacting with the client device 108 (e.g., via the client application 110) to request generation of a virtual try-on digital image; paragraph [0045], as shown, the client device 108 includes a client application 110.  In particular, the client application 110 may be a web application, a native application installed on the client device 108 (e.g., a mobile application, a desktop application, etc.), or a cloud-based application where all or part of the functionality is performed by the server(s) 104. For example, the client application 110 can present an online catalog of model digital images and product digital images for browsing.  A user can interact with the client application 110 to provide user input to, for example, change a product worn by a model in a model digital image (e.g., an uploaded digital image of the user). Thus,  an uploaded digital image of the user is a user image “for example, as shown in FIG. 3, 302; paragraph [0055], the virtual try-on digital image generation system 102 receives the model digital image 302 as an upload from the client device 108 or captures the model digital image 302 via the client device 108” and a selected product digital image is a clothing image for wearing “for example, as shown in FIG. 3, 304”); 
inputting the user image and the clothing image to a first deep-learning model (Paragraph [0112], the storage manager 1110 can include one or more memory devices that store various data such as model digital images, product digital images, warped versions of a product digital images, neural networks, and/or warping parameters; paragraph [0038], the term "neural network" refers to a machine learning model that can be trained and/or tuned based on inputs to determine classifications or approximate unknown functions.  In particular, the term neural network can include a model of interconnected artificial neurons (e.g., organized in layers) that communicate and learn to approximate complex functions and generate outputs (e.g., determinations of digital image classes) based on a plurality of inputs provided to the neural network.  In addition, a neural network can refer to an algorithm (or set of algorithms) that implements deep learning techniques to model high-level abstractions in data; paragraph [0054], FIG. 3 shows the virtual try-on digital image generation system 102 utilizes a coarse regression neural network 308 and fine regression neural network 320 as a multi-stage coarse-to-fine process for generating the fine warped product digital image 322; paragraph [0060], the virtual try-on digital image generation system 102 inputs the digital image priors 306 and the product digital image 304 into a coarse regression neural network 308); 
outputting, by the first deep-learning model, an image (Paragraph [0054], FIG. 3 show the virtual try-on digital image generation system 102 utilizes a coarse regression neural network 308 and fine regression neural network 320 as a multi-stage coarse-to-fine process for generating the fine warped product digital image 322; paragraph [0060], the virtual try-on digital image generation system 102 inputs the digital image priors 306 and the product digital image 304 into a coarse regression neural network 308) of a transformed virtual wearing 20clothing which is transformed (Paragraph [0060], based on analyzing the digital image priors 306 and the product digital image 304 using its various constituent components/layers, the coarse regression neural network 308 outputs a coarse warped product digital image 318.  In particular, the virtual try-on digital image generation system 102 generates the coarse warped product digital image 318 by (coarsely) modifying one or more portions of the product digital image 304 in accordance with coarse transformation parameters learned by the coarse regression neural network 308.  For example, the virtual try-on digital image generation system 102 modifies the product digital image 304 by moving portions to align with a shape and a pose of the model digital image 302 (as indicated by the digital image priors 306); paragraph [0063], the virtual try-on digital image generation system 102 utilizes the coarse regression neural network 308 to generate the coarse transformation parameters ɵ in the form of a coarse offset matrix.  In particular, the virtual try-on digital image generation system 102 generates a coarse offset matrix that includes coarse modifications for modifying portions of the product digital image to align with a pose and a shape of the model digital image.  Indeed, different fields of the coarse offset matrix can include different offsets or other transformation parameters that indicate how to modify respective portions of the product digital image 304; paragraph [0065], based on generating the coarse warped product digital image 318 (Istn0), the virtual try-on digital image generation system 102 can further generate a fine warped product digital image 322) in accordance with a body of the user included in the user image by the first deep-learning model (Paragraph [0056], the virtual try-on digital image generation system 102 accesses or determines digital image priors 306 (Ipriors) for the model digital image 302.  For example, the virtual try-on digital image generation system 102 determines shape priors as an outline of a shape of a model in the model digital image 302 and pose priors as locations of anchor points for joints or other portions of the model in the model digital image 302.  As shown in FIG. 3, the virtual try-on digital image generation system 102 determines shape priors in the form of a white silhouette outlining the shape of the model in the model digital image 302.  The virtual try-on digital image generation system 102 also determines pose priors in the form of points indicating particular portions of the model in the model digital image 302 such as a chin, a head, shoulders, hands, and hips to give an indication of the pose of the model.  The digital image priors 306 can leave out effects of clothes (like color, texture, and shape), while preserving the person's face, hair, body shape, and pose); 
inputting the user image and the image of the transformed virtual wearing clothing to the second deep-learning model (Paragraph [0097], FIG. 8 shows the virtual try-on digital image generation system 102 can utilize a neural network 804 to generate the virtual try-on digital image 814; for example, the virtual try-on digital image generation system 102 inputs the fine warped product digital image 322, the corrected segmentation mask 608, and texture translation priors 802 of the model digital image 302 into the neural network 804); and 
outputting, by the second deep-learning model, a virtual wearing person image (Paragraph [0099], the virtual try-on digital image generation system 102 further generates the virtual try-on digital image 814 by combining the composition mask 810, the rendered person image 812, and the fine warped product digital image 322) by 25dressing the transformed virtual wearing clothing (Paragraph [0098], the fine warped product digital image 322), outputted by the first deep-learning model, 47on the body of the user included in the user image (Paragraph [0097], for example, the virtual try-on digital image generation system 102 inputs the fine warped product digital image 322, the corrected segmentation mask 608, and texture translation priors 802 of the model digital image 302 into the neural network 804.  The virtual try-on digital image generation system 102 generates or identifies the texture translation priors 802 which can include pixels of the model digital image 302 that are unaffected such as face pixels and pixels of a product not being replaced in the model digital image 302 (e.g., pants in the illustrated case); paragraph [0098], as shown, the virtual try-on digital image generation system 102 utilizes a convolutional encoder 806 to extract features relating to the texture translation priors 802, the corrected segmentation mask 608, and the fine warped product digital image 322.  The virtual try-on digital image generation system 102 further pass these features through an upsampling convolutional decoder 808 (and/or other components/layers) to generate two outputs--an RGB rendered person image 812 and a composition mask 810.  For example, the neural network 804 produces a 4-channel output where three channels are the R, G, and B values of the rendered person image 812, and the fourth channel is the composite mask 810).    

	Regarding claim 18, Ayush discloses everything claimed as applied above (see claim 17), and Ayush further disclose 5wherein the outputting, by the first deep-learning model, of the image of the transformed virtual wearing clothing includes: 
generating, by a first-1 deep-learning model (FIG. 3; paragraph [0054], the virtual try-on digital image generation system 102 utilizes a coarse regression neural network 308), a first-1 transformation virtual wearing clothing image (Paragraph [0055], the virtual try-on digital image generation system 102 identifies or receives a model digital image 302 (Im) and a product digital image 304 (Ip); paragraph [0060], based on analyzing the digital image priors 306 and the product digital image 304 using its various constituent components/layers, the coarse regression neural network 308 outputs a coarse warped product digital image 318) by performing Perspective Transformation to the clothing image to match with a Paragraph [0056], the virtual try-on digital image generation system 102 accesses or determines digital image priors 306 (Ipriors) for the model digital image 302.  For example, the virtual try-on digital image generation system 102 determines shape priors as an outline of a shape of a model in the model digital image 302 and pose priors as locations of anchor points for joints or other portions of the model in the model digital image 302.  As shown in FIG. 3, the virtual try-on digital image generation system 102 determines shape priors in the form of a white silhouette outlining the shape of the model in the model digital image 302.  The virtual try-on digital image generation system 102 also determines pose priors in the form of points indicating particular portions of the model in the model digital image 302 such as a chin, a head, shoulders, hands, and hips to give an indication of the pose of the model.  The digital image priors 306 can leave out effects of clothes (like color, texture, and shape), while preserving the person's face, hair, body shape, and pose; paragraphs [0057]-[0059, the digital image priors 306 are a 19-channel map of pose and body-shape map …The pose heatmap can comprise an 18-channel feature map with each channel corresponding to a human pose keypoint.  To leverage the spatial layout, the virtual try-on digital image generation system 102 can transform each keypoint into a heatmap, with and 11x11 neighborhood around the keypoint filled with ones and zeroes everywhere else…; paragraph [0060], the virtual try-on digital image generation system 102 generates the coarse warped product digital image 318 by (coarsely) modifying one or more portions of the product digital image 304 in accordance with coarse transformation parameters learned by the coarse regression neural network 308.  For example, the virtual try-on digital image generation system 102 modifies the product digital image 304 by moving portions to align with a shape and a pose of the model digital image 302 (as indicated by the digital image priors 306); paragraph [0063], the virtual try-on digital image generation system 102 utilizes the coarse regression neural network 308 to generate the coarse transformation parameters ɵ in the form of a coarse offset matrix.  In particular, the virtual try-on digital image generation system 102 generates a coarse offset matrix that includes coarse modifications for modifying portions of the product digital image to align with a pose and a shape of the model digital image.  Indeed, different fields of the coarse offset matrix can include different offsets or other transformation parameters that indicate how to modify respective portions of the product digital image 304).

	Regarding claim 19, Ayush discloses everything claimed as applied above (see claim 18), and Ayush further disclose wherein the outputting, by the first deep-learning model, of the image of the transformed virtual wearing clothing further includes: 
15generating, by a first-2 deep-learning model (FIG. 3; paragraph [0054], fine regression neural network 320), a first-2 transformation virtual wearing clothing image (Paragraph [0065], the virtual try-on digital image generation system 102 can determine fine modifications or fine transformations to make to the coarse warped product digital image 318 to more closely align the depicted product with the shape and the pose of the digital image priors 306 (or, by association, the model digital image 302).  As shown in FIG. 3, the virtual try-on digital image generation system 102 inputs the coarse warped product digital image 318 into a fine regression neural network 320 to generate the fine warped product digital image 322 (Istn1)) by transforming the first-1 transformation virtual wearing clothing (Paragraph [0060], the coarse regression neural network 308 outputs a coarse warped product digital image 318), included in the image of the first-1 transformation virtual wearing clothing, to be matched with a shape of the body of the user, included in the obtained user image (Paragraph [0066]-[0067], the virtual try-on digital image generation system 102 inputs the coarse warped product digital image 318 into a convolutional encoder 323 of the fine regression neural network 320.  The convolutional encoder 323, in turn, encodes or generates a feature representations of the coarse warped product digital image 318 including observable features and/or hidden latent features.  The virtual try-on digital image generation system 102 passes the features through a feature correlator 324 along with the digital image priors 306 to determine relationships or correlations between the features of the coarse warped product digital image 318 and the digital image priors 306. Based on these relationships, the virtual try-on digital image generation system 102 can determine how much warping or transformation is still required to align with the digital image priors 306.  Indeed, the virtual try-on digital image generation system 102 passes the correlated features/relationships to a regressor 326 to determine a difference or a change of the coarse warped product digital image 318 still required to align with the digital image priors 306).

Regarding claim 20, Ayush discloses everything claimed as applied above (see claim 17), and Ayush discloses  20further comprising transmitting the virtual wearing image, outputted by the second deep-learning model, to a terminal of the user (FIG. 1; paragraph [0046], the server(s) 104 may receive data from the client device 108 in the form of a request to generate a virtual try-on digital image.  In addition, the server(s) 104 can transmit data to the client device 108 to provide a virtual try-on digital image).


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 4-15 are rejected under 35 U.S.C. 103 as being unpatentable over Ayush et al  (U.S. Patent Application Publication 2021/0142539 A1).

	Regarding claim 4, Ayush discloses everything claimed as applied above (see claim 3), and Ayush further disclose wherein the synthesis mask image is an image that a position of the first-2 15transformation virtual wearing clothing is territorialized on the received user image (Paragraph [0087], FIG. 6 illustrates utilizing a neural network 602 to generate a 
corrected segmentation mask 608 (e.g., an expected segmentation map Mexp) in accordance with one or more embodiments.  For instance, to transfer the texture of the fine warped product digital image 322 onto the model digital image 302, the virtual try-on digital image generation system 102 can generate a segmentation mask for the model digital image 302 that indicates a portion (e.g., a number of pixels) of the model digital image 302 that are to be replaced with the texture of the fine warped product digital image 322.  As shown in FIG. 6, the virtual try-on digital image generation system 102 thus generates a corrected segmentation mask 608 for the model digital image 302; FIG. 8; paragraph [0097], the virtual try-on digital image generation system 102 inputs the fine warped product digital image 322, the corrected segmentation mask 608, and texture translation priors 802 of the model digital image 302 into the neural network 804; paragraph [0098], the virtual try-on digital image generation system 102 utilizes a convolutional encoder 806 to extract features relating to the texture translation priors 802, the corrected segmentation mask 608, and the fine warped product digital image 322. The virtual try-on digital image generation system 102 further pass these features through an upsampling convolutional decoder 808 (and/or other components/layers) to generate a composition mask 810), and 
wherein the intermediate person image is an image of an arm and/or a hand of the body of the user (Paragraph [0041], a segmentation mask delineates bounds 
between a portion of a digital image to be replaced (e.g., pixels that depict a shirt to be replaced by a different shirt of a product digital image) and other portions not to be replaced (e.g., pixels that depict pants or arms or neck).  A "corrected segmentation mask (or "conditional segmentation mask") refers to a segmentation mask that has been corrected or conditioned based on a product digital image.  For example, a corrected segmentation mask includes a segmentation mask where the initial portion to be replaced depicted a short-sleeve shirt and the corrected portion to be replaced depicts a long sleeve shirt; paragraph [0089], to generate the corrected segmentation mask 608, the virtual try-on digital image generation system 102 determines how to modify a segmentation mask 610 associated with the digital image priors 306 (or the model digital image 302).  Indeed, the virtual try-on digital image generation system 102 
generates, identifies, or accesses the segmentation mask 610 that, based on the digital image priors 306, indicates portions of the model digital image 302 that are covered by (or depict) a particular product … In one or more embodiments, the virtual try-on digital image generation system generates the segmentation mask 610 using a human parser to compute a segmentation map, where different regions represent different parts of the human body (e.g., arms, shirt, pants, legs); paragraph [0098], the virtual try-on digital image generation system 102 utilizes a convolutional encoder 806 to extract features relating to the texture translation priors 802, the corrected segmentation mask 608, and the fine warped product digital image 322. The virtual try-on digital image generation system 102 further pass these features through an upsampling convolutional decoder 808 (and/or other components/layers) to generate an RGB rendered person image 812) which is generated based on the first-2 transformation virtual wearing clothing (Paragraph [0087], for instance, to transfer the texture of the fine warped product digital image 322 onto the model digital image 302, the virtual try-on digital image generation system 102 can generate a segmentation mask for the model digital image 302 that indicates a portion (e.g., a number of pixels) of the model digital image 302 that are to be replaced with the texture of the fine warped product digital image 322.  As shown in FIG. 6, the virtual try-on digital image generation system 102 thus generates a corrected segmentation mask 608 for the model digital image 302).
	Ayush does not specifically disclose a length of the first-2 transformation virtual wearing clothing.
	However, paragraph [0088] of Ayush discloses “The corrected segmentation mask 608 indicates those pixels of the model digital image 302 that are to be replaced by the fine warped product digital image 322” and the virtual try-on digital image generation system taught by Ayush utilizes a convolutional encoder to extract features relating to the texture translation priors, the corrected segmentation mask and the fine warped product digital image to generate an RGB rendered person image. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to understand that “the corrected segmentation mask indicates those pixels of the model digital image that are to be replaced by the fine warped product digital image” defines a length of the fine warped product digital image to be used for computing different regions represent different parts of the human body “arms” and replacing those pixels of model digital image indicated by the corrected segmentation mask. Thus, the image of arms of the body is generated based on the 

	Regarding claim 5, Ayush discloses everything claimed as applied above (see claim 4), and Ayush further disclose wherein the second deep-learning model further includes a second-2 deep-learning model configured to generate a second virtual wearing person image (Paragraph [0100], FIG. 9 illustrates training the neural network 804 in accordance with one or more embodiment; paragraphs [0101]-[0104], the virtual try-on digital image generation system 102 further performs a comparison 912 to compare the predicted composite mask 908, the predicted virtual try-on digital image 910, and a ground truth segmentation mask 710 (accessed from the database 406) … As shown, the virtual try-on digital image generation system 102 also performs back propagation 914 to modify weights or parameters associated with the neural network 804.  By modifying the weights/parameters, the virtual try-on digital image generation system 102 changes how the neural network 804 analyzes the inputs to generate outputs. Thus, the second predicted virtual try-on digital image 910 is generated) which is generated by a plurality of dilated convolutions and based on the user image and the first virtual wearing person image (Paragraph [0100], the virtual try-on digital image generation system 102 accesses training data such as a fine warped product digital image 902, a corrected segmentation mask 904, and a model digital image 906 (or texture translation priors of the model digital image 906) to input into the neural network 804.  Based on analyzing these three inputs, the neural network 804 generates a predicted composite mask 908 and (as described above in relation to FIG. 8) a predicted virtual try-on digital image 910 in accordance with the weights and parameters of the components and layers of the neural network 804).  

	Regarding claim 6, Ayush discloses everything claimed as applied above (see claim 5), and Ayush further disclose wherein the communicator is configured to receive (FIG. 1; paragraph [0044], the client device 108 can communicate with the server(s) 104 via the network 112.  For example, the client device 108 can receive user input from a user interacting with the client device 108 (e.g., via the client application 110) to request generation of a virtual try-on digital image; FIG. 9 illustrates training the neural network 804 of the virtual try-on digital image generation system 102) a training data set including a person, a cloth for virtual wearing, a transformation clothing truth image when wearing (Paragraph [0100], the virtual try-on digital image generation system 102 accesses training data such as a fine warped product digital image 902, a corrected segmentation mask 904, and a model digital image 906 (or texture translation priors of the model digital image 906) to input into the neural network 804), and a Truth Label for the person dressing the cloth for virtual wearing (Paragraph [0102], the virtual try-on digital image generation system 102 further performs a comparison 912 to compare the predicted composite mask 908, the predicted virtual try-on digital image 910, and a ground truth segmentation mask 710 (accessed from the database 406).  In particular, the virtual try-on digital image generation system 102 performs the comparison 912 by utilizing one or more loss functions such as a texture translation loss function and/or a dueling triplet loss function; paragraph [0102], the virtual try-on digital image generation system 102 can implement a texture transfer loss, which includes other loss components such as a perceptual distance loss and a mask loss.  For example, Ll1=|Itry-on-Im| represents the  represents an L1 distance loss; Itry-on represents the predicted virtual try-on digital image 910; paragraph [0106], as training progresses, this online hard negative training strategy helps the virtual try-on digital image generation system 102 push the results closer to the ground truth by updating the negative at discrete step intervals (T steps)).  

	Regarding claim 7, Ayush discloses everything claimed as applied above (see claim 6), and Ayush further disclose perform training of the first-1 deep-learning model using a first-1 model loss of comparing the first-1 transformation virtual wearing clothing image generated by the first-1 10deep-learning model and the transformation clothing truth image when virtually wearing for the person and the cloth for virtual wearing of the training data set (Paragraph [0079], As illustrated in FIG. 5, the virtual try-on digital image generation system 102 further compares the feature representations of these three digital images utilizing a perceptual geometric matching loss.  More particularly, the virtual try-on digital image generation system 102 subjects the interim Istn0 (the predicted coarse warped product digital image 408); paragraph [0080], to elaborate, the virtual try-on digital image generation system 102 can utilize a warp loss function that includes a perceptual geometric matching loss, as represented by: Ls0=Igt-warp-Istn0), and 
perform training of the first deep-learning model using a first-2 model loss of comparing the image of the first-2 transformation virtual wearing clothing generated by the first-2 deep-learning model and the transformation clothing truth image when virtually wearing for the 15person and the cloth for virtual wearing of the training data set, when training the first-1 deep- learning model (Paragraph [0079], the virtual try-on digital image generation system 102 further compares the feature representations of these three digital images utilizing a perceptual geometric matching loss.  More particularly, the virtual try-on digital image generation system 102 subjects final Istn1 (the predicted fine warped product digital image 410); paragraph [0080], to elaborate, the virtual try-on digital image generation system 102 can utilize a warp loss function that includes a perceptual geometric matching loss, as represented by: Ls1=|Igt-warp-Istn1|).

	Regarding claim 8, Ayush discloses everything claimed as applied above (see claim 6), and Ayush further disclose wherein the processor is configured to perform training of the first deep-learning model 20using a grid interval consistency loss (Paragraph [0079], As illustrated in FIG. 5, the virtual try-on digital image generation system 102 further compares the feature representations of these three digital images utilizing a perceptual geometric matching loss.  More particularly, the virtual try-on digital image generation system 102 subjects the interim Istn0 (the predicted coarse warped product digital image 408) and final Istn1 (the predicted fine warped product digital image 410) output to a warp loss Lwarp against Igt-warp (the ground truth warped product digital image 414).  The warp loss Lwarp includes a perceptual geometric matching loss component Lpgm.  By utilizing the warp loss, the virtual try-on digital image generation system 102 causes the fine regression neural network 320 to incrementally improve upon the warping modifications (e.g., the coarse transformation parameters ɵ) of the coarse regression neural network 308) based on a distance between pixels of an image of the cloth for virtual wearing (Paragraph [0080], indeed, FIG. 5 illustrates the respective feature representations of Istn0, Istn1, and Igt-warp in a VGG-19 feature space, where d0, d1, and d01 represent distances or difference vectors between the feature representations in the feature space as shown.  To elaborate, the virtual try-on digital image generation system 102 can utilize a warp loss function that includes a perceptual geometric matching loss).  5   

	Regarding claim 9, Ayush discloses everything claimed as applied above (see claim 8), and Ayush further disclose wherein the processor is configured to generate an occlusion clothing image on which 25an occlusion part is removed from the transformation clothing truth image through an occlusion 44process (Paragraph [0087], FIG. 6 illustrates utilizing a neural network 602 to generate a corrected segmentation mask 608 (e.g., an expected segmentation map Mexp) in accordance with one or more embodiments.  For instance, to transfer the texture of the fine warped product digital image 322 onto the model digital image 302, the virtual try-on digital image generation system 102 can generate a segmentation mask for the model digital image 302 that indicates a portion (e.g., a number of pixels) of the model digital image 302 that are to be replaced with the texture of the fine warped product digital image 322.  As shown in FIG. 6, the virtual try-on digital image generation system 102 thus generates a corrected segmentation mask 608 for the model digital image 302; FIG. 7 illustrates training the neural network 602 in accordance with one or more embodiments; paragraph [0093], the virtual try-on digital image generation system 102 performs a comparison 708 to compare the predicted segmentation mask 706 with a ground truth segmentation mask 710.  Indeed, the virtual try-on digital image generation system 102 accesses a ground truth segmentation mask 710 that corresponds to the model digital image priors 702 and/or the product digital image 704 from the database 406.  Thus, the virtual try-on digital image generation system 102 can utilize a cross entropy loss function to compare the predicted segmentation mask 706 with the ground truth segmentation mask 710 to thereby determine a measure of loss associated with the neural network 602.  For instance, the virtual try-on digital image generation system 102 can utilize a cross entropy loss function for semantic segmentation with increased weights for skin classes (to better handle occlusion cases) and background classes (to stem bleeding of skin pixels into other pixels)), and 
wherein the first deep-learning model is configured to use the occlusion clothing image when training using the first-2 model loss (Paragraph [0079], the virtual try-on digital image generation system 102 further compares the feature representations of these three digital images utilizing a perceptual geometric matching loss.  More particularly, the virtual try-on digital image generation system 102 subjects final Istn1 (the predicted fine warped product digital image 410); paragraph [0080], to elaborate, the virtual try-on digital image generation system 102 can utilize a warp loss function that includes a perceptual geometric matching loss, as represented by: Ls1=|Igt-warp-Istn1|).

	Regarding claim 10, Ayush discloses everything claimed as applied above (see claim 9), and Ayush further disclose wherein the processor is configured to: 
generate a first discrimination image based on the image of the first-2 transformation virtual wearing clothing (Paragraph [0087], FIG. 6 illustrates utilizing a neural network 602 to generate a corrected segmentation mask 608 (e.g., an expected segmentation map Mexp) in accordance with one or more embodiments.  For instance, to transfer the texture of the fine warped product digital image 322 onto the model digital image 302, the virtual try-on digital image generation system 102 can generate a segmentation mask for the model digital image 302 that indicates a portion (e.g., a number of pixels) of the model digital image 302 that are to be replaced with the texture of the fine warped product digital image 322.  As shown in FIG. 6, the virtual try-on digital image generation system 102 thus generates a corrected segmentation mask 608 for the model digital image 302), and 
perform training of the first deep-learning model using a first adversarial loss based on 10the first discrimination image (Paragraph [0092], FIG. 7 illustrates training the neural network 602 in accordance with one or more embodiments.  As shown, the virtual try-on digital image generation system 102 inputs, from the database 406, model digital image priors and a product digital image 704 into the neural network 602.  The neural network 602 analyzes the model digital image priors 702 and the product digital image 704 to generate a predicted segmentation mask 706.  The predicted segmentation mask 706 represents a prediction of what the neural network 602 expects for a segmentation mask according to its various weights and parameters; paragraph [0093], the virtual try-on digital image generation system 102 performs a comparison 708 to compare the predicted segmentation mask 706 with a ground truth segmentation mask 710.  Indeed, the virtual try-on digital image generation system 102 accesses a ground truth segmentation mask 710 that corresponds to the model digital image priors 702 and/or the product digital image 704 from the database 406.  Thus, the virtual try-on digital image generation system 102 can utilize a cross entropy loss function to compare the predicted segmentation mask 706 with the ground truth segmentation mask 710 to thereby determine a measure of loss associated with the neural network 602.  For instance, the virtual try-on digital image generation system 102 can utilize a cross entropy loss function for semantic segmentation with increased weights for skin classes (to better handle occlusion cases) and background classes (to stem bleeding of skin pixels into other pixels)).

	Regarding claim 11, Ayush discloses everything claimed as applied above (see claim 10), and Ayush further disclose wherein the processor is configured to generate a second virtual wearing person image (Paragraph [0100], FIG. 9 illustrates training the neural network 804 in accordance with one or more embodiment; paragraphs [0101]-[0104], the virtual try-on digital image generation system 102 further performs a comparison 912 to compare the predicted composite mask 908, the predicted virtual try-on digital image 910, and a ground truth segmentation mask 710 (accessed from the database 406) … As shown, the virtual try-on digital image generation system 102 also performs back propagation 914 to modify weights or parameters associated with the neural network 804.  By modifying the weights/parameters, the virtual try-on digital image generation system 102 changes how the neural network 804 analyzes the inputs to generate outputs. Thus, the second predicted virtual try-on digital image 910 is generated) by the second-2 deep-learning model using the image of first-2 transformation virtual wearing clothing 15generated by the first-2 deep-learning model for the person and the cloth for virtual wearing of the training data set (Paragraph [0100], the virtual try-on digital image generation system 102 accesses training data such as a fine warped product digital image 902, a corrected segmentation mask 904, and a model digital image 906 (or texture translation priors of the model digital image 906) to input into the neural network 804.  Based on analyzing these three inputs, the neural network 804 generates a predicted composite mask 908 and (as described above in relation to FIG. 8) a predicted virtual try-on digital image 910 in accordance with the weights and parameters of the components and layers of the neural network 804).

	Regarding claim 12, Ayush discloses everything claimed as applied above (see claim 11), and Ayush further disclose wherein the processor is configured to perform FIG. 9 illustrates training the neural network 804 of the virtual try-on digital image generation system 102; paragraph [0100], the virtual try-on digital image generation system 102 accesses training data such as a fine warped product digital image 902, a corrected segmentation mask 904, and a model digital image 906 (or texture translation priors of the model digital image 906) to input into the neural network 804) and the Truth Label for the person dressing the cloth for virtual wearing (Paragraph [0102], the virtual try-on digital image generation system 102 further performs a comparison 912 to compare the predicted composite mask 908, the predicted virtual try-on digital image 910, and a ground truth segmentation mask 710 (accessed from the database 406).  In particular, the virtual try-on digital image generation system 102 performs the comparison 912 by utilizing one or more loss functions such as a texture translation loss function and/or a dueling triplet loss function; paragraph [0102], the virtual try-on digital image generation system 102 can implement a texture transfer loss, which includes other loss components such as a perceptual distance loss and a mask loss.  For example, Ll1=|Itry-on-Im| represents the  represents an L1 distance loss; Itry-on represents the predicted virtual try-on digital image 910; paragraph [0106], as training progresses, this online hard negative training strategy helps the virtual try-on digital image generation system 102 push the results closer to the ground truth by updating the negative at discrete step intervals (T steps)).  

	Regarding claim 13, Ayush discloses everything claimed as applied above (see claim 12), and Ayush further disclose 25wherein the memory further includes VGG-19 neural network (Paragraph [0079], As illustrated in FIG. 5, the virtual try-on digital image generation system 102 further compares the feature representations of these three digital images utilizing a perceptual geometric matching loss.  More particularly, the virtual try-on digital image generation system 102 subjects the interim Istn0 (the predicted coarse warped product digital image 408) and final Istn1 (the predicted fine warped product digital image 410) output to a warp loss Lwarp against Igt-warp (the ground truth warped product digital image 414).  The warp loss Lwarp includes a perceptual geometric matching loss component Lpgm.  By utilizing the warp loss, the virtual try-on digital image generation system 102 causes the fine regression neural network 320 to incrementally improve upon the warping modifications (e.g., the coarse transformation parameters ɵ) of the coarse regression neural network 308; paragraph [0080], FIG. 5 illustrates the respective feature representations of Istn0, Istn1, and Igt-warp in a VGG-19 feature space), and 
45wherein the processor is configured to generate a layer property map for the second virtual wearing person image for the person and the cloth for virtual wearing of the training data set and a layer property map for the Truth Label for the person dressing the cloth for virtual wearing (Paragraph [0102], to elaborate, the virtual try-on digital image generation system 102 can implement a texture transfer loss, which includes other loss components such as a perceptual distance loss and a mask loss.  For example, the virtual try-on digital image generation system 102 can determine a texture transfer loss as given by: … where VGG(Itry-on) represents a VGG-19 feature space representation of the predicted virtual try-on digital image 910, VGG(Im) represents a VGG-19 feature space representation of the model digital image 906).

	Regarding claim 14, Ayush discloses everything claimed as applied above (see claim 13), and Ayush further disclose wherein the processor is configured to perform training of the second deep-learning model using a perceptual loss by comparing the layer property map for the second virtual wearing person image for the person and the cloth for virtual wearing of the training data set and the layer 10property map for the Truth Label for the person dressing the cloth for virtual wearing (Paragraph [0102], the virtual try-on digital image generation system 102 further performs a comparison 912 to compare the predicted composite mask 908, the predicted virtual try-on digital image 910, and a ground truth segmentation mask 710 (accessed from the database 406).  In particular, the virtual try-on digital image generation system 102 performs the comparison 912 by utilizing one or more loss functions such as a texture translation loss function and/or a dueling triplet loss function; paragraph [0102], the virtual try-on digital image generation system 102 can implement a texture transfer loss, which includes other loss components such as a perceptual distance loss and a mask loss.  For example, Lpercep=|VGG(Itry-on)-VGG(Im)| represents a perceptual distance loss, VGG(Itry-on) represents a VGG-19 feature space representation of the predicted virtual try-on digital image 910, VGG(Im) represents a VGG-19 feature space representation of the model digital image 906; paragraph [0106], as training progresses, this online hard negative training strategy helps the virtual try-on digital image generation system 102 push the results closer to the ground truth by updating the negative at discrete step intervals (T steps)).

	Regarding claim 15, Ayush discloses everything claimed as applied above (see claim 14), and Ayush further disclose wherein the processor is configured to: 
generate a second discrimination image based on the second virtual wearing person 15image through the second deep-learning model (Paragraph [0100], FIG. 9 illustrates training the neural network 804 in accordance with one or more embodiment; paragraphs [0101]-[0103], the virtual try-on digital image generation system 102 further performs a comparison 912 to compare the predicted composite mask 908, the predicted virtual try-on digital image 910, and a ground truth segmentation mask 710 (accessed from the database 406) … As shown, the virtual try-on digital image generation system 102 also performs back propagation 914 to modify weights or parameters associated with the neural network 804.  By modifying the weights/parameters, the virtual try-on digital image generation system 102 changes how the neural network 804 analyzes the inputs to generate outputs. Thus, upon multiple successive iterations or epochs of training with different inputs, repeating the comparison 912 and the back propagation 914 to continually modify the weights/parameters. Thus, the second discrimination image is generated at comparison 912), and 
Paragraph [0103], upon multiple successive iterations or epochs of training with different inputs, repeating the comparison 912 and the back propagation 914 to continually modify the weights/parameters, the virtual try-on digital image generation system 102 reduces or minimizes the loss associated with the neural network 804 until it satisfies a threshold (and the neural network 804 therefore generates accurate predictions of composite masks and rendered person images)).

	Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Ayush et al  (U.S. Patent Application Publication 2021/0133919 A1).
	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Xilin Guo whose telephone number is (571)272-5786. The examiner can normally be reached Monday - Friday 9:00 AM-5:30 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Gregory J Tryder can be reached on 571-270-7365. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.






/XILIN GUO/Primary Examiner, Art Unit 2616