DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-24 are pending in the application.

CLAIM INTERPRETATION

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:

(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: Claim 1/13 and dependent claims: “a server”. 
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 8-10, 13-17 and 20-22 are rejected under 35 U.S.C. 103 as being unpatentable over Han et al. (Han et al., “VITON: An Image-based Virtual Try-on Network,” arXiv:1711.08447v4 [cs.CV] 12 Jun 2018, hereafter Han),  in view of Guay et al. (US Patent 10,885,708 B2, hereafter Guay).
As per claim 1, Han teaches the invention substantially as claimed including a method (Abstract) comprising: 
determining, a first semantic segmentation image of a first image, wherein the first image includes at least a portion of a person wearing a first fashion item (Fig. 1 and Fig. 5 left column “Reference Image”; Fig. 3; page 3 right col. Human body representation: “a human segmentation map”); 
Pose heatmap: “18 keypoints”); 
using the determined first semantic segmentation image, the determined keypoints, and a second image that includes a second fashion item, generating a second semantic segmentation image of the person in the first image with the second fashion item of the second image (Han generates a Person Representation “p” by combining pose map (keypoints), pose map and face and hair (corresponding to first semantic segmentation image) (Fig. 3; page 3-4 bridging para.). Han further generates a coarse synthesize image I’ (corresponding to the second semantic segmentation image) through reconstruction such that a natural transfer from c (a target clothing item to be transferred onto the person’s body, corresponding to a second image, see Fig. 1 top row; Fig. 5 2nd col.) to the corresponding region of p (person representation) can be learned. See Fig. 2 top part coarse result I’ and page 4 section 3.2); 
masking, the first image to occlude pixels of the first fashion item that is to be replaced with the second fashion item (Fig. 2 top part clothing mask M; page 4 section 3.2; Fig. 5 4th col. and Fig. 7 4th col. showing occluded pixels of the first fashion item that is to be replaced with the second fashion item; More result can be found in supplemental material in page 12-13 Fig. 3-4); and
using the masked first image, the second semantic segmentation image, and the second image that includes the second fashion item, generating a third image that includes the person with the second fashion item (Han generates a warped clothing item c’ by using foreground mask of c (c being the second image) and a TPS transform between c and M (M being the masked first image) (see Fig. 4; page 4-5 Warped clothing item), and generates a refined final image (corresponding to the third image) by combining the warped item c’ and the coarse synthesize image I’ (I’ being the second semantic segmentation image). See Fig. 2 bottom part and page 5 Learn to composite).  
Han teaches every limitation as analyzed above except for a server for performing the acts and transmitting, via a communications network coupled to the server, the generated third image. 
	Guay discloses an automated costume augmentation system (ABSTARCT). The system comprises a server 102, a remote communication device 140 and a communication network 120 connecting the server and the remote communication device (FIG. 1). The server includes a hardware processor that performs a plurality of acts for fitting a costume on a posed figure, and outputs an enhanced image including the posed figure augmented with the fitted costume to the remote communication device through the communication network (ABSTRACT; FIG. 1 #138; FIG. 2 #238; col. 2 lines 30-40; col. 4 lines 9-16).  
Taking the combined teachings of Han and Guay as a whole, it would have been obvious for a person with ordinary skill in the art before the effective filing date of the claimed invention to consider including a computing system comprising a server and a communications network coupled to the server as disclosed by Guay in order to improve image processing efficiency. A server usually has more powerful computing ability than a user device. Directing complicated image processing task to the server and transmitting the result back to the user device can efficiently utilize the system’s computing power.

As per claim 2, dependent upon claim 1, Han in view of Guay teaches that the masking of the first image to occlude pixels of the first fashion item comprises: deleting, at the server, minimal sub-images whose pixels are to be changed during the transference of the image of the second fashion item onto the image of the first fashion item on the person (Han: Fig. 5 4th col. and Fig. 7 4th col. show minimal deleted pixels. More result can be found in supplemental material in page 12-13 Fig. 3-4). 

As per claim 3, dependent upon claim 1, Han in view of Guay teaches: 
masking, at the server, parts of the body of the person in the first image to be changed during a transference of the second fashion item for the first fashion item in the first semantic segmentation image (Han Fig. 5 4th col. and Fig. 7 4th col. show masked pixels of parts of the body of the person. More result can be found in supplemental material in page 12-13 Fig. 3-4). 

As per claim 4, dependent upon claim 3, Han in view of Guay teaches that when the parts of the body of the person in the first image overlap or self-occlude, retaining the pixels of the overlapped or self-occluded parts in the first semantic segmentation image (Han Fig. 1-2 and 5-7 show pixels of the overlapped or self-occluded parts are retained. Since the final images in Fig. 1-2 and 5-7 are the refined images from coarse result (first semantic segmentation image), therefore the pixels of the overlapped or self-occluded parts are retained in first semantic segmentation image).


using the first semantic segmentation image, the determined keypoints, and the second image that include the second fashion item, generating at the server a second semantic segmentation image that includes the person of the first image with the second fashion item (see rejections applied to claim 1). 

As per claim 8, dependent upon claim 1, Han in view of Guay teaches: 
masking, at the server, the first image to occlude pixels of the first fashion item to be replaced to form a second masked image (See rejections applied to claim 1. Han teaches generating second masked image (Fig. 5 4th column)). 

As per claim 9, dependent upon claim 8, Han in view of Guay teaches:
deleting, at the server, minimal sub-images whose pixels are to be changed during the transference of the image of the second fashion item onto the image of the first fashion item on the person (See rejections applied to claim 2). 

As per claim 10, dependent upon claim 8, Han in view of Guay teaches: 
generating, at the server, a fourth image of the person that includes the second fashion item by using the first semantic segmentation image, the second masked image, and the second image that includes the second fashion item and the determined keypoints of the body (See rejections applied to claim 1. Han teaches generating a fourth image (Fig. 1 and 5 last columns)). 



Claim 14, dependent upon claim 13, is rejected as applied to claim 2 above.

Claim 15, dependent upon claim 13, is rejected as applied to claim 3 above.

Claim 16, dependent upon claim 15, is rejected as applied to claim 4 above.

Claim 17, dependent upon claim 15, is rejected as applied to claim 5 above.

Claim 20, dependent upon claim 13, is rejected as applied to claim 8 above.

Claim 21, dependent upon claim 20, is rejected as applied to claim 9 above.

Claim 22, dependent upon claim 21, is rejected as applied to claim 10 above.

Claims 6-7 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Han et al. (Han et al., “VITON: An Image-based Virtual Try-on Network,” arXiv:1711.08447v4 [cs.CV] 12 Jun 2018, hereafter Han),  in view of Guay et al. (US Patent 10,885,708 B2, hereafter Guay), as applied above to claims 5 and 17 .
As per claim 6, depending upon claim 5, Han in view of Guay teaches determining a loss between the second semantic segmentation image and the first semantic segmentation image (Han page 5 left col. last 5 lines including eqn. (3)), but does not determine adversarial loss. 
Wang teaches a method for generating a video of a body moving in synchronization with music (ABSTRACT). When training a neural network, Wang uses an objective function including multiple loss terms. For example a combination of L1 pixel loss, Very Deep Convolutional Networks for Large-Scale Image Recognition (VGG) perceptual loss, pose consistency loss, and generative adversarial loss is used as a training objective to minimize the difference between predicted and ground-truth video frames (para. [0053]).
Taking the combined teachings of Han, Guay and Wang as a whole, it would have been obvious for a person with ordinary skill in the art before the effective filing date of the claimed invention to consider determining adversarial loss as disclosed by Wang in order to improve the quality of generated third image. Adversarial loss is a function that estimates the probability of error and can be used for judging how close a generated image to a ground truth image.

As per claim 7, depending upon claim 6, Han in view of Guay and Wang teaches:
training, at the server, by convolutional back-propagation (Wang para. [0053]). 



Claim 19, dependent upon claim 18, is rejected as applied to claim 7 above.

Claims 11-12 and 23-24 are rejected under 35 U.S.C. 103 as being unpatentable over Han et al. (Han et al., “VITON: An Image-based Virtual Try-on Network,” arXiv:1711.08447v4 [cs.CV] 12 Jun 2018, hereafter Han),  in view of Guay et al. (US Patent 10,885,708 B2, hereafter Guay), as applied above to claims 10 and 22 respectively, and further in view of Wang et al. (US Publication 2020/0342646 A1, hereafter Wang) and Vo et al. (US Publication 2020/0302168 A1, hereafter Vo).
As per claim 11, dependent upon claim 10, Han in view of Guay teaches determining a loss function combining perceptual loss and feature matching loss (Han page 4 eqn. (1) the first term being perceptual loss and the second term being feature matching loss (how close a generated clothing mask M is to ground truth clothing mask M0).  Han in view of Guay, however, does not disclose that the combined loss function includes an adversarial loss. 
Wang teaches a method for generating a video of a body moving in synchronization with music (ABSTRACT). When training a neural network, Wang uses an objective function including multiple loss terms. For example a combination of L1 pixel loss, Very Deep Convolutional Networks for Large-Scale Image Recognition (VGG) perceptual loss, pose consistency loss, and generative adversarial loss is used as a training objective to minimize the difference between predicted and ground-truth video frames (para. [0053]).


Han in view of Guay and Wang, does not further teach determining an error gradient of the combined loss function. 
Vo is evidenced that calculating an error gradient of a loss function when training a neural network is well-known and practiced (para. [0108]).
Taking the combined teachings of Han, Guay, Wang and Vo as a whole, it would have been obvious for a person with ordinary skill in the art before the effective filing date of the claimed invention to consider calculating an error gradient of a loss function in order to train a neural network efficiently.

As per claim 12, dependent upon claim 11, Han in view of Guay, Wang and Vo further teaches: 
training, at the server, by back-propagation of the error gradient (Vo para. [0108]). 

Claim 23, dependent upon claim 22, is rejected as applied to claim 11 above.



Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to XUEMEI G CHEN whose telephone number is (571)270-3480.  The examiner can normally be reached on Monday-Friday 9am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nay Maung can be reached on 571-272-7882.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.