Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This non-final rejection is in response to the application filed on: 04/04/2019.
Claims 1-20 are pending. Claims 1, 11, 18, 19 and 20 are independent claims.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 04/04/2019 is being considered by the examiner.

Drawings
The drawings filed on: 04/04/2019 are accepted.

Allowable Subject Matter
Claims 2-6, 8, 9, 12-14, and 16-17 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 11, 18, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Dance (US Application: US 2005/0169558, published: Aug. 4, 2005, filed: Jan. 30, 2004) in view of Magnani et al (US Application: US 2018/0121533, published: May 3, 2018, filed: Oct. 31, 2016) and further in view of Gao et al (US Application: US 2017/0061250, published: Mar. 2, 2017, filed: Aug. 28, 2015).

With regards to claim 1. Dance teaches a method for training an inference model using a computing device, comprising: providing the inference model providing a second number of fashion entries, wherein the …  entries are not labeled (paragraph 0042: entries that include at least an image and text are obtained); separating each of the second number of …  entries into a target image and target text (paragraph 0042: image and text data are considered separate data that undergoes classification); converting the target text into a category vector (paragraph 0059: “the text data stored in the repository can also be classified according to the same categories as the image classification is performed”) … wherein the category vector comprises a plurality of dimensions corresponding to categories of (paragraph 0024: In another embodiment, classifying an image can comprise classifying the image according to a predetermined set of subcategories within a category.) processing the target image using the inference model to obtain processed target image and target image label (“paragraph 0024: In another embodiment, classifying an image can comprise classifying the image according to a predetermined set of subcategories within a category.”); comparing the category vector to the target image label; when the category vector matches the target image label (paragraph 0059: “In the simplest case, the text data stored in the repository can also be classified according to the same categories as the image classification is performed. In this case, all text data falling under the same category as the image match the image's category”).

However Dance does not expressly teach the entries can be fashion entries. Also Dance does not expressly teach providing a text-to-vector converter; and pre-training the inference model using a first number of labeled fashion entries; and an attribute vector using the text-to-vector converter, …, … categories of fashion, and the attribute vector comprise a plurality of dimensions corresponding to attributes of fashion; when the category vector matches the target image label, updating the target image label based on the category vector and the attribute vector to obtained updated label; and retraining the inference model using the processed target image and the updated label.

Yet Magnani et al teaches the entries can be fashion entries (paragraph 0029: entries can be fashion products such as boots). Also Dance does not expressly teach providing a text-to-vector converter (paragraph 0041: text is encoded into vectors); and pre-training the inference model using a first number of labeled fashion entries (paragraph 0055 and 0057: the model is trained with labeled entries ); and an attribute vector using the text-to-vector converter, …, … categories of fashion (paragraph 0055, 0056: classes of the product are implemented), and the attribute vector comprise a plurality of dimensions corresponding to attributes of fashion (paragraphs 0055, 0056, 0061, 0067: metadata set for the product are dimensions of data and also product attributes).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified Dance’s ability to match category vectors and image labels using an inference model, such that the inference model is modified to be pretrained and also encode entries that are of fashion products, as taught by Magnani et al. The combination would have allowed Dance to have classified products in an efficient manner even when there are a high amount of new products uploaded daily and dynamic categories (Magnani et al, paragraph 0004). 

However the combination does not expressly teach when the category vector matches the target image label, updating the target image label based on the category vector and the attribute vector to obtained updated label; and retraining the inference model using the processed target image and the updated label. 

Yet Gao et al teaches when the category vector matches the target image label, updating the target image label based on the category vector and the attribute vector to obtained updated label; and retraining the inference model using the processed target image and the updated label (paragraph 0058 – 0060: semantic vectors acting as semantic categories are matched against a target image’s label description and those instances are merged within positive bags, the model is retrained using the updated positive labels).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified Dance and Magnani et al’s ability to match category vector’s of target text to a target image label using a trained inference model, such that the inference model could have been retrained upon desired matching/alignment conditions, as taught by Gao et al. The combination would have implemented an improved process to “discover semantic similarities between images and text” (Gao et al, paragraph 0002)

With regards to claim 11. the combination of Dance, Magnani et al, and Gao et al teaches a system for training an inference model, comprising: a computing device, comprising a processor and a storage device storing computer executable code, wherein the computer executable code comprises a text- to-vector converter, the inference model, a first number of labeled fashion entries, and a second number of fashion entries that are not labeled, and the computer executable code, when executed at the processor, is configured to: pre-train the inference model using the first number of labeled fashion entries; separate each of the second number of fashion entries into a target image and target text; convert the target text into a category vector and an attribute vector using the text-to-vector converter, wherein the category vector comprises a plurality of dimensions corresponding to categories of fashion, and the attribute vector comprises a plurality of dimensions corresponding to attributes of fashion; process the target image using the inference model to obtain processed target image and target image label; compare the category vector to the target image label; when the category vector matches the target label, update the target label based on the category vector and the attribute vector to obtain updated label; and retrain the inference model using the processed image and the updated label, as similarly explained in the rejection of claim 1, and is rejected under similar rationale.

With regards to claim 18. the combination of Dance, Magnani et al, and Gao et al teaches a method for training an inference model using a computing device, comprising: providing a text-to-vector converter; providing the inference model and pre-training the inference model using a first number of labeled entries, wherein labels of the labeled entries are categories of the entries; providing a second number of entries, wherein the entries are not labeled; separating each of the second number of entries into a target image and target text; converting the target text into a category vector and an attribute vector using the text-to-vector converter, wherein the category vector comprises a plurality of dimensions corresponding to the categories of the entries, and the attribute vector comprises a plurality of dimensions corresponding to attributes of the entries; processing the target image using the inference model to obtain processed target image and target image label; comparing the category vector to the target image label; when the category vector matches the target image label, updating the target image label based on the category vector and the attribute vector to obtain updated label, as similarly explained in the rejection of claim 1, and is rejected under similar rationale.

With regards to claim 19. the combination of Dance, Magnani et al, and Gao et al teaches a non-transitory computer readable medium storing computer executable code, wherein the computer executable code, when executed at a processor of a computing device, is configured to perform the method of claim 18, as similarly explained in the rejection of claim 18, and is rejected under similar rationale.

With regards to claim 20. the combination of Dance, Magnani et al, and Gao et al teaches a system for training an inference model, comprising a computer device, the computer device comprising a processor and a storage device storing computer executable code, wherein the computer executable code, when executed at the processor, is configured to perform the method of claim 18, as similarly explained in the rejection for claim 18, and is rejected under similar rationale.

Claims 7 and 15 are unpatentable over Dance (US Application: US 2005/0169558, published: Aug. 4, 2005, filed: Jan. 30, 2004) in view of Magnani et al (US Application: US 2018/0121533, published: May 3, 2018, filed: Oct. 31, 2016) and further in view of Gao et al (US Application: US 2017/0061250, published: Mar. 2, 2017, filed: Aug. 28, 2015) in view of Gokturk et al (US Application: US 2008/0082426), published: Apr. 3, 2008, filed: Jul. 13, 2007).

With regards to claim 7. The method of claim 1, the combination of Dance, Magnani et al and Gao et al teaches wherein each of the labeled fashion entries comprises a label, as similarly explained in the rejection of claim 1, and is rejected under similar rationale.

However the combination does not expressly teach .. wherein the label is a word related to a fashion feature of an image.

Yet Gokturk et al teaches .. wherein the label is a word related to a fashion feature of an image (paragraph 0064: labels for features can include labels that relate to fashion features such as shoes, straps, of shoes, heels , etc.).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified Dance, Magnani et al and Gao et al’s ability to label features, such that the labels would have been associated with words of fashions in an image, as taught by Gokturk et al. The combination would have allowed Dance, Magnani et al and Gao et al to have been able to have matched text … to content associated with images (Gokturk et al, paragraph 0009). 

With regards to claim 15. The system of claim 11, the combination of Dance, Magnani et al, Gao et al and Gokturk et al teaches wherein each of the labeled fashion entries comprises a label, wherein the label is a word related to a fashion feature of an image, as similarly explained in the rejection of claim 7, and is rejected under similar rationale.

Claims 10 is unpatentable over Dance (US Application: US 2005/0169558, published: Aug. 4, 2005, filed: Jan. 30, 2004) in view of Magnani et al (US Application: US 2018/0121533, published: May 3, 2018, filed: Oct. 31, 2016) and further in view of Gao et al (US Application: US 2017/0061250, published: Mar. 2, 2017, filed: Aug. 28, 2015) in view of Dligach et al (“Semi-supervised Learning for Phenotyping Tasks”, published: 2015, pages 1-17).

With regards to claim 10. The method of claim 1, the combination of Dance, Magnani et al and Gao teaches wherein the first number … and the second number …, as similarly explained in the rejection of claim 1, and is rejected under similar rationale. 

However the combination does not expressly teach … wherein the first number is about or less than 2000, and the second number is greater than 1 million.

Yet Dligach et al teaches wherein the first number is about or less than 2000, and the second number is greater than 1 million (Abstract, page 1, page 7: a labeled set of a few hundred (600 patients) are used and over 6 million patients yet to be labeled are processed).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified Dance, Magnani et al, and Gao et al’s ability to use machine learning to accept labeled and unlabeled data, such that the numbers/sample size in machine learning could have been less than 2000 labeled and more than a million unlabeled, as taught by Dligach et al. The combination would have allowed Dance, Magnani et al and Gao et al to have generated accurate model(s) with only a few hundred labeled for a large number of unlabeled examples (Abstract, Dligach et al). 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 

Jin et al (US Application: US 2017/0206435): This reference teaches an embedding space for images with multiple text labels, and semantically encoding text labels in the embedding space.
Wang et al (US Application: US 2019/0205393): This reference teaches training a semantic matching model based on logistic regression using image and text features from training data.
Zhai et al (US Application: US 2019/0095466): This reference teaches an image processing process that generates stored feature vectors and labels representative of segments and objects of stored images. 
Ha et al (WO 2018/208871 A1): This reference teaches concatenating  features from two different modalities into a concatenation layer.
Akata et al (US Application: US 2014/0376804): This reference teaches image classification and each class of a set of classes being embedded in an attribute space where each dimension of the attribute space corresponds to a class attribute.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILSON W TSUI whose telephone number is (571)272-7596. The examiner can normally be reached Monday - Friday 9 am -6 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Stephen Hong can be reached on 571-272-4124. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/WILSON W TSUI/Primary Examiner, Art Unit 2178