Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 3-12, 14, 16-21 are rejected under 35 U.S.C. 103 as being unpatentable over Shih et al (“Shih” US 2016/0217157 A1), published on July 28, 2016 in view of Lavergne (US 10,515,125 B1), published on December 24, 2019.
As to claim 1, Shih teaches “generating, by a computer system, an attention score for an item attribute in an 3image of one or more images for an item based at least in part on a neural network that uses the 4one or more images, the attention score identifying a likelihood of the item attribute being 5present in the image” in par. 0029 (score for each proposed match corresponds to an attention score for an item attribute in an 3image of one or more images for an item) and in par. 0017 (potential matches are found by the use of direct image classification (e.g., using deep convolutional neural network (CNN) which is a neural network that uses the 4one or more images).
identifying, by the computer system, the item attribute associated with text of a query based at least in part on a text classifier configured to use a string of the query to classify the item attribute associated with the text”.
However, Lavergne teaches “identifying, by the computer system, the item attribute associated with text of a query based at least in part on a text classifier configured to use a string of the query to classify the item attribute associated with the text” in col. 9: 43-48 (“To classify the text segment 201, the system 100A makes predictive inferences on attributes associated with the text segment 201 using, for example, NLP techniques to linguistically parse the contents of the text segment 201, e.g., terms included in the query, to develop semantic understanding of the text segment 201”).
Shih and Lavergne are analogous art because they are in the same field of endeavor, image query processing. It would have been obvious to one of ordinary skill in the art before the effective filling date of the claim invention to classify text segment included in the image, disclosed by Shih to include “identifying, by the computer system, the item attribute associated with text of a query based at least in part on a text classifier configured to use a string of the query to classify the item attribute associated with the text” in order to automatically classify the text segment without human intervention (see Lavergne col. 9).
Shih teaches “8determining, by the computer system, items from a catalog of items based at least 9in part on the item attribute associated with the items, an individual item of the items associated 10with a plurality of images” in par. 0031. 
11ranking, by the computer system, the plurality of images of the individual item 12based at least in part on corresponding attention scores associated with each image of the 13plurality of images” in par. 0029.
Shih teaches “and 14presenting, by the computer system, a user interface as a result to the query that 15includes a highest ranked image of the plurality of images for each item of the items based at 16least in part on the ranking of the plurality of images” in par. 0030.
As to claim 3, Shih teaches “updating, by the computer system, the user interface to present the first ranked image of the 3plurality of images in response to receiving input of an interaction with a particular item of the 4items” in pars [0030-0031] (user interface accepts image query and presents image results to the user).
As to claim 4, Shih teaches “updating the user 2interface includes presenting an individual image of the plurality of images for each item and 3scrolling to the first ranked image of the plurality of images in response to receiving the input of the interaction with the particular item of the items” 
As to claim 5, it is rejected for similar reason as claim 1.
As to claim 6, Shih teaches “a user interface that further presents information about the plurality of items in response to obtaining the query; and updating, by the computer system, the user interface to present item details for a specific item of the plurality of items based at least in part on receiving input via the user interface, wherein updating the user interface includes presenting a highest ranked image of the plurality of images for the specific item” in fig. 6, par. 0039.
As to claim 7, Shih teaches “obtaining, by the computer system, reviewer provided images for the item” in fig. 6, par. 0039.
As to claim 8, Shih teaches “wherein the model uses 2the one or more images of the item and the reviewer provided images for the item to generate the 3attention score for the item attribute in the image and each reviewer provided image of the reviewer provided images” in par. 0029 (input image corresponds to reviewer provided image).
As to claim 9, Shih teaches “presenting, by the computer system, a user interface that further presents the 3plurality of images of the item according to the ranking by the corresponding attention scores” in par. 0030.
Shih teaches “and 5updating, by the computer system, the user interface to replace an individual 6image of the plurality of images with a reviewer provided image of the item based at least in part 7on an associated attention score for the reviewer provided image and the associated attention 8score for the individual image” in fig. 6, par. 0039 (a reviewer provided image is interpreted as the image which has a highest matching score because it is actually provided by image input query).
As to claim 10, Shih teaches “identifying, by the computer system, a pixel range in each image of the plurality of images based 3at least in part on a class activation image processing algorithm, the pixel range associated with 4the item attribute identified by the model” in par. 0049.
As to claim 11, Shih teaches “updating, by the computer system, a user interface that presents information about the plurality 3of items in response to the query to include a view of the pixel range associated with the item 4attribute in each ranked image of the plurality of images of the item” in par. 0049 and in fig. 6.
	As to claim 12, Shih teaches “maintaining, by the computer system, the attention score for the item attribute in the image of the 3plurality of images associated with the query” in par. 0041.
	As to claim 14, Shih teaches “generate an attention score for an item attribute in an image of one or more 6images of an item based at least in part on a model that uses the one or more images” in fig. 7, step 720.
Shih teaches “7receive text associated with a query for the item; 8identify a result set of items based at least in part on the text; 9present a user interface with a representative image for each item of the 10result set of items” in fig. 7, fig. 8.I
Shih teaches “receive first input, via the user interface, that identifies the item attribute 12associated with each item of the result set of items” in fig. 7 (text extracted from the input image, corresponds to that identifies the item attribute 12associated with each item of the result set of items).
Shih teaches “13in response to receiving the first input: 14update the user interface from presenting the representative image 15for each item to present a particular image of the one or more images for each 16item of the result set of items based at least in part on the attention score for the 17item attribute in the particular image” in fig. 6 and par. 0039.
determine item attributes in the result set of items based at least in part on a text classifier configured to use a string of the query to classify the item attributes associated with the text of the query”.
However, Lavergne teaches “determine item attributes in the result set of items based at least in part on a text classifier configured to use a string of the query to classify the item attributes associated with the text of the query” in col. 9: 43-48 (“To classify the text segment 201, the system 100A makes predictive inferences on attributes associated with the text segment 201 using, for example, NLP techniques to linguistically parse the contents of the text segment 201, e.g., terms included in the query, to develop semantic understanding of the text segment 201”).
Shih and Lavergne are analogous art because they are in the same field of endeavor, image query processing. It would have been obvious to one of ordinary skill in the art before the effective filling date of the claim invention to classify text segment included in the image, disclosed by Shih to include “determine item attributes in the result set of items based at least in part on a text classifier configured to use a string of the query to classify the item attributes associated with the text of the query” in order to automatically classify the text segment without human intervention (see Lavergne col. 9).

As to claim 16, Shih teaches “wherein determining the item attributes 2includes using a convolutional neural network that uses images of the result set of items” in par. 0017 (CNN is a deep convolutional neural network).
As to claim 17, Shih teaches “wherein the processor is further 2configured to receive second input that associates a tag to a portion of the particular image for 3the attribute indicated by the first input” in fig. 5, par. 0027 (text extracted as a tag to a portion of the particular image).
As to claim 18, Shih teaches “wherein the processor is further 2configured to generate a first user interface element that corresponds to the portion of the particular image for the attribute based at least in part on the tag” in fig. 6.
As to claim 19, Shih teaches “generate a second user interface element that includes the first user interface 3element interleaved into another image” in fig. 6.
As to claim 20, Shih teaches “generate a new data object that includes the first user interface element interleaved 3into the second user interface element, where the new data object is configured to be transmitted 4to one or more social media platforms or image sharing applications” in fig. 6, par. 0018 (mobile phone corresponds to image sharing applications).
As to claim 21, Shih teaches “the first input identifies a 2plurality of item attributes for the item, and wherein updating the user interface to present the 3particular image for each item of the result set of items is further based at least in part on an 4aggregated attention score for the plurality of item attributes in the particular image, the 5aggregated attention score generated by the model that uses the one or more images for the item” in par. 0047, par. 0048 (vector combines a set of similar measures corresponds to aggregated attention score generated by the model that uses the one or more images for the item).

Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Shih et al (“Shih” US 2016/0217157 A1), published on July 28, 2016 in view of Lavergne (US 10,515,125 B1), published on December 24, 2019, and in further view of Goens et al (“Goens” US 2017/0243275 A1), published on August 24, 2017.
As to claim 2, it appears Shih and Lavergne do not explicitly teach “wherein generating the 2attention score for the attribute occurs periodically”.
However, Goens teaches “wherein generating the 2attention score for the attribute occurs periodically” in par. 0032 (“a clarity score is calculated periodically based on then-current information and/or dynamically…”).
Shih, Lavergne and Goens are analogous art because they are in the same field of endeavor, image processing. It would have been obvious to one of ordinary skill in the art before the effective filling date of the claim invention to include “wherein generating the 2attention score for the attribute occurs periodically” in order to increase the accuracy of scoring (see Goens par. 0032).

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Shih et al (“Shih” US 2016/0217157 A1), published on July 28, 2016 in view of Lavergne (US 10,515,125 B1), published on December 24, 2019, in further view of Sha et al (“Sha” US 2020/0004918 A1), published on January 02, 2020.
As to claim 13, it appears Shih and Lavergne do not explicitly teach “associating, by the computer system, metadata with each image of the plurality of images that 3identifies each item attribute and an associated attention score for each item attribute, and a 2-D heat map for each item attribute generated by a class activation image processing algorithm”.
However, Sha teaches “associating, by the computer system, metadata with each image of the plurality of images that 3identifies each item attribute and an associated attention score for each item attribute, and a 2-D heat map for each item attribute generated by a class activation image processing algorithm” in par. 0034 (“The generation of the heat map may be performed by any appropriate process including, e.g., gradient-weighted class activation mapping, which is based on the trained weights of a neural network in the model to localize the hotspots.  The heat map, which may be represented as a class activation heat map, is a two-dimensional grid of scores associated with a specific output class (e.g., hotspot-free, hotspot-containing, etc.)”).
Shih, Lavergne and Sha are analogous art because they are in the same field of endeavor, image processing. It would have been obvious to one of ordinary skill in the art before the effective filling date of the claim invention to create metadata of image, disclosed by Shih, including “associating, by the computer system, metadata with each image of the plurality of images that 3identifies each item attribute and an associated attention score for each item attribute, and a 2-D heat map for each item attribute generated by a class activation image processing algorithm” in order to classify pattern of image attribute utilizing hotspot model.



Response to Arguments
The applicant’s arguments have been considered but are rendered moot by the new grounds of rejection.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Loc Tran whose telephone number is (571)272-8485.  The examiner can normally be reached on Mon - Fri (8:00 am - 5:00 pm).
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kerzhner Aleksandr can be reached on 571-272-36760.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


/LOC TRAN/
Primary Examiner, Art Unit 2165