DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responsive to the Communication filed on 12/31/2020.   Claims 1-20, 35 and 36 have been canceled previously. New claims 41-42 have been added. Therefore, claims 21-34 and 37-42 are pending in this office action, of which claims 21, 27 and 34 are independent claims.

Response to Arguments
Applicant’s arguments, see page 9, filed 12/31/2020, with respect to the double patenting rejection have been fully considered and the rejection has been respectfully maintained.
Applicant’s arguments, see pages 9-13, filed 12/31/2020, with respect to the rejection(s) of claims 21-34 and 37-40 under 35 USC 103 have been fully considered but are not persuasive.
Examiner, in her previous office action, gave a detailed explanation of claimed limitation and pointed out exact locations in the cited prior art. 
Examiner is entitled to give claim limitations their broadest reasonable interpretation in light of the specification.  See MPEP 2111 [R-1]
	Interpretation of Claims-Broadest Reasonable Interpretation
	During patent examination, the pending claims must be ‘given the broadest reasonable interpretation consistent with the specification.’  Applicant always has the opportunity to 
 
Applicant argues:
a.	Nowhere does Luo disclose, teach, or suggest querying an index to determine an image that does not include an object having a second visual similarity to the second object, as recited in independent claim 21, and as similarly recited in independent claims 27 and 34 (page 11). 
	In response to applicant's argument a:  The argument is Luo describes creating an image product (i.e., collage) by combining cutout objects of interests (i.e., object cutouts). Luo does not teach determine an image that does not include an object having a second visual similarity to the second object.
	Examiner respectfully disagree. Luo teaches in para 0025 the data processing system store digital content records in the processor-accessible memory system and para 0028-0029, splitting image collection into groups of related images by analyzing this metadata or visual similarity by a clustering process, e.g., k-means, which is well known in the art.  A group of topically related photos is provided by such a clustering process 200. The next step includes selecting a seed image from each group of images and displaying it to a user 210.  One way to do this is to pick an image at random out of the group (i.e., index).  One can also employ a heuristic that selects that image which includes the most non-informative distribution of visual features, which can be indicated by a high entropy on the distribution, among all images in the group.  Visual features here might include color features, such as pixel RGB or the well-known 
	Note: since images are clustered into group (i.e., index) based on visual similarity and seed image (i.e., second image) is selected from each of the groups does not include visually similar objects from the seed image of another group. Fig. 3 provides a group of topically related photos 500. According to para 0028, if the object in these images is tree, then these three images includes visually different object. Therefore, Luo clearly teaches the argued limitation querying (i.e., pick or select) an index (i.e., visually similar groups) to determine an image that does not include an object (e.g., tree or person or background as shown in Fig. 3 seed images) having a second visual similarity to the second object. 
b.	 There is nothing in Keating that discloses, teaches, or suggests determining a combined similarity score based on a combination of similarity scores that correspond to a visual similarity of certain objects, as recited in dependent claim 41, and as similarly recited in dependent claim 42. (page 11-12).
	In response to applicant's argument b:  The argument is that Keating is limited to combining multiple versions (e.g., with different scaling) of the same image.
	Examiner respectfully disagree. Since the visual similarity is based at least in part on at least one of an object type, a color, a size, a shape, a style, a brand, or a texture of the object as claimed in the claimed invention. Accordingly Keating reference discloses in para 0059-0060 that a visual image can also be divided in color space. For example, average color can be computed for each spatial region (e.g., block) of a visual image, and the regions put into standard bins based on the computed average colors. If we suppose that there are 8 bins for 
	Para 0084-0085 also teaches each derivative "image" can be termed a "response image.". The values in each response image can be put into a histogram. Each such histogram is a representation of the statistical distribution of values within any response image, and the process-response statistical model for a visual image is the collection of histograms for the visual image and any response images. Statistical models other than histograms can be used to represent the distribution of values for a given response image and combined to produce a process-response statistical model for a visual image. See also para 0088.
	In view of the above, the examiner contends that all limitations as recited in the claims have been addressed in this Action.  For the above reasons, Examiner believed that rejection of the last Office action was proper.

Claim Objections
Claim 21 is objected to because of the following informalities:  the limitation in the “query at least on index……” includes “image-indicated” and “image-including” where “image indicated” and “image including” are expected similar to the other independent claims.  
Claim 41 depends on claim 21 and the “receive” step recites “receive a second selection of a forth object”, however claim 21 recites “receive a second selection of a second object”. Thus it is not clear regarding the second selection. 
Appropriate correction is required.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees.  A nonstatutory double patenting rejection is appropriate where the claims at issue are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the reference application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the 
The USPTO internet Web site contains terminal disclaimer forms which may be used.  Please visit http://www.uspto.gov/forms/.  The filing date of the application will determine what form should be used.  A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission.  For more information about eTerminal Disclaimers, refer to http://www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.  
Claims 21-34 and 37-40 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 4-8, 10, 13-15, and 18-19 respectively of copending Application No. 14/279,871. Although the claims at issue are not identical, they are not patentably distinct from each other.
This is a nonstatutory provisional obviousness-type double patenting rejection because the patentably indistinct claims have not in fact been patented. 
See the following chart for claim correspondences:
Co-pending Application #15/491,951
Instant Application # 14/279,871
Claim 21. (New) A computing system, comprising: at least one processor; memory including instructions that, when executed by the at least one processor, cause the computing system to at least:
process a first image that includes a plurality of objects to detect each of the plurality of objects;
for each of the plurality of objects, associate a token with the object;






    present the first image that includes the plurality of objects;
receive a first selection of a first object of the plurality of objects from the first image;
receive a second selection of a second object of the plurality of objects from the first image;
determine a first token corresponding to the first object and a second token corresponding to the second object;






    query at least one index maintained in a data store to determine, based at least in part on the first token or the second token, a second image indicated in the at least one index, the second image including a third object that is visually similar to the first object and does not include an object that is visually similar to the second object; and

















     


 present at least a portion of the second image, wherein the at least a portion of the second image includes the second object.


      Claim 22. (New) The computing system of claim 21, wherein the at least a portion of the second image is presented concurrently with the first image. 
Claim 23. (New) The computing system of claim 21, wherein the instructions when executed further cause the computing system to at least: for each of the plurality of objects: index the object using a text-based retrieval technique such that the object is included in at least one index maintained in the data store. 
Claim 24. (New) The computing system of claim 21, wherein the instructions when executed further cause the computing system to at least:
receive a third selection of a fourth object represented in the first image of the plurality of objects; and 
wherein the second image further includes a fifth object that is visually similar to the fourth object.









Claim 25. (New) The computing system of claim 21, wherein the instructions when executed further cause the computing system to at least: present on the first image, a graphical representation of a selection control identifying that an object included in the first image may be selected.

Claim 26. (New) The computing system of claim 21, wherein the instructions when executed further cause the computing system to at least: 
receive a third selection of a fourth object of the plurality of objects from the first image; 
determine that the second image includes a fifth object that is visually similar to the fourth object; and 
wherein the presented at least a portion of the second image includes at least a portion of the fifth object.



      

Claim 27. (New) A computer-implemented method, comprising:processing a first image that includes a plurality of objects to detect each of the plurality of objects;

for each of the plurality of objects, associating a token with the object:






receiving a first selection of a first location within the first image;
receiving a second selection of a second location within the first image;
determining a first token corresponding to a first object represented at the first location;
determining a second token corresponding to a second object represented at the second location;





    querying at least one index maintained in a data store to determine, based at least in part on the first token and the second token, a plurality of additional images indicated in the at least
one index, each of the plurality of additional images including a representation of an additional object that is visually similar to the first object and does not include an object that is visually similar to the second object; and


















          presenting, concurrently with the first image, at least a portion of each of the plurality of additional images.






Claim 28. (New) The computer-implemented method of claim 27, wherein the first selection is at least one of a touch-based input on a display, a determined position of a user's gaze, or an input from an input component.



Claim 29. (New) The computer-implemented method of claim 27, further comprising: determining that the first selection is a positive selection; and determining that the second selection is a negative selection. (part of claim 1 of ‘871).

Claim 30. (New) The computer-implemented method of claim 27, further comprising:
presenting at least one selection control at a location on the first image to identify an object represented in the first image that is selectable.




       Claim 33. (New) The computer-implemented method of claim 27, further comprising:
receiving a third selection of a second image from the plurality of additional images;
removing the presentation of the first image; and presenting the second image.

34. (New) A non-transitory computer-readable storage medium storing instructions that,
when executed by at least one processor of a computing system, cause the computing system to at least:
receive a selection of a first representation of a first object included in a first image presented in a first window;
in response to the selection, process a first region of the first image corresponding to the first object to form a first token representative of the first object;














            query at least one index maintained in a data store to determine, based at least in part on the first token, a second image indicated in the at least one index, the second image including a second representation of a second object that is visually similar to the first object; and




















               present, in a second window, at least a portion of the second image.




35. (New) The non-transitory computer-readable storage medium of claim 34, wherein an object is determined visually similar based at least in part on a shape, size, color, or brand of the object.


36. (New) The non-transitory computer-readable storage medium of claim 34, further comprising:
receive a second selection of a third representation of a third object included in the first image;
in response to the second selection, process a second region of the first image to form a second token; and
wherein the query of the at least one index is further to determine, based at least in part on the second token, the second representation such that the second representation does not include a representation of an object that is visually similar to the third object.





37. (New) The non-transitory computer-readable storage medium of claim 34, further comprising:
receive a second selection of a third representation of a third object included in the first image;
in response to the second selection, process a second region of the first image to form a second token; and
wherein the query of the at least one index is further to determine, based at least in part on the second token, the second representation such that the second representation includes a representation of an object that is visually similar to the third object.

38. (New) The non-transitory computer-readable storage medium of claim 34, further comprising: indexing the first object in the at least one data store. 


Claim 1. (Currently amended) A computing system, comprising: at least one processor; memory including instructions that, when executed by the at least one processor, cause the computing system to at least:
           process a first image that includes a plurality of objects to determine each of the plurality of objects; 
           for each of the plurality of objects:  associate a token with the object, the token representative of an object type of the object; and 
      index the object using a text-based retrieval technique
      present on a display the first image that includes the plurality of objects; 
        receive a first user input selecting a first object of the plurality of objects from the first image; 
       receive a second user input selecting a second object of the plurality of objects from the first image;
determine that the first user input selecting the first object of the plurality of objects is a positive selection; 
           determine that the second user input selecting the second object of the plurality of objects is a negative selection;
      determine a first token indicating a first object type of the first object and a second token indicating a second object type of the second object;
       query, with the first token, at least one index maintained in the data store in which at least one of the first object is indexed or the second object is indexed to determine a first plurality of images corresponding to the first object type;
        


 assign a positive weighting to each of the first plurality of images to adjust a similarity score for each image of the first plurality of images; 
        query, with the second token, at least one index maintained in the data store in which at least one of the first object is indexed or the second object is indexed to determine a second plurality of images corresponding to the second object type;
       assign a negative weighting to each of the second plurality of images to adjust a similarity score for each image of the second plurality of images;
      rank, based at least in part on the similarity score of each image, the first plurality of images and the second plurality of images to produce a ranked plurality of images; and
      send for presentation, concurrently with the first image, at least a portion of a highest ranked image that includes the first object type and does not include the second object type.

 Part of claim 1





Part of claim 1
























         Claim 5. (Previously presented) The computing system of claim 1, wherein the instructions when executed further cause the computing system to at least: present on the first image, a graphical representation of a selection control identifying that an object included in the first image may be selected by a user. 

    Claim 6. (Currently amended) The computing system of claim 1, wherein the instructions when executed further cause the computing system to at least:
receive a third user input selecting a third object of the plurality of objects from the first image; 
determine a third object type of the third object; 
determine that the highest ranked image includes a fourth object 
wherein a presented at least a portion of the highest ranked image includes at least a portion of the fourth object.

7.	(Currently amended) A computer-implemented method, comprising: processing a first image that includes a plurality of objects to determine each of the plurality of objects; 
for each of the plurality of objects: associating a token with the object, the token representative of an object type of the object; and indexing the object using a text-based retrieval technique 
receiving a first selection of a first location within the first image; 
receiving a second selection of a second location within the first image; 
determining that the first selection is a positive selection;
determining that the second selection is a negative selection; 
determining a first token corresponding to a first object represented at the first selected location; 
determining a second token corresponding to a second object represented at the second selected location; 
querying at least one index maintained in the data store in which at least one of the first object is indexed or the second object is indexed to determine, based at least in part on the first token and the second token, a plurality of additional images indicated in the at least one index, each of the plurality of additional images including a representation of an additional object that is a same object type as the first object or a representation of an object that is a same object type as the second object; 
assigning a positive weighting to a similarity score of each image of the plurality of additional images that include a representation of an object that is the same object type as the first object;
assigning a negative weighting to a similarity score of each image of the plurality of additional images that include a representation of an object that is the same object type as the second object; 
ranking, based at least in part on the similarity score of each image, the plurality of additional images to produce a ranked plurality of images; and
sending for presentation, concurrently with the first image, at least a portion of a highest ranked image that includes the representation of an object that is the same object type as the first object and does not include the representation of an object that is the same object type as the second object. 


          Claim 8. 	(Previously presented) The computer-implemented method of claim 7, wherein the first selection is at least one of a touch-based input on a display, a determined position of a user’s gaze, or an input from an input component. 



Part of claim 1







    Claim 10. (Original) The computer-implemented method of claim 7, further comprising:
presenting at least one selection control at a location on the first image to identify an object represented in the first image that is selectable. 






    Claim 13.	(Previously presented) The computer-implemented method of claim 7, further comprising:
receiving a third selection of a second image from the plurality of additional images; 
removing the presentation of the first image; and presenting the second image. 

     14. 	(Currently amended) A non-transitory computer-readable storage medium storing instructions that, when executed by at least one processor of a computing system, cause the computing system to at least:
receive a selection of a first representation of a first object included in a first image presented in a first window;
in response to the selection, process a first region of the first image corresponding to the first object to form a first token representative of a first object type of the first object;
receive a second selection of a second representation of a second object included in the first image presented in the first window;
in response to the second selection, process a second region of the first image corresponding to the second object to form a second token representative of a second object type of the second object;
determine that the first selection is a positive selection;
determine that the second selection is a negative selection;
query at least one index maintained in a data store in which at least one of the first object is indexed or the second object is indexed to determine, based at least in part on the first token or the second token a plurality of additional images indicated in the at least one index, each of the plurality of additional images including a representation of an additional object that is a same object type as the first object or a representation of an object that is a same object type as the second object; 
assigning a positive weighting to a similarity score of each image of the plurality of additional images that include a representation of an object that is the same object type as the first object;
assigning a negative weighting to a similarity score of each image of the plurality of additional images that include a representation of an object that is the same object type as the second object; 
ranking, based at least in part on the similarity score of each image, the plurality of additional images to produce a ranked plurality of images; and
send for presentation, in a second window, at least a portion of a highest ranked image that includes the first object type and does not include the second object type.

15. 	(Previously presented) The non-transitory computer-readable storage medium of claim 14, wherein each similarity score is determined based at least in part on a shape, a size, a color, or a brand of the object. 

18. 	(Currently amended) The non-transitory computer-readable storage medium of claim 14, further comprising:
receive a third selection of a third representation of a third object included in the first image presented in the first window; 
determine that the third selection is a positive selection; 
determine a third object type of the third object; and wherein:
        the query includes a query of at least one index maintained in the data store in which at least one of the first object is indexed, the second object is indexed, or the third object is indexed; and
          the highest ranked image further includes 

19. 	(Previously presented) The non-transitory computer-readable storage medium of claim 14, wherein the instructions when executed further cause the computing system to at least:
receive a third selection of the presented at least a portion of the highest ranked image; and 
present the at least a portion of the highest ranked image in the first window.





The limitation of claim 38 of ‘951 included in claim 1 of ‘871. 




Claims 21-34 and 37-40 are provisionally rejected on the ground of nonstatutory obviousness-type double patenting for the reasons set forth both below in the claims and above in the previous double patenting rejections.  The claims of the ‘871 application anticipate the claims of the Instant Application with minor phrase changes.  Therefore, the claims in the Instant Application are obvious in view of the claims of the ‘871 Application.  This is a provisional obviousness-type double patenting rejection.
It would have been obvious to a person of ordinary skill in the art at the time the invention was made to modify the '951 system claims with the '871 system claims by including the claimed feature of determination of a first token indicating a first object type and a second token indicating a second object type. And assigning a positive weighting or a negative weighting to each of the first plurality of images to adjust a similarity score and query the index to determine first and second plurality of images with positive and negative weighting and ranked the plurality of images based on similarity score in order to achieve the same functions to be performed by both of the system.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 21-34 and 37-40 are rejected under 35 U.S.C. 103 as being unpatentable over Gokturk et al., US 2013/0132236 (hereinafter “Gokturk”) and in view of Luo et al, US 2010/0226566 A1 (hereinafter “Luo”).

As to claim 21,
 
Gokturk teaches a computing system, comprising: at least one processor; memory including instructions that, when executed by the at least one processor, cause the computing system to at least: 
process a first image that includes a plurality of objects to detect each of the plurality of objects (Gokturk teaches in paragraph 0037, Embodiments described herein enable programmatic detection and/or identification of various types and classes of objects from images, including objects that are items of commerce or merchandise); 
for each of the plurality of objects, associate a token with the object (Gokturk teaches in Para 0064, the text translator 135 may convert the feature extraction 134 into the text "orange" and assign the text value to a field that is designated for the type of feature.);

present the first image that includes the plurality of objects; receive a first selection of a first object of the plurality of objects from the first image (See Gokturk Para 0192, the user-interface 1110 may be configured to enable a user to select or specify a portion of a processed image as the search input 1112.  The processed image may be existing, or it may be analyzed on the fly in the case of trigger inputs 134 and user-inputs 136.  In either case, portions of the image may be made selectable, or at least identifiable with user input.  Once selected, the portion of the image can form the basis of the query, or the user may manipulate or specify additional features that are to be combined with that portion.  For example, the user may specify a portion of an object image, then specify a color that the portion of the object is to possess.);                receive a second selection of a second object of the plurality of objects from the first image (Gokturk teaches in Para 0192, the user-interface 1110 may be configured to enable a user to select or specify a portion of a processed image as the search input 1112.  The processed image may be existing, or it may be analyzed on the fly in the case of trigger inputs 134 and user-inputs 136.  In either case, portions of the image may be made selectable, or at least identifiable with user input.  Once selected, the portion of the image can form the basis of the query, or the user may manipulate or specify additional features that are to be combined with that portion.  For example, the user may specify a portion of an object image, then specify a color that the portion of the object is to possess); 
determine a first token corresponding to the first object and a second token corresponding to the second object (Gokturk teaches in Para 0064, a text translator 136 that converts class-specific feature data 134 (or signatures 128) into text, so that local and/or global features may be represented by text data 129.);

query at least one index maintained in a data store to determine, based at least in part on the first token or the second token, a second image indicated in the at least one index, the second image including a third object having a first visual similarity to the first object (Gokturk teaches in Para 0161, the system searches for top N most similar key points and/or regions.  The similarity is defined by one of the above described similarity metrics.  Each of these N nearest neighbors are mapped to the corresponding image in the database.  These mapped images form the potential search results.  A default rank of 0 is assigned to each of these images.  Sometimes multiple points or regions from an image in the database matches to a query point or region.  This results in many to many mappings of the points),  wherein the first visual similarity is based at least in part on at least one of an object type, a color, a size, a shape, a style, a brand, or a texture of the first object (Gokturk teaches in paragraph 0290: In response to the query, step 2340 provides that the visual search engine returns images of objects that correspond to or are otherwise determined to be similar in appearance or design or even style, as the object of the user's selection); 
present at least a portion of the second image, wherein the at least a portion of the second image includes the third object (Gokturk teaches in Fig. 16-18).
Even though Gokturk teaches query at least one index maintained in a data store to determine, based at least in part on the first token or the second token, a second image indicated in the at least one index, the second image including a third object having a first visual similarity to the first object, Gokturk does not explicitly teach the second image does not include an object having a second visual similarity to the second object, wherein the second visual similarity is based at least in part on at least one of an object type, a color, a size, a shape, a style, a brand, or a texture of the second object.
However, Luo teaches the second image does not include an object having a second visual similarity to the second object (Luo, para 033, the object cutouts can be used 260 to create image products, including: creating a collage from the object cutouts in a group, removing unwanted object and filling in the resulting blank space by texture synthesis or in-painting, compositing one or more of the object cutouts on a new image background), wherein the second visual similarity is based at least in part on at least one of an object type, a color, a size, a shape, a style, a brand, or a texture of the second object (Luo, para 0029 Visual features here might include color features, such as pixel RGB or the well-known HSV (hue-saturation-value) or CIELAB responses, or texture features, such as filterbank responses, or any other image descriptors).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Gokturk by including a method for extracting an object out of each image in a group of visually digital images that contain the object and the object cutouts can be used to create image products including removing unwanted object in order to improve large number of image processing as taught by Luo.

As to claim 22,
The combination of Gokturk and Luo teaches the at least a portion of the second image is presented concurrently with the first image (Gokturk teaches in Fig. 16-18 concurrently presenting images with the first image).  

As to claim 23,
The combination of Gokturk and Luo teaches the instructions when executed further cause the computing system to at least: for each of the plurality of objects: index the object using a text-based retrieval technique such that the object is included in at least one index maintained in the data store (See Gokturk Para 0178, an index 1150 which stores index data 1152 generated from the analysis of images and content item.  The index data 1152 may be generated as a result of the performance of the content analysis system 1140.  System 1100 may include a record data store 1160 that holds records 1162 that include content items analyzed by the content analysis system 1140.  Procurement 1130 may retrieve and populate records 1162 with record data 1164. Also see Para 0217, The searchable text of the index of other data structure may include text that is characteristic of an image, and programmatically determined from performing analysis or recognition of the image).  

As to claim 24,
The combination of Gokturk and Luo teaches the instructions when executed further cause the computing system to at least: receive a third selection of a fourth object represented in the first image of the plurality of objects; and wherein the second image further includes a fifth object having a third visual similarity to the fourth object, the third visual similarity being based at least in part on at least one of an object type, a color, a size, a shape, a style, a brand, or a texture of the fourth object (Gokturk teaches in paragraph 0191: search module may implement one or more filters to identify items by category classification and display the results. See also paragraph 0290: In response to the query, step 2340 provides that the visual search engine returns images of objects that correspond to or are otherwise determined to be similar in appearance or design or even style, as the object of the user's selection. So style is the visual similarity between the objects).  
As to claim 25,
The combination of Gokturk and Luo teaches the instructions when executed further cause the computing system to at least: present on the first image, a graphical representation of a selection control identifying that an object included in the first image may be selected (Gokturk teaches in paragraph 0030: FIG. 18 illustrates implementation of a selector graphic feature for enabling a user to select a portion of an image of an object. Paragraph 0244 also teaches: An example of a selector graphic feature 1810 for enabling the user to make the local region selection is shown with an embodiment of FIG. 18.  The feature 1810 may be graphic, and sizeable (expand and contract) over a desired position of a processed image 1820.  In the example provided, a rectangle may be formed by the user over a region of interest of the image-that being the face of a watch).  

As to claim 26,
The combination of Gokturk and Luo teaches the instructions when executed further cause the computing system to at least: receive a third selection of a fourth object of the plurality of objects from the first image; determine that the second image includes a fifth object having a third  visual similarity to the fourth object, the third visual similarity being based at least in part on at least one of an object type, a color, a size, a shape, a style, a brand, or a texture of the fourth object (Gokturk teaches in paragraph 0290: In response to the query, step 2340 provides that the visual search engine returns images of objects that correspond to or are otherwise determined to be similar in appearance or design or even style, as the object of the user's selection. So style is the visual similarity between the objects); and wherein the presented at least a portion of the second image includes at least a portion of the fifth object (Gokturk teaches in paragraph 0191: search module may implement one or more filters to identify items by category classification and display the results).  

As to claim 27,
The combination of Gokturk and Luo teaches a computer-implemented method, comprising: 
processing a first image that includes a plurality of objects to detect each of the plurality of objects; (See Gokturk Para 0037, Embodiments described herein enable programmatic detection and/or identification of various types and classes of objects from images, including objects that are items of commerce or merchandise.); 
for each of the plurality of objects, associating a token with the object: receiving a first selection of a first location within the first image; (Gokturk Para 0064, the text translator 135 may convert the feature extraction 134 into the text "orange" and assign the text value to a field that is designated for the type of feature.);  
receiving a second selection of a second location within the first image; determining a first token corresponding to a first object represented at the first location; determining a second token corresponding to a second object represented at the second location; (See Gokturk Para 0192, the user-interface 1110 may be configured to enable a user to select or specify a portion of a processed image as the search input 1112.  The processed image may be existing, or it may be analyzed on the fly in the case of trigger inputs 134 and user-inputs 136.  In either case, portions of the image may be made selectable, or at least identifiable with user input.  Once selected, the portion of the image can form the basis of the query, or the user may manipulate or specify additional features that are to be combined with that portion.  For example, the user may specify a portion of an object image, then specify a color that the portion of the object is to possess.); 	querying at least one index maintained in a data store to determine, based at least in part on the first token and the second token, a plurality of additional images indicated in the at least one index, each of the plurality of additional images including a representation of an additional object having a first visual similarity to the first object (Gokturk teaches in Para 0161, the system searches for top N most similar key points and/or regions.  The similarity is defined by one of the above described similarity metrics.  Each of these N nearest neighbors are mapped to the corresponding image in the database.  These mapped images form the potential search results.  A default rank of 0 is assigned to each of these images.  Sometimes multiple points or regions from an image in the database matches to a query point or region.  This results in many to many mappings of the points) and does not include an object having a second visual similarity to the second object, 
wherein the first visual similarity is based at least in part on at least one of an object type, a color, a size, a shape, a style, a brand, or a texture of the first object (Gokturk teaches in paragraph 0290: In response to the query, step 2340 provides that the visual search engine returns images of objects that correspond to or are otherwise determined to be similar in appearance or design or even style, as the object of the user's selection); and 
presenting, concurrently with the first image, at least a portion of each of the plurality of additional images (Gokturk teaches in Fig. 16-18);
Even though Gokturk teaches querying at least one index maintained in a data store to determine, based at least in part on the first token and the second token, a plurality of additional images indicated in the at least one index, each of the plurality of additional images including a representation of an additional object having a first visual similarity to the first object, Gokturk does not explicitly teach each of the plurality of additional images does not include an object having a second visual similarity to the second object, wherein the second visual similarity is based at least in part on at least one of an object type, a color, a size, a shape, a style, a brand, or a texture of the second object. 

However, Luo teaches each of the plurality of additional images does not include an object having a second visual similarity to the second object (Luo, para 033, the object cutouts can be used 260 to create image products, including: creating a collage from the object cutouts in a group, removing unwanted object and filling in the resulting blank space by texture synthesis or in-painting, compositing one or more of the object cutouts on a new image background), wherein the second visual similarity is based at least in part on at least one of an object type, a color, a size, a shape, a style, a brand, or a texture of the second object (Luo, para 0029 Visual features here might include color features, such as pixel RGB or the well-known HSV (hue-saturation-value) or CIELAB responses, or texture features, such as filter bank responses, or any other image descriptors).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Gokturk by including a method for extracting an object out of each image in a group of visually digital images that contain the object and the object cutouts can be used to create image products including removing unwanted object in order to improve large number of image processing as taught by Luo.

As to claim 28,
The combination of Gokturk and Luo teaches the first selection is at least one of a touch-based input on a display, a determined position of a user's gaze, or an input from an input component (Gokturk teaches in paragraph 0049: human input). 
As to claim 29,
The combination of Gokturk and Luo teaches determining that the first selection is a positive selection (Gokturk taches in Para 0192, the user-interface 1110 may be configured to enable a user to select or specify a portion of a processed image as the search input 1112.  The processed image may be existing, or it may be analyzed on the fly in the case of trigger inputs 134 and user-inputs 136.  In either case, portions of the image may be made selectable, or at least identifiable with user input.  Once selected, the portion of the image can form the basis of the query, or the user may manipulate or specify additional features that are to be combined with that portion.  For example, the user may specify a portion of an object image, then specify a color that the portion of the object is to possess); 
Gokturk does not explicitly teach determining that the second selection is a negative selection.
However, Luo teaches that determining that the second selection is a negative selection (Luo, para 033, the object cutouts can be used 260 to create image products, including: creating a collage from the object cutouts in a group, removing unwanted object and filling in the resulting blank space by texture synthesis or in-painting, compositing one or more of the object cutouts on a new image background). Athorus Matter No.: 127.0023-US-CON1 4 4831-6961-7479, v. 1 U.S. Patent Application No.: 15/491,951 Preliminary Amendment  

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Gokturk by including a method for extracting an object out of each image in a group of visually digital images that contain the object and the object cutouts can be used to create image products including removing unwanted object in order to improve large number of image processing as taught by Luo.

As to claim 30,
The combination of Gokturk and Luo teaches presenting at least one selection control at a location on the first image to identify an object represented in the first image that is selectable (Gokturk teaches in paragraph 0030: FIG. 18 illustrates implementation of a selector graphic feature for enabling a user to select a portion of an image of an object. Paragraph 0244 also teaches: An example of a selector graphic feature 1810 for enabling the user to make the local region selection is shown with an embodiment of FIG. 18.  The feature 1810 may be graphic, and sizeable (expand and contract) over a desired position of a processed image 1820.  In the example provided, a rectangle may be formed by the user over a region of interest of the image-that being the face of a watch).
As to claim 31,
The combination of Gokturk and Luo teaches presenting adjacent each of the plurality of objects a selection control indicating that the respective object may be selected (Gokturk teaches in paragraph 0030: FIG. 18 illustrates implementation of a selector graphic feature for enabling a user to select a portion of an image of an object. Paragraph 0244 also teaches: An example of a selector graphic feature 1810 for enabling the user to make the local region selection is shown with an embodiment of FIG. 18.  The feature 1810 may be graphic, and sizeable (expand and contract) over a desired position of a processed image 1820.  In the example provided, a rectangle may be formed by the user over a region of interest of the image-that being the face of a watch).
As to claim 32,
The combination of Gokturk and Luo teaches determining for each of the plurality of additional images a respective similarity score (Gokturk teaches in paragraph 0163: The algorithm then converts distances returned by similarity metrics to goodness score); 
ranking each of the plurality of additional images based at least in part on the respective similarity scores (Gokturk teaches in paragraph 0168: the system returns the most similarly ranked images); and 
wherein presenting includes presenting, concurrently with the first image, at least a portion of the plurality of additional images in an ordered arrangement based at least in part on the respective similarity scores(Gokturk teaches in paragraph 0168: one or more embodiments enable the user to fuse search results by defining multiple selections. Here the user selects an image region and the system returns the most similarly ranked images. Now the user can select another image region and ask the system to return intersection or union of the two search results. More than two selections can be defined in a similar fashion).  

As to claim 33,
The combination of Gokturk and Luo teaches receiving a third selection of a second image from the plurality of additional images (Gokturk teaches in paragraph 0192: additional features added); 
removing the presentation of the first image (Gokturk teaches in paragraph 0192); and presenting the second image (Gokturk teaches in Fig. 15, 16 and 17). Athorus Matter No.: 127.0023-US-CON1 5 4831-6961-7479, v. 1 U.S. Patent Application No.: 15/491,951 Preliminary Amendment  
As to claim 34,
The combination of Gokturk and Luo teaches a non-transitory computer-readable storage medium storing instructions that, when executed by at least one processor of a computing system, cause the computing system to at least: 
receive a selection of a first representation of a first object included in a first image presented in a first window(See Gokturk Para 0037, Embodiments described herein enable programmatic detection and/or identification of various types and classes of objects from images, including objects that are items of commerce or merchandise.); 
receive a second selection of a third representation of a third object included in the first image (Gokturk teaches in Para 0192, the user-interface 1110 may be configured to enable a user to select or specify a portion of a processed image as the search input 1112.  The processed image may be existing, or it may be analyzed on the fly in the case of trigger inputs 134 and user-inputs 136.  In either case, portions of the image may be made selectable, or at least identifiable with user input.  Once selected, the portion of the image can form the basis of the query, or the user may manipulate or specify additional features that are to be combined with that portion.  For example, the user may specify a portion of an object image, then specify a color that the portion of the object is to possess);
in response to the selection and the second selection, process a first region of the first image corresponding to the first object to form a first token representative of the first object(See Gokturk Para 0060, Feature extraction 120 may detect or determine features of either the segmented image 114, or a normalized segmented image 115) and process a second region of the first image to form a second token representative of the third object (Gokturk teaches in Para 0064, a text translator 136 that converts class-specific feature data 134 (or signatures 128) into text, so that local and/or global features may be represented by text data 129.);                      query at least one index maintained in a data store to determine, based at least in part on the first token and the second token, a second image indicated in the at least one index, the second image including a second representation of a second object having a first visual similarity to the first object (See Gokturk Para 0161, the system searches for top N most similar key points and/or regions.  The similarity is defined by one of the above described similarity metrics.  Each of these N nearest neighbors are mapped to the corresponding image in the database.  These mapped images form the potential search results.  A default rank of 0 is assigned to each of these images.  Sometimes multiple points or regions from an image in the database matches to a query point or region.  This results in many to many mappings of the points), wherein the first visual similarity is based at least in part on at least one of an object type, a color, a size, a shape, a style, a brand, or a texture of the first object (Gokturk teaches in paragraph 0290: In response to the query, step 2340 provides that the visual search engine returns images of objects that correspond to or are otherwise determined to be similar in appearance or design or even style, as the object of the user's selection);
present, in a second window, at least a portion of the second image (Gokturk teaches in Fig. 16-18).  
Even though Gokturk teaches query at least one index maintained in a data store to determine, based at least in part on the first token and the second token, a second image indicated in the at least one index, the second image including a second representation of a second object having a first visual similarity to the first object, Gokturk does not explicitly teach the second image does not include a representation of an object having a second visual similarity to the third object, wherein the second visual similarity is based at least in part on at least one of an object type, a color, a size, a shape, a style, a brand, or a texture of the second object;
However, Luo teaches the second image does not include a representation of an object having a second visual similarity to the third object (Luo, para 033, the object cutouts can be used 260 to create image products, including: creating a collage from the object cutouts in a group, removing unwanted object and filling in the resulting blank space by texture synthesis or in-painting, compositing one or more of the object cutouts on a new image background), wherein the second visual similarity is based at least in part on at least one of an object type, a color, a size, a shape, a style, a brand, or a texture of the second object (Luo, para 0029 Visual features here might include color features, such as pixel RGB or the well-known HSV (hue-saturation-value) or CIELAB responses, or texture features, such as filter bank responses, or any other image descriptors).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Gokturk by including a method for extracting an object out of each image in a group of visually digital images that contain the object and the object cutouts can be used to create image products including removing unwanted object in order to improve large number of image processing as taught by Luo.

As to claim 37,
The combination of Gokturk and Luo teaches receive a second selection of a third representation of a third object included in the first image; in response to the second selection, process a second region of the first image to form a second token (See Gokturk Para 0192, the user-interface 1110 may be configured to enable a user to select or specify a portion of a processed image as the search input 1112.  The processed image may be existing, or it may be analyzed on the fly in the case of trigger inputs 134 and user-inputs 136.  In either case, portions of the image may be made selectable, or at least identifiable with user input.  Once selected, the portion of the image can form the basis of the query, or the user may manipulate or specify additional features that are to be combined with that portion.  For example, the user may specify a portion of an object image, then specify a color that the portion of the object is to possess); and 
wherein the query of the at least one index is further to determine, based at least in part on the second token, the second representation such that the second representation includes a representation of an object having a third visual similarity to the third object, the third visual similarity being based at least in part on at least one of an object type, a color, a size, a shape, a style, a brand, or a texture of the third object (See Gokturk Para 0161, the system searches for top N most similar key points and/or regions.  The similarity is defined by one of the above described similarity metrics.  Each of these N nearest neighbors are mapped to the corresponding image in the database.  These mapped images form the potential search results.  A default rank of 0 is assigned to each of these images.  Sometimes multiple points or regions from an image in the database matches to a query point or region.  This results in many to many mappings of the points).  

As to claim 38,
The combination of Gokturk and Luo teaches indexing the first object in the at least one data store (See Gokturk Para 0178, an index 1150 which stores index data 1152 generated from the analysis of images and content item.  The index data 1152 may be generated as a result of the performance of the content analysis system 1140.  System 1100 may include a record data store 1160 that holds records 1162 that include content items analyzed by the content analysis system 1140.  Procurement 1130 may retrieve and populate records 1162 with record data 1164. Also see Para 0217, The searchable text of the index of other data structure may include text that is characteristic of an image, and programmatically determined from performing analysis or recognition of the image).  
As to claim 39,
The combination of Gokturk and Luo teaches the instructions when executed further cause the computing system to at least: receive a second selection of the presented at least a portion of the second image; and present the second image in the first window (See Gokturk Para 0037, Embodiments described herein enable programmatic detection and/or identification of various types and classes of objects from images, including objects that are items of commerce or merchandise).  
As to claim 40,
The combination of Gokturk and Luo teaches the instructions when executed further cause the computing system to at least: determine a third image that includes a third representation of a third object having a third visual similarity to the first object (Gokturk teaches in paragraph 0168: one or more embodiments enable the user to fuse search results by defining multiple selections. Here the user selects an image region and the system returns the most similarly ranked images), the third visual similarity being based at least in part on at least one of an object type, a color, a size, a shape, a style, a brand, or a texture of the first object (Gokturk teaches in paragraph 0290: In response to the query, step 2340 provides that the visual search engine returns images of objects that correspond to or are otherwise determined to be similar in appearance or design or even style, as the object of the user's selection); 
rank the second image and the third image based at least in part on a similarity score associated with each of the second image and the third image(Gokturk teaches in paragraph 0163: The algorithm then converts distances returned by similarity metrics to goodness score. Paragraph 0168 also teaches the system returns the most similarly ranked images); and 
present, in the second window, a ranked representation of at least a portion of the second image and the third image (Gokturk teaches in paragraph 0168: one or more embodiments enable the user to fuse search results by defining multiple selections. Here the user selects an image region and the system returns the most similarly ranked images).

Claims 40-42 are rejected under 35 U.S.C. 103 as being unpatentable “Gokturk” and in view of “Luo” and further in view of Keating et al., US 20060020597 A1 (hereinafter “Keating”).

As to claim 41,
The combination of Gokturk and Luo teaches the instructions, when executed by the at least one processor, further cause the computing system to at least: 
determine a first similarity score corresponding to the first visual similarity of the first object and the third object (Gokturk, para 0110, associates edges with high similarity scores to pairs of adjacent image pixels with similar color and texture (i.e., visually similar)) ; 
receive a second selection of a fourth object of the plurality of objects from the first image (Gokturk teaches in Para 0192, the user-interface 1110 may be configured to enable a user to select or specify a portion of a processed image as the search input 1112.  The processed image may be existing, or it may be analyzed on the fly in the case of trigger inputs 134 and user-inputs 136.  In either case, portions of the image may be made selectable, or at least identifiable with user input.  Once selected, the portion of the image can form the basis of the query, or the user may manipulate or specify additional features that are to be combined with that portion.  For example, the user may specify a portion of an object image, then specify a color that the portion of the object is to possess); 
The combination of Gokturk and Lou teaches the invention as claimed above, the combination does not explicitly teach determine a second similarity score corresponding to a third visual similarity of the fourth object and a fifth object included in the second image; and determine a combined similarity score corresponding to the second image, the combined similarity score being based at least in part on a combination of the first similarity score and the second similarity score, wherein the third visual similarity is based at least in part on at least one of an object type, a color, a size, a shape, a style, a brand, or a texture of the fourth object.  
However, Keating teaches determine a second similarity score corresponding to a third visual similarity of the fourth object and a fifth object included in the second image (Keating, para 0059-0060 that a visual image can also be divided in color space. For example, average color can be computed for each spatial region (e.g., block) of a visual image, and the regions put into standard bins based on the computed average colors. If we suppose that there are 8 bins for average color (one bit per channel for a three-channel color space, for example), then we can have one process-response statistical model for all regions with average color 0, another for regions with average color 1, and so on. This use of information about the regions (i.e., fourth object and a fifth object) can advantageously enable more separation between statistics to be maintained. Thus, regions that tend to be more similar are compared on a statistical basis (i.e., determine second similarity score) independent of regions that may be quite different. As para 0051 a region can contain information about the relationship between two objects); and 
	determine a combined similarity score corresponding to the second image, the combined similarity score being based at least in part on a combination of the first similarity score and the second similarity score (Keating, para 0059-0060, a visual image can also be divided in color space. For example, average color can be computed for each spatial region (e.g., block) of a visual image, and the regions put into standard bins based on the computed average colors. If we suppose that there are 8 bins for average color (one bit per channel for a three-channel color space, for example), then we can have one process-response statistical model for all regions with average color 0, another for regions with average color 1, and so on. This use of information about the regions can advantageously enable more separation between statistics to be maintained. Thus, regions that tend to be more similar are compared on a statistical basis (i.e., combined similarity score based on combination of similarity) independent of regions that may be quite different. As para 0051 a region can contain information about the relationship between two objects), wherein the third visual similarity is based at least in part on at least one of an object type, a color, a size, a shape, a style, a brand, or a texture of the fourth object(Keating, para 0010: the comparison between visual images is typically done by comparing the feature vectors of the most prominent regions (determined in any of a variety of ways, e.g., by size or shape) in each visual image).  

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of the combination of Gokturk and Luo by including the process-response statistical modeling of visual images that is used in determining similarity between visual images. A similarity score and a quality score can be determined for each of multiple visual images of a group of visual images, the scores can be weighted as deemed appropriate (e.g., the weight of the similarity score can be made greater than that of the quality score), the scores combined, and the visual image having the highest or lowest combined score (depending on whether the increasing desirability of a visual image is represented by a higher or lower score) selected as the keyframeas as taught by Keating.

As to claim 42,
The combination of Gokturk, Luo and Keating teaches each of the plurality of additional images includes a plurality of additional objects and the respective similarity score for each of the plurality of additional images includes a combination of a plurality of additional object similarity scores (Keating, para 0059-0060, a visual image can also be divided in color space. For example, average color can be computed for each spatial region (e.g., block) of a visual image, and the regions put into standard bins based on the computed average colors. If we suppose that there are 8 bins for average color (one bit per channel for a three-channel color space, for example), then we can have one process-response statistical model for all regions with average color 0, another for regions with average color 1, and so on. This use of information about the regions can advantageously enable more separation between statistics to be maintained. Thus, regions that tend to be more similar are compared on a statistical basis (i.e., combined similarity score based on combination of similarity) independent of regions that may be quite different. As para 0051 a region can contain information about the relationship between two objects).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
The Reference Anguelov et al. (US 9,098,741 B1) discloses methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for object detection.  Methods can include, for each of a plurality of locations in one or more positive images, image filters are identified, each image filter representing visual features of a location in a positive image (e.g., an image that includes a particular object).  Positive location feature scores and negative location feature scores are determined for locations within images. 
The reference Wang et al. (US 2012/0123976 A1) discloses methods and systems for object-sensitive image searches. These methods and systems are usable for receiving a query for an image of an object and providing a ranked list of query results to the user based on a ranking of the images.  The object-sensitive image searches may generate a pre-trained multi-instance learning (MIL) model trained from free training data from users sharing images at websites to identify a common pattern of the object, and/or may generate a MIL model "on the fly" trained from pseudo-positive and pseudo-negative samples of query results to identify a common pattern of the object.

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NARGIS SULTANA whose telephone number is (571)272-6350.  The examiner can normally be reached on Monday to Thursday 8:30am to 4:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ashish Thomas can be reached on 571 272 0631.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





3/31/2021
/NARGIS SULTANA/Examiner, Art Unit 2164        

/ASHISH THOMAS/Supervisory Patent Examiner, Art Unit 2164