Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment

Applicant response filled May 24, 2021 has been considered and entered. Accordingly, Claims 1 – 3, 5 – 10, 12 – 17, 19 and 21 – 24 are pending in this application. Additionally claims 1, 3, 8, 10, 15 and 17 are amended; claims 4, 11 and 18 are cancelled and claim 20 has been previously cancelled; lastly claims 22 – 24 have been newly added.     

Claim Rejections - 35 USC § 102

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claim(s) 1, 2, 8, 9, 15, 16 and 21 are rejected under 35 U.S.C. 102 (a)(2) as being anticipated by Huang et al. (US 2019/0311070 A1).

As to claim 1, Huang et al. teaches receiving from a user an image and a query associated with one or more objects within the image (figure 3A section 302 [which teaches capturing and image and receiving a speech input. It is noted that he received speech input is being interpreted as the query]), wherein the query comprises an indication of a plurality of possible actions regarding the one or more objects of the image that a user may perform, wherein the plurality of possible actions can be taken with regards to one or more real-world items appearing as the corresponding one or more objects in the image (paragraph [0023] [section 302 teaches capturing a image and a speech signal, a user being able to initiate a visual search and prompted by a message to provide and image and to speak a search request. In addition and example of what the speech signal could be is showed based on the captured pictured which was taken in the last 5 lines of paragraph [0023]])
(figure 3A section 308 [teaches process of the cropped image likely identified object ]); 
retrieving a feature corresponding to the identified object from a database (paragraph [0028] lines 1 – 5 [teaches classification portion based on the detection service trained by an image database. In addition to object detection portion classification/detection service identify contiguous boundaries in the images]);
selecting an action of the plurality of possible actions based on a visual  feature of the identified object (figure 3A section 314, 316 and 322 [section 314 teaches determining the intent of the text in order to generate a query, in addition section 316 determine a weight based on the intent of the generated query based on the speech]); and 
returning, responsive to the query associated with the image, a result of the query including the selected action to be performed by the user on a real world item of the one or more real-world items that appears in the image as the identified object (figure 3A section 324 and paragraph [0044] [teaches display a search result based on the combination of the image and the text intent. In addition paragraph [0044] teaches based on the combination of the image and the text intent and example of what the result are regarding a combination of and image and text search]).

Huang et al. teaches receiving from a user an image and a query associated with one or more objects within the image (figure 3A section 302 [which teaches capturing and image and receiving a speech input. It is noted that he received speech input is being interpreted as the query]), wherein the query comprises an indication of a plurality of possible actions regarding the one or more objects of the image that a user may perform, wherein the plurality of possible actions can be taken with regards to one or more real-world items appearing as the corresponding one or more objects in the image (paragraph [0023] [section 302 teaches capturing a image and a speech signal, a user being able to initiate a visual search and prompted by a message to provide and image and to speak a search request. In addition and example of what the speech signal could be is showed based on the captured pictured which was taken in the last 5 lines of paragraph [0023]])
identifying, from the image, an identified object of the one or more objects associated with the query (figure 3A section 308 [teaches process of the cropped image likely identified object ]); 
retrieving a feature corresponding to the identified object from a database (paragraph [0028] lines 1 – 5 [teaches classification portion based on the detection service trained by an image database. In addition to object detection portion classification/detection service identify contiguous boundaries in the images]);
selecting an action of the plurality of possible actions based on a visual  feature of the identified object (figure 3A section 314, 316 and 322 [section 314 teaches determining the intent of the text in order to generate a query, in addition section 316 determine a weight based on the intent of the generated query based on the speech]); and 
returning, responsive to the query associated with the image, a result of the query including the selected action to be performed by the user on a real world item of the one or more real-world items that appears in the image as the identified object (figure 3A section 324 and paragraph [0044] [teaches display a search result based on the combination of the image and the text intent. In addition paragraph [0044] teaches based on the combination of the image and the text intent and example of what the result are regarding a combination of and image and text search]).

As to claim 15, Huang et al. teaches receiving from a user an image and a query associated with one or more objects within the image (figure 3A section 302 [which teaches capturing and image and receiving a speech input. It is noted that he received speech input is being interpreted as the query]), wherein the query comprises an indication of a plurality of possible actions regarding the one or more objects of the image that a user may perform, wherein the plurality of possible actions can be taken with regards to one or more real-world items appearing as the corresponding one or more objects in the image (paragraph [0023] [section 302 teaches capturing a image and a speech signal, a user being able to initiate a visual search and prompted by a message to provide and image and to speak a search request. In addition and example of what the speech signal could be is showed based on the captured pictured which was taken in the last 5 lines of paragraph [0023]])
identifying, from the image, an identified object of the one or more objects associated with the query (figure 3A section 308 [teaches process of the cropped image likely identified object ]); 
retrieving a feature corresponding to the identified object from a database (paragraph [0028] lines 1 – 5 [teaches classification portion based on the detection service trained by an image database. In addition to object detection portion classification/detection service identify contiguous boundaries in the images]);
selecting an action of the plurality of possible actions based on a visual  feature of the identified object (figure 3A section 314, 316 and 322 [section 314 teaches determining the intent of the text in order to generate a query, in addition section 316 determine a weight based on the intent of the generated query based on the speech]); and 
returning, responsive to the query associated with the image, a result of the query including the selected action to be performed by the user on a real world item of the one or more real-world items that appears in the image as the identified object (figure 3A section 324 and paragraph [0044] [teaches display a search result based on the combination of the image and the text intent. In addition paragraph [0044] teaches based on the combination of the image and the text intent and example of what the result are regarding a combination of and image and text search]).

As to claims 2, 9 and 16, these claims are rejected for the same reasons as the independent claims above. In addition Huang et al. teaches identifying a plurality of objects in the image (figure 3A section 304 [section 304 teaches process said image into categorize of content]); selecting one of the objects in the image associated with the query; and identifying the selected object (figure 3A section 306 and 308 [section 306 teaches receiving a selection form the user by cropping the image. In addition section 308 teaches processing said cropped image to identify object identities which are presented to the user]).

As to claim 21, this claim is rejected for the same reason as claim 1 above. In addition Huang et al. teaches receiving, from the user, an indication as to which of the one or more objects of the image on which to perform the query, wherein the indication comprises a user-drawn border around the indicated object (figure 3A section 306 and 308 [section 306 teaches receiving a selection form the user by cropping the image. In addition section 308 teaches processing said cropped image to identify object identities which are presented to the user]). 

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 3, 6, 10, 13, 17 and 22 – 24 are rejected under 35 U.S.C. 103 as being unpatentable over Huang et al. as applied to claims above, and further in view of HUANG et al. (US 2010/0260426 A1).

As to claims 3, 10 and 17, these claims are rejected for the same reasons as the dependent claims above. In addition Huang et al. teaches cropping the selected object from the image (figure 3A section 306).
Huang et al. does not explicitly teach returning the cropped object with the result of the query.
HUANG et al. teaches returning the cropped object with the result of the query (paragraph [0066] lines 1 – 7 [discloses transmitting the visual search result, the query and annotation via the communication protocol]).
Huang et al. teaches a method for using a speech signal and augment a visual search includes processing image data to determine image search intent. However Huang et al. does not fully disclose returning the cropped image and result.  A person of ordinary skill in the art would have been motivated to overcome this deficiency by incorporating HUANG et al. receiving instructions to crop said received image with Huang et al. because Huang et al. as showed in both figure 2B and 2C how the reference is able to crop images from the initial image and separate it from the rest of the other images, in addition to return result based on both the image and the text. Thus it could be reasonably implied that Huang et al. would be able to display content in addition the answer. 

As to claims 6 and 13, these claims are rejected for the same reasons as the independent claims above. In addition Huang et al. does not explicitly teach determining a first category of objects corresponding to a first one of the plurality of actions;  
9571040-1Atty. Dkt. No. 3462.1500000 A4021US- 14 - determining a second category of objects corresponding to a second one of the plurality of actions; and determining that the object belongs to the first category, wherein the result comprises the first action and an indication of the first category.
HUANG et al. teaches determining a first category of objects corresponding to a first one of the plurality of actions9571040-1Atty. Dkt. No. 3462.1500000 A4021US- 14 - determining a second category of objects corresponding to a second one of the plurality of actions; and determining that the object belongs to the first category, wherein the result comprises the first action and an indication of the first category (paragraph [0058] lines 6 – 11 and paragraph [0061] in line 9 – 13 [paragraph [0058] lines 6 – 11 disclose determining the detected object categories based on a generated image coefficient. In addition paragraph [0061] in line 9 – 13 teaches in one implementation, mobile device 130 can compare the selected object's feature vector with image coefficients of trained images stored in image coefficient library 262 to recognize or otherwise determine characteristics of the selected object. It is to be noted that the recognition of other characteristics as the action taken]).
The motivation for combining Huang et al. as modified with HUANG et al. are the same as set forth above with respect to claim 4.

As to claims 22, 23 and 24, these claims are rejected for the same reason as the independent claims above. In addition Huang et al. teaches wherein the image is an augmented reality (AR) image.
HUANG et al. teaches wherein the image is an augmented reality (AR) image (paragraph [0039] [discloses image detection/recognition based on a particular algorithms augmented reality being one of them]).
Huang et al. as modified with HUANG et al. are the same as set forth above with respect to claim 4.

Claims 5, 7, 12, 14 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Huang et al. as applied to claims above, and further in view of Grossman (US 2018/0197223 A1).

As to claims 5, 12 and 19, these claims are rejected for the same reasons as the independent claims above. In addition Huang et al. does not explicitly teach providing one or more terms associated with the object in the image to a user device; and receiving, from the user device, an indication as to which of the one or more terms correspond to the object
 Grossman teaches providing one or more terms associated with the object in the image to a user device; and receiving, from the user device, an indication as to which of the one or more terms correspond to the object (paragraph [0048] lines 7 – 13 figure 3 section 302 and section 308 [section 302 teaches obtaining an electronic image. In addition section 308 teaches analyzing the image and obtaining keywords assorted the image. Furthermore paragraph [0048] lines 7 – 13 teaches presentation of option of relating alternative keyword in a search option text box which overlaid in with the received image. It is to be noted that the keyword is being interpreted as the query]).
Huang et al. teaches a method for using a speech signal and augment a visual search includes processing image data to determine image search intent. However Huang et al. does not fully disclose returning the cropped image and result.  A person of ordinary skill in the art would have been motivated to overcome this deficiency by incorporating Grossman  providing alternative keywords with Huang et al. because Huang et al. as showed in both figure 3A in section 320 a request for further clarification from the user is . Thus it could be reasonably implied that Huang et al. would be able to provide additional terms in order to further clarify the user intent. 

As to claims 7 and 14 are rejected for the same reasons as the independent claims above. In addition Huang et al. does not explicitly teach receiving a second image associated with the first image; identifying one or more objects of the second image, wherein each object from the second image is associated with one of the plurality of actions; and determining the plurality of actions based on the identified one or more objects of the second image.
Grossman teaches receiving a second image associated with the first image (paragraph [0063] lines 12 – 17 [teaches generating a plurality of subdivided images based on the original image. Each of the subset of relevant images as sent to the image server for evaluation]); identifying one or more objects of the second image (figure 5B section 504 and section 506 [both these section are identified object of said image]), wherein each object from the second image is associated with one of the plurality of actions; and (paragraph [0052] lines 6 – 10 [figure 3 section 314 teaches presenting purchase option information to the user based on the search. In addition paragraph [0052] lines 6 – 10 teaches facilitating purchase which includes directing user to vendor website associated with the selected product]).
The motivation for combining Huang et al. as modified with Grossman are the same as set forth above with respect to claim 5.

Response to Arguments

Applicant's arguments filed May 24, 2021 have been fully considered and not persuasive. For Examiners response, see discussion below:

Applicant’s arguments, see pages 8 – 10, with respect to the rejection(s) of claim(s) 1, 8 and 15 under 35 USC § 102 (a)(2) have been fully considered and are not persuasive. Applicant argues that Huang et al does not teach the newly added limitations:

regarding the first argument wherein the plurality of possible actions can be taken with regards to one or more real-world items appearing as the corresponding one or more objects in the image. Huang et al teaches in section 302 capturing a image and a speech signal. A user is able to 

regarding the second argument returning, responsive to the query associated with the image, a result of the query including the selected action to be performed by the user on a real world item of the one or more real-world items that appears in the image as the identified object. Huang et al teaches display a search result based on the combination of the image and the text intent. In addition paragraph [0044] teaches based on the combination of the image and the text intent and example of what the results are regarding a combination of and image and text search.

Conclusion

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to PEDRO J SANTOS whose telephone number is (571)272-9877.  The examiner can normally be reached on M - F 7:30 AM - 5:00 PM EST (Alternating Fridays).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Robert Beausoliel can be reached on 571-272-3645.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 






/Pedro J Santos/Examiner, Art Unit 2167

/ROBERT W BEAUSOLIEL JR/Supervisory Patent Examiner, Art Unit 2167