Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. KR10-2019-0099131, filed on 08/13/2019.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 07/29/2020, 07/30/2020, 01/07/2021, and 11/22/2021 are considered by the examiner.
Drawings
The drawing submitted on 07/29/2020 is considered by the examiner.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-3, 5-6, 8-9, 11, 13-15, 17, and 19-20are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Badr et al.(Us 2018/0336414 A1).
Regarding Claims 1 and 13, Badr et al. teach: An electronic apparatus comprising ([0036] In some of those implementations, the user interface input is explicitly directed to automated assistant 120. For example, one of the message exchange clients 107.sub.1-N may be a personal assistant messaging service dedicated to conversations with automated assistant 120 and user interface input provided via that personal assistant messaging service may be automatically provided to automated assistant 120.): a microphone; a camera ([0031] Each client device 106 may also be equipped with one or more cameras 111 (e.g., a front-facing and/or rear-facing camera in the case of a smart phone or tablet) and/or one or more additional sensors 113. The additional sensors 113 may include, for example, a microphone, a temperature sensor, a weight sensor, etc.); a memory configured to store at least one instruction ([0022] In addition, some implementations include one or more processors of one or more computing devices, where the one or more processors are operable to execute instructions stored in associated memory, and where the instructions are configured to cause performance of any of the aforementioned methods.); and at least one processor connected to the microphone, the camera, and the memory and configured to control the electronic apparatus, wherein the processor, by executing the at least one instruction([0031] In some implementations, one or more of the additional sensors 113 may be provided as part of a stand-alone peripheral device that is separate from, but is in communication with, one or more corresponding client devices 106 and/or the automated assistant 120. [0037] Each of the client computing devices 106.sub.1-N and automated assistant 120 may include one or more memories for storage of data and software applications, one or more processors for accessing data and executing applications, and other components that facilitate communication over a network. ), is configured to control the electronic apparatus to obtain a text corresponding to a voice that is input through the microphone, and provide a reply to a query based on the query being included in the obtained text, and wherein the processor ([0036] In many implementations, automated assistant 120 may engage in interactive voice response (“IVR”), such that the user can utter commands, searches, etc., and the automated assistant 120 may utilize natural language processing and/or one or more grammars to convert the utterances into text, and respond accordingly. [0038] In some implementations, automated assistant 120 generates responsive content in response to various inputs from the client devices 106.sub.1-N during a human-to-computer dialog session with automated assistant 120. Automated assistant 120 provides the responsive content (e.g., over one or more networks when separate from a client device of a user) for presentation to the user as part of the dialog session. For example, automated assistant 120 may generate responsive content in response to free-form natural language input provided via one of the client devices 106.sub.1-N, in response to image(s) captured by one of the cameras 111, and/or in response to additional sensor data captured by one or more of the additional sensor(s) 113. [0041] In some implementations, the natural language processor 122 includes a voice processing module that is configured to process voice (spoken) natural language input. The natural language processor 122 can then operate on the processed voice input (e.g., based on text derived from the processed voice input).) is further configured to: identify a region of interest (object) corresponding to a co-reference in an image obtained through the camera based on the co-reference being included in the query (image relates to a request related to an object captured by the at least one image, captured by a camera of a client device), identify an object referred to by the co-reference, among at least one object included in the identified region of interest based on a dialogue content that includes the query, and provide information on the identified object as the reply ([0017] In some implementations, a method performed by one or more processors is provided that includes: receiving, via an automated assistant interface of a client device, a voice input provided by a user; and determining that the voice input includes a request related to an object in an environment of the client device. [0018] In some implementations, a method performed by one or more processors is provided that includes: receiving at least one image captured by a camera of a client device; and determining that the at least one image relates to a request related to an object captured by the at least one image. The method further includes, in response to determining that the image relates to the request related to the object: causing image processing to be performed on the at least one image. [0063] The request engine 124 provides an indication of the request to the request resolution engine 130. The request resolution engine 130 attempts to resolve the request using the natural language input and the captured image. For example, the request resolution engine 130 can provide the captured image to one or more of the image processing engine(s) 142. The image processing engine(s) 142 can process the captured image to determine a classification attribute of “wine bottle”, and return the classification attribute to the request resolution engine 130. The request resolution engine 130 can further determine that the request is for a “cost” action (e.g., based on output provided by natural language processor 122, which is based on “how much does this cost?”). Further, for a “cost” action for an object that is a “wine bottle”, request resolution engine 130 can determine that, to resolve the request, attributes need to be resolved for the fields of: brand, wine type, and vintage. For example, the request resolution engine 130 can determine those fields based on looking up defined fields for a “cost” action for a “wine bottle” classification in resources database 148. [0068] The request resolution engine 130 can then determine the request is resolvable based on the additional attributes. For example, the request resolution engine 130 can submit, to one of the agents 146, an agent query that is based on the natural language input and the additional attributes. For example, the request resolution engine 130 can submit an agent query of “cost of vineyard A cabernet sauvignon 2012” and/or a structured agent query such as {intent=“wine—cost”; brand=“vineyard a”; type=“merlot”; vintage=“2012” }. Additional content can be received from the agent in response to the agent query, and at least some of the additional content provided for presentation to the user. For example, output 272B of FIG. 2B can be presented based on the additional content. The output 272B specifies the price range, and also asks if the user would like to see links where the user can purchase the bottle of wine.).

Regarding Claims 2 and 14, Badr et al. teach: The electronic apparatus of claim 1, wherein the processor is further configured to control the electronic apparatus to identify the region of interest based on a distance attribute of the co-reference ([0009] As yet another example, when the user is at a “retail” location, a request related to an environmental object of a captured image can be inferred, whereas no request would be inferred if the user had instead captured the same environmental object at a park (under the assumption that the user is likely seeking shopping intelligence while at the retail location, such as price(s), review(s), etc.). As yet another example, where an initially captured image captures multiple objects all at far distances, a request may not be inferred—whereas the request would have been inferred if the image instead captured only one object at a close distance. ).

Regarding Claims 3 and 15, Badr et al. teach: The electronic apparatus of claim 2, wherein the processor is further configured to control the electronic apparatus to: identify a region positioned at a relatively close distance in the obtained image as the region of interest based on the co-reference being a co-reference referring to an object at a close distance, and identify a region positioned at a relatively far distance in the obtained image as the region of interest based on the co-reference being a co-reference referring to an object at a far distance ( [0009] As yet another example, when the user is at a “retail” location, a request related to an environmental object of a captured image can be inferred, whereas no request would be inferred if the user had instead captured the same environmental object at a park (under the assumption that the user is likely seeking shopping intelligence while at the retail location, such as price(s), review(s), etc.). As yet another example, where an initially captured image captures multiple objects all at far distances, a request may not be inferred—whereas the request would have been inferred if the image instead captured only one object at a close distance.).


Regarding Claims 5 and 17, Badr et al. teach: The electronic apparatus of claim 1, wherein the processor is further configured to control the electronic apparatus to: identify at least one region from the obtained image in each of which one object is present, and identify a region of interest corresponding to the co-reference based on a density of the identified region in the obtained image ([0009] As yet another example, when the user is at a “retail” location, a request related to an environmental object of a captured image can be inferred, whereas no request would be inferred if the user had instead captured the same environmental object at a park (under the assumption that the user is likely seeking shopping intelligence while at the retail location, such as price(s), review(s), etc.). As yet another example, where an initially captured image captures multiple objects all at far distances, a request may not be inferred—whereas the request would have been inferred if the image instead captured only one object at a close distance.).

Regarding Claim 6, Badr et al. teach: The electronic apparatus of claim 5, wherein the processor is further configured to control the electronic apparatus to: identify a region of the identified region having a relatively low density from the obtained image as the region of interest based on the co-reference referring to a singular object, and identify a region of the identified region having a relatively high density from the obtained image as the region of interest based on the co-reference referring to plural objects ([0009] As yet another example, when the user is at a “retail” location, a request related to an environmental object of a captured image can be inferred, whereas no request would be inferred if the user had instead captured the same environmental object at a park (under the assumption that the user is likely seeking shopping intelligence while at the retail location, such as price(s), review(s), etc.). As yet another example, where an initially captured image captures multiple objects all at far distances, a request may not be inferred—whereas the request would have been inferred if the image instead captured only one object at a close distance.).


Regarding Claims 8 and 19, Badr et al. teach: The electronic apparatus of claim 1, wherein the processor is further configured to control the electronic apparatus to: identify at least one region in which an object is present in the obtained image, identify an object included in the region of interest among the identified regions, acquire information on an object referred to by the co-reference based on a dialogue content including the query, a previous query and a reply to the previous query, and identify an object referred to by the co-reference among objects included in a region in the region of interest based on the obtained information on the object (see rejection of claim 1 and  [0036] In many implementations, automated assistant 120 may engage in interactive voice response (“IVR”), such that the user can utter commands, searches, etc., and the automated assistant 120 may utilize natural language processing and/or one or more grammars to convert the utterances into text, and respond accordingly. [0099] In some of those implementations, the target degree of specificity is a target degree of classification of the object in a classification taxonomy and/or is defined with reference to one or more fields to be defined, where the fields for the object can be dependent on a classification (general or specific) of the object. In some of those implementations, the target degree of specificity can additionally or alternatively be determined based on initial natural language input provided by the user, feedback provided by the user, historical interactions of the user and/or other users, and/or location and/or other contextual signals.).

Regarding Claims 9 and 20, Badr et al. teach:  The electronic apparatus of claim 8, wherein the processor is further configured to control the electronic apparatus to: output a request for additional information based on an object referred to by the co-reference not being identified from the region of interest based on the obtained information on the object, acquire additional information on the object from the input re-query or reply based on a re- query or a reply being input based on the output request, and identify an object referred to by the co-reference among the objects included in the region of the region of interest based on the obtained additional information (See rejection of claim 1 and [0053] If the request resolution engine 130 determines the request is not resolvable, the request resolution engine 130 can cause the prompt engine 126 to determine one or more prompts to provide for presentation via the client device 106.sub.1. A prompt determined by the prompt engine 126 can instruct a user to capture additional sensor data (e.g., image(s), audio, temperature sensor data, weight sensor data) for the object and/or to move the object (and/or other object(s)) to enable capturing of additional sensor data for the object. The prompt can additionally or alternatively solicit the user to provide user interface input directed to unresolved attributes of the object. [0054] The request resolution engine 130 can then utilize the additional sensor data and/or the user interface input received in response to the prompt, to again attempt to resolve the request. If the request is still not resolvable, the request resolution engine 130 can cause the prompt engine 126 to determine one or more additional prompts to provide for presentation via the client device 106.sub.1. Additional sensor data and/or user interface input received in response to such additional prompt(s) can then be utilized to again attempt to resolve the request. This can continue until the request is resolved, a threshold number of prompts is reached, a threshold time period has elapsed, and/or until one or more other criteria have been achieved. [0068] The request resolution engine 130 can then determine the request is resolvable based on the additional attributes.).

Regarding Claim 11, Badr et al. teach: The electronic apparatus of claim 1, wherein: the camera is configured to be disposed on the electronic apparatus and to be rotatable, the processor is configured to control the electronic apparatus to: determine a direction an indication or a direction of a gaze through an image captured through the camera, adjust a capturing direction of the camera based on the determined direction, and identify a region of interest corresponding to the co-reference through an image obtained through the adjusted camera (See rejection of claim 1 and [0066] In FIG. 2B, an additional image is captured by the camera 111.sub.1, where the additional image captures the label of the bottle of wine 261. For example, the additional image can capture an image that conforms to the rendition 261B of the bottle of wine 261 shown in the electronic viewfinder of FIG. 2B. To capture such an image, the user can reposition the bottle of wine 261 (e.g., turn it so the label is visible and/or move it closer to the camera 111.sub.1), can reposition the camera 111.sub.1, and/or can adjust a zoom (hardware and/or software) and/or other characteristic of the camera 111.sub.1 (e.g., via a “pinching out” gesture on the touchscreen 160). In some implementations, the additional image can be captured in FIG. 2B in response to selection of second graphical element 166.sub.2 by the user. In some other implementations, the additional image can be captured automatically. ).
Allowable Subject Matter
Claims 4, 7, 10,  12, 16, and 18, are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The prior art of record Rakshit et al.(US 2020/0394436 A1) teach: invention generally relates to devices that are connected to the Internet, and more particularly to employing devices that are connected to the Internet to locate missing items.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMAD K ISLAM whose telephone number is (571)270-5878. The examiner can normally be reached Monday -Friday, EST (IFP).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MOHAMMAD K ISLAM/Primary Examiner, Art Unit 2656