Notice of Pre-AIA  or AIA  Status
The present application is being examined under the pre-AIA  first to invent provisions. 
Terminal Disclaimer
The terminal disclaimer filed on 10/05/2020 disclaiming the terminal portion of any patent granted on this application which would extend beyond the expiration date of US PATENT #: 10,303,433 has been reviewed and is accepted.  The terminal disclaimer has been recorded.

Reason for Allowance
	2.	The following is an examiner’s statement of reasons for allowance:
3.	Claims 1-18 are allowed 
4.	Independent claims 1, 7 and 13 claim a portable terminal device with an information processing system and method which includes a camera and a microphone. Data of obtained images and voice are transmitted to a server that identifies operations to be executed based on the received voice and image data. The server transmits an identification of one or more results of the plurality of operations to the portable terminal device. When the portable terminal device receives only one result from the server, an operation corresponding to the one result is executed, and when a plurality of results is received, the portable terminal device displays information corresponding to the plurality of results as candidates. Additional voice is captured for selecting one of the plurality of results during the displaying of the information. A determination of one result from the plurality of results is made based on the captured voice, and an operation corresponding to the determined result is executed, in a manner not disclose or suggested in any prior art.  
	The representative closest prior art is Hart et al., US Patent (8,700,392), hereinafter “Hart” and Cha et al., US Patent Application (20100009720), hereinafter "Cha", which do not teach the features claimed in the independent claims, 1 and similarly worded claims 7 and 13: “1. A portable terminal device comprising: a camera that captures images of an operator; a microphone that captures voice instructions of the operator; a controller which is programmed to execute a plurality of operations; a communication interface that transmits and receives data with an external server; and wherein the controller is further programmed to: when the images are obtained from the camera and a single voice instruction is obtained from the microphone, control the communication interface to transmit data of the obtained images and the obtained single voice instruction to the external server, and when the communication interface receives information from the external server including one or more results identified by the external server based on the transmitted data, wherein, when the communication interface receives only one result for operation identified in response to the transmitted data by the external server as operation based on the single voice instruction, the controller is further programmed to execute an operation corresponding to the one result, wherein, when the communication interface receives a plurality of results for operation identified in response to the transmitted data by the external server as operation based on the single voice instruction, the controller is further programmed to: display a plurality of operation options to be operable by the portable terminal device, the plurality of operation options relating to each of the plurality of results for operation based on the single voice instruction, capture additional voice instruction for selecting one operation option  from the plurality of the operation options corresponding to the single voice instruction to be operable by the portable terminal device during displaying of the plurality of operation options based on the single voice instruction to, determine one operation option from the plurality of the operation options to be operable by the portable terminal device based on the captured voice instruction, and execute an operation corresponding to the determined one operation option”.

In regards to claims 1, 7 and 13 the representative prior art is Hart and Cha. Hart discloses a user can provide input to a computing device through various combinations of speech, movement, and/or gestures. A computing device can analyze captured audio data and analyze that data to determine any speech information in the audio data. The computing device can simultaneously capture image or video
example, image information is utilized by the device to determine when someone is speaking, and the movement of the person's lips can be analyzed to assist in determining the words that were spoken. Any gestures or other motions can assist in the determination as well. By combining various types of data to determine user input, the accuracy of a process such as speech recognition can be improved, and the need for lengthy application training processes can be avoided. Hart also illustrates an example of an environment 1200 for implementing aspects in accordance with various embodiments. As will be appreciated, although a Web-based environment is used for purposes of explanation, different environments may be used, as appropriate, to implement various embodiments. The system includes an electronic client device 1202, which can include any appropriate device operable to send and receive requests, messages or information over an appropriate network 1204 and convey information back to a user of the device. The illustrative environment includes at least one application server 1208 and a data store 1210. The application server 1208 can include any appropriate hardware and software for integrating with the data store 1210 as needed to execute aspects of one or more applications for the client device and handling a majority of the data access and business logic for an application. The application server provides access control services in cooperation with the data store and is able to generate content such as text, graphics, audio and/or video to be transferred to the user, which may be served to the user by the Web server 1206. The handling of all requests and responses, as well as the delivery of content between the client device 1202 and the application server 1208, can be handled by the Web server 1206.
Cha discloses a method for recognizing a voice and converting it into a text, and more particularly, a method for preferentially voice-recognizing a word starting with a character inputted through a keypad before voice is inputted and converting it into a text, and a mobile terminal implementing such method. The mobile terminal includes a keypad to receive an input of a specific character from a user; a controller to recognize voice input from the user based on words starting with the specific character, and to convert the recognized voice into text; and a display to display the text under a control of the controller.

In regards to claims 1, 7 and 13 Hart and Cha, alone, or in combination, do not provide a teaching, a suggestion or a motivation that could be found either in the art or within the skill of one of ordinary skill in the art prior to the effective time of the invention to modify or combine the prior art to disclose the cited claim limitations above and more specifically “wherein, when the communication interface receives a plurality of results for operation identified in response to the transmitted data by the external server as operation based on the single voice instruction, the controller is further programmed to: display a plurality of operation options to be operable by the portable terminal device, the plurality of operation options relating to each of the plurality of results for operation based on the single voice instruction, capture additional voice instruction for selecting one operation option  from the plurality of the operation options corresponding to the single voice instruction to be operable by the portable terminal device during displaying of the plurality of operation options based on the single voice instruction to, determine one operation option from the plurality of the operation options to be operable by the portable terminal device based on the captured voice instruction, and execute an operation corresponding to the determined one operation option” of the claimed invention.  Claims 2-6; 8-12 and14-18 depend from claim 1, 7 and 13 respectively and as such the claims are allowed.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ROBERT J MICHAUD whose telephone number is (571)270-3981. The examiner can normally be reached 8:30 - 5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Patrick Edouard can be reached on 571-272-7603.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ROBERT J MICHAUD/Examiner, Art Unit 2694