DETAILED ACTION
Claims 1 and 11 are amended. Claims 9, 19, and 21-50 are cancelled. Claims 1-8, 10-18, 20, 51, and 52 are pending in the application.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Examiner’s Notes
The Examiner cites particular sections in the references as applied to the claims below for the convenience of the applicant(s). Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the applicant(s) fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the 01/25/2021 has been entered.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159.  See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed 
Claims 1-8, 10-18, 20, 51, and 52 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-8, 10-18, 20, 51, and 52 of copending Application No. 16/430,719 (from IDS filed on 08/13/2019; hereinafter “reference application”) in view of Ganz et al. (US 2020/0066261 A1; hereinafter Ganz). 
Although the claims at issue are not identical, they are not patentably distinct from each other because the method disclosed in claims 1-8, 10, 51 and the system disclosed in claims 11-18, 20, 52 of the reference application anticipates the method and system disclosed in claims 1-8, 10-18, 20, 51, and 52 of the instant application, respectively; except for the features of transmitting a request including an image to a server.
However, Ganz teaches: 
transmitting, by the control circuitry of the device (see e.g. Ganz, Fig. 1: “User Device 106”), via a network (see e.g. Ganz, Fig. 1: “Network 108”), the API request (see e.g. Ganz, paragraph 59: “aesthetics analysis request”) including the image from the device (see e.g. Ganz, paragraph 60: “receiving a digital image from a user to provide an initial aesthetic analysis of the digital image”; and paragraph 38: “a user of the client device 106 may designate a digital image in the user interface module 114 that the user would like to edit”) to a server (see e.g. Ganz, Fig. 4: “Imageserver 406”) remote from the device (see e.g. Ganz, paragraph 60: “image server 406 receives the request to aesthetically analyze a digital image”);
The reference application and Ganz are analogous art because they are in the same field of endeavor: interpreting user commands, such as voice commands, in association with a corresponding .
This is a provisional nonstatutory double patenting rejection.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-8, 10-18, 20, 51, and 52 are rejected under 35 U.S.C. 103 as being unpatentable over Kam et al. (US 2017/0031652 A1; hereinafter Kam) in view of An (US 2018/0204602 A1) and Ganz et al. (US 2020/0066261 A1; hereinafter Ganz).

With respect to claim 1, Kam teaches: A method for facilitating communications (see e.g. Kam, Fig. 7-8)…, the method comprising: 
generating for display, by control circuitry of a device (see e.g. Kam, paragraph 33; and Fig. 1), a user interface on a display screen (see e.g. Kam, paragraph 60: “a web page is being displayed on the screen”; paragraph 81: “a web page 50 of a portal site illustrated as being displayed on a screen”; and Fig. 4A-6C); 
while the user interface is displayed, receiving, by the control circuitry of the device, a command (see e.g. Kam, paragraph 60: “the user inputs the voice command "search for cars" while a web page is being displayed on the screen”; and paragraph 36: “receives an input of a voice command (hereinafter, referred to as a "primary command") regarding navigation of the screen”); 
in response to receiving the command, capturing, by the control circuitry of the device, an image of the user interface (see e.g. Kam, paragraph 99: “the content on the screen may be analyzed in response to the input user command”; and paragraph 38: “analyze content displayed on a screen of a display device, or the display 120, and generate a content analysis result. The content may include any entity displayed on the screen, such as various applications, messages, emails, documents, songs, videos, images, and other entities (e.g., text input windows, click buttons, dropdown menus, etc.)”); 
generating, at the device, an [application programming interface ("API")] request for interpreting the command, wherein the [API] request includes the image (see e.g. paragraph 56: “interpret the voice command based on the content analysis result from the screen analyzer 200 and then convert said voice command into a navigation command”); 
caching, by the control circuitry of the device, the image in the [API] request (see e.g. Kam, paragraph 52: “in the case where a semantic map for the content has previously been created on the screen and each piece of content has been previously defined with a description”; and paragraph 117); and 
Since Kam discloses utilizing previously analyzed screen content and storing associated data in storage media, Kam inherently discloses caching the screen content for later use.
receiving, by the control circuitry of the device, an [API] response … to the [API] request, wherein the [API] response is customized … based on the image (see e.g. Kam, paragraph 76: “executes the command created by the command composer 300 of FIG. 1 to perform a corresponding navigation operation on the screen. For example, in response to the navigation command created by the command composer 300, the command executer 400 may highlight a specific keyword on the screen or navigate the screen to search for a new keyword. The command executer 400 may also carry out web browsing or a move to a previous or next page of the current page. In addition, the command executer 400 may zoom in on a particular area of the screen, open a link, or navigate files to play voice/image/video files”; and paragraph 102: “executes the composed command to perform various navigation operations, such as highlighting a keyword, zoom-in, search, and moving to a previous/next page”).
Kam does not explicitly disclose utilizing APIs.
However, An teaches:
using application programming interfaces ("APIs") (see e.g. An, paragraph 45: “The API 145 may be an interface through which the application program 147 controls a function provided by the kernel 141 or the middleware 143, and may include, for example, at least one interface or function (e.g., an instruction) for a file control, a window control, image processing, a character control, or the like”; and paragraph 187: “The API 1260 (e.g., an API 145) may be, for example, a set of programming functions and may be provided with a configuration which is variable depending on an OS”)
Kam and An are analogous art because they are in the same field of endeavor: interpreting user commands, such as voice commands, in association with a corresponding image. Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to modify Kam with the teachings of An. The motivation/suggestion would be to simplify the software development process associated with Kam’s system; thus improving the overall software development process.
Furthermore, even though Kam discloses utilizing distributed processing of the abovementioned steps within a network, Kam does not explicitly disclose a server for receiving a request including an image and/or sending a customized response to the request based on the image.
However, Ganz teaches:
transmitting, by the control circuitry of the device (see e.g. Ganz, Fig. 1: “User Device 106”), via a network (see e.g. Ganz, Fig. 1: “Network 108”), the API request (see e.g. Ganz, paragraph 59: “aesthetics analysis request”) including the image from the device (see e.g. Ganz, paragraph 60: “receiving a digital image from a user to provide an initial aesthetic analysis of the digital image”; and paragraph 38: “a user of the client device 106 may designate a digital image in the user interface module 114 that the user would like to edit”) to a server (see e.g. Ganz, Fig. 4: “Imageserver 406”) remote from the device (see e.g. Ganz, paragraph 60: “image server 406 receives the request to aesthetically analyze a digital image”);
from the server (see e.g. Ganz, paragraph 59: “response 500 is a JSON response generated by the image server 406”; and paragraph 60: “response 500 may be generated responsive to receiving a digital image from a user to provide an initial aesthetic analysis of the digital image”)
by the server (see e.g. Ganz, paragraph 60: “When the image server 406 receives the request to aesthetically analyze a digital image, the image server generates various values corresponding to the different aesthetic features, and incorporates the values into the response 500. The particular response 500 has returned values for a subset of the aesthetic attributes, in this case, depth of field, motion blur, and object emphasis”)
Kam and Ganz are analogous art because they are in the same field of endeavor: interpreting user commands, such as voice commands, in association with a corresponding image. Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to modify Kam with the teachings of Ganz. The motivation/suggestion would be to reduce the workload on the user’s device.

With respect to claim 2, Kam as modified teaches: The method of claim 1, wherein the API response is customized based on the image by: 
determining an object in the image (see e.g. Kam, paragraph 82: “determine a type of each piece of content on the screen, such as text type, an icon type, a table type, and a link type”; and Fig. 5A-5D); 
determining a context for the user interface based on the object (see e.g. paragraph 49: “the semantic map generator 220 may obtain the meanings of pieces of content based on recognition of context information”; paragraph 84: “define each area 51, 52, 53, 54, 55, and 56 of FIG. 5A and each piece of content based on context information”; and paragraph 98: “the meaning of each content may be determined based on the contextual information”); and 
generating the API response based on the context (see e.g. Kam, paragraph 76: “in response to the navigation command created by the command composer 300, the command executer 400 may highlight a specific keyword on the screen or navigate the screen to search for a new keyword. The command executer 400 may also carry out web browsing or a move to a previous or next page of the current page. In addition, the command executer 400 may zoom in on a particular area of the screen, open a link, or navigate files to play voice/image/video files”; and paragraph 102: “executes the composed command to perform various navigation operations, such as highlighting a keyword, zoom-in, search, and moving to a previous/next page”).

With respect to claim 3, Kam as modified teaches: The method of claim 2, wherein the object in the image is determined by:
 determining boundaries of objects in the image (see e.g. Kam, paragraph 81: “analyze the screen and define each area 51, 52, 53, 54, 55, and 56 and each piece of content displayed on the screen”; and Fig. 5A-D); 
matching the boundaries of objects to a user interface template of a plurality of user interface templates, wherein each of the plurality of user interface templates corresponds to a respective context (see e.g. Kam, paragraph 83: “define the meaning of each piece of content using image analysis, text analysis, object extraction, classification and naming technologies”; paragraph 84: “define each area 51, 52, 53, 54, 55, and 56 of FIG. 5A and each piece of content based on context information”; and paragraph 85: “The semantic map generator 220 may define the meaning of each piece of content displayed on the screen by synthesizing analysis results obtained by various schemes, as shown in FIGS. 5B to 5D”); and 
determining the context for the user interface based on the respective context for the user interface template (see e.g. Kam, paragraph 86: “Once the semantic map has been created, the user may easily select specific content displayed on the screen using a natural language command. For example, the user may select a sports newspaper in area 55 by inputting an additional command, "newspaper, the third one on the top". Thereafter, the user may carry out various operations by further inputting primary commands. For example, the user may display or zoom in on the content of the sports newspaper or display the previous/next page of the newspaper”).

With respect to claim 4, Kam as modified teaches: The method of claim 1, wherein the API response is customized based on the image by: 
see e.g. Kam, paragraph 82: “determine a type of each piece of content on the screen, such as text type, an icon type, a table type, and a link type”; and Fig. 5A-D); 
determining a position of the object in the user interface (see e.g. Kam, paragraph 81: “analyze the screen and define each area 51, 52, 53, 54, 55, and 56 and each piece of content displayed on the screen”; paragraph 82: “designate areas 51 and 54 as input windows 51a and 54a and designate areas 53 and 56 as images 53a and 56b”; and Fig. 5A-D); and 
generating the API response based on the position (see e.g. Kam, paragraph 86: “Once the semantic map has been created, the user may easily select specific content displayed on the screen using a natural language command. For example, the user may select a sports newspaper in area 55 by inputting an additional command, "newspaper, the third one on the top"; paragraph 76: “in response to the navigation command created by the command composer 300, the command executer 400 may highlight a specific keyword on the screen or navigate the screen to search for a new keyword. The command executer 400 may also carry out web browsing or a move to a previous or next page of the current page. In addition, the command executer 400 may zoom in on a particular area of the screen, open a link, or navigate files to play voice/image/video files”; and paragraph 102: “executes the composed command to perform various navigation operations, such as highlighting a keyword, zoom-in, search, and moving to a previous/next page”).

With respect to claim 5, Kam as modified teaches: The method of claim 1, wherein the API response is customized based on the image by: 
determining an object in the image (see e.g. Kam, paragraph 82: “determine a type of each piece of content on the screen, such as text type, an icon type, a table type, and a link type”; and Fig. 5A-D); 
see e.g. paragraph 83: “define the meaning of each piece of content using… text analysis”; and paragraph 97: “define meanings of particular content by… key-word extraction through text analysis”); and 
generating the API response based on the word (see e.g. Kam, paragraph 76: “in response to the navigation command created by the command composer 300, the command executer 400 may highlight a specific keyword on the screen or navigate the screen to search for a new keyword. The command executer 400 may also carry out web browsing or a move to a previous or next page of the current page. In addition, the command executer 400 may zoom in on a particular area of the screen, open a link, or navigate files to play voice/image/video files”; and paragraph 102: “executes the composed command to perform various navigation operations, such as highlighting a keyword, zoom-in, search, and moving to a previous/next page”).

With respect to claim 6, Kam as modified teaches: The method of claim 1, wherein the API response is customized based on the image by interpreting the command based on an object in the image (see e.g. Kam, paragraph 82: “determine a type of each piece of content on the screen, such as text type, an icon type, a table type, and a link type”; paragraphs 83-85; paragraph 86: “Once the semantic map has been created, the user may easily select specific content displayed on the screen using a natural language command”; paragraph 76: “in response to the navigation command created by the command composer 300, the command executer 400 may highlight a specific keyword on the screen or navigate the screen to search for a new keyword. The command executer 400 may also carry out web browsing or a move to a previous or next page of the current page. In addition, the command executer 400 may zoom in on a particular area of the screen, open a link, or navigate files to play voice/image/video files”; and paragraph 102: “executes the composed command to perform various navigation operations, such as highlighting a keyword, zoom-in, search, and moving to a previous/next page”).

With respect to claim 7, Kam as modified teaches: The method of claim 1, wherein the command is a vocal search command, and the API request is for a voice recognition application (see e.g. Kam, paragraph 36: “receives an input of a voice command (hereinafter, referred to as a "primary command") regarding navigation of the screen”; paragraph 59: “procedures may include conversion of a voice command into a predefined format, recognition of a voice command and conversion of recognized speech into text, extraction of keywords from a voice command and understanding the meaning of extracted keywords, determinations regarding a voice command”).

With respect to claim 8, Kam as modified teaches: The method of claim 1, further comprising transmitting, by the control circuitry, the API request from a first device to a second device (see e.g. Kam, paragraph 117: “the instructions or software and any associated data, data files, and data structures are distributed over network-coupled computer systems so that the instructions and software and any associated data, data files, and data structures are stored, accessed, and executed in a distributed fashion by the one or more processors or computers”; paragraph 115: “The methods illustrated in FIGS. 4A-8 that perform the operations described in this application are performed by computing hardware, for example, by one or more processors or computers, implemented as described above executing instructions or software to perform the operations described in this application that are performed by the methods”; and paragraph 33).
Since Kam discloses implementing network-coupled computer systems for distributed processing of the methods disclosed in Fig. 4A-8 and the screen navigation apparatus 1 being capable of network communications, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Kam to implement distributed processing for the components of the screen navigation apparatus 1 via network communications. The motivation/suggestion for such a modification would be to improve hardware implementation flexibility and user friendliness.

With respect to claim 10, Kam as modified teaches: The method of claim 1, wherein the image is captured prior to modifying the user interface in response to the command (see e.g. Kam, paragraph 99: “Operations of analyzing the content on the screen, as depicted in operation 710 and receiving the input command, as depicted in operation 720, are not limited to any particular order. That is, the user may input an intended command based on the content analysis result, or the content on the screen may be analyzed in response to the input user command. Alternatively, the content on the screen may be analyzed while the user is inputting the user command, or vice versa”; and Fig. 7, steps 710, 720).

With respect to claims 11-18 and 20: Claims 11-18 and 20 are directed to a system comprising a control circuitry configured to implement active steps corresponding to the method disclosed in claims 1-8 and 10, respectively; please see the rejections directed to claims 1-8 and 10 above which also cover the limitations recited in claims 11-18 and 20. Note that, Kam further discloses a system comprising a control circuitry (see e.g. Kam, paragraph 114; Fig. 1) to implement the method disclosed in claims 1-8 and 10.

With respect to claim 51, Kam as modified teaches: The method of claim 1, further comprising generating a command response based on the command and the image (see e.g. Kam, paragraph 76: “executes the command created by the command composer 300 of FIG. 1 to perform a corresponding navigation operation on the screen. For example, in response to the navigation command created by the command composer 300, the command executer 400 may highlight a specific keyword on the screen or navigate the screen to search for a new keyword. The command executer 400 may also carry out web browsing or a move to a previous or next page of the current page. In addition, the command executer 400 may zoom in on a particular area of the screen, open a link, or navigate files to play voice/image/video files”; and paragraph 102: “executes the composed command to perform various navigation operations, such as highlighting a keyword, zoom-in, search, and moving to a previous/next page”).

With respect to claim 52: Claim 52 is directed to a system comprising a control circuitry configured to implement active steps corresponding to the method disclosed in claim 51; please see the rejection directed to claim 51 above which also cover the limitations recited in claim 52.

Response to Arguments
Applicant’s arguments, filed on 12/23/2020, with respect to claim(s) 1 and 11 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

CONCLUSION
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
U.S. Patent No. 9,684,826 B2 by Dubuque.
U.S. Patent No. 10,796,690 B2 by Gantz et al.


Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Umut Onat whose telephone number is (571)270-1735.  The examiner can normally be reached on M-Th 9:00-7:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Dennis Chow can be reached on (571) 272-7767.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/UMUT ONAT/Primary Examiner, Art Unit 2194