DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 4-5, 7, 9-11, 13-14, 16-17 & 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over KANG et al. (U.S. Pub 2020/0020334) hereinafter Kang, in view of Gibson et al. (U.S. Pat 10,944,845) hereinafter Gibson.

As per Claim 1, Kang teaches determining a user intent to interact with a particular graphical user interface ("GUI") based at least in part on a free-form natural language input; based on the user intent, perform a function (Fig. 4, Fig. 5A, ¶112 wherien the electronic device 101 may receive a first user utterance 501 through the microphone 280 and provide data about the first user utterance 501 to an external server including an ASR system and an intelligence system. The intelligence system may apply  natural-language understanding to a text obtained by, e.g., the ASR system and determine, e.g., the user's intent, thereby generating a command including a task corresponding thereto)
and automatically populating the identified interactive element with data determined from the user intent. (Fig. 5A, ¶113 wherein the electronic device 101 may obtain the first user utterance 501 “Register Study schedule on February second.” The electronic device 101 may send the data about the first user utterance 501 to the external server, and the external server may apply ASR to the received data, thus obtaining the text “Register, Study schedule on February, second.” The external server may generate a command including tasks to execute a schedule management application and register the schedule of “Study” on February 2 on the schedule management application, corresponding to the first user utterance 501 from the obtained text using the intelligence system. The external server may send the generated command to the electronic device 101, and the electronic device 101 may perform the task included in the command. As shown on the right side of FIG. 5A, the electronic device 101 may display an execution screen 520 of the schedule management application and display the result of registering the schedule 522 of “Study” on the February 2 item 521)
Kang previously taught based on the user intent performing a function. However, Kang does not explicitly teach identifying a target visual cue to be located in the GUI; performing object recognition processing on a screenshot of the GUI to determine a location of a detected instance of the target visual cue in the screenshot; based on the location of the detected instance of the target visual cue, identifying an interactive element of the GUI; and
Gibson teaches based on the user, identifying a target visual cue to be located in the GUI; (Fig. 8, col. 12 lines 37-58 wherien the top of a screen may contain interface elements (e.g., search bar with a magnifying glass icon) that can be clicked on. When the interface elements (e.g., the magnifying glass icon) is selected or clicked on, the 
performing object recognition processing on a screenshot of the GUI to determine a location of a detected instance of the target visual cue in the screenshot; based on the location of the detected instance of the target visual cue, identifying an interactive element of the GUI; and (Fig. 8, col. 12 lines 14-35 wherien  the aggregation application may display a tab (e.g., located just below the Favorite Articles tab titled “Add a new artist”) that a user may select. In embodiments, this tab also creates the same search page as another interface element (e.g., the magnifying glass icon on the top left of the home screen). Therefore the aggregation application provides the user with two different ways of getting into a search page. For example, when the “Add a new Artist” tab or the magnifying glass icon on the top left are selected by the user, the aggregation application directs the screen interface elements to slide (e.g., slide over from left to right) and bring up the search bar populated with artists, bands, and festivals from the aggregation application database)
It would have been obvious to one having ordinary skill in the art at the time the invention was filed to utilize  the teaching of consolidated content aggregation of Gibson with the teaching of processing user speech of Kang because Gibson teaches a content aggregation 760 may be performed during the process and information from the tracking may be used in a feedback loop to improve selection of a particular content for the user (col. 1 lines 18-21, col. 9 lines 25-29)

As per Claim 2, the rejection of claim 1 is hereby incorporated by reference;  Kang as modified further teaches wherein the GUI comprises an interactive webpage. (Fig. 5A, ¶106, ¶110 wherien in a case where the first application program is a web browsing application, a screen downloaded from a server corresponding to an access URL may be transitorily or non-transitorily stored, and the first user interface may be included in the downloaded screen wherein the second-type user input is entered while the first user interface is displayed, the electronic device 101 may perform the second operation on a user utterance 501 input through the microphone 280; as taught by Kang)

	As per Claim 4, the rejection of claim 2 is hereby incorporated by reference; Kang as modified further teaches further comprising: automatically submitting the data determined from the user intent; and receiving a subsequent webpage that is generated at least in part on the data determined from the user intent. (Fig. 5A, ¶113 wherien the electronic device 101 may obtain the first user utterance 501 “Register Study schedule on February second.” The electronic device 101 may send the data about the first user utterance 501 to the external server, and the external server may apply ASR to the received data, thus obtaining the text “Register, Study schedule on February, second.” The external server 501 from the obtained text using the intelligence system. The external server may send the generated command to the electronic device 101, and the electronic device 101 may perform the task included in the command. As shown on the right side of FIG. 5A, the electronic device 101 may display an execution screen 520 of the schedule management application and display the result of registering the schedule 522 of “Study” on the February 2 item 521; as taught by Kang)

As per Claim 5, the rejection of claim 4 is hereby incorporated by reference; Kang as modified further teaches further comprising searching a uniform resource locator ("URL") or content of the subsequent webpage to determine an outcome of the automatic submitting. (Fig. 5A, ¶106, ¶110 wherien in a case where the first application program is a web browsing application, a screen downloaded from a server corresponding to an access URL may be transitorily or non-transitorily stored, and the first user interface may be included in the downloaded screen wherein the second-type user input is entered while the first user interface is displayed, the electronic device 101 may perform the second operation on a user utterance 501 input through the microphone 280; as taught by Kang)

As per Claim 7, the rejection of claim 1 is hereby incorporated by reference; Kang as modified further teaches wherein the free-form natural language input takes the form of a speech input captured at a microphone, and(Fig. 4, Fig. 5A, ¶112 wherien the electronic 101 may receive a first user utterance 501 through the microphone 280 and provide data about the first user utterance 501 to an external server including an ASR system and an intelligence system. The intelligence system may apply  natural-language understanding to a text obtained by, e.g., the ASR system and determine, e.g., the user's intent, thereby generating a command including a task corresponding thereto; as taught by Kang)
  the method further includes performing speech recognition processing on the speech input to generate textual output. (Fig. 5B,¶113, ¶116 wherein the electronic device 101 may obtain the second user utterance 503 “Register Study schedule on February second.” The electronic device 101 may send the data about the second user utterance 503 to the external server, and the external server may apply ASR to the received data, the external server may include an automatic speech recognition (ASR) system capable of generating text using data about an utterance and an intelligence system capable of natural-language understanding text, grasping the meaning of the text, and generating a command corresponding to the text, thus obtaining the text “Register, Study schedule on February, second.” The external server may send the obtained text to the electronic device 101, and the electronic device 101 may display at least part 513 of the obtained text in the text box 511 as shown on the right side of FIG. 5B. According to various embodiments of the present invention, the electronic device 101 may be configured to input the text received from the external server to the first user interface based on the state information indicating that the first user interface is being displayed; as taught by Kang)

As per Claim 9, the rejection of claim 1 is hereby incorporated by reference; Kang as modified further teaches wherein the user intent comprises submission of a search query using the GUI, and the target visual cue comprises a magnifying glass. (Fig. 8, col. 12 lines 37-58 wherien the top of a screen may contain interface elements (e.g., search bar with a magnifying glass icon) that can be clicked on. When the interface elements (e.g., the magnifying glass icon) is selected or clicked on, the aggregation application may direct a portion of the home screen (e.g., on the right) to disappear wherein as the user begins typing, the aggregation application will cause the suggested search queries to populate according to the text the user has typed in; as taught by Gibson)

As per Claim 10, the rejection of claim 1 is hereby incorporated by reference; Kang as modified further teaches further comprising generating, based on the identified interactive element, (Fig. 8, col. 12 lines 14-35 wherien  when the “Add a new Artist” tab or the magnifying glass icon on the top left are selected by the user, the aggregation application directs the screen interface elements to slide (e.g., slide over from left to right) and bring up the search bar populated with artists, bands, and festivals from the aggregation application database; as taught by Gibson)
a script that is subsequently executable in association with the GUI and a subsequent free-form natural language input to trigger automatic population of the identified interactive element with data determined from a subsequent user intent determined from the subsequent free-form natural language input and submission of the data determined from the user intent via the GUI. (Fig. 5B, ¶113,¶116 wherein the electronic device 101 may obtain the second user utterance 503 “Register Study schedule on February second.” The electronic 101 may send the data about the second user utterance 503 to the external server, and the external server may apply ASR to the received data, the external server may include an automatic speech recognition (ASR) system capable of generating text using data about an utterance and an intelligence system capable of natural-language understanding text, grasping the meaning of the text, and generating a command corresponding to the text, thus obtaining the text “Register, Study schedule on February, second.” The external server may send the obtained text to the electronic device 101, and the electronic device 101 may display at least part 513 of the obtained text in the text box 511 as shown on the right side of FIG. 5B. According to various embodiments of the present invention, the electronic device 101 may be configured to input the text received from the external server to the first user interface based on the state information indicating that the first user interface is being displayed; as taught by Kang)

As per Claim 11, the rejection of claim 1 is hereby incorporated by reference; Kang as modified further teaches wherein the subsequent automatic population and submission is performed without one or more of identifying the target visual cue, performing the object recognition, or identifying the interactive element of the GUI. (Fig. 5B, ¶113,¶116 wherein the electronic device 101 may obtain the second user utterance 503 “Register Study schedule on February second.” The electronic device 101 may send the data about the second user utterance 503 to the external server, and the external server may apply ASR to the received data, the external server may include an automatic speech recognition (ASR) system capable of generating text using data about an utterance and 101, and the electronic device 101 may display at least part 513 of the obtained text in the text box 511 as shown on the right side of FIG. 5B. According to various embodiments of the present invention, the electronic device 101 may be configured to input the text received from the external server to the first user interface based on the state information indicating that the first user interface is being displayed; as taught by Kang; Examiner interprets relying on Kang to teach the above and not Gibson is performing the automatic population without the elements that are taught in Gibson)

Claim 13 is similar in scope to Claim 1; therefore, Claim 13 is rejected under the same rationale as Claim 1.

Claim 14 is similar in scope to Claim 2; therefore, Claim 14 is rejected under the same rationale as Claim 2.

Claim 16 is similar in scope to Claim 4; therefore, Claim 16 is rejected under the same rationale as Claim 4.

Claim 17 is similar in scope to Claim 5; therefore, Claim 17 is rejected under the same rationale as Claim 5.

Claim 19 is similar in scope to Claim 7; therefore, Claim 19 is rejected under the same rationale as Claim 7.

Claim 20 is similar in scope to Claim 1; therefore, Claim 20 is rejected under the same rationale as Claim 1.

Claims 3 & 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kang in view of Gibson as applied to claims 2 & 14 above, and further in view of Lorimor et al. (U.S. Pat 10,037,552) hereinafter Lori.

As per Claim 3, the rejection of claim 2 is hereby incorporated by reference; Kang as modified previously taught the interactive webpage, the location of the detected instance of the target visual cue. However, Kang as modified does not explicitly teach wherein the interactive element of the GUI is identified by comparing a document object model ("DOM") of the interactive webpage with the location of the detected instance of the target visual cue.
Lori teaches wherein the interactive element of the GUI is identified by comparing a document object model ("DOM") of the interactive webpage with the location of the detected instance of the target visual cue. (Fig. 3, col 2 lines 19-29 wherien he advertisement discovery equipment may determine from the Document Object Model (DOM) associated with a publisher web page that a particular advertisement is located at a particular location on a web page, determine one or more test points within the location of the advertisement, obtain the visible element at each test point (e.g., by requesting the visible element at 
It would have been obvious to one having ordinary skill in the art at the time the invention was filed to utilize the teaching of discovery and tracking of obscured web-based advertisement of Lori with the teaching of processing user speech of Kang as modified because Lori teaches provide improved systems for discovering and tracking of internet-based advertisements that can distinguish between obscured and unobscured advertisements by determining a score for the advertisement  If the visible percentage is below a threshold, the advertisement discovery equipment may determine that the advertisement is obscured and may take suitable action for an obscured advertisement. If the visible percentage is above the threshold, the advertisement discovery equipment may determine that the advertisement is visible and may take suitable action for a visible advertisement (col. 1 lines 45-47, col.2 lines 35-50)

Claim 15 is similar in scope to Claim 3; therefore, Claim 15 is rejected under the same rationale as Claim 3.

Claim 6, 8 & 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kang in view of Gibson as applied to claims 1, 5 & 16 above, and further in view of PHAM et al. (U.S. Pub 2021/0081475) hereinafter Pham.

As per Claim 6, the rejection of claim 5 is hereby incorporated by reference; Kang as modified previously taught the automatic submitting. However, Kang as modified does not explicitly teach wherein the object recognition is performed using a machine learning model, and the method further includes training the machine learning model based on the outcome of the automatic submitting.
Pham teaches wherein the object recognition is performed using a machine learning model, and the method further includes training the machine learning model based on the outcome of the automatic submitting. (Fig. 1, ¶46, ¶47, ¶62 wherein content generation subsystem 116 may train a prediction model, such as a machine learning model wherein the prediction model may be trained using training data including the initially accessed websites, the selected text from the initially accessed websites, and the subsequently accessed websites wherein the prediction model may include one or more neural networks wherein image item 204 may include an image of an object. Image item 204 may be analyzed using an object recognition computer vision model to determine the object included within the image, and a topic associated with the object may be determined by the object recognition computer vision model. In some embodiments, the object recognition computer vision model may be a convolutional neural network )
It would have been obvious to one having ordinary skill in the art at the time the invention was filed to utilize  the teaching of integrating content into web pages of Pham with the teaching of processing user speech of Kang as modified because Pham teaches integrating content into one or more online resources, including, for example, embedding hyperlinks into 

As per Claim 8, the rejection of claim 1 is hereby incorporated by reference; Kang as modified previously taught the object recognition. However, Kang as modified does not explicitly teach  further teaches wherein the object recognition is performed using a convolutional neural network. 
Pham teaches wherein the object recognition is performed using a convolutional neural network. (Fig. 2B, Fig. 6,¶62 wherein  image item 204 may include an image of an object. Image item 204 may be analyzed using an object recognition computer vision model to determine the object included within the image, and a topic associated with the object may be determined by the object recognition computer vision model. In some embodiments, the object recognition computer vision model may be a convolutional neural network (CNN))
It would have been obvious to one having ordinary skill in the art at the time the invention was filed to utilize  the teaching of integrating content into web pages of Pham with the teaching of processing user speech of Kang as modified because Pham teaches integrating content into one or more online resources, including, for example, embedding hyperlinks into webpage content based on prior user interactions allowing users to personalize documents based on their preferences. (¶1, ¶2)

Claim 18 is similar in scope to Claim 6; therefore, Claim 18 is rejected under the same rationale as Claim 6.

Claim 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kang in view of Gibson as applied to claim 11 above, and further in view of Nolan et al. (U.S. Pat 9,954,729) hereinafter Nolan.

As per Claim 12, the rejection of claim 11 is hereby incorporated by reference; Kang as modified previously taught submission of the data determined from the user intent. However, Kang as modified does not explicitly teach further comprising validating that submission of the data determined from the user intent resulted in a desired outcome, wherein the script is generated based on the validating.
Nolan teaches further comprising validating that submission of the data determined from the user intent resulted in a desired outcome, wherein the script is generated based on the validating. (Fig. 5, col. 8 lines 38-67 wherein a test is conducted to determine whether the results of the validation rules is positive and if the results of the validation are positive, a user at the client 102 can optionally request the generation of the script subsequent to a confirmation of a valid processing)
It would have been obvious to one having ordinary skill in the art at the time the invention was filed to utilize the teaching of provisioning and configuration of network infrastructure of Nolan with the teaching of processing user speech of Kang as modified because Nolan teaches tool utilizes pre-configured templates to collect information utilized in the configuration of the infrastructure equipment and automatically generate configuration 

Related Art
Related Art not relied upon Lee et al. (U.S. Pub 2020/0326832) for teaching AI system may be a neural network-based system wherein when an electronic device receives a user utterance associated with an object on an image, the electronic device may recognize the object on the image by analyzing the image through a vision server, may generate information associated with the recognized object to provide the user with the information, and may organically process the image displayed on a screen and a user utterance. Grant et al. (U.S. Pub 2003/0164855) for teaching A content management system for providing content that is specific to the execution context of a user application is disclosed. The invention relates specifically to applications that have a display in a browser that generates a document object model. The system comprises a development tool that is operative to record a specific execution context of the user application. The development tool makes a record that includes at least part of a document object model constructed by the browser and to associates content with the record.

Inquiry
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANGIE BADAWI whose telephone number is (571)270-7590. The examiner can normally be reached Monday thru Wednesday 9:00am - 5:00pm EST with Thursdays and Fridays off.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Renee Chavez can be reached on (571) 270-1104. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ANGIE BADAWI/            Primary Examiner, Art Unit 2179