Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim(s) 18 is/are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 18 requires automatically performing the action on the matched interactable element to include “performing one of” a plurality of alternative actions, such as selecting a link or selecting an option among dropdown list options. However, claim 18 also requires the additional limitation of “selecting the option among the dropdown list options” to include additional recited steps. When a claim covers several alternatives, the claim is deemed anticipated if any of the alternatives within the scope of the claim are known in the prior art.  See Brown v. 3M, 265 F.3d 1349, 1351, 60 USPQ2d 1375, 1376 (Fed. Cir. 2001).  However, because claim 18 further defines the “selecting the option among the dropdown list options” alternative, it is unclear whether a prior art reference teaching, for example, selecting a link would anticipate the claim, or if the claim requires “selecting the option among the dropdown list options”, despite this being one of a plurality of alternatives.
For the purposes of examination, the broadest reasonable interpretation of claim 18 will be applied. That is, a reference teaching any one of the recited alternative actions will be considered to anticipate the claim. To expedite prosecution, it is suggested to cancel the “selecting the option among the dropdown list options” limitation from claim 18 and add a new dependent claim reciting the limitation.



Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-5, 7-8, 10-12, and 18-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by James et al. (U.S. Patent No. 7,036,080, hereinafter “James”).
In regard to claim 1, James discloses a method, comprising: 
providing web content with a speech interaction user interface capability (a client computer accesses voice-enabled web content, column 4, lines 32-44); 
identifying interactable elements of the web content (HTML elements associated with an event and an action are identified, column 4, lines 45-56); 
for each of the interactable elements, determining one or more associated identifiers and associating in a data structure the determined one or more identifiers with a corresponding interactable element of the identified interactable elements (see Fig. 4B, a voice command corresponding to identifying text is assigned to each element, and a data structure comprising the element data, the associated voice command, the event, and the action is generated, column 4, lines 55-65 and column 5, lines 13-22); 
receiving a speech input from a user (a speech recognition engine recognizes received speech from a user, column 5, lines 23-30); 
using the data structure, matching one of the interactable elements to the received speech input (an element that matches a registered voice command from the data structure is selected, column 5, lines 30-33); and 
automatically performing an action on the matched interactable element (once an element is selected, an action or function associated with the element is performed, column 5, lines 34-37).

In regard to claim 2, James discloses the web content is a webpage (column 4, lines 11-14).

In regard to claim 3, James discloses providing the web content includes inserting code into the web content provided by a server from a web content source to enable the speech interaction is user interface capability (an extension adds data to the HTML content to voice-enable the web page, column 4, lines 23-31).

In regard to claim 4, James discloses providing the web content includes using a web browser plugin or add-on to enable the speech interaction user interface capability (an extension module within a browser voice-enables the web page, column 4, lines 32-44).

In regard to claim 5, James discloses identifying the interactable elements of the web content includes identifying Hypertext Markup Language (HTML) elements with tags associated with elements that a user is able to interact with (HTML elements are identified and identifying text is extracted, column 4, lines 45-56).

In regard to claim 7, James discloses identifying the interactable elements of the web content includes querying a document object model of the web content (a parser parses the DOM of the web content, column 4, lines 45-56).

In regard to claim 8, James discloses the interactable elements include one or more of the following elements: a link element, a button element, a textbox element, a dropdown list element, a checkbox element, or a radio button element (e.g., a selectable button, column 5, lines 55-63).

In regard to claim 10, James discloses determining the one or more associated identifiers for each of the interactable elements includes extracting one or more attribute content or tagged content from a specification of the corresponding interactable element (identifying text is extracted from a label tag of an HTML element, column 4, lines 45-65).

In regard to claim 11, James discloses associating in the data structure the determined one or more identifiers with the corresponding interactable element includes storing a key-value entry in the data structure that includes a normalized version of at least one of the one or more identifiers is as a key of the key-value entry and a reference to the corresponding interactable element as a value of the key-value entry (see Fig. 4B, a voice command for an HTML element is stored in the data structure as a key, and ID/Reference values, Event values, and Action values are stored as values in the data structure, column 5, lines 10-22).

In regard to claim 12, James discloses matching one of the interactable elements to the received speech input includes comparing content of the speech input with key values of entries in the data structure (the voice commands stored in the data structure act as a grammar to which the input speech is compared, column 5, lines 13-33).

In regard to claim 18, James discloses automatically performing the action on the matched interactable element includes performing one of the following: selecting a link, selecting a button, selecting a checkbox, selecting a radio button, selecting an option among dropdown list options, or inputting text content into a textbox; and wherein selecting the option among the dropdown list options includes determining based on the received speech input a corresponding similarity score for each option among the dropdown list options that have been identified from the web content and selecting the option with the best similarity score among the dropdown list options (e.g., selecting a button, column 5, lines 34-54).

In regard to claim 19, James discloses a system (Fig. 2, 200), comprising: 
one or more processors (processor 205) configured to: 
provide web content with a speech interaction user interface capability (a client computer accesses voice-enabled web content, column 4, lines 32-44); 
identify interactable elements of the web content (HTML elements associated with an event and an action are identified, column 4, lines 45-56); 
for each of the interactable elements, determine one or more associated identifiers and associate in a data structure the determined one or more identifiers with a corresponding interactable element of the identified interactable elements (see Fig. 4B, a voice command corresponding to identifying text is assigned to each element, and a data structure comprising the element data, the associated voice command, the event, and the action is generated, column 4, lines 55-65 and column 5, lines 13-22); 
receive a speech input from a user (a speech recognition engine recognizes received speech from a user, column 5, lines 23-30); 
use the data structure to match one of the interactable elements to the received speech input (an element that matches a registered voice command from the data structure is selected, column 5, lines 30-33); and 
automatically perform an action on the matched interactable element (once an element is selected, an action or function associated with the element is performed, column 5, lines 34-37); and
a memory coupled to the one or more processors and configured to provide the one or more processors with instructions (memory 225).

In regard to claim 20, James discloses a computer program product, the computer program product being embodied in a non-transitory computer readable storage medium and comprising computer instructions (column 7, lines 1-10) for:
providing web content with a speech interaction user interface capability (a client computer accesses voice-enabled web content, column 4, lines 32-44); 
identifying interactable elements of the web content (HTML elements associated with an event and an action are identified, column 4, lines 45-56); 
for each of the interactable elements, determining one or more associated identifiers and associating in a data structure the determined one or more identifiers with a corresponding interactable element of the identified interactable elements (see Fig. 4B, a voice command corresponding to identifying text is assigned to each element, and a data structure comprising the element data, the associated voice command, the event, and the action is generated, column 4, lines 55-65 and column 5, lines 13-22); 
receiving a speech input from a user (a speech recognition engine recognizes received speech from a user, column 5, lines 23-30); 
using the data structure, matching one of the interactable elements to the received speech input (an element that matches a registered voice command from the data structure is selected, column 5, lines 30-33); and 
automatically performing an action on the matched interactable element (once an element is selected, an action or function associated with the element is performed, column 5, lines 34-37).


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over James, in view of Thrift et al. (U.S. Patent No. 6,188,985, hereinafter “Thrift”).
In regard to claim 6, the elements identified by James in the exemplary web content (Fig. 3) are all visible. However, James does not expressly disclose identifying the interactable elements of the web content includes determining not to identify non-visible Hypertext Markup Language (HTML) elements.
Thrift discloses a method for providing web content with a speech interaction user interface capability, wherein identifying the interactable elements of the web content includes determining not to identify non-visible Hypertext Markup Language (HTML) elements (when making speakable links for a web page, only HTML links that are visible are made speakable, column 5, lines 26-31).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to determine not to identify non-visible HTML elements, because a web page in HTML format can have any length, and not identifying non-visible HTML elements reduces the amount of required memory, as taught by Thrift (column 5, lines 26-31 and lines 48-52).


Claim(s) 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over James, in view of Profit et al. (WO 9948088, hereinafter “Profit”).
In regard to claim 9, James discloses determining the one or more associated identifiers for each of the interactable elements includes extracting the one or more associated identifiers from a specification of the corresponding interactable element (identifying text is extracted from HTML elements, column 4, lines 45-56).
James does not disclose using a rule specifically selected for the corresponding interactable element among a plurality of different rules based on a tag type of the corresponding interactable element.
Profit discloses a method for determining the one or more associated identifiers which includes using a rule specifically selected for the corresponding interactable element among a plurality of different rules based on a tag type of the corresponding interactable element (an HTML tag type is determined and speakable entities are determined according to rules for the tag, page 13, lines 1-17).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to use a rule specifically selected for the corresponding interactable element among a plurality of different rules based on a tag type of the corresponding interactable element, because it would allow the user to manipulate a variety of user interface controls by the use of voice commands, as suggested by Profit (page 3, lines 15-18).


Claim(s) 13-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over James, in view of Zeigler et al. (U.S. Patent Application Pub. No. 2014/0350928, hereinafter “Zeigler”).
In regard to claim 13, James does not disclose comparing content of the speech input with the key values of entries in the data structure includes performing n-gram matching.
Zeigler discloses a method for speech enabling a user interface, wherein comparing content of the speech input with the key values of entries in the data structure includes performing n-gram matching (grammars created from an HTML document comprise n-gram sub-phrases, paragraphs [0060-0061]).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to compare content of the speech input with the key values of entries in the data structure by performing n-gram matching, because n-grams allow additional phrases to be associated with an element, which improves the accuracy of the voice interface, as taught by Zeigler (paragraph [0005]).

In regard to claim 14, James does not disclose comparing content of the speech input with the key values of entries in the data structure includes determining a similarity score for each of the entries in the data structure and comparing the similarity scores.
Zeigler discloses comparing content of the speech input with the key values of entries in the data structure includes determining a similarity score for each of the entries in the data structure and comparing the similarity scores (input voice is compared to phrases in the grammar and confidence scores are determined indicating a match between the voice input and the grammar, paragraph [0063]).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to compare content of the speech input with the key values of entries in the data structure by determining a similarity score for each of the entries in the data structure and comparing the similarity scores, because this would improve the accuracy of the voice interface, as taught by Zeigler (paragraph [0005]).

In regard to claim 15, James does not disclose the matched one of the interactable elements is determined to be associated with a highest one of the determined similarity scores that is above a threshold value.
Zeigler discloses the matched one of the interactable elements is determined to be associated with a highest one of the determined similarity scores that is above a threshold value (a determination is made if there are multiple good matches, or a single good match, paragraph [0063]).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to determine the matched one of the interactable elements is to be associated with a highest one of the determined similarity scores that is above a threshold value, because this would automatically perform the action associated with the element only if there were not additional similarly scored matches, as suggested by Zeigler (paragraph [0063]).

Claim(s) 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over James, in view of Guy (U.S. Patent Application Pub. No. 2020/0111491).
In regard to claim 16, James does not disclose automatically performing the action on the matched interactable element includes identifying the action for the matched interactable element and determining whether sufficient information has been specified in the speech input to perform the action.
Guy discloses a method for performing an action on a matched interactable element which includes identifying the action for the matched interactable element and determining whether sufficient information has been specified in the speech input to perform the action (for example, the system determines whether all the information required to fill a selected form using voice commands has been received, paragraph [0172]).
It would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to identify the action for the matched interactable element and determining whether sufficient information has been specified in the speech input to perform the action, because this can make the overall interaction experience with the content more seamless, as taught by Guy (paragraph [0169]).


Allowable Subject Matter
Claim 17 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:
While the prior art of record discloses various methods for extracting text content for input to a textbox, James and the additional prior art of record does not disclose or suggest identifying parts of speech for words in the speech input, determining a character position within content of the speech input associated with a longest common string between the content of the speech input and an identifier of the matched interactable element, and extracting text content to be inputted in a textbox of the matched interactable element based on the identified parts of speech and the determined character position, as required by claim 17.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Mauro et al., Koehler, Ito et al., Katsuranis, Mahajan, Moore et al., Reich et al., Ringuette, Stent et al., and Thrift et al. disclose additional methods of voice enabling web pages.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRIAN LOUIS ALBERTALLI whose telephone number is (571)272-7616. The examiner can normally be reached Mon-Thurs 9AM-3PM (Part time).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





BLA 5/19/22
/BRIAN L ALBERTALLI/               Primary Examiner, Art Unit 2656