DETAILED ACTION
This action is in response to the reply filed 3/16/2021.
Claims 2-21 are currently pending.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments, see pages 2-4 of reply, filed 3/16/2021, with respect to the rejection(s) of claim(s) 1, 5, 9-12, 15, and 19-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kiss in view of Marila have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Blumenberg.

The Double Patenting rejections are being deferred per applicant’s request.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For 

Claim 2 and 12 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 4, 12, and 15 of U.S. Patent No. 10,048,842, hereinafter ‘842.

Instant Claim 2 is taught by Claim 1 of ‘842 as follows:
Instant Claim 2					Claim 1 of ‘842
A method comprising:
A computer-implemented method comprising: 

receiving, at a user device, audio data corresponding to an utterance spoken by a user of the user device; 
processing, by the user device, the audio data to generate a transcription of the utterance, the transcription comprising a plurality of words, 
displaying, by the data processing hardware, the transcription of the utterance on a proximity-sensitive display of the user device; 



receiving, at the user device, data indicating a touch received on the proximity-sensitive display; 

receiving data indicating a touch received at a second location on the proximity- sensitive display; 

each of the plurality of words in the transcription comprising a respective confidence value indicating a likelihood that the corresponding word is correct; 
response to receiving the data indicating the touch, selecting, by the user device, one of the plurality of words in the transcription displayed on the proximity-sensitive display based on the respective confidence value of each word of the plurality of words and a touch location on the proximity-sensitive display where the touch is received; and 

determining a confidence value that reflects an input-to-text engine's confidence that text associated with the text object accurately represents an input; determining whether the touch received through the proximity-sensitive display represents a selection of the text based at least on (i) the confidence value that reflects the input-to-text engine's confidence that the text is an accurate representation of the input, (ii) the first location of the text object on the proximity-sensitive display, and (iii) the second location of the touch on the proximity-sensitive display; and 

augmenting, by the user device, the selected one of the plurality of words 





Instant claim 12 is taught by claims 12 and 15 of ‘842 similarly to that of instant claim 1, and is rejected using identical logic.

Claims 2 and 12 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-3 and 8-10 of U.S. Patent No. 10,545,647, hereinafter ‘647.

Instant Claim 2 is taught by Claim 1 of ‘647 as follows:
A method comprising:
A computer-implemented method performed using a system that processes selections of items, the method comprising:
receiving, at a user device, audio data corresponding to an utterance spoken by a user of the user device; 

displaying, by the data processing hardware, the transcription of the utterance on a proximity-sensitive display of the user device; 



receiving, at the user device, data indicating a touch received on the proximity-sensitive display; 

receiving data for a user input that includes image data and that was received at a second location on the display;

generating an object tag for the item of information based on obj ects depicted in the image data,
each of the plurality of words in the transcription comprising a respective confidence value indicating a likelihood that the corresponding word is correct;
determining a confidence value that indicates a confidence that the item of information accurately represents a particular type of input based on a recognition process that was used to generate the object tag;

determining whether the user input represents a selection of the item of information based at least on: (i) the confidence value, and (ii) the second location of the user input;
augmenting, by the user device, the selected one of the plurality of words in the transcription displayed on the proximity-sensitive display.  

and providing an indication of whether the user input represents selection of the item of information.


While claim 1 of ‘647 does not explicitly teach that the input is that of receiving audio data and performing a transcription, claims 2 and 3 of ‘647 teaches this feature, and is dependent upon claim 

Instant claim 12 is taught by claims 8-10 of ‘647 similarly to that of instant claim 1, and is rejected using identical logic.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5, 9-12, 15, and 19-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kiss et al. (US 2006/0293889), hereinafter Kiss, in view of Marila et al. (US 2009/0326938), hereinafter Marila, in view of Blumenberg et al. (US 2008/0165133), hereinafter Blumenberg.

As per claim 2, Kiss teaches the following:
a method comprising: 
receiving, at a user device, audio data corresponding to an utterance spoken by a user of the user device.  See Fig. 3a, step 301; 
processing, by the user device, the audio data to generate a transcription of the utterance, the transcription comprising a plurality of words, each of the plurality of words in the transcription comprising a respective confidence value indicating a likelihood that the corresponding word is correct.  As Kiss teaches in paragraph [-0068], and corresponding Fig. 3a, steps 302 and 303, speech recognition is performed on the input speech to obtain a sequence of words.  As Kiss teaches in the abstract, the words each 
displaying, by the data processing hardware, the transcription of the utterance on a proximity-sensitive display of the user device.  See Fig. 3a, step 304; 
receiving, at the user device, data indicating a touch received on the proximity-sensitive display.  As Kiss shows in Fig. 3a, step 305, a word is selected.  ; 
in response to receiving the data indicating the touch, selecting, by the user device, one of the plurality of words in the transcription displayed on the proximity-sensitive display based on the respective confidence value of each word of the plurality of words and a touch location on the proximity-sensitive display where the touch is received.  As Kiss shows in Fig. 3a, steps 3-5 and 306, a word is selected; and 
augmenting, by the user device, the selected one of the plurality of words in the transcription displayed on the proximity-sensitive display.  As Kiss shows in Fig. 3b, step 326, and Fig. 6, a selected words is “augmented” with the presentation of word candidates.
However, Kiss does not explicitly teach of a touch sensitive display.  In a similar field of invention, Marila teaches in the abstract of a method for correcting text in a speech recognition system.  Marila teaches in paragraph [0021], a touch/proximity screen may be utilized for user input. Marila further teaches in [0024] that the user may touch words for selection, such as touching a single letter to select an entire word.
It would have been obvious to one of ordinary skill in the art at the time the application was filed to have modified the selection method of Kiss with the touch screen of Marila. One of ordinary skill would have been motivated to have made such 
Furthermore, Kiss does not explicitly teach on the word selection being based on the touch location in combination with the confidence score.  Blumenberg teaches in the abstract of selecting graphical objects based upon the proximity of the object to a determined touch point.  It would have been obvious to one of ordinary skill in the art at the time the application was filed to have modified the word selection of Kiss in view of Marila with the proximity value of Blumenberg.  One of ordinary skill would have been motivated to have made such modification because as Blumenberg teaches in paragraph [0004], such proximity selection benefits the user of small displays where desired objects for selection may be small and clustered closely together.
Upon the further modification of Kiss in view of Blumenberg, the combined method would utilize the confidence values of Kiss and proximity values of Blumenberg.  One of ordinary skill would have been motivated to have utilized both values because as Kiss teaches in paragraph [0006], properly allowing for user desired text benefits the user in correcting errors.

Regarding claim 5, modified Kiss teaches the method of claim 2 as described above.  Kiss further teaches the following:
each word of the plurality of words in the transcription are displayed at a respective location on the proximity-sensitive display; and selecting the one of the plurality of words in the transcription displayed on the proximity-sensitive display is further based on the respective location each word in the transcription is displayed on the proximity-sensitive display.  As Kiss shows in Fig. 4, and corresponding paragraph [0070], each word is presented at a respective location for selection.
Regarding claim 9, modified Kiss teaches the method of claim 2 as described above.  However, Kiss does note explicitly teach of highlighting the selected word.  Marila further teaches the following:
augmenting the selected one of the plurality of words in the transcription displayed on the proximity-sensitive display comprises highlighting the selected one of the plurality of words.  As Marila teaches in paragraph [0022], a selected text may be indicated by highlighting.
It would have been obvious to one of ordinary skill in the art at the time the application was filed to have modified the selection method of Kiss with the highlighting of selected text of Marila. One of ordinary skill would have been motivated to have made such modification because such highlighting would benefit the user in providing visual confirmation of a selection.

Regarding claim 10, modified Kiss teaches the method of claim 2 as described above.  Kiss further teaches the following:
wherein augmenting the selected one of the plurality of words in the transcription displayed on the proximity-sensitive display comprises superimposing a menu of alternative hypothesized words.  As Kiss shows in Fig. 3b, step 326, and Fig. 6, a selected words is “augmented” with the presentation of word candidates.  

Regarding claim 11, modified Kiss teaches the method of claim 10 as described above.  Kiss further teaches the following:
receiving, at the user device, a user input indication indicating selection of an alternative hypothesized word in the menu of alternative hypothesized words; and replacing, by the user device, the selected one of the plurality of words in the transcription displayed on the proximity-sensitive display with the selected alternative hypothesized word.  As Kiss teaches in the abstract, a candidate word replaces a selected word in response to user input.

As per claim 12, Kiss teaches the following:
a user device comprising: 
data processing hardware, (see Fig. 1, 100); and 
memory hardware in communication with the data processing hardware and storing instructions, (see Fig. 1, 101).
The remaining limitations of claim 12 are substantially similar to those of claim 2 and are rejected using identical reasoning.

Regarding claim 15, modified Kiss teaches the device of claim 12 as described above.  The remaining limitations of claim 15 are substantially similar to those of claim 5 and are rejected using identical reasoning.

Regarding claims 19-21, modified Kiss teaches the device of claim 12 as described above.  The remaining limitations of claims 19-21 are substantially similar to those of claims 9-11 respectively and are rejected using identical reasoning.

Claims 3, 4, 13, and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kiss in view of  Marila in view of Blumenberg as applied to claims 2 and 12 above, and further in view of Waibel et al. (US 5,712,957), hereinafter Waibel

Regarding claim 3, modified Kiss teaches the method of claim 2 as described above.  However, Kiss does not explicitly teach of a speech recognizer generating a word lattice.  Waibel teaches the following:
processing the audio data to generate the transcription of the utterance comprises executing a speech recognizer on the user device, the speech recognizer configured to: generate a word lattice for the utterance; and generate the transcription of the utterance based on the word lattice.  As Waibel teaches in column 3, lines 15-17, a lattice of hypotheses is utilized by a recognition engine in generating a transcription of a primary utterance.
It would have been obvious to one of ordinary skill in the art at the time the application was filed to have modified the speech recognition of Kiss with the lattice of Waibel.  One of ordinary skill would have been motivated to have made such 

Regarding claim 4, modified Kiss teaches the method of claim 3 as described above.  However, as described above, Kiss does not explicitly teach of a speech recognizer generating a word lattice.  Waibel teaches the following:
the word lattice comprises multiple hypotheses for the transcription of the utterance.  As Waibel teaches in column 3, lines 15-17, a lattice of hypotheses is utilized by a recognition engine in generating a transcription of a primary utterance.
It would have been obvious to one of ordinary skill in the art at the time the application was filed to have modified the speech recognition of Kiss with the lattice of Waibel.  One of ordinary skill would have been motivated to have made such modification because as Waibel teaches in column 3, lines 10-30, such recognition technique benefit a user in better repairing text translations of recognized speech.

Regarding claims 13 and 14, modified Kiss teaches the device of claim 12 as described above.  The remaining limitations of claims 13 and 14 are substantially similar to those of claims 3 and 4 and are rejected using identical reasoning.

Claims 6, 7, 16, and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kiss in view of  Marila in view of Blumenberg as applied to claims 2, 5, 12, and 15 above, and further in view of Pohjola et al. (US 2009/0006958), hereinafter Pohjola.

Regarding claim 6, modified Kiss teaches the method of claim 5 as described above.  However, Kiss does not explicitly teach of determining a distance between touch location and words for selection.  Pohjola teaches the following:
for each word of the plurality of words in the transcription: determining, by the user device, a distance between the touch location on the proximity- sensitive display where the touch is received and the respective location the corresponding word in the transcription is displayed on the proximity-sensitive display; and when the distance satisfies a threshold distance, excluding, by the user device, the corresponding word from consideration as the selected one of the plurality of words in the transcription displayed on the proximity-sensitive display.  As Pohjola teaches in paragraph [0036], a distance from a touch event to candidate targets may be utilized for making a selection.  Pohjola further teaches in paragraph [0037], a “predetermined threshold distance” may be utilized in selecting candidates.
It would have been obvious to one of ordinary skill in the art at the time the application was filed to have modified the speech recognition of Kiss with the proximity selection of Pohjola.  One of ordinary skill would have been motivated to have made such modification because as Pohjola teaches in paragraph [0006], such proximity selection benefit a user in selection accuracy in small screen devices.

Regarding claim 7, modified Kiss teaches the method of claim 6 as described above.  However, as described above, Kiss does not explicitly teach of determining a distance between touch location and words for selection.  Pohjola teaches the following:
the threshold distance is based on a size and resolution of the proximity-sensitive display.  As Pohjola taches in paragraph [0037], the threshold distance may also be determined based on screen size or resolution of the touch screen display.
It would have been obvious to one of ordinary skill in the art at the time the application was filed to have modified the speech recognition of Kiss with the proximity selection of Pohjola.  One of ordinary skill would have been motivated to have made such modification because as Pohjola teaches in paragraph [0006], such proximity selection benefit a user in selection accuracy in small screen devices.

Regarding claims 16 and 17, modified Kiss teaches the device of claim 15 as described above.  The remaining limitations of claims 16 and 17 are substantially similar to those of claims 6 and 7 and are rejected using identical reasoning.

Claims 8 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kiss in view of  Marila in view of Blumenberg as applied to claims 2, 5, 12, and 15 above, and further in view of Roth et al. (US 2011/0043455), hereinafter Roth.

Regarding claim 8, modified Kiss teaches the method of claim 5 as described above.  However, Kiss does not explicitly teach of the locations correspond to a centroid of the word.  Roth teaches the following:
the respective location each word in the transcription is displayed on the proximity-sensitive display corresponds to a respective location of a centroid of the corresponding word displayed on the proximity-sensitive display.  As Roth teaches in paragraph [0033], word locations are calculated via the words’ center.
It would have been obvious to one of ordinary skill in the art at the time the application was filed to have modified the word locations of Kiss with the word centers of Roth.  One of ordinary skill would have been motivated to have made such modification because as Roth teaches in paragraph [0030], such centroid analysis of words benefit a user in more accurate interpretation of word locations.

Regarding claim 18, modified Kiss teaches the device of claim 15 as described above.  The remaining limitations of claim 18 are substantially similar to those of claim 8 and are rejected using identical reasoning.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GREGORY A DISTEFANO whose telephone number is (571)270-1644.  The examiner can normally be reached on Monday - Friday: 9 am - 5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/GREGORY A. DISTEFANO/
Examiner
Art Unit 2175



					/WILLIAM L BASHORE/                                                      Supervisory Patent Examiner, Art Unit 2175