DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Response to Amendment
This communication is responsive to the applicant’s amendment dated 07/27/2022.  The applicant(s) amended claims 1, 8, and 15.

Response to Arguments
Applicant's arguments with respect to claims 1, 8, and 15 have been considered but are moot in view of the new ground(s) of rejection because the arguments pertain to the newly amended limitations.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1, 8 and 15 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.  
Regarding claims 1, 8, and 15, the claims recite “selecting individual portions of the one or more portions of the first audio stream to be combined with select individual portions of the one or more portions of the second audio stream based upon, at least in part, a weighted signal-to-noise ratio for each of the one or more portions of the first audio stream and each of the one or more portions of the second audio stream, wherein the selected individual portions of the one or more portions of the first audio stream is less than the first audio stream and wherein the select individual portions of the one or more portions of the second audio stream is less than the second audio stream.” However, the specification and drawings do not show any support for a weighted signal-to-noise ratio. At best, there is support for weighting the first audio stream and the second audio stream based upon, at least in part, a signal-to-noise ratio for the first audio stream and a signal-to-noise ratio for the second audio stream, thus defining a first audio stream weight and a second audio stream weight. (par. 0164, published spec) There is no evidence for a scaled or weighted signal-to-noise ratio value.

Claim Rejections - 35 USC § 103
Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Contolini et al. (US 20160125882 A1) in view of Leppanen et al. (US 20150228274 A1).

Regarding claims 1, 8, and 15, Contolini teaches:
“receiving audio encounter information from a first microphone system, thus defining a first audio stream” (par. 0026; ‘During a medical procedure, operator 110 issues operator speech 112 that is received by microphone arrays 120,120′. Voice interpreting module 135 in voice controlled medical system 105 interprets operator speech 112 from array signals 125,125′ to identify commands 115.’);
“receiving audio encounter information from a second microphone system, thus defining a second audio stream” (par. 0026; ‘During a medical procedure, operator 110 issues operator speech 112 that is received by microphone arrays 120,120′. Voice interpreting module 135 in voice controlled medical system 105 interprets operator speech 112 from array signals 125,125′ to identify commands 115.’);
 “detecting speech activity in one or more portions of the first audio stream, thus defining one or more speech portions of the first audio stream” (par. 0034; ‘Comparison module 160 determines that beamed signals 142,142′ likely contains desired commands 115 (i.e. is operator speech 112) if data from both arrays 120,120′ (i.e. beamed signals 142,142′) are sufficiently correlated, and the determined location of the sound source is within the command area 180.’)
“detecting speech activity in one or more portions of the second audio stream, thus defining one or more speech portions of the second audio stream, wherein the speech activity from the first audio stream and the second audio stream is detected based upon a threshold amount of correlation among the audio encounter information received from the first microphone system and the second microphone system;” (par. 0034; ‘In other words, comparison module 160 compares beamed signals 142,142′ (from first array 120 and second array 120′, respectively) and determines if they are sufficiently highly correlated to exceed a correlation threshold 165. If comparison module 160 determines that a desired command 115 is likely contained within beamed signals 142,142′ (i.e. that their correlation exceeds threshold 165), it will send beamed signals 142,142′ to voice interpreting module 135 which will perform speech recognition on the beamed signals 142,142′ and determine what, if any, commands 115 are contained therein.’).
Contolini does not explicitly teach:
“aligning the first audio stream and the second audio stream based upon, at least in part, the one or more speech portions of the first audio stream and the one or more speech portions of the second audio stream” and
“selecting individual portions of the one or more portions of the first audio stream to be combined with select individual portions of the one or more portions of the second audio stream based upon, at least in part, a weighted signal-to-noise ratio for each of the one or more portions of the first audio stream and each of the one or more portions of the second audio stream, wherein the selected individual portions of the one or more portions of the first audio stream is less than the first audio stream and wherein the select individual portions of the one or more portions of the second audio stream is less than the second audio stream”.
Leppanen teaches:
“aligning the first audio stream and the second audio stream based upon, at least in part, the one or more speech portions of the first audio stream and the one or more speech portions of the second audio stream” (par. 0030; ‘Multi-device speech recognition apparatus 200 may utilize the timestamps of audio samples 500, 502, and 504 to divide each of the samples into multiple frames, the frames corresponding to portions of time over which audio samples 500, 502, and 504 were captured.’; par. 0031; ‘The frames determined to be most suitable for speech recognition for their respective period of time may then be combined to form hybrid sample 506.’); and
“selecting individual portions of the one or more portions of the first audio stream to be combined with select individual portions of the one or more portions of the second audio stream based upon, at least in part, a weighted signal-to-noise ratio for each of the one or more portions of the first audio stream and each of the one or more portions of the second audio stream, wherein the selected individual portions of the one or more portions of the first audio stream is less than the first audio stream and wherein the select individual portions of the one or more portions of the second audio stream is less than the second audio stream” (par. 0031; ‘Multi-device speech recognition apparatus 200 may analyze each of the frames to identify a preferred frame for each portion of time based on their suitability for speech recognition (e.g., based on one or more of the frames' signal-to-noise ratios, amplitude levels, gain levels, or phoneme recognition levels).’ ‘The frames determined to be most suitable for speech recognition for their respective period of time may then be combined to form hybrid sample 506.’).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to modify Contolini’s voice controlled medical system by incorporating Leppanen’s multi-device speech recognition methods (identifying frames most suitable for speech recognition) in order to increase the probability that a high quality audio sample will be available for speech recognition by utilizing multiple devices in physical proximity to the user to capture multiple audio samples. (Leppanen: par. 0017)

Regarding claims 2 (dep. on claim 1), 9 (dep. on claim 8), and 16 (dep. on claim 15), the combination of Contolini in view of Leppanen further teaches:
“wherein the first microphone system includes a microphone array” (Contolini: par. 0026; ‘During a medical procedure, operator 110 issues operator speech 112 that is received by microphone arrays 120,120′.’).

Regarding claims 3 (dep. on claim 1), 10 (dep. on claim 8), and 17 (dep. on claim 15), the combination of Contolini in view of Leppanen further teaches:
“wherein the second microphone system includes a mobile electronic device” (Leppanen: par. 0015; ‘For example, principal device 104 may be a smartphone, tablet computer, laptop computer, desktop computer, or other similar device capable of utilizing a text string produced via speech recognition.’).

Regarding claims 4 (dep. on claim 1), 11 (dep. on claim 8), and 18 (dep. on claim 15), the combination of Contolini in view of Leppanen further teaches:
“in response to aligning the first audio stream and the second audio stream, processing the first audio stream and the second audio stream with one or more speech processing systems” (Leppanen: par. 0031; ‘The frames determined to be most suitable for speech recognition for their respective period of time may then be combined to form hybrid sample 506.’).

Regarding claims 5 (dep. on claim 4), 12 (dep. on claim 11), and 19 (dep. on claim 18), the combination of Contolini in view of Leppanen further teaches:
“wherein processing the first audio81H&K Docket No.: 119482.00458/20-0003-US-ORG5Holland & Knight LLP stream and the second audio stream with one or more speech processing systems includes weighting the first audio stream and the second audio stream based upon, at least in part, the signal-to-noise ratio for the one or more portions of the first audio stream and the signal-to-noise ratio for the one or more portions of the second audio stream, thus defining a first audio stream weight and a second audio stream weight” (Leppanen: par. 0031; ‘Multi-device speech recognition apparatus 200 may analyze each of the frames to identify a preferred frame for each portion of time based on their suitability for speech recognition (e.g., based on one or more of the frames' signal-to-noise ratios, amplitude levels, gain levels, or phoneme recognition levels).’).

Regarding claim 6 (dep. on claim 5), 13 (dep. on claim 12), and 20 (dep. on claim 19), the combination of Contolini in view of Leppanen further teaches:
“wherein processing the first audio stream and the second audio stream with one or more speech processing systems includes processing the first audio stream and the second audio stream with a single speech processing system based upon, at least in part, the first audio stream weight and the second audio stream weight” (Leppanen: par. 0031; ‘Multi-device speech recognition apparatus 200 may then perform speech recognition on hybrid sample 506, generating output text string 508, which may be communicated to primary device 104.’).

Regarding claims 7 (dep. on claim 5) and 14 (dep. on claim 12), the combination of Contolini in view of Leppanen further teaches:
“wherein processing the first audio stream and the second audio stream with one or more speech processing systems includes: processing the first audio stream with a first speech processing system, thus defining a first speech processing output” (Leppanen: par. 0035; ‘For example, audio samples 400, 402, and 404 may respectively be received from secondary devices 106, 108, and 110. In step 706, the received audio samples may be compared to a reference sample of user 102's voice to identify samples or portions of samples that contain voices other than user 102's voice, and the extraneous samples (or extraneous portions of the samples) may be discarded.’) the combination of Contolini in view of Leppanen further teaches:
“processing the second audio stream with a second speech processing system, thus defining a second speech processing output” (Leppanen: par. 0035; ‘For example, audio samples 400, 402, and 404 may respectively be received from secondary devices 106, 108, and 110. In step 706, the received audio samples may be compared to a reference sample of user 102's voice to identify samples or portions of samples that contain voices other than user 102's voice, and the extraneous samples (or extraneous portions of the samples) may be discarded.’);  and
“combining the first speech processing output with the second speech processing output based upon, at least in part, the first audio stream weight and the second audio stream weight” (Leppanen: par. 0031; ‘The frames determined to be most suitable for speech recognition for their respective period of time may then be combined to form hybrid sample 506.’).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARK VILLENA whose telephone number is (571)270-3191. The examiner can normally be reached 10 am - 6pm EST Monday through Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

MARK . VILLENA
Examiner
Art Unit 2658



/MARK VILLENA/Examiner, Art Unit 2658