DETAILED ACTION
1.	This communication is in response to the Amendments and Arguments filed on 2/26/2021. In particular, dependent claim 7 is amended. Claims 1-15 are pending and have been examined.  
Response to Amendments and Arguments
2.	 With respect to claim rejections under 35 USC 103, in particular, the applicant argues that the references do not teach multiple limitations stated in the original Claim 1.
In response, the examiner respectfully disagrees. Note that because there is no amendment to Claim 1, the reasons for limitation rejection are further elaborated and the key points are highlighted in underlined bold text as detailed in Section 3 below.
As for the amended dependent Claim 7, the amendments and the reasons for rejection are detailed below (also see more details stated in Claim 1):
As per claim 7 (dependent on claim 1), EIDE in view of KATURI further discloses “performing testing periodically, including daily, weekly, or monthly, wherein the testing comprises the submitting of hundreds of speech samples to multiple speech recognition engines, and the analyzing of how transcription error rates of the speech recognition engines vary with sound and speech characteristics of the speech samples (EIDE, [0038], A speech recognizer selection test is performed <where when to do testing is a design choice>; [0003], a speech recognition engine .. generates a textual transcript using a combination of acoustic and language model scores to determine the best word or phrase for each portion of the input audio stream <where ‘input audio stream’ reads on submitting any number of speech samples and ‘scores’ read on analyzing transcription error rates>; [0010], the characteristic-specific speech recognition system .. utilizes a plurality of prioritized speech recognition methods <read on .”
Claim Rejections - 35 USC § 103
3.	Claims 1-3, 6-15 are rejected under 35 U.S.C. 103 as being unpatentable over Eide, et al. (US 20030115053; hereinafter EIDE) in view of Katuri, et al. (US 20150194152; hereinafter KATURI).
As per claim 1, EIDE (Title: Methods and apparatus for improving automatic digitization techniques using recognition metrics) discloses “A computer-implemented method of [ cloud-based speech recognition ] from audio channels without prior training to adapt to speaker(s) in the audio channels (EIDE, [0003], a speech recognition engine .. generates a textual transcript using a combination of acoustic and language model scores to determine the best word or phrase for each portion of the input audio stream <read on audio channel>; Abstract, digitization system recognizes the input information using a general recognizer that performs well for typical input information <read on ‘without prior training to adapt to a particular speaker’>; [0020], computer), the method including: 
submitting hundreds of speech samples to multiple speech recognition engines, and analyzing how transcription error rates of the speech recognition engines vary with sound and speech characteristics of the speech samples (EIDE, [0003], a speech recognition engine .. generates a textual transcript using a combination of acoustic and language model scores to determine the best word or phrase for each portion of the input audio stream <where ‘input audio stream’ reads on submitting any number of speech samples and ‘scores’ read on analyzing transcription error rates>; [0010], the characteristic-specific speech recognition utilizes a plurality of prioritized speech recognition methods <read on ‘multiple speech recognition engines’ where it is apparent that different speech recognition methods mean different speech recognition engines, unless the applicant define ‘engine’ very specifically> that exhibit varying degrees of improved speech recognition for a certain characteristic of the input speech); 
receiving an audio channel and qualifying the speech recognition engines as capable of transcribing the audio channel and/or its parts, taking into account at least a recording codec of the audio channel, available transcoding from the recording codec to a speech recognition engine supported codec, a length of the audio channel, and a language of the audio channel (EIDE, [0010], utilizes a plurality of prioritized speech recognition methods that exhibit varying degrees of improved speech recognition for a certain characteristic of the input speech <read on any characteristic of the audio channel including codec, language, and so on>; [0020], The input information may be, for example, speech, handwriting, printed text, pictures or text in some language); 
applying an audio channel analyzer to the audio channel to characterize audio fidelity, background noise, concurrent speech by multiple speakers, timbre, pitch, and audio distortion of the audio channel (EIDE, [0009], analyzes the speech being processed and classifies each speech phrase according to whether or not the speech phrase exhibits the physical parameter .. Once classified, the speech may be recognized using a general or characteristic-specific speech recognizer, as appropriate; [0010], for a certain characteristic of the input speech <read on any characteristic of the audio channel and the audio/speech signal itself>; [0005], the accuracy of ; 
selecting a speech recognition engine that is qualified as capable of transcribing the audio channel and/or its parts or a transcoded version of the audio channel and/or its parts, taking into account the audio fidelity, background noise, concurrent speech by multiple speakers, timbre, pitch, and audio distortion of the audio channel and how transcription error rates of the speech recognition engines vary, based on the analyzing of the hundreds of speech samples; and submitting the audio channel and/or its parts to the selected speech recognition engine (EIDE, [0010], The input information is recognized in parallel using each of the prioritized speech recognition methods and the best performing speech recognizer is selected for each phrase).”  
EIDE does not explicitly disclose “cloud-based speech recognition ..” However, the feature is taught by KATURI (Title: Far-field speech recognition systems and methods).  
In the same field of endeavor, KATURI teaches: [0008] “The number of voice recognition devices can send the voice commands to a computing hub (e.g., computing device, computer, cloud computing network, etc.)” and [0030] “The far-field speech recognition system 210 can include a network 222 (e.g., cloud computing network, LAN, WAN, Internet, etc.)”
Therefore, it would have been obvious to one of ordinary skill in the art at the time before the effective filing date of the claimed invention to incorporate the teachings of KATURI in a speech recognition system (as taught by EIDE) to provide cloud-based speech recognition.
claim 2 (dependent on claim 1), EIDE in view of KATURI further discloses “using the multiple speech recognition engines on the audio channel, including using the speech recognition engines sequentially when a first speech recognition engine reports a low confidence score on some or all of its transcription (EIDE, [0010], utilizes a plurality of prioritized speech recognition methods that exhibit varying degrees of improved speech recognition for a certain characteristic of the input speech .. the best performing speech recognizer is selected for each phrase <read on a ready mechanism to realize the recited limitations, where running a plurality of speech recognizers sequentially or in other manner is a system design choice>).”  
As per claim 3 (dependent on claim 1), EIDE in view of KATURI further discloses “using the multiple speech recognition engines on all or separate parts of the audio channel, including using the speech recognition engines when voting on transcription results is used, when different speakers on different tracks of the audio channel, and when different speakers take turns during segments of the audio channel (EIDE, [0010], utilizes a plurality of prioritized speech recognition methods that exhibit varying degrees of improved speech recognition for a certain characteristic of the input speech .. the best performing speech recognizer is selected for each phrase <read on using multiple speech recognizers no matter how and by whom the input speech is produced, and where ‘the best performing speech recognizer is selected’ reads on ‘voting’. Note that ‘voting’ is not clearly described in the Specification>).”  
As per claim 6 (dependent on claim 1), EIDE in view of KATURI further discloses “applying a silence analyzer to the speech samples and the audio channel prior to submission to parse out silent parts of speech (KATURI, [0020], adaptive spatial signal processing can be utilized to focus the sound reception of the plurality of sound recognition devices and remove .”
As per claim 7 (dependent on claim 1), EIDE in view of KATURI further discloses “performing testing periodically, including daily, weekly, or monthly, wherein the testing comprises the submitting of hundreds of speech samples to multiple speech recognition engines, and the analyzing of how transcription error rates of the speech recognition engines vary with sound and speech characteristics of the speech samples (EIDE, [0038], A speech recognizer selection test is performed <where when to do testing is a design choice>; [0003], a speech recognition engine .. generates a textual transcript using a combination of acoustic and language model scores to determine the best word or phrase for each portion of the input audio stream <where ‘input audio stream’ reads on submitting any number of speech samples and ‘scores’ read on analyzing transcription error rates>; [0010], the characteristic-specific speech recognition system .. utilizes a plurality of prioritized speech recognition methods <read on multiple speech recognition engines> that exhibit varying degrees of improved speech recognition for a certain characteristic of the input speech).”
Claim 8 (similar in scope to claim 1, except changing “submitting hundreds of speech samples” to “submitting thousands of speech samples”) is rejected under the same rationale as applied above for claim 1. Claim 8 recites “based on the analyzing of the hundreds of speech samples ..” which must be corrected.
Claim 9 (similar in scope to claim 1, except changing “submitting hundreds of speech samples” to “submitting dozens of speech samples”) is rejected under the same rationale as based on the analyzing of the hundreds of speech samples ..” which must be corrected.
Claim 10 (similar in scope to claim 1) is rejected under the same rationale as applied above for claim 1. Note that EIDE further teaches: [0020] “The characteristic-specific digitization system 100 converts input information into words or characters in a computer-readable format” which reads on processor and memory. 
Claim 11 (similar in scope to claim 8) is rejected under the same rationale as applied above for claim 8. Note that EIDE further teaches: [0020] “The characteristic-specific digitization system 100 converts input information into words or characters in a computer-readable format” which reads on processor and memory. 
Claim 12 (similar in scope to claim 9) is rejected under the same rationale as applied above for claim 9. Note that EIDE further teaches: [0020] “The characteristic-specific digitization system 100 converts input information into words or characters in a computer-readable format” which reads on processor and memory. 
Claim 13 (similar in scope to claim 1) is rejected under the same rationale as applied above for claim 1. Note that EIDE further teaches: [0020] “The characteristic-specific digitization system 100 converts input information into words or characters in a computer-readable format” which reads on processor and memory. 
Claim 14 (similar in scope to claim 8) is rejected under the same rationale as applied above for claim 8. Note that EIDE further teaches: [0020] “The characteristic-specific digitization system 100 converts input information into words or characters in a computer-readable format” which reads on processor and memory. 
Claim 15 (similar in scope to claim 9) is rejected under the same rationale as applied above for claim 9. Note that EIDE further teaches: [0020] “The characteristic-specific digitization system 100 converts input information into words or characters in a computer-readable format” which reads on processor and memory. 
Note that Claims 10-15 further recite “computer instructions to securely authenticate a recording file from initial collection through post-production and distribution ..” However, no clear description on this limitation is found in the Specification.
4.	Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over EIDE in view of KATURI, and further in view of Aronowitz (US 20110119060; hereinafter ARONOWITZ).
As per claim 4 (dependent on claim 1), EIDE in view of KATURI further discloses “applying the method to [ separation/identification (diarization) engines ].”  
EIDE in view of KATURI does not explicitly disclose “separation/identification (diarization) engines ..” However, the feature is taught by ARONOWITZ (Title: Method and system for speaker diarization).
In the same field of endeavor, ARONOWITZ teaches: [0019] “Speaker diarization methods can be summarized in the following steps: Process an input signal by computing an acoustic feature vector based on spectral characteristics every frame or time period (for example, every 10 msec); Segment the input signal by finding time frames of speaker changes using statistical measures of distances between distributions of feature vectors; Cluster resulting segments using the feature vectors.”
Therefore, it would have been obvious to one of ordinary skill in the art at the time before the effective filing date of the claimed invention to incorporate the teachings of ARONOWITZ  speaker diarization. 
5.	Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over EIDE in view of KATURI, and further in view of Tang, et al. (US 20020069055; hereinafter TANG). 
As per claim 5 (dependent on claim 1), EIDE in view of KATURI further discloses “applying the method to [ auto-punctuation engines ].”  
EIDE in view of KATURI does not explicitly disclose “auto-punctuation engines ..” However, the feature is taught by TANG (Title: Apparatus and method for automatically generating punctuation marks continuous speech recognition).  
In the same field of endeavor, TANG teaches: [Abstract] “An apparatus for automatically generating punctuation marks in a continuous speech recognition system, comprises means (1, 2, 3, 5) for recognizing user speech and converting the user speech into words, characterized in that means (1, 2, 3, 5) for recognizing user speech is further used to recognize pseudo noises in the user speech; and the apparatus characterized by further comprising: means (9) for marking pseudo noises in output results of means (1, 2, 3, 5) for recognizing user speech; means (10, 14, 13) for generating punctuation marks by finding most likely pseudo punctuation marks at locations of pseudo noises marked by the means (9) for marking pseudo noises based on a language model containing pseudo punctuation marks.”
Therefore, it would have been obvious to one of ordinary skill in the art at the time before the effective filing date of the claimed invention to incorporate the teachings of TANG in a speech recognition system (as taught by EIDE and KATURI) for application to auto-punctuation engines.
Conclusion 
6.	THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).   
	A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 		
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FENG-TZER TZENG whose telephone number is (571)272-4609. The examiner can normally be reached on M-F (8:30-5:00). The fax phone number where this application or proceeding is assigned is 571-273-4609.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir (SPE) can be reached on 571-272-7799. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would 

/FENG-TZER TZENG/		3/2/2021Primary Examiner, Art Unit 2659