DETAILED ACTION
This communication is in response to the Application filed on 03/13/2020. Claims 1-20 are pending and have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 03/13/2020, 05/29/2020, and 08/11/2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Interpretation
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:

(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “a whispered speech acoustic feature acquiring unit”, “preliminary recognition result acquiring unit”, “a whispered speech converting unit”, “a final recognition result determining unit”, “a first preliminary recognition result acquiring subunit”, “a lip shape image data acquiring unit”, “a second preliminary recognition acquiring subunit”, “a third preliminary recognition result acquiring subunit”, “a lip detecting unit”, “an image processing unit”, “a frame processing unit”, “a pre-emphasis processing unit”, “a spectrum feature extracting unit”, “a recursive processing unit”, “a first codex processing subunit”, “”a second codec processing subunit”, “a third codec processing subunit”, “a fourth codec processing subunit”, “a normal speech recognition unit”, “a first result determining unit”, “a normal speech recognition unit”, “an iteration determining unit”, “a second result determining unit”, and “third result determining unit” in claims 11-20.

If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted 



Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 7, 9, 11-12, 17, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Whitmire (US 10,665,243) in view of Zhang (“Whisper Island Detection Based on Unsupervised Segmentation with Entropy Based Speech Feature Processing”) in view of Rakshit (US 2019/0189145).
As to claim 1 and 11, Whitmire teaches method for converting a whispered speech, comprising: 
acquiring a whispered speech acoustic feature of whispered speech data (see col. 4, lines 19-25, where NAM microphone 130 is used to measure vibrations during users mouthed and sub-vocalized speech), and acquiring a preliminary recognition result of the whispered speech data (see col. 3, lines 49-51, where air microphone  to detect users whisper and other types of subvocalized speech) ; and 
inputting the whispered speech acoustic feature and the preliminary recognition result into a preset whispered speech converting model (see Figure 3, where air mic 115 and NAM mic 130 gets inputted to controller 310 and then into machine learning module 330) to acquire a normal speech acoustic feature outputted by the whispered speech converting model (see col. 6, lines 54-56, where sequence of the data 320 is classified as phoneme), wherein 
the whispered speech converting model is trained in advance (see col. 6, lines 59-62, where machine learning module comprises acoustic and language model trained for sub-audible and subvocalized)
However, Whitmire does not specifically teach by using recognition results of whispered speech training data and whispered speech training acoustic 
	Zhang does teach the whispered speech converting model is trained in advance (see page 886, right column last paragraph, where GMM is trained) and whispered speech training acoustic features of the whispered speech training data as samples (see page 886, right column, last paragraph where GMM is trained with whisper speech data) and using normal speech acoustic features of normal speech data parallel to the whispered speech training data as sample labels (see page 886, right columns, last paragraph, where neutral speech data is used as part of training) (see page 891, VI, sect. B, 1st para where each speech audio is manually labelled).
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the whisper recognition as taught by Whitmire with the use of parallel data during training as taught by Zhang  for the purpose of enabling correct detection if neutral and whisper change points (see Zhang, page 893. VIII, 2nd full paragraph).
However, Whitmire in view of Zhang does not specifically teach training in advance by using recognition results of whispered speech training data.
Rakshit does teach training in advance by using recognition results of whispered speech training data (see [0029] and Figure 4, where user speaks whisper based on displayed text in order to create training data, i.e. recognition result).

As to claim 11, apparatus claim 11 and method claim 1 are related as apparatus and the method of using same, with each claimed element's function corresponding to the claimed method step. Accordingly claim 11 is similarly rejected under the same rationale as applied above with respect to method claim. Furthermore, Whitmire teaches the units by way of a computing device (See col. 8, lines 14-17, computing device and program)

As to claim 2 and 12, Whitmire in view of Zhang in view of Rakshit teach all of the limitations as in claim 1 and 11, 
Furthermore, Whitmire does teach further comprising: determining a final recognition result of the whispered speech data based on the normal speech acoustic feature (see col. 7, lines 1-4, where command determined based on whispered speech data (see col. Lines 54-67)).
Furthermore, Zhang teaches the obtaining of normal speech acoustic features (see page 886, right columns, last paragraph, where neutral speech data is used as part of training) (see page 891, VI, sect. B, 1st para where each speech audio is manually labelled).
nd full paragraph).

As to claim 7 and 17, Whitmire in view of Zhang in view of Rakshit teach all of the limitations as in claim 1 and 11, 
Furthermore, Whitmire does teach wherein the inputting the whispered speech acoustic feature and the preliminary recognition result into the preset whispered speech converting model to acquire the normal speech acoustic feature outputted by the whispered speech converting model comprises: inputting the whispered speech acoustic feature and the preliminary recognition result into a whispered speech converting model having a recurrent neural network type, to acquire the normal speech acoustic feature outputted by the whispered speech converting model (see Figure 3, where air mic 115 and NAM mic 130 gets inputted to controller 310 and then into machine learning module 330 and see col. 6, lines 63-65, where the machine learning model is a RNN).

As to claim 9 and 19, Whitmire in view of Zhang in view of Rakshit teach all of the limitations as in claim 2 and 12, 
Furthermore, Whitmire does teach wherein the determining the final recognition result of the whispered speech data based on the normal speech acoustic feature 
determining the normal speech recognition result as the final recognition result of the whispered speech data (see col. 7, lines 5-7, where sequence mapped to a command).

Claims 6 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Whitmire in view of Zhang  in view of Rakshit as applied in claim 1 and 11, above and further in view of Braida (US 6,317,716).
As to claim 6 and 16, Whitmire in view of Zhang in view of Rakshit teaches all of the limitations as in claim 1 and 11. 
However, Whitmire in view of Zhang in view of Rakshit does not specifically teach  wherein the acquiring the whispered speech acoustic feature of the whispered speech data comprises: segmenting the whispered speech data into frames to acquire a plurality of frames of whispered speech data; performing a pre-emphasis process on each frame of whispered speech data to acquire a frame of pre-emphasis processed whispered speech data; and extracting a spectrum feature of each frame of pre-emphasis processed whispered speech data, wherein the spectrum feature comprises one or more of a LogFilter Bank Energy feature, a Mel Frequency Cepstrum Coefficient feature, or a Perceptual Linear Predictive feature.

Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the whisper speech detection as taught by Whitmire in view of Zhang in view of Rakshit with the features as taught by Braida in order to provide a more suitable representation for recognition (see BRaida col. 7, lines 8-9)

Claim 10 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Whitmire in view of Zhang in view of Rakshitas applied in claim 2 and 12, above and further in view of White (US 6,594,632).
As to claim 10 and 20, Whitmire in view of Zhang in view of Rakshit teaches all of the limitations as in claim 2 and 12. 

inputting the normal speech acoustic feature into a preset normal speech recognition model to acquire a normal speech recognition result outputted by the normal speech recognition model (see col. 7, lines 1-2, where phoneme sequence inputted to language model);
determining the normal speech recognition result as the final recognition result of the whispered speech data…(see col. 7, lines 1-4, where command determined)...
process of inputting the whispered speech acoustic feature and the preliminary recognition result into the preset whispered speech converting model (see Figure 3, where air mic 115 and NAM mic 130 gets inputted to controller 310 and then into machine learning module 330).
However, Whitmire in view of Zhang in view of Rakshit does not specifically teach determining the normal speech recognition result as the final recognition result of the whispered speech data, in a case that the preset iteration termination condition is met; and determining the normal speech recognition result as the preliminary recognition result and returning to perform the process of inputting the whispered speech acoustic feature and the preliminary recognition result into the preset whispered speech converting model, in a case that the preset iteration termination condition is not met.
White does teach determining the normal speech recognition result as the final recognition result of the whispered speech data (see col. 5, lines 35-37, where the 
Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed inventions to have modified the whisper speech recognition as taught by Whitmire in view of Zhang in view of Rakshit with the termination as taught by White in order to allow the start and stop of a voice recognition system without specific spoken commands (White col. 3, lines 5-7).

Allowable Subject Matter
Claims 3-5, 8, 13-15, and 18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. 
The following is a statement of reasons for the indication of allowable subject matter:  None of the cited prior art of record teaches or makes obvious the combination .

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Chen (US 2016/0360372) is cited to teach detection of whispered speech (see abstract). Hong (US 2016/0019886) is cited to teach recognition of a whisper action (see abstract).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PARAS D SHAH whose telephone number is (571)270-1650.  The examiner can normally be reached on Monday-Thursday 7:30AM-2:30PM, 5PM-7PM (EST), Friday 8AM-noon (EST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on 571-272-7799.  The fax phone 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/Paras D Shah/Primary Examiner, Art Unit 2659                                                                                                                                                                                                        
12/05/2021