DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 08/15/2022 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
	
Response to Amendments and Arguments
Regarding the rejections under 35 U.S.C. §102 and §103, applicant substantially amended all claims by adding new limitations. Applicant also removed a few previously presented limitations. The claimed inventions defined by the amended claims were shifted to distinct inventions from the previously presented claims. The previously claimed inventions were based on features showing in Fig. 3 (also relevant sections in the specification). The amended claims are based on features of beamforming functions using a microphone array which was mainly shown in Fig. 4 (Spec. [0042-0048]). 
 Although it is improper to shift a claimed invention to a distinct invention after receiving an office action (See MPEP 821.03), for the purpose of compact prosecution, the examiner treats the amendment as bona fide and examines the amended claims. The examiner has performed an extensive search and discovered several relevant references. The examiner rejects the amended claims using the newly discovered references. Applicant’s arguments (Remarks, pages 14-15) are moot because the arguments do not apply to the new references being used in the current rejection.  
  
Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 

In an amendment filed on 09/15/2022, applicant deleted many limitations and added different limitations. Since the claimed invention was shifted to a distinct invention, the abstracted presented on 02/17/2021 does not correspond to the instant claims. A new abstract is required. See MPEP § 608.01(b) for guidelines for the preparation of patent abstracts.

Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), first paragraph:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1, 3-5, 8-10, 12-14, 17, and 19-22 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for pre-AIA  the inventor(s), at the time the application was filed, had possession of the claimed invention.

Applicant amended independent claims by adding new limitations:

“wherein the enhanced speech information includes first enhanced speech information in a first enhancement direction and second enhanced speech information in a second enhancement direction, the first enhanced speech information includes a first portion (k11) of the first-path audio data and a first portion (k21) of the second-path audio data, and the second enhanced speech information includes a second portion (k12) of the first-path audio data and a second portion (k22) of the second-path audio data, wherein the k11 is different than k12 or k21 is different than k22;”

The newly added limitations were based on the specification ([0048], duplicated below). However, the specification does NOT provide an adequate support for the newly added limitations. The specification only shows combining audio signals from multiple microphones of a microphone array to beam to a selected direction of D1, D2 or D3. The disclosure does NOT describe a portion of audio data, therefore, does not support the claimed: “a first portion of the first path audio data”, or “a second portion of the first audio path audio data”. 



    PNG
    media_image1.png
    656
    986
    media_image1.png
    Greyscale

If applicant believes the original disclosure supports the newly added limitations, the examiner suggests application pointing out the exact sections in the disclosure that provides an adequate support for the new limitations. In the following rejection over prior art references, the examiner interprets the limitation according the described features in the specification. 

	Claim Rejections - 35 USC § 103
 Claims 1, 3, 4, 8-10, 12, 13, 17 and 19-22 are rejected under 35 U.S.C. 103 as being unpatentable over Pogue et al. (US PG Pub. 2015/0006176, referred to as Pogue) in view of Benesty et al. (“On Microphone-Array Beamforming From a MIMO Acoustic Signal Processing Perspective”, IEEE, 2007, referred to as Benesty).

Examiner Notes
Applicant substantially amended independent claims by adding new limitations related to a beamforming function using a microphone array based on an illustration of Fig. 4. The added limitations (related to limitation designated as k11, k12, k21, k22) are related to a typical beamforming method using a delay-and-sum procedure. 

Pogue discloses using a voice-controlled device (Fig. 1). The device has a build-in microphone array (Fig. 2, #204, Fig. 3, #108). The system detects user’s wakeup expression (Fig. 3, #136). The system also performs a beamforming function to reduce echo from a loudspeaker (Fig. 3, #126, #110, [0018-0019]). Pogue discloses all features defined in independent claims, although some features are implicitly disclosed. 

Benesty discloses details of beamforming functions of a microphone array. Benesty discloses using a microphone array to beam to a desired sound source direction (Section II, Fig. 4).

Regarding claims 1, 10 and 17, Pogue discloses a method, an apparatus and a storage medium, performed by an audio data processing device in communication with a terminal, the terminal including a first microphone and a second microphone (Fig. 1, #108, [0015], detecting user’s wake expression when using a voice controlled device, the device has a microphone array, which have many microphones as show in Fig. 2, #204), comprising:

obtaining multi-path audio data by collecting via the first microphone first-path audio data and collecting via the second microphone second-path audio data;
generating enhanced speech information corresponding to the multi-path audio data ([0015], [0018-0019], [0030-0032], fig. 2, #204, Examiner note, each microphone in a microphone array is located at a different position, therefore, recorded audio is from “multi-path”);
according to the first and the second enhanced speech information determining one of the first or the second enhancement direction as a target audio direction (Fig. 1, #122, Fig. 3, #110, [0018], [0035], [0048], [0073], [0082-0083], cancelling echo sound from a loud speaker, beamforming towards to a direction of user’s voice);
determining a probability of existence of a target matching word in the target audio direction ([0032], [0036], [0038], detecting probabilities of user’s wake expression voice command).

Pogue discloses detecting user’s voice command having wake expression / trigger expression ([0008]) by using a microphone array with beamforming functions (Fig. 3). Since beamforming techniques using a microphone array is mature technology, Pogue does not describe beamforming details when using a microphone array. Pogue does not explicitly discloses the added limitations: “wherein the enhanced speech information includes first enhanced speech information in a first enhancement direction and second enhanced speech information in a second enhancement direction, the first enhanced speech information includes a first portion (k11) of the first-path audio data and a first portion (k21) of the second-path audio data, and the second enhanced speech information includes a second portion (k12) of the first-path audio data and a second portion (k22) of the second-path audio data, wherein the k 11 is different than k12 or k21 is different than k22;”

The above limitations are related to a typical procedure of delay-and-sum beamforming operations. Benesy gives details of delay-and-sum operations when performing a beamforming function using a microphone array (Section I, Fig. 1, delay-and-sum beamformer to point to a desired direction of an audio signal. Section II, Fig. 4, please note, matrix H with coefficients hij correspond to k11, k12, k21 and k22).  

It would have been obvious to a person having ordinary skill in the art at the time the invention was filed to combine Pogue’s teaching with Benesy’s teaching to obtain more details about delay-and-sum operations when performing beamforming using a microphone array. One having ordinary skill in the art would have been motivated to make such a modification to reject noise (Benesy, Introduction). In addition, all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods, and in the combination each element merely would have performed the same function as it did separately. “A combination of familiar elements according to known methods is likely to be obvious when it does no more than yield predictable results.” KSR, 550 U.S. ___, 82 USPQ2d at 1395 (2007). One of ordinary skill in the art would have recognized that the results of the combination were predictable.
Regarding claims 3, 12 and 19, Pogue in view of Benesy further discloses the first- path audio data or the second-path audio data includes a first speech signal and a second speech signal, the first speech signal being a sound signal that is transmitted by a user, the second speech signal being a sound signal that is transmitted by the terminal (Fig. 1, #110, a loud speaker inside a voice device; Fig. 3, Microphone array #108, user voice command  #104, echo from a loud speaker in voice controlled device #110; [0008-0009], [0049], avoid accepting wake expression generated from a loud speaker by using echo cancellation).

Regarding claims 4, 13 and 20, Pogue in view of Benesy further discloses the enhanced speech information is generated using a beamformer (Pogue, Fig. 1, #126, [0019], [0029]).

Regarding claims 8 and 9, Pogue in view of Benesy further discloses waking up the terminal in response to determining the degree of matching is greater than or equal to a matching threshold corresponding to the target matching word (Pogue, [0010], [0038], [0043], detected wake expression exceeds a confidence threshold; not wakeup is less than the threshold; Examiner note, the detecting of wake expression is based on comparing a confidence level or a probability with a threshold to determine whether a user’s voice command contains wake expression).

Regarding claim 21, Pogue in view of Benesy further discloses reducing the second speech signal by using an echo canceler, prior to generating the enhanced speech information (Pogue, [0018], [0048], Fig. 2, #404(b)).

Regarding claim 22, Pogue in view of Benesy further discloses wherein the first-path audio data is represented by B 1 and the second-path audio data is represented by B2, and wherein the first enhanced speech information D1 is represented by (B1*kl 1+B2*k21), and the second-path audio data is represented by (B 1 *kl2+B2*k22) (Besty, Besty, Fig. 4, equation 2, Note, the claimed equation is a matrix multiplication, which is shown in Besty, EQ. 2 matrix H and signal S(k))

Claims 5 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Pogue in view of Benesty and further in view of Visser et al. (US PG Pub. 2007/0021958, referred to as Visser).

Pogue in view of Benesty discloses noise reduction and echo cancellation when detecting a user’s wake expression for controlling a voice controlled device (Pogue, Fig. 1, ##122, #124). Pogue further discloses there are may be more users (Pogue, [0015], the audio device is interacting with other users). Pogue discloses reducing ambient noise in an environment ([0015]), but does not explicitly discloses inhibiting speech from a second user (claim limitation: the first-path audio data is a sound sub-signal transmitted by a first user, the first microphone also collects a sound sub-signal transmitted by a second user, and the method further comprises:     enhancing the sound sub-signal transmitted by the first user in the speech data set, and inhibiting interference data generated by the sound sub-signal transmitted by the second user).

Visser discloses using a microphone array and beamforming technology to separate speeches from noises such as background speaker, background noise ([0082]). The background speaker is claimed “sound sub-signal transmitted by the second user”.

It would have been obvious to a person having ordinary skill in the art at the time the invention was filed to combine Pogue in view of Benesty’s teaching with Visser’s teaching to remove background noise from a background speaker (i.e., “speech from the second user”). One having ordinary skill in the art would have been motivated to make such a modification to output high quality speech (Visser, Abstract).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jialong He, whose telephone number is (571) 270-5359.  The examiner can normally be reached on Monday – Friday, 8:00AM – 4:30PM, EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Pierre Desir can be reached on (571) 272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/JIALONG HE/Primary Examiner, Art Unit 2659