DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Response to Amendments and Arguments
Applicant's arguments filed 02/15/2021 have been fully considered but they are not persuasive. 

Regarding twice rejections under 35 U.S.C. §102(a)(1) over each of two published papers co-authored by the first inventor, applicant amended independent claims 1, 10 and 18 by adding a limitation: “detecting overlapped speech during a period of time;” 

Applicant argued (Remarks, page 6) that either Yoshioka I or Yoshioka II fails to teach the newly added limitation. 

The examiner determined that the amendment fails to distinguish with the cited references. The examiner would like to notice that the claims and only the claims form the metes 

Both Yoshioka_A and Yoshioka_B (in the Remarks, these references were referred as “Yoshioka I” and “Yoshioka II”) were published research papers co-authored by the first inventor (Yoshioka).
 
Yoshioka_A discloses separating overlapped speech signals recorded from a meeting (See Abstract, Fig. 1). During the meeting two speakers talked at the same time during the meeting (See Fig. 1, in the same time period, speech from speaker A overlaps with speech from speaker B). Yoshioka_A further discloses in an experiment by comparing speech recognition using overlapped speech segments vs. speech segments from a single speaker (Section 3.3, Results). Since Yoshioka experimented using overlapped speech segments recorded during a meeting, Yoshioka_A meets a broadly recited limitation: detecting overlapped speech during a period of time.

Yoshioka_B also discloses separating a overlapped speech signal using a neural network and recognize the separated speech signal (Abstract, Fig. 3). In particular, Yoshioka_B discloses an experiment of using fully overlapped speech, partial overlap speech or speech from a single speaker during a period of time (See section 4.1, Setup and Fig. 3). Yoshioka_B meets a broadly recited limitation: detecting overlapped speech during a period of time.    



To expedite prosecution, the examiner also performed an update search and discovered several references related to detecting overlapping speech and/or separating the overlapped speech. These references are included in the attached PTO-892 form. 

Claim Rejections - 35 USC § 102
Claims 1-7, 9-15 and 17-19 are rejected under 35 U.S.C. 102 (a)(1) as being anticipated by Yoshioka et al. (“Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks”, Published Oct. 8, 2018, referred as Yoshioka_A).


Regarding claims 1, 10 and 18, Yoshioka_A discloses a method, a storage device and a device (Abstract, Section 3, computer implemented experiments using BLSTM neural networks to separate overlapped meeting speeches from multiple talkers at different meeting rooms / locations) comprising: 

receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices (Section 2.2, Section 3.2,  receiving overlapped meeting speeches from different  meeting rooms); 

Fig. 1, during a meeting, two speakers are talking at the same time; Section 3.3, experiments using overlapped speech segments and single speaker segments); 

performing for the period of time, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech in response to detecting the overlapped speech (Section 2, unmixing / separating the overlapped meeting speeches using BLSTM neural networks; See illustrations in Fig. 2 and Fig. 3; Section 3.3, experimenting with overlapped speech signals as well as speech signals from a single speaker); and 
providing the separated speech on a fixed number of separate output audio channels (Abstract, Fig. 1, generated fixed number of separated speech signals from different takers).

Regarding claims 2 and 11, Yoshioka_A further discloses performing continuous speech separation is performed by the neural network model trained using permutation invariant training (Section 1, multi-microphone speech separation neural network using permutation invariant training (PIT), also see Section 2.2.2).
 
Regarding claims 3, 12 and 19, Yoshioka_A further discloses the neural network model is configured to receive a varying number of inputs to support a dynamic change in a number of audio signals and locations of distributed devices during a meeting between multiple users (Abstract, Section 1 and section 3, receiving meeting audios from different locations containing overlapped speeches of multiple speakers).

Regarding claims 4 and 13, Yoshioka_A further discloses the multiple devices capture the audio signals during an ad-hoc meeting (Section 3, meeting transcription experiments using overlapped speech from different speakers).

Regarding claims 5 and 14, Yoshioka_A further discloses the audio signals are received at a meeting server coupled to the distributed devices via a network (Section 3, transcribing meeting speeches transmitted from different locations).

Regarding claims 6 and 15, Yoshioka_A further discloses generating a transcript based on the separate audio channels (Section 3, transcribing meeting speeches from different speakers separated by BLSTM neural networks).

Regarding claim 7, Yoshioka_A further discloses including speaker attribution in the generated transcript (Section 3, meeting speech transcriptions using separated speeches from different speakers, Fig. 1, labelled each segment by speakers “Utt1 by spkrA”, “Utt4 by SpkrC”).

Regarding claims 9 and 17, Yoshioka_A further discloses at least two of the audio streams are provided by an ambient capture device having an array of Section 3, using a circular microphone array in each meeting room). 


Claims 1, 10 and 18 are rejected under 35 U.S.C. 102 (a)(1) as being anticipated by Yoshioka et al. (“MULTI-MICROPHONE NEURAL SPEECH SEPARATION FOR FAR-FIELD MULTI-TALKER SPEECH RECOGNITION”, IEEE, published April 15, 2018, referred as Yoshioka_B).

Regarding claims 1, 10 and 18, Yoshioka_B discloses a method, a storage device and a device (Abstract, Section 4, computer implemented experiments using BLSTM neural networks to separate an audio signal containing multiple talkers) comprising: 

receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices (Abstract, Section 1, Introduction, Section 3, separating far-field speeches from multiple microphones; Note, multiple microphones correspond to claimed “multiple distributed devices” ); 
detecting overlapped speech during a period of time (Fig. 3, Section 4.1, using full overlapped speech, partial overlapped speech and non-overlapped speech in speech separating experiments);
performing for the period of time, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech in response to detecting the overlapped speech (Section 3, Fig. 3. Using BLSTM neural network to separate overlapped speech; Section 4.1, using full overlapped speech, partial overlapped speech and non-overlapped speech in speech separating experiments); and 
providing the separated speech on a fixed number of separate output audio channels (Section 2, Section 3, Fig. 2, separating speech on a fixed number channels).
	Claim Rejections - 35 USC § 103
Claims 8, 16 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Yoshioka_A in view of Frankel et al. (US PG Pub. 2011/0112833). 

Regarding claims 8, 16 and 20, Yoshioka_A discloses transcribing meeting speeches from multiple speakers (Yoshioka_A, Section 3). Yoshioka_A does not explicitly disclose sending the transcript to one or more of the distributed devices. 

 	Frankel discloses transcribing conference calls in real-time and displaying the generated transcripts on meeting participant’s computer screens ([0048-0049], Fig. 6). 

	It would have been obvious to a person having ordinary skill in the art at the time the invention was filed to combine Yoshioka_A’s teaching with Frankel’s teaching to send and display speech transcripts on each of participant’s screens. One having ordinary skill in the art would have been motivated to make such a modification so that a meeting participant can review what he had missed in the meeting (Frankel, [0004]).

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jialong He, whose telephone number is (571) 270-5359.  The examiner can normally be reached on Monday – Friday, 8:00AM – 4:30PM, EST.

If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Pierre Desir can be reached on (571) 272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  

/JIALONG HE/Primary Examiner, Art Unit 2659