Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 6/18/2021 has been entered.
 
Response to Arguments
Applicant's arguments filed 6/18/2021 have been fully considered but they are not persuasive. 
Interview
On August 9, 10,11, 2021, an interview as conducted by the examiner with suggestions on claimed language in hopes to place the case in condition for allowance. No immediate agreement was reached regarding amendment language. After further search and consideration, the examiner determined the language proposed does not place the case in condition for allowance. At this time, consideration is for the claimed language on the record is found below.


35 USC 112
Regarding the 35 USC 112(a), rejection of claims 1,24, the applicant contends the amendments to claim 1 satisfy the requirements under 35 USC 112(a).
The examiner disagrees. The amendment corrects a portion of the limitation, but the 35 USC 112 is directed towards another portion of the limitation (as highlighted below), where the amendments do not correct. The limitation (currently amended) recites “... applying to a representation of the combined sound signal at least one of the estimated one or more separation or extraction filters according to the obtained neural signals to produce a resultant filtered signal corresponding to the one or more of the multiple sound sources …”. The highlighted portion of the limitation is focused on application of a representation of the combined sound signal to at least one or more estimated one or more separation or extraction filters, which produces, a resultant filtered signal. As indicated in the applicant’s remarks, “As noted, the procedure 2000 may be included as part of the procedure 300 to facilitate attentional selection based on neural signals measured for a user … The procedure 300 illustrated in Fig. 3 separates a combined signal to its source components, and then uses neural signals of the person to select one of these components.” 
The highlighted portion of the limitation above recites application to a representation of the combined signal to at least one separation or extraction filter is according to obtained neural signals, as opposed to the selection of the output of the extraction or separation filter or the output from “applying, …, a neural network based speech separation processing to the combined signal …” as per the specification and the applicant’s remarks. For these reasons, the recited limitation is not in alignment with the specification, hence fails to contain subject matter which was not described in the 
Regarding claims 2-5,6,10-23, such claims are dependent on respective independent claims 

Prior art Rejection
The applicant contends Lunner teaches away from the recited limitations of the claim due to paragraph 74,75. Such paragraphs are directed towards determining the similarity measure between brainwave and audio signal as opposed to speech separation performed by label IU of Fig. 3a,3b.  Paragraph 87-90 discloses microphone techniques such as the Bell et al reference (included in the office action below) discloses blind source separation for separating signals found in the mixture or combined signal. Such is a model that does not include speaker specific training data as per the recited limitations. Please see the office action below for the further explanation regarding Bell et al and the recited limitation. Such paragraphs indicate that there are more than one manner in which source separation is performed and one well known method or element is blind source separation as disclosed by Bell et al. 

The applicant contends Barker et al fails to disclose the newly recited limitation. Applicant’s arguments with respect to Barker et al have been considered but are moot because the new ground of rejection does not rely on the Barker et al reference applied 

The applicant contends Yu et al fails to disclose amended new limitation of the independent claim. Such reference is directed to limitations pertaining the dependent claims (as indicated below) as opposed to the newly recited limitation. Please see the office action below.

The applicant contends the references Gupta, Visser, and Mowlace, references provided in the interview conducted on 6/7/2021. The new ground of rejection does not rely on such references for any teaching or matter specifically challenged in the argument. 

New Claims
The applicant contends newly added claims 31-32 are similar to features of now cancelled claim 9. Such newly added claims are considered below.

The applicant further contends Lunner fails to disclose any approach to separate a mixed (combined) signal into its sources.
The examiner disagrees. Fig. 3 shows the components of the hearing device, wherein label IU indicates the source separation unit. Furthermore, as indicated in the applicant’s remarks, paragraph 88 discloses Bell et al reference that discloses the process of blind source separation. Although Lunner does not disclose all the details of 

The applicant contends Lunner fails to disclose “separating the combined sound signals into source spectrograms, much less discusses such speech separation processing being based on a neural network implementation.” 
The office action below clearly describes the correlation between the references and the recited limitation. The office action below addresses the limitation regarding neural networks and source separation. Please see the office action below. 

The applicant further contends Lunner fails to disclose “a comparison of spectrograms (the digitized form of the received source signals does not imply that the received signals are provided as a spectrogram representation).” 
The examiner disagrees. Paragraph 147 discloses 
“Additionally, the input unit IU may comprise time to time-frequency conversion units (e.g. (digital) Fourier transformation units (e.g. FFT, e.g. DFT) or (digital) filter banks) to provide each of the electric (microphone) input signals m.sub.p (or weighted combinations thereof) and/or separated source signals s.sub.i in a time frequency representation (m.sub.p[n,k] and s.sub.i[n,k], respectively), where k is a frequency index. Preferably, the time to time-frequency conversion units may be configurable, to allow activation or deactivation of one or more, such as all, time to time-frequency units, and/or partial activation (e.g. to allow analysis of only pre-selected frequency bands). In an embodiment, the model unit MOD is configurable. In an embodiment, the model unit MOD is configured to work in the time domain in a first specific mode of operation (e.g. a normal mode) and in the (time-)frequency domain in a second specific mode of operation (e.g. in a learning mode), where analysis of signal spectra are enabled.”



	The applicant further contends Lunner fails to disclose the “applying, by the device, a neural network based speech separation processing ….” and “selecting one of the multiple resultant speaker spectrograms ….”. The examiner disagrees. The office action below clearly indicates the correlation between the references and recited limitations. Please see the office action below.

Claim Objections
Claim 31 is objected to because of the following informalities:  Claim 31 recites “a the neural-network …”.  The highlighted portion is grammatically incorrect. Please choose one. Note: The term “the” indicates reference to a previously recited limitation. The limitation “neural-network” was not previously recited.
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:


Claims 1-4,6,10-24,26-30 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
Claim 1 recites the limitation “wherein selecting the one of the plurality of signals comprises applying to a representation of the combined sound signal at least one of the estimated one or more separation or extraction filters, …, according to the obtained neural signals to produce a resultant filtered signal corresponding to the one or more of the multiple sound sources to which the person is attentive to”.
The highlighted portion of the limitation is focused on application of a representation of the combined sound signal to at least one or more estimated one or more separation or extraction filters, which produces, a resultant filtered signal. As indicated in the applicant’s remarks, “As noted, the procedure 2000 may be included as part of the procedure 300 to facilitate attentional selection based on neural signals measured for a user … The procedure 300 illustrated in Fig. 3 separates a combined signal to its source components, and then uses neural signals of the person to select one of these components.” 
as opposed to the selection of the output of the extraction or separation filter or the output from “applying, …, a neural network based speech separation processing to the combined signal …” as per the specification and the applicant’s remarks. For these reasons, the recited limitation is not in alignment with the specification, hence fails to contain subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 USC 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Claim 2-4,6-23 are dependent to claim 1, hence incorporates the limitations of the independent claim 1. The grounds for rejection are as indicated for claim 1.

Claim 24 recites the limitation “wherein the controller configured to select the one of the plurality of signals is configured to apply to a representation of the combined sound signal at least one of the estimated one or more separation or extraction filters, …, according to the obtained neural signals to produce a resultant filtered signal corresponding to the one or more of the multiple sound sources to which the person is attentive to”.
The highlighted portion of the limitation is focused on application of a representation of the combined sound signal to at least one or more estimated one or more separation or extraction filters, which produces, a resultant filtered signal. As indicated in the applicant’s remarks, “As noted, the procedure 2000 may be included as separates a combined signal to its source components, and then uses neural signals of the person to select one of these components.” 
The highlighted portion of the limitation above recites application to a representation of the combined signal to at least one separation or extraction filter is according to obtained neural signals, as opposed to the selection of the output of the extraction or separation filter or the output from “applying, …, a neural network based speech separation processing to the combined signal …” as per the specification and the applicant’s remarks. For these reasons, the recited limitation is not in alignment with the specification, hence fails to contain subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 USC 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Claim 25-29 are dependent to claim 24, hence incorporates the limitations of the independent claim 24. The grounds for rejection are as indicated for claim 24.

Claim 30 recites the limitation “wherein the instructions to select the one of the plurality of signals comprise further instructions to apply to a representation of the combined sound signal at least one of the estimated one or more separation or extraction filters, …, according to the obtained neural signals to produce a resultant filtered signal corresponding to the one or more of the multiple sound sources to which the person is attentive to”.
separates a combined signal to its source components, and then uses neural signals of the person to select one of these components.” 
The highlighted portion of the limitation above recites application to a representation of the combined signal to at least one separation or extraction filter is according to obtained neural signals, as opposed to the selection of the output of the extraction or separation filter or the output from “applying, …, a neural network based speech separation processing to the combined signal …” as per the specification and the applicant’s remarks. For these reasons, the recited limitation is not in alignment with the specification, hence fails to contain subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 USC 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.

Claims 1-4,6,10-24,26-30 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that 
Claim 1 recites “applying the speech separation processing, implemented without speaker-dependent training, ….” The highlighted recited limitation does not match the recited limitation.
Claim 24 recites “apply the speech separation processing is implemented without speaker dependent training …” 
Claim 30 recites “wherein the speech separation processing is implemented without speaker-dependent training …”. 
 Paragraph 211 discloses “The neural network models used in the example implementation that are tested and evaluated were trained by mixing speech utterances from Wall Street Journal corpus, specifically the WSJ0-2mix and WSJ0-3mix datasets, which contain 30 hours of training, 10 hours of validation, and 5 hours of test data.” The highlighted portion indicates the neural network are trained by mixing speech utterances, wherein such speech utterances are specific to a speaker due to each speaker has a unique voice and speech. The claimed language recites the speech separation processing, which is performed using a neural network, is implemented without speaker dependent training, but paragraph 211 discloses the use of mixed speech utterances, wherein an utterance is specific to a specific speaker given each speaker has a unique voice and speech. This indicates the recited claimed language does not match the disclosure.
The applicant’s remarks filed 6/18/2021 points to paragraphs 8,144,170,210-211. Paragraph 170 discloses “An example system implementation was evaluated for a two-randomly selecting utterances from different speakers in Wall Street Journal (WSJ0) training set si_tr_s, and mixing them at random signal to noise ratios (SNR) between 0 dB and 5 dB. Five hours of evaluation set was generated in the same way, using utterances from 16 unseen speakers from si_dt_05 and si_et_05 in the WSJ0 dataset. …” 
	The highlighted portions of the paragraph above indicates that the neural network or speech separation process is trained using “randomly selected utterances from different speakers in Wall Street Journal …”, which includes “utterances from 16 unseen speakers” in the WSJ0 dataset, which is also found in the Wall Street Journal. Such indicates training of the speech separation process neural network uses speaker dependent training since utterances from different speakers (as highlighted above) indicates speech dependent on speakers, wherein each speaker has a unique voice and speech. 
For these reasons, the recited claimed language contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
	Claims 2-4,6,10-23,26-29 are rejected as per the respective independent claim.



Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4,6-7,24,30 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lunner et al (US Publication No.: 20140098981) in view of Bell et al (Title: “An Information-Maximization Approach to Blind Separation and Blind Deconvolution”).
Claim 1, Lunner et al discloses 
	obtaining, by a device, a combined sound signal for signals combined from multiple sound sources in an area in which a person is located (Fig. 3a-4, label M1-Mu are microphones or multiple sound sources that receive multiple sound signals in an area of a person (person wearing device HD. Example shown in Fig. 4, label user). Paragraph 147 discloses the input processing comprises a weighting unit for combining the input signals (audio signals received via the microphones).);
	applying, by a device, speech separation processing to the combined sound signal from the multiple sound sources to derive a plurality of separated signals (Fig. 3a, label sep. Paragraph 147 discloses the input unit (Fig. 3a, label IU) receives a plurality of 
	obtaining, by the device, neural signals for the person (Fig. 1, label Brain, E1-Ep are neural signals for the user such as the user shown in Fig. 4. Label BWM outputs the brain waves, e1-ej (paragraph 147).), the neural signals being indicative of one or more of the multiple sound sources the person is attentive to (Fig. 1, label 13,14,20 are electrodes connected to the person or user of the hearing device 1 (Fig. 1). Such device records the brain wave of the user or person as the person or user wears the hearing device and the microphones detect or receive the sound or audio in the environment or room of the user or person. Fig. 3a shows an example of the speakers or sound that is received via the microphones of the hearing device, label HD.); and
	selecting one of the plurality of separated signals based on the obtained neural signals for the person (Paragraph 73 discloses “The goal is to find a pattern in the EEG signals ej(t) (j=1,…,J) that correlates strongly to s1(t) and much less to si(t), I not equal to 1.” Paragraph 149 discloses the hearing device is configured to “select a preferred one 
wherein selection of the one of the plurality of signals (paragraph 73,149) comprises applying to a representation of the combined sound signal (Paragraph 147 discloses “the input unit IU may comprise time to time-frequency conversion units (e.g. (digital) Fourier transformation units (e.g. FFT, e.g. DFT) or (digital) filter banks) to provide each of the electric (microphone) input signals m.sub.p (or weighted combinations thereof) and/or separated source signals s.sub.i in a time frequency representation (m.sub.p[n,k] and s.sub.i[n,k], respectively) …”.)the source separation unit (Fig. 3a, label SEP,), according to the obtained neural signals (Paragraph 73 discloses “The goal is to find a pattern in the EEG signals ej(t) (j=1,…,J) that correlates strongly to s1(t) and much less to si(t), I not equal to 1.” Paragraph 149 discloses the hearing device is configured to “select a preferred one (or a preferred combination) of the directly received input audio signals by comparison with the concurrently recorded brainwave signals.”) to produce a resultant filtered signal corresponding to the one or more of the multiple sound sources to which the person is attentive to (Such is a result of the selection. By selecting the separated sound source according to the obtained neural signals as indicated above, the resultant filtered signal as recited is produced.).
Lunner et al discloses Bell et al “An information maximization approach to blind separation and blind deconvolution” regarding microphone array techniques, but fails to disclose the speech separation processing is a neural network based speech separation processing, wherein applying the speech separation processing, without speaker dependent training comprises estimating separation or extraction filters derived based on the combined sound signal and wherein applying to a representation of the combined 
Bell et al discloses a source separation unit (Fig. 3, Section 2) comprises blind separation and blind deconvolution that performs speech separation processing on a combined signal or signal with multiple sources (Abstract discloses “We apply the network to the source separation (or cocktail party) problem, successfully separating unknown mixtures of up to 10 speakers.” Such indicates speech separation processing is performed on combined signal or mixture of multiple sound sources or speakers. Page 1133 discloses “… to see the advantages of this approach in artificial neural networks, we now analyze the case of multidimensional inputs and outputs.” Section 2.23,2.24 are learning rules used to perform blind deconvolution where speech signals were convolved with various filters (Section 5.2). Section 2.14,2.15 discloses blind separation rules.) as a neural network based speech separation processing (Section 2 discloses “The basic problem tackled here is how to maximize the mutual information that the output Y of a neural network processor contains about its input X. …”), 
wherein applying the speech separation processing (Section 3 discloses application of speech separation processing on a set of sources mixed together linearly by a matrix A.), implemented without speaker dependent training (Section 3 discloses “We do not know anything about the sources or the mixing process.” Such disclosure indicates the speech separation processing is performed without knowledge of the sources or mixing process or blindly without data pertaining to the sound in the environment. ) comprises estimating separation or extraction filters derived based on the combined sound signal (Section 5 discloses blind deconvolution includes convolving 
wherein applying to a representation of the combined sound signal (Section 2.2 disclose x as the input vector, x(t) as a representation of the input vector,x (Equation 2.17) is applied to the filter w(t) or source separation unit (Fig. 3, label W and deconvolution).) to the source separation unit (Fig. 3 shows the source separation unit comprising blind separation and blind deconvolution.) includes applying the representation of the combined signal (Section 2.2 disclose x as the input vector, x(t) as a representation of the input vector,x (Equation 2.17) is applied to the filter w(t) or source separation unit (Fig. 3, label W and deconvolution).) to at least one of the estimated one or more separation or extraction filters (Fig. 3, label W shows the one or more estimated separation or extraction filters. Section 5.1 discloses fig. 3a, 2.14 and 2.15 for blind separation using weight matrix WA. Section 5.2 discloses “Speech signals were convolved with various filters and the learning rules in 2.23 and 2.24 were used to perform blind deconvolution.” Fig. 7 shows the filters used for blind deconvolution.).
It would be obvious to one skilled in the art before the effective filing date of the application to simply substitute one well known element of speech separation as disclosed by Lunner with another well known element of blind source separation or blindly perform speech separation, a microphone array techniques, as disclosed by Bell et al so to obtain predictable results of separated source signals. 
Claim 2, Lunner et al discloses wherein obtaining the neural signals for the person comprises: obtaining one or more of: electrocorticography (ECoG) signals for the 
	Claim 3, Lunner et al discloses processing the selected one of the plurality of separated sound signals, including performing one or more of: amplifying the selected one of the plurality of separated signals, or attenuating at least one non-selected signal from the plurality of separated signals. (Fig. 1, label 9,4 amplifies the input from the microphone and the modified audio signal from the signal processor, label 8. Paragraph 114 discloses the signal processor modifies the combined signal (input sound signals) to improve the hearing capability of the user and/or amplify or convey a received audio signal to the user.).
	Claim 4, Lunner et al discloses obtaining the combined sound signal for the multiple sound sources comprises: receiving the combined sound signal for the multiple sound sources at a single microphone coupled to the device. (Fig. 1, label 3. Paragraph 147 discloses N sound sources are received via microphones, M1 …. Mm, wherein depending on the value of m, the number of microphones can be 1.)
Claim 6, Bell et al discloses applying the neural network based speech separation processing to the combined sound signal from the multiple sound sources comprises providing the combined sound signal from the multiple sound sources to a deep neural network (DNN) configured to identify individual sound sources from the combined sound signal (Introduction and Section 2 discloses source separation is performed using a neural network, wherein DNN is a type of artificial neural network. Fig. 3 shows processing the combined sound signal (label s as the sources or sound received, x as the combined signal). Label W of Fig. 3a, Fig. 3b shows the artificial neural network 
Although Bell et al discloses a neural network but doesn’t specify the neural network is a DNN, DNN is a type of neural network and depending on the layers of the neural network disclosed by Bell et al, it would be obvious to one skilled in the art before the effective filing date of the application to simply substitute one well known element of neural network for another well known element DNN so to obtain predictable results of source separation.
Claim 7, Lunner et al discloses
generating a sound spectrogram from the combined sound signal (Paragraph 43,44 discloses calculating the input audio signal that has the highest or largest resulting correlation measurement between the audio signal, si and  the eeg, ej. Paragraph 50-53 discloses such correlation includes calculation of the coherence that determines the spectral density in frequency of two signals, wherein such signals can be brainwave signal and target sound signal (paragraph 54).); and 
	applying the speech separation processing to the generated spectrogram to derive multiple resultant speaker spectrograms (Paragraph 46 discloses the source separation unit performs speech separation of the one or more sound sources. Paragraph 43 discloses identifying one of the input audio signals that has the highest probability of being a target signal for the individual wearing the hearing device.” Paragraph 42 discloses performing correlation measurement for the input audio signals si (each individual audio signal from individual speakers). Such paragraphs indicates that correlation is applied to individual audio signals from individual speakers, wherein such correlation includes generation of spectrogram as disclosed in paragraph 53.).

	Barker et al discloses sound separation unit where the sound separation is based on deep neural network (Col. 4, lines 10-15). Lunner et al discloses separation of the combined sound signal (Fig. 3a, label sep) and Barker et al discloses how the combined sound signal is separated (Fig. 1a,1b), hence it would be obvious to one skilled in the art before the effective filing date of the application to modify Lunner et al by performing separation of the combined sound signal as disclosed by Barker et al so to provide enhanced separation and improve presenting the listener with a single speaker at a time.
Claim 24, Lunner et al discloses 
	at least one microphone (Fig. 1, label 3, 3a, label IU,M1-MM) to obtain a combined sound signal (Paragraph 147 discloses the input processing comprises a weighting unit for combining the input signals (audio signals received via the microphones).) for signals combined from multiple sound sources in an area in which a person is located (Fig. 2a, label S1-SN); 
	one or more neural sensors (Fig. 1, label 13,14,20 are electrodes connected to the person or user of the hearing device 1 (Fig. 1).) to obtain neural signals for the person (Fig. 3a, label Brain, E1-Ep,e1-ej), the neural signals being indicative of one or more of the multiple sound sources the person is attentive to (Fig. 1, label 13,14,20 are electrodes connected to the person or user of the hearing device 1 (Fig. 1). Such device records the brain wave of the user or person as the person or user wears the hearing device and the microphones detect or receive the sound or audio in the environment or room of the user or person. Fig. 3a shows an example of the speakers or sound that is received via the microphones of the hearing device, label HD.); and

	apply speech separation processing to the combined sound signal from the multiple sound sources to derive a plurality of separated signals (Fig. 3a, label sep. Paragraph 147 discloses the input unit (Fig. 3a, label IU) receives a plurality of sound signals from the microphones M1-MM, combines the input signals and outputs a number of output signals, s1-sN.) that each contains signals corresponding to different groups of multiple sound signals (paragraph 147 discloses the unit SEP “receives M electric input signals and provides as output N separated source signals s1, …, sN, ideally representing the speech signals provided by the N speakers.” Paragraph 46 discloses the source separation unit for separating one or more sound sources s in the sound field based on electric input signals and providing respective separated input audio signals Xs, from said one or more sound sources s.” These paragraphs indicates that each signal corresponding to one or more sound sources or different groups of multiple sound signals.); and
	select one of the plurality of separated signals based on the obtained neural signals for the person (Paragraph 73 discloses “The goal is to find a pattern in the EEG signals ej(t) (j=1,…,J) that correlates strongly to s1(t) and much less to si(t), I not equal to 1.” Paragraph 149 discloses the hearing device is configured to “select a preferred one (or a preferred combination) of the directly received input audio signals by comparison with the concurrently recorded brainwave signals.”) and 
)the source separation unit (Fig. 3a, label SEP,), according to the obtained neural signals (Paragraph 73 discloses “The goal is to find a pattern in the EEG signals ej(t) (j=1,…,J) that correlates strongly to s1(t) and much less to si(t), I not equal to 1.” Paragraph 149 discloses the hearing device is configured to “select a preferred one (or a preferred combination) of the directly received input audio signals by comparison with the concurrently recorded brainwave signals.”) to produce a resultant filtered signal corresponding to the one or more of the multiple sound sources to which the person is attentive to (Such is a result of the selection. By selecting the separated sound source according to the obtained
Lunner et al discloses Bell et al “An information maximization approach to blind separation and blind deconvolution” regarding microphone array techniques, but fails to disclose the speech separation processing is a neural network based speech separation processing, wherein applying the speech separation processing, without speaker dependent training comprises estimating separation or extraction filters derived based on the combined sound signal and wherein applying to a representation of the combined sound signal to the source separation unit includes applying the representation of the combined signal to at least one of the estimated one or more separation or extraction filters.

wherein applying the speech separation processing (Section 3 discloses application of speech separation processing on a set of sources mixed together linearly by a matrix A.), implemented without speaker dependent training (Section 3 discloses “We do not know anything about the sources or the mixing process.” Such disclosure indicates the speech separation processing is performed without knowledge of the sources or mixing process or blindly without data pertaining to the sound in the environment. ) comprises estimating separation or extraction filters derived based on the combined sound signal (Section 5 discloses blind deconvolution includes convolving the mixed signal or combined signal with multiple filters. Fig. 7 shows the multiple filters. Section 5.2 disclose blind deconvolution is performed using the learning rules 
wherein applying to a representation of the combined sound signal (Section 2.2 disclose x as the input vector, x(t) as a representation of the input vector,x (Equation 2.17) is applied to the filter w(t) or source separation unit (Fig. 3, label W and deconvolution).) to the source separation unit (Fig. 3 shows the source separation unit comprising blind separation and blind deconvolution.) includes applying the representation of the combined signal (Section 2.2 disclose x as the input vector, x(t) as a representation of the input vector,x (Equation 2.17) is applied to the filter w(t) or source separation unit (Fig. 3, label W and deconvolution).) to at least one of the estimated one or more separation or extraction filters (Fig. 3, label W shows the one or more estimated separation or extraction filters. Section 5.1 discloses fig. 3a, 2.14 and 2.15 for blind separation using weight matrix WA. Section 5.2 discloses “Speech signals were convolved with various filters and the learning rules in 2.23 and 2.24 were used to perform blind deconvolution.” Fig. 7 shows the filters used for blind deconvolution.).
It would be obvious to one skilled in the art before the effective filing date of the application to simply substitute one well known element of speech separation as disclosed by Lunner with another well known element of blind source separation or blindly perform speech separation, a microphone array techniques, as disclosed by Bell et al so to obtain predictable results of separated source signals. 
	Claim 25, Lunner et al discloses
generate a sound spectrogram from the combined sound signal (Paragraph 43,44 discloses calculating the input audio signal that has the highest or largest resulting correlation measurement between the audio signal, si and  the eeg, ej. Paragraph 50-53 
	apply the speech separation processing to the generated spectrogram to derive multiple resultant speaker spectrograms (Paragraph 46 discloses the source separation unit performs speech separation of the one or more sound sources. Paragraph 43 discloses identifying one of the input audio signals that has the highest probability of being a target signal for the individual wearing the hearing device.” Paragraph 42 discloses performing correlation measurement for the input audio signals si (each individual audio signal from individual speakers). Such paragraphs indicates that correlation is applied to individual audio signals from individual speakers, wherein such correlation includes generation of spectrogram as disclosed in paragraph 53.); and
	wherein the controller (Fig. 3a, label BWM,MOD,SEP,SPU. Fig. 1, label 2) configured to select one of the plurality of separated sound signals (Paragraph 149 discloses the hearing device is configured to “select a preferred one (or a preferred combination) of the directly received input audio signals by comparison with the concurrently recorded brainwave signals.”) comprises: 
generate an attended speaker spectrogram based on the neural signals for the person (paragraphs 50-54,42-45 discloses the correlation between brainwave, ej and audio signal si is performed for the audio input signal.);
compare the attended speaker spectrogram to the derived multiple resultant speaker spectrograms to select one of the multiple resultant speaker spectrograms (paragraph 43 discloses “identify one of the input audio signals that has  the highest probability of being a target signal for the individual wearing the hearing device.” This 
transforms the selected one of the multiple resultant speaker spectrograms into an acoustic signal (Fig. 3, label OU. Paragraph 147 discloses “the output signal are fed to an output unit for being presented to a user and perceived as sound (and/or for being transmitted to another device.)”.
Lunner et al fails to disclose the speech separation is a neural network speech separation processing, implemented without speaker dependent training.
Bell et al discloses a source separation unit (Fig. 3, Section 2) comprises blind separation and blind deconvolution that performs speech separation processing on a combined signal or signal with multiple sources (Abstract discloses “We apply the network to the source separation (or cocktail party) problem, successfully separating unknown mixtures of up to 10 speakers.” Such indicates speech separation processing is performed on combined signal or mixture of multiple sound sources or speakers. Page 1133 discloses “… to see the advantages of this approach in artificial neural networks, we now analyze the case of multidimensional inputs and outputs.” Section 2.23,2.24 are learning rules used to perform blind deconvolution where speech signals were convolved with various filters (Section 5.2). Section 2.14,2.15 discloses blind separation rules.) as a neural network based speech separation processing (Section 2 discloses “The basic problem tackled here is how to maximize the mutual information that the output Y of a neural network processor contains about its input X. …”), implemented  without speaker dependent training (Section 3 discloses “We do not know anything about the 
Lunner et al disclose Bell et al reference as a microphone array technique, wherein Bell et al discloses source separation comprising a neural network speech separation process, implemented without speaker dependent training (see above), hence it would be obvious to one skilled in the art before the effective filing date of the application to simply substitute one well known element of speech separation as disclosed by Lunner with another well known element of blind source separation or blindly perform speech separation, a microphone array techniques, as disclosed by Bell et al so to obtain predictable results of separated source signals. 
Claim 30, Lunner et al discloses
	Preamble: A nontransitory computer readable media programmed with instructions (Fig. 1 shows the digital circuits. Paragraph 143 discloses digital circuits implemented using hardware, firmware, software or combination thereof, wherein non-transitory computer readable media embedded with instructions can be considered hardware, firmware and software.), executable on a processor (Fig. 1, label 8. paragraph 143) to:
obtain, by a device comprising the processor, a combined sound signal for signals combined from multiple sound sources in an area in which a person is located (Fig. 3a-4, label M1-Mu are microphones or multiple sound sources that receive multiple sound signals in an area of a person (person wearing device HD. Example shown in Fig. 4, label user). Paragraph 147 discloses the input processing comprises a weighting unit for combining the input signals (audio signals received via the microphones).);

	obtain, by the device, neural signals for the person (Fig. 1, label Brain, E1-Ep are neural signals for the user such as the user shown in Fig. 4. Label BWM outputs the brain waves, e1-ej (paragraph 147).), the neural signals being indicative of one or more of the multiple sound sources the person is attentive to (Fig. 1, label 13,14,20 are electrodes connected to the person or user of the hearing device 1 (Fig. 1). Such device records the brain wave of the user or person as the person or user wears the hearing device and the microphones detect or receive the sound or audio in the environment or room of the user or person. Fig. 3a shows an example of the speakers or sound that is received via the microphones of the hearing device, label HD.); and
	select one of the plurality of separated signals based on the obtained neural signals for the person (Paragraph 73 discloses “The goal is to find a pattern in the EEG 
Lunner et al discloses Bell et al “An information maximization approach to blind separation and blind deconvolution” regarding microphone array techniques, but fails to disclose the speech separation processing is a neural network based speech separation processing, wherein applying the speech separation processing, without speaker dependent training comprises estimating separation or extraction filters derived based on the combined sound signal and wherein applying to a representation of the combined sound signal to the source separation unit includes applying the representation of the combined signal to at least one of the estimated one or more separation or extraction filters.
Bell et al discloses a source separation unit (Fig. 3, Section 2) comprises blind separation and blind deconvolution that performs speech separation processing on a combined signal or signal with multiple sources (Abstract discloses “We apply the network to the source separation (or cocktail party) problem, successfully separating unknown mixtures of up to 10 speakers.” Such indicates speech separation processing is performed on combined signal or mixture of multiple sound sources or speakers. Page 1133 discloses “… to see the advantages of this approach in artificial neural networks, we now analyze the case of multidimensional inputs and outputs.” Section 2.23,2.24 are learning rules used to perform blind deconvolution where speech signals were convolved with various filters (Section 5.2). Section 2.14,2.15 discloses blind separation rules.) as a neural network based speech separation processing (Section 2 discloses “The 
wherein applying the speech separation processing (Section 3 discloses application of speech separation processing on a set of sources mixed together linearly by a matrix A.), implemented without speaker dependent training (Section 3 discloses “We do not know anything about the sources or the mixing process.” Such disclosure indicates the speech separation processing is performed without knowledge of the sources or mixing process or blindly without data pertaining to the sound in the environment. ) comprises estimating separation or extraction filters derived based on the combined sound signal (Section 5 discloses blind deconvolution includes convolving the mixed signal or combined signal with multiple filters. Fig. 7 shows the multiple filters. Section 5.2 disclose blind deconvolution is performed using the learning rules 2.23,2.24. Equations 2.23,2.24 determines weight of the filter w(t) based on x, input vector (Section 2.2). Fig. 3, label a shows x as the input vector, and the sound sources.),
wherein applying to a representation of the combined sound signal (Section 2.2 disclose x as the input vector, x(t) as a representation of the input vector,x (Equation 2.17) is applied to the filter w(t) or source separation unit (Fig. 3, label W and deconvolution).) to the source separation unit (Fig. 3 shows the source separation unit comprising blind separation and blind deconvolution.) includes applying the representation of the combined signal (Section 2.2 disclose x as the input vector, x(t) as a representation of the input vector,x (Equation 2.17) is applied to the filter w(t) or source separation unit (Fig. 3, label W and deconvolution).) to at least one of the estimated one or more separation or extraction filters (Fig. 3, label W shows the one or more estimated separation or extraction filters. Section 5.1 discloses fig. 3a, 2.14 and 
It would be obvious to one skilled in the art before the effective filing date of the application to simply substitute one well known element of speech separation as disclosed by Lunner with another well known element of blind source separation or blindly perform speech separation, a microphone array techniques, as disclosed by Bell et al so to obtain predictable results of separated source signals.
Claims 8-9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lunner et al (US Publication No.: 20140098981) in view of Bell et al (Title: “An Information-Maximization Approach to Blind Separation and Blind Deconvolution”), further in view of Yu (US Patent No.: 9818431).
 	Claim 8, Lunner et al discloses a combined signal of multiple voices (Fig. 3a, label SEP,IU) and selection of one of the plurality of separated sound signals based on the obtained neural signals (Paragraph 73 discloses “The goal is to find a pattern in the EEG signals ej(t) (j=1,…,J) that correlates strongly to s1(t) and much less to si(t), I not equal to 1.” Paragraph 149 discloses the hearing device is configured to “select a preferred one (or a preferred combination) of the directly received input audio signals by comparison with the concurrently recorded brainwave signals.”), but fails to disclose selecting one of the plurality of separated sound signals based on the obtained neural signal for the person comprises generating an attended speaker spectrogram based on the neural signals for the person; comparing the attended speaker spectrogram to the derived multiple resultant speaker spectrograms to select one of the multiple resultant speaker spectrograms into an acoustic signal.

selecting one of the plurality of separated sound signals based on the obtained neural signals for the person (Paragraph 73 discloses “The goal is to find a pattern in the EEG signals ej(t) (j=1,…,J) that correlates strongly to s1(t) and much less to si(t), I not equal to 1.” Paragraph 149 discloses the hearing device is configured to “select a preferred one (or a preferred combination) of the directly received input audio signals by comparison with the concurrently recorded brainwave signals.”) comprises: 
generating an attended speaker spectrogram based on the neural signals for the person (paragraphs 50-54,42-45 discloses the correlation between brainwave, ej and audio signal si is performed for the audio input signal.);
comparing the attended speaker spectrogram to the derived multiple resultant speaker spectrograms to select one of the multiple resultant speaker spectrograms (paragraph 43 discloses “identify one of the input audio signals that has  the highest probability of being a target signal for the individual wearing the hearing device.” This indicates a comparison of the multiple resultant speakers to select the highest probability of being a target signal. Paragraph 50-53 disclose the generation of correlation includes generating spectrograms or frequency response or response of the audio in frequency.); and
transferring the selected one of the multiple resultant speaker spectrograms into an acoustic signal (Fig. 3, label OU. Paragraph 147 discloses “the output signal are fed to an output unit for being presented to a user and perceived as sound (and/or for being transmitted to another device.)”.
	It would be obvious to one skilled in the art before the effective filing date of the application to modify selection of the separated sound signal as disclosed by Lunner et 
Claim 9, Lunner et al discloses comparing the attended speaker spectrogram to the derived multiple resultant speaker spectrograms using normalized correlation analysis. (paragraph 42-45 discloses finding the highest probability of being a target signal for the user includes calculating the correlation measure. Paragraph 50-53 discloses the correlation measure includes generating of power spectral density in the frequency domain of the audio input signals, si, for i= 1…N, each input audio signal (paragraph 41). Paragraph 51-53 discloses CM includes cross correlation, which includes normalized correlation.)
	Claim 31, Lunner et al discloses
	obtaining, by a device, a combined sound signal for signals combined from multiple sound sources in an area in which a person is located (Fig. 3a-4, label M1-Mu are microphones or multiple sound sources that receive multiple sound signals in an area of a person (person wearing device HD. Example shown in Fig. 4, label user). Paragraph 147 discloses the input processing comprises a weighting unit for combining the input signals (audio signals received via the microphones).);
	generating a combined sound spectrogram from the combined sound signal (paragraph 147 discloses FFT is found in the IU to provide each of the electronic (microphone) inputs mp (or weighted combinations thereof) and/or separated source signals si in a time frequency representation (mp[n,k] and si[n,k]).);
	applying, by a device, speech separation processing to the generated combined sound signal (Paragraph 147 discloses FFT is found in the IU to provide each of the p (or weighted combinations thereof) and/or separated source signals si in a time frequency representation (mp[n,k] and si[n,k]).) to derive a multiple resultant speaker spectrograms (Fig. 3a, label sep. Paragraph 147 discloses the input unit (Fig. 3a, label IU) receives a plurality of sound signals from the microphones M1-MM, combines the input signals and outputs a number of output signals, s1-sN.) that each corresponds to different groups of multiple sound signals (paragraph 147 discloses the unit SEP “receives M electric input signals and provides as output N separated source signals s1, …, sN, ideally representing the speech signals provided by the N speakers.” Such paragraph also discloses outputting FFT processed separated source signals. Paragraph 46 discloses the source separation unit for separating one or more sound sources s in the sound field based on electric input signals and providing respective separated input audio signals Xs, from said one or more sound sources s.” These paragraphs indicates that each signal corresponding to one or more sound sources or different groups of multiple sound signals.);
obtaining, by the device, neural signals for the person (Fig. 1, label Brain, E1-Ep are neural signals for the user such as the user shown in Fig. 4. Label BWM outputs the brain waves, e1-ej (paragraph 147).), the neural signals being indicative of one or more of the multiple sound sources the person is attentive to (Fig. 1, label 13,14,20 are electrodes connected to the person or user of the hearing device 1 (Fig. 1). Such device records the brain wave of the user or person as the person or user wears the hearing device and the microphones detect or receive the sound or audio in the environment or room of the user or person. Fig. 3a shows an example of the speakers or sound that is received via the microphones of the hearing device, label HD.); and

Lunner et al discloses source separation processing (Fig. 3, label IU) and Bell et al “An information maximization approach to blind separation and blind deconvolution” regarding microphone array techniques (paragraph 90), but fails to disclose the source separation processing is a neural network based speech separation processing. 
Bell et al discloses a neural network based speech separation processing (Section 2 discloses “The basic problem tackled here is how to maximize the mutual information that the output Y of a neural network processor contains about its input X. …”). 
It would be obvious to one skilled in the art before the effective filing date of the application to simply substitute one well known element of speech separation as disclosed by Lunner with another well known element of blind source separation or blindly perform speech separation, a microphone array techniques, as disclosed by Bell et al so to obtain predictable results of separated source signals. 
Lunner et al discloses a combined signal of multiple voices (Fig. 3a, label SEP,IU) and selection of one of the plurality of separated sound signals based on the obtained neural signals (Paragraph 73 discloses “The goal is to find a pattern in the EEG signals ej(t) (j=1,…,J) that correlates strongly to s1(t) and much less to si(t), I not equal to 1.” Paragraph 149 discloses the hearing device is configured to “select a preferred one (or a 
 selecting one of the plurality of separated sound signals based on the obtained neural signal for the person comprises 
generating an attended speaker spectrogram based on the neural signals for the person; 
comparing the attended speaker spectrogram to the derived multiple resultant speaker spectrograms to select one of the multiple resultant speaker spectrograms; and 
transforming the selected one of the multiple resultant speaker spectrograms into an acoustic signal.
	Yu discloses
selecting one of the plurality of separated sound signals based on the obtained neural signals for the person (Paragraph 73 discloses “The goal is to find a pattern in the EEG signals ej(t) (j=1,…,J) that correlates strongly to s1(t) and much less to si(t), I not equal to 1.” Paragraph 149 discloses the hearing device is configured to “select a preferred one (or a preferred combination) of the directly received input audio signals by comparison with the concurrently recorded brainwave signals.”) comprises: 
generating an attended speaker spectrogram based on the neural signals for the person (paragraphs 50-54,42-45 discloses the correlation between brainwave, ej and audio signal si is performed for the audio input signal.);
comparing the attended speaker spectrogram to the derived multiple resultant speaker spectrograms to select one of the multiple resultant speaker spectrograms (paragraph 43 discloses “identify one of the input audio signals that has  the highest probability of being a target signal for the individual wearing the hearing device.” This 
transforming the selected one of the multiple resultant speaker spectrograms into an acoustic signal (Fig. 3, label OU. Paragraph 147 discloses “the output signal are fed to an output unit for being presented to a user and perceived as sound (and/or for being transmitted to another device.)”.
	It would be obvious to one skilled in the art before the effective filing date of the application to modify selection of the separated sound signal as disclosed by Lunner et al by incorporating the selection of the sound signal as disclosed by Yu et al so to better provide the listener with the sound from the target speaker, hence enabling the listener to clearly hear the target speaker.
Claim 32, Lunner et al discloses comparing the attended speaker spectrogram to the derived multiple resultant speaker spectrograms using normalized correlation analysis. (paragraph 42-45 discloses finding the highest probability of being a target signal for the user includes calculating the correlation measure. Paragraph 50-53 discloses the correlation measure includes generating of power spectral density in the frequency domain of the audio input signals, si, for i= 1…N, each input audio signal (paragraph 41). Paragraph 51-53 discloses CM includes cross correlation, which includes normalized correlation.)
	


Allowable Subject Matter
Claims 10-23,26-29 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Note: All rejections and objections must be overcome prior to placing the case in condition for allowance.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Sawada et al, A Robust and Precise method of solving the permutation problem of Frequency-Domain Blind Source Separation, and Zhan et al, Improvement of Mask Based Speech Source Separation Using DNN, both discloses blind source separation of mixed signal.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to LINDA WONG whose telephone number is (571)272-6044.  The examiner can normally be reached on 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on (571) 272-7453.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


/LINDA WONG/Primary Examiner, Art Unit 2656