Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
2.	In response to the office action mailed on 11/23/2021, applicant filed an amendment on 01/21/2022, amending claims 1, 6, 8-13, and 15-20.  Claims 4, 5, 7, and 14 are cancelled.  The pending claims are 1-3, 6, 8-13, and 15-20. 

Response to Arguments
3.	Applicant's arguments filed 01/21/2022 have been fully considered but they are not persuasive.
	As peer claim 1, applicant argues that the prior art Chen only receives mixed speech signals from one multi-microphone device. Chen does not receive second data from a second multi-microphone device comprising second ambient sound sensed by the second multi-microphone device at a second location.  
	The examiner notes that the prior art Chen teaches receiving first and second data from a first and second multi-microphone devices comprising second ambient sound sensed by the first and second multi-microphone device at a first and second locations, respectively (see Fig. 9 and [0085], wherein microphone array 910 and microphone array 940 have microphones having microphones 904 and 905, as well as other microphones not shown in FIG. 9).
Applicant argues that the prior art Chen is limited to just determining a direction of a speaker in relation to the multi- microphone device.
the phase coefficients may provide some information as to the relative locations of the speakers. Because the microphones pick up each individual speaker differently, e.g., based on their respective locations, the phase coefficients or information derived therefrom, such as the IPD features as discussed earlier, can improve the ability of a speech separation model to distinguish between different speakers.
As per the rest of the claims, and combinations of prior art reference, applicant has no further arguments beside the ones mentioned above.  Therefore, all the combinations of prior art reference mentioned above are valid, and all other claims are rejected for the same reasons as set above. 

Claim Rejections - 35 USC § 103
4.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103, which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 6, 8-13, 15-20 are rejected under 35 U.S.C. 103 as being unpatentable over Chen (US 2019/0318757) in view of Shin (US 2020/0051566).
As per claim 1, Chen teaches receiving first data from a first multi-microphone device comprising first ambient sound sensed by the first multi-microphone device at a first location (Fig. 9 and [0085], wherein first data is received from microphone array 910 at a first location); 
receiving second data from a second multi-microphone device comprising second ambient sound sensed by the second multi-microphone device at a second location (Fig. 9 and [0085], wherein second data is received from microphone array 940 at a second location); 
separating, from the first data, via a first deep neural network comprising a first predictive audio spectral mask in the first user device, at least one component of the first data selected from the group consisting of amplitude data corresponding to a first amplitude component of the first ambient sound and phase data corresponding to a first phase component of the first ambient sound ([0085]- [0086], wherein said the speech separation module of Fig. 9 can perform any or all the blocks of method 300; and speech separation processing can be distributed across individual devices of system 900 in any fashion..  So, first data received from the first microphone array 910 is processed as in [0021], [0026]-[0032], using the trained neural network, as a speech separation model, and applying the time-frequency masks to separate  at least one component of the first data selected from the group consisting of amplitude data corresponding to a first amplitude component of the ambient sound sensed at a first microphone of the first multi-microphone device and phase data corresponding to a first phase component of the ambient sound sensed at the first microphone of the first multi- microphone device);  
separating, from the second data, via the first deep neural network, at least one component of the second data selected from the group consisting of the amplitude data corresponding to a second amplitude component of the second ambient sound and the phase data ([0085]- [0086], wherein said the speech separation module of Fig. 9 can perform any or all the blocks of method 300; and speech separation processing can be distributed across individual devices of system 900 in any fashion.  So, second data received from the second microphone array 940 is processed as in [0021], [0026]-[0032], using the trained neural network, as a speech separation model, and applying the time-frequency masks to separate  at least one component of the first data selected from the group consisting of amplitude data corresponding to a first amplitude component of the ambient sound sensed at a first microphone of the first multi-microphone device and phase data corresponding to a first phase component of the ambient sound sensed at the first microphone of the first multi- microphone device);  
determining a location of origin of first target speech relative to a location of the first user device based on the at least one component of the first data, the at least one component of the second data, the first location, and the second location (based on [0085]-[0086], wherein said the speech separation module of Fig. 9 can perform any or all the blocks of method 300, and speech separation processing can be distributed across individual devices of system 900 in any fashion, the speech separation models of [0071] can process various feature including magnitude and phase coefficients  of microphone array 910 and microphone array 940 (Fig. 9) to provide location information of the speakers). 
Chen does not explicitly disclose displaying, via the first user device, the location of origin of the first target speech relative to the first user device.
Shin in the same field of endeavor teaches a machine learning, which is implemented as a deep neural network (DNN) including a plurality of hidden layers among artificial neural networks ([0039]), wherein the used artificial intelligence device can determine a sound source 
As per claim 2, Chen teaches wherein the first user device comprises a mobile device ([0082], mobile device).
As per claim 3, Chen teaches wherein the mobile device comprises a smartphone ([0025], smartphone).
As per claim 6, Chen teaches separating, from the first data, via second deep neural network, at least one additional component of the first data selected from the group consisting of the amplitude data corresponding to a third amplitude component of the first ambient sound and the phase data corresponding to a third phase component of the first ambient sound ( in order to provide location information of different speakers as in [0071], processing and separating additional components corresponding to other speakers is necessarily disclosed and performed in the same manner as described above with regard to the process of [0085]-[0086], [0026]-[0032);
separating, from the second data, via the second deep neural network, at least one additional component of the second data selected from the group consisting of the amplitude data corresponding to a fourth amplitude component of the second ambient sound and the phase data corresponding to a fourth phase component of the second ambient sound ( in order to provide location information of different speakers as in [0071], processing and separating additional necessarily disclosed and performed in the same manner as described above with regard to the process of [0085]-[0086], [0026]-[0032);
determining, via the first user device and based on the at least one additional component of the first data, the at least one additional component of the second data, the first location, and the second location, the location of origin of second target speech relative to a location of the first user device (based on [0085]-[0086], wherein said the speech separation module of Fig. 9 can perform any or all the blocks of method 300, and speech separation processing can be distributed across individual devices of system 900 in any fashion, the speech separation models of [0071] can process various feature including magnitude and phase coefficients  of microphone array 910 and microphone array 940 (Fig. 9) to provide location information of the speakers). 
Chen does not explicitly disclose displaying, via the first user device, the location of origin of the first target speech relative to the first user device.
Shin in the same field of endeavor teaches a machine learning, which is implemented as a deep neural network (DNN) including a plurality of hidden layers among artificial neural networks ([0039]), wherein the used artificial intelligence device can determine a sound source direction of audio data from audio data received from each of a plurality of microphones, and outputs visual information about the determined sound source direction through a display unit ([0248]- [0254]).  Therefore, it would have been obvious at the time the application was filed to use Shin’s display feature with the system of Chen, in order to the location of origin of the first target speech relative to the first user device.  This would provide the provided information more understandable and efficient.
As per claims 8-10, 13, system claims 8-10, 13 and method claims 1-2, 6 are related as apparatus and the method of using same, with each claimed element's function corresponding to the claimed method step.  Accordingly, claims 8-10, 13 are similarly rejected under the same rationale as applied above with respect to method claims 1-2, 6.  Further, Chen teaches a memory having instructions therein; and at least one processor in communication with the memory ([0005], [0089]-[0091]).
As per claim 11, Chen teaches receive third data from a third multi-microphone device comprising third ambient sound sensed by the third multi- microphone device at a third location; separate, from the third data, via the first deep neural network, at least one component of the third  data selected from the group consisting of the amplitude data corresponding to a third (Fig. 9 and [0085], wherein the speech separation module can receive and process data from Microphone array 910, client device 920, and microphone array 940 can also have microphones 904 and 905, as well as other microphones not shown in FIG. 9.  The rest is rejected for the same as set with regard to claim 8).
As per claim 12, Chen does not explicitly disclose wireless communication, from a first user device to a second user device, of a copy of the first deep neural network comprising a copy of the first predictive audio spectral mask.
a learning model, or learned model (or an artificial neural network, see [0081]), and a control signal to and from external devices ([0056]-[0057], [0080]).  Fig. 3, Fig. 9, and [0088]-[0094] show a plurality of devices, including a smart phone, exchanging information, i.e. a learning model, with each other.  After receiving the learning model, the receiving device (second device) uses the learning model, and generates the response or the control command based on the inference result.  See also, [0241], wherein said that speech synthesis server 43 can transmit a model learned through machine learning or deep learning to the terminal 100 periodically or in response to a request.  Therefore, it would have been obvious at the time the application was filed to use Shin’s features of wirelessly exchanging neural network learning models between devices with the system of Chen, in order to communicate, into a second user device, a copy of the first deep neural network comprising a copy of the first predictive audio spectral mask; perform the claimed speech separation;  and display, via the second user device, the location of origin of the first target speech relative to the second user device, as claimed.  This would help distribute the workload among a plurality of devices capable of applying copies of a deep neural network to separate speech rapidly and efficiently.
As per claims 15-20, Chen teaches a computer readable medium ([0005]).  The remaining steps are rejected under the same rationale as applied to the method steps of rejected claims 8-13. 
Conclusion
5.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. See PTO-892.
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ABDELALI SERROU whose telephone number is (571)272-7638. The examiner can normally be reached M-F 9 Am - 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached on 571-272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.






/ABDELALI SERROU/            Primary Examiner, Art Unit 2659