Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
2.	The submission of the information disclosure statement (IDS) complies with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Double Patenting
3.	The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees.   A nonstatutory obviousness-type double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and  In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the conflicting application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. 
Effective January 1, 1994, a registered attorney or agent of record may sign a terminal disclaimer. A terminal disclaimer signed by the assignee must fully comply with 37 CFR 3.73(b).
The USPTO internet Web site contains terminal disclaimer forms which may be used.  Please visit http://www.uspto.gov/forms/.  The filing date of the application will determine what form should be used.  A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission.  For more information about eTerminal Disclaimers, refer to http://www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.  


s 1-20 are rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claim 1-20 of U.S. Patent No. 10777202.  Although the conflicting claims are not identical, they are not patentably distinct from each other because the claims of the instant application merely broadens the scope of the claims of the Patent by eliminating the elements and their functions of the claims. It has been held that the omission of an element and its function is an obvious expedient if the remaining elements perform the same function as before. In re Karlson, 136 USPQ 184 (CCPA).  Also note Ex parte Rainu, 168 USPQ 375 (Bd.App.1969); omission of a reference element whose function is not needed would be obvious to one skilled in the art.
Claim 1 from both sets of claims is shown below:

Current Application
The patent
1. A method comprising: 
         receiving, by a speech presentation system, a simulated binaural audio signal merging together a plurality of concurrent speech instances originating from a plurality of different speakers speaking concurrently; 
        receiving, by the speech presentation system, acoustic propagation data associated with the simulated binaural audio signal, the acoustic propagation data representative of respective propagation effects applied, within 
         extracting, by the speech presentation system from the simulated binaural audio signal and based on the acoustic propagation data, a different auto- transcribable speech signal for each of the plurality of concurrent speech instances merged together in the simulated binaural audio signal; and
       generating, by the speech presentation system based on the extracted auto-transcribable speech signals, a different closed captioning dataset for each of the plurality of concurrent speech instances merged together in the simulated binaural audio signal.

       receiving, by a speech presentation system, a simulated binaural audio signal associated with a media player device that is presenting an artificial reality world to a user of the media player device, wherein: the simulated binaural audio signal is representative of a simulation of sound propagating to an avatar representing the user within the artificial reality world, and the simulated binaural audio signal merges from different positions within the artificial reality world; 
           receiving, by the speech presentation system, acoustic propagation data representative of a simulated propagation effect that is applied, within the simulated binaural audio signal, to speech originating from the different speakers in order to simulate propagation of the speech to the avatar, the simulated propagation effect including one or more of a reverberation effect to simulate natural echoes or an attenuation effect to simulate a natural drop-off of a volume of the speech; 
         extracting, by the speech presentation system from the simulated binaural audio signal and based on the acoustic propagation data, a plurality of auto-transcribable speech signals representative of the speech originating from the different speakers 
generating, by the speech presentation system based on the plurality of auto-transcribable speech signals, a plurality of closed captioning datasets representative of the plurality of concurrent speech instances originating from the plurality of different speakers; and
           providing, by the speech presentation system to the media player device, the plurality of closed captioning datasets. 






Claim Rejections - 35 USC § 103
3.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103, which forms the basis for all obviousness rejections set forth in this Office action:


Claims 1-3, 7-9 are rejected under 35 U.S.C. 103 as being unpatentable over Lyren (US 2018/0249274) in view of Chen (US 2019/0318757), and further in view of Ota (US 2017/0277257).

As per claim 1, Lyren teaches receiving, by a speech presentation system, a simulated binaural audio signal merging together a plurality of concurrent speech instances originating from a plurality of different speakers speaking concurrently ([0224], receiving a speech signal originating from a first and second speaker within the Virtual Reality (VR) telephony space);
receiving, by the speech presentation system, acoustic propagation data associated with the simulated binaural audio signal, the acoustic propagation data representative of respective propagation effects applied, within the simulated binaural audio signal, to each of the concurrent speech instances to simulate propagation of the concurrent speech instances (Lyren, [0100], extracting by the performance enhancer aspects affecting propagation of the sound originating from one or more of the virtual sound sources. For example, performance enhancer predicts one or more head paths and one or more virtual sound source paths for each virtual sound source and calculates HRTF paths of the virtual sound sources for each combination); 
extracting, by the speech presentation system from the simulated binaural audio signal and based on the acoustic propagation data, a different auto- transcribable speech signal for each of the plurality of concurrent speech instances merged together in the simulated binaural audio signal ([0100], extracting sound signals originating from one or more of the virtual sound sources for enhancement.  According to applicant’s specification at [0021], an auto-transcribable speech signal may refer to an audio signal that is derived from the simulated binaural audio signal, but that only includes speech from one speaker (rather than multiple speakers who may be speaking concurrently within the artificial reality world), and that has had various aspects affecting the propagation of sound (e.g., noise, echoes, reverberation, distance-based attenuation, etc.)).
Lyren may not explicitly disclose an audio signal merging a plurality of concurrent speech instances originating from a plurality of different speakers speaking concurrently.
Chen in the same field of endeavor teaches an audio signal merging a plurality of concurrent speech instances originating from a plurality of different speakers speaking concurrently ([0027], wherein mixed speech signals can be obtained from multiple microphones, and also an individual audio signal from a given microphone may be considered "mixed" by virtue of having utterances, potentially concurrent or overlapping, by multiple speakers in the individual audio signal).  Therefore, it would have been obvious at the time the application was filed to use Chen’s above feature with the system of Lyren, in order to benefit from an automated speech separation system that can separate an audio signal with multiple speakers into separate audio signals for individual speakers, and thereof provide reliable speech recognizers and better quality transcriptions (Chen, [0001]).
Lyren in view of Chen does not explicitly disclose generating, by the speech presentation system based on the extracted auto-transcribable speech signals, a different closed captioning dataset for each of the plurality of concurrent speech instances merged together in the simulated binaural audio signal.
Ota in the same field of endeavor teaches an automatic speech recognition system that generates closed captions of a speaker within a virtual reality world and presenting the generated 
As per claim 2, Lyren teaches wherein the plurality of different speakers are included within an artificial reality world that is presented to a user by a media player device ([0431]); and 
the plurality of different speakers include: a first speaker included on a media content presentation that is presented within the artificial reality world, and a second speaker implemented by an avatar within the artificial reality world, the avatar associated with an additional user to whom the artificial reality world is presented by an additional media player device (additional avatars with additional media players as in paragraphs [0190]-[0191] in consideration with the terms definitions provided by paragraphs, [0431], [0469]-[0473], wherein a speaking robot or avatar shaped like a human playing music and watching videos on home entertainment systems).
As per claim 3, Lyren teaches wherein the extracting of the different auto-transcribable speech signals from the simulated binaural audio signal includes identifying a plurality of features of sound represented by the simulated binaural audio signal; and the extracting of the different auto-transcribable speech signals from the simulated binaural audio signal is further based on the identified plurality of features of the sound represented by the simulated binaural [0100], extracting sound signals originating from one or more of the virtual sound sources for enhancement; and [0078], The HRTF path includes the information of the motion of the head… transforms the HRTF path to a head path).
As per claim 7, Lyren teaches wherein the respective propagation effects applied to each of the concurrent speech instances include one or more of: a reverberation effect to simulate natural echoes of one of the concurrent speech instances; or an attenuation effect to simulate a natural drop-off of a volume of one of the concurrent speech instances ([0333], wherein reverberation effect is applied) .
As per claim 8,  Lyren teaches wherein the simulated binaural audio signal is associated with a media player device that is presenting an artificial reality world to a user of the media player device (media players as in paragraphs [0190]-[0191] in consideration with the terms definitions provided by paragraphs [0469]-[0473]); and representative of a simulation of sound propagating to an avatar representing the user within the artificial reality world (Lyren, [0098], receiving a speech signal originating from a speaker within the Virtual Reality (VR) telephony space).
As per claim 9, Lyren teaches a media player device includes presenting, on a display screen associated with the media player device and upon which the artificial reality world is presented, the transcription dataset in real time as the speech originates from the speaker within the artificial reality world ([0015]-[0016], and [0020] closed caption above the speaker head).
Lyren does not explicitly disclose providing, by the speech presentation system to the media player device, the different closed captioning datasets for the plurality of concurrent speech instances; wherein the providing is configured to allow the media player device to present, on a display screen associated with the media player device and upon which the artificial 
Ota in the same field of endeavor teaches an automatic speech recognition system that generates closed captions of a speaker within a virtual reality world and presenting the generated closed caption in the form of a speech bubble above the head of the speaker ([0045], [0052]).  Therefore, it would have been obvious at the time the application was filed to use the transcription feature of Ota with the system of Lyren to perform the above claimed steps, in order to generate a closed captioning dataset representative of the speech originating from the speaker; and provide the closed captioning dataset.  This would increase users’ convenience by adding another way of presenting transcription data via displaying the output data as a readable transcription of the audio data to the user.
As per claims 11-13, 17-19, system claims 11-13, 17-19 and method claims 1-3, 7-9 are related as apparatus and the method of using same, with each claimed element's function corresponding to the claimed method step.  Accordingly, claims 11-13, 17-19 are similarly rejected under the same rationale as applied above with respect to method claims 1-3, 7-9. 
Further Lyren teaches a memory storing instructions, and a processor communicatively coupled to the memory ([0477]).
As per claims 20, Lyren teaches a computer readable medium ([0477]).  The remaining steps are rejected under the same rationale as applied to the method steps of rejected claim 1. 

Claims 4 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Lyren in view of Chen and Ota, and further in view of Wu (US 10964315).
As per claims 4 and 14, Lyren in view of Chen and Ota teaches a plurality of different speakers speaking concurrently includes a first speaker and a second speaker (Lyren, [0224], receiving a speech signal originating from a first and second speaker within the Virtual Reality (VR) telephony space), and using machine learning technique for associating voices with corresponding speakers (Lyren, [0478]; Chen [0026], [0049], [0070])
Lyren in view of Chen and Ota does not explicitly disclose wherein the identified plurality of features of the sound include cepstral coefficients associated with a first voice of the first speaker and cepstral coefficients associated with a second voice of the second speaker; and the extracting of the different auto-transcribable speech signals from the simulated binaural audio signal further includes employing a machine learning technique to: associate the first voice with the first speaker based on the cepstral coefficients associated with the first voice, and associate the second voice with the second speaker based on the cepstral coefficients associated with the second voice. Wu in the same field of endeavor teaches using cepstral coefficients for processing digitized audio signals (col. 3, lines 27-44).  Therefore, it would have been obvious at the time the application was filed to use Wu’s feature of using cepstral coefficients with the system of Lyren in view of Chen and Ota, in order to perform the claimed steps.  This would improve speech signals processing and recognition accuracy, especially in reverberant environments.

Claims 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Lyren in view of Chen and Ota, and further in view of Schultz (US 2005/0084116).
As per claims 5, 15, Lyren in view of Chen and Ota teaches plurality of different speakers speaking concurrently includes a first speaker and a second speaker (Lyren, [0224], receiving a speech signal originating from a first and second speaker within the Virtual Reality (VR) telephony space).
 Lyren in view of Chen and Ota does not explicitly disclose the identified plurality of features of the sound include a first root-mean- square (RMS) magnitude associated with speech from the first speaker and a second RMS magnitude associated with speech from the second speaker, each RMS magnitude indicating a proximity of a particular speaker and a listener to whom speech from the particular speaker propagates; and the extracting of the different auto-transcribable speech signals from the simulated binaural audio signal further includes differentiating the speech from the first speaker and the speech from the second speaker based on the first and second RMS magnitudes.
Schulz in the same field of endeavor teaches a system for root-mean- square (RMS) to differentiate the speech from a first speaker and speech from a second speaker ([0032]). Therefore, it would have been obvious at the time the application was filed to use Schulz ‘s above features with system of Lyren in view of Chen and Ota, in order to differentiate the speech from the first speaker and speech from the second speaker, as claimed.  This would reduce the magnitude of the acoustic reflections, and provide strong signals.

Claims 6 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Lyren in view of Chen and Ota, and further in view of Morishita (US 20170257723).
As per claims 6 and 16, Lyren in view of Chen and Ota teaches plurality of different speakers speaking concurrently includes a first speaker and a second speaker (Lyren, [0224], receiving a speech signal originating from a first and second speaker within the Virtual Reality (VR) telephony space).

Morishita in the same field of endeavor teaches managing audio signals within a user's perceptible audio environment, wherein an interaural level difference (ILD) cue or an interaural time difference (ITD) cue is used to indicate a direction from which speech from a particular speaker originates with respect to a listener to whom the speech from the particular speaker propagates ([0029], [0048], [0123]). Therefore, it would have been obvious at the time the application was filed to use Morishita‘s above features with system of Lyren in view of Chen and Ota, in order to differentiate the speech from the first speaker and speech from the second speaker, as claimed.  This would reduce the magnitude of the acoustic reflections, and provide strong signals.

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Lyren in view of Chen and Ota, and further in view of Chand (US 2019/0130629).
As per claim 10, Lyren in view of Chen and Ota does not explicitly disclose the particular closed captioning dataset presented on the display screen is for a speech instance 
Chand in the same field closed captioning dataset presented on the display screen is for a speech instance originating from an off-screen speaker not depicted on the display screen as the particular closed captioning dataset is presented; and an indicator associated with the particular closed captioning dataset being presented points in a general direction toward the off-screen speaker (Fig. 5, [0048], [0059]). Therefore, it would have been obvious at the time the application was filed to use Chand’s above feature with the system of  Lyren in view of Chen and Ota, in order to provide better user interfaces and make existing social networking sites more accurate.
Conclusion
4.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. See PTO-892.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ABDELALI SERROU whose telephone number is (571)272-7638. The examiner can normally be reached M-F 9 Am - 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ABDELALI SERROU/            Primary Examiner, Art Unit 2659