Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
DETAILED ACTION
Claim Interpretation
1.	The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claim (claim 12) in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:

(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 

Claim Rejections - 35 USC § 103
2.	Claims 1-22 are rejected under 35 U.S.C. 103 as being unpatentable over Ferris et al. (2020/0294500) in view of Uszkoreit et al. (2016/0203127) and further in view of Farrow et al. (2020/0059554).
As to claim 1, Ferris teaches a method of providing a translation-enabled, multiparty communications session, comprising: receiving an audio stream from each of the communication session participants ([0037-0038]); providing the audio stream to the communication session participants ([0024, 0092-0093] – where Ferris discuss a processor facilitate a collaboration session between participants, using audio data, image data, or the video data during the collaboration session); generating a participant text stream for each participant by converting the audio stream received from each participant into a participant text stream in a first language using computer automated speech-to-text techniques ([0023-0024, 0038]); creating at least one translated participant text stream for each participant by translating each participant's text stream into a text stream in a second language ([0024, 0036-0038]); and publishing the translated participant text streams (abstract; [0024, 0087-0088]). Ferris does not explicitly discuss receiving a plurality of requests to join a translation-enabled, multiparty 
Uszkoreit teaches receiving a plurality of requests to join a translation-enabled, multiparty communications session from a corresponding plurality of communication session participants (claims 1 & 8; [0005-0006]).
Farrow teaches mixing the received audio streams from the communication session participants to create a mixed audio stream (Fig. 1, audio bridge mixer 124; [0025, 0027, 0031]); the contributions of multiple participants is mixed and recorded (of telephony communications such as telephone calls and conference calls [0002]) in a single recording track (abstract; [0038]) and it would have been obvious when playing the recording to the communication session participants this will provide the audio stream to the communication session participants.
It would have been obvious before the effective filing date of the claimed invention to incorporate the teachings of Uszkoreit and Farrow into the teachings of Ferris for the purpose of translating from a source language to the preferred language of the user and mixing audio contributions of participants to record in a track of the multitrack recording in order to later play the mix recording to the participants.
As to claims 2 and 14, Ferris teaches the method of claim 1 and the system of claim 13, wherein creating at least one translated participant text stream for each participant comprises creating multiple translated participant text streams for each participant by translating each participant's text stream into multiple text streams in different languages ([0036-0038]).

As to claims 4 and 16, Ferris teaches the method of claim 1 and the system of claim 13, wherein translate and display participant text stream in communication session participant ([0023-0024, 0036, 0038]); publishing the translated participant text streams (abstract; [0024, 0087-0088]) and providing the audio stream to the communication session participants ([0024, 0092-0093] – where Ferris discuss a processor facilitate a collaboration session between participants, using audio data, image data, or the video data during the collaboration session); Uszkoreit teaches receiving a plurality of requests to join a translation-enabled, multiparty communications session from a corresponding plurality of communication session participants (claims 1 & 8; [0005-0006]); and Farrow teaches mixing the received audio streams from the communication session participants to create a mixed audio stream (Fig. 1, audio bridge mixer 124; [0025, 0027]) the received command via the API of a telephony services provider ([0025, 0027, 0035]); the contributions of multiple participants is mixed and 
As to claim 5, Uszkoreit teaches the method of claim 1, receiving a plurality of requests to join a translation-enabled, multiparty communications session from a corresponding plurality of communication session participants (claims 1 & 8; [0005-0006]).  Farrow teaches receiving commands from communication session participants via a multiparty communication session application programming interface ([0025, 0027]). It would have been obvious to incorporate the teachings of Farrow into the teachings of Uszkoreit for the purpose of requesting via the API that include the participant’s telephone number as part of the caller ID information so the switch is able to identify which audio stream corresponds to the telephone number.
As to claim 6, Ferris teaches the method of claim 5 wherein receiving an audio stream from each of the communication session participants ([0037-0038]); Farrow teaches the API of a telephony services provider receiving command form customer to record the audio of a conference call in a multitrack recording ([0025, 0027]). It would have been obvious to incorporate the teachings of Farrow into the teachings of Ferris for the purpose of receiving the telephone number of each calling party as part of the caller ID information or data via the API.
As to claims 7 and 17, Farrow teaches the method of claim 6 and the system of claim 13, wherein mixing the received audio streams from the communication session participants to create a mixed audio stream (Fig. 1, audio bridge mixer 124; [0025, 
As to claims 8 and 18, Uszkoreit teaches the method of claim 1 and the system of claim 13, receiving a plurality of requests to join a translation-enabled, multiparty communications session from a corresponding plurality of communication session participants (claims 1 & 8; [0005-0006]); and Ferris teaches for each participant, an indication of the language that will be spoken by the participant during the multiparty communication session ([0024, 0035-0036 – determining that two languages are being spoken, 0048-0050 – determining the language being spoken during the meeting]).
As to claims 9 and 19, Ferris teaches the method of claim 1 and the system of claim 13, further comprising monitoring the received audio streams from the multiparty communication session participants to determine, for each participant, the language being spoken by the participant ([0024, 0035-0038]).
As to claim 10, Ferris teaches the method of claim 1, further comprising creating at least one translated participant audio stream for each participant by converting the participant's translated text stream into a translated audio stream using computer automated speech to text techniques ([0023-0024, 0036] – where Ferris discussed performing a speech to text conversion audio data, translated and displaying text for 
As to claim 11, Ferris teaches the method of claim 10, wherein creating at least one translated participant audio stream for each participant by converting the participant's translated text stream into a translated audio stream using computer automated speech to text techniques ([0023-0024, 0036] – where Ferris discussed performing a speech to text conversion audio data, translated and displaying text for participant); identifying a participant and the number of participants by processing audio data and caused the data associated with the participant to be displayed at the portion of the display (Fig. 13, [0070]); and obtaining audio data, a display for displaying portions viewable from side of the device (abstract); and Farrow teaches an application programming interface of a telephony services provider receives command from a customer to record the audio of a conference call in a multitrack recording and that the audio contributions of all other participants are to be mixed and recorded in a second track ([0025]). It would have been obvious to utilize the application programming interface to have the translated audio streams available to participants for the purpose of utilizing the application programming interface as a central place holder for translated mixed participant audio streams in order for all participants access the translated information easily.
Claim 12 is rejected for the same reasons discussed above with respect to claim 1. Furthermore, Uszkoreit teaches means (the virtual participant processor) for receiving 
As to claim 13, Ferris teaches a system for providing a translation-enabled, multiparty communications session, comprising: receiving an audio stream from each of the communication session participants ([0037-0038]); providing the audio stream to the communication session participants ([0024, 0092-0093] – where Ferris discuss a processor facilitate a collaboration session between participants, using audio data, image data, or the video data during the collaboration session); a translation service that generate a participant text stream for each participant by converting the audio 
Uszkoreit teaches the first computing device which include software and the processor generating requests to join a translation-enabled, multiparty communications session from a corresponding plurality of communication session participants (claims 1 & 8; [0005-0006]).
Farrow teaches telephony services provider includes API which can be used by customers to interact with the telephony services provider, the customer use the API to issue commands that the telephony services provider then acts upon ([0020]); a media bridge (Fig. 1, media bridge) that is configured to mix the received audio streams from the communication session participants to create a mixed audio stream that enables the communication session API to provide the mixed audio stream to the communication session participants (Fig. 1, audio bridge mixer 124; [0025, 0027, 0031]); the contributions of multiple participants is mixed and recorded (of telephony 
It would have been obvious before the effective filing date of the claimed invention to incorporate the teachings of Uszkoreit and Farrow into the teachings of Ferris for the purpose of translating from a source language to the preferred language of the user and mixing audio contributions of participants to record in a track of the multitrack recording in order to later play the mix recording to the participants.
Claim 20 rejected for the same reasons discussed above with respect to claims 10-11.
As to claim 21, Ferris teaches the method of claim 1, wherein the generating step is performed by a transcription service that receives each individual participant's audio stream and that uses each individual participant's audio stream to generate a corresponding individual participant text stream ([0024, 0036, 0038] – processing audio data by performing a speech to text conversion).  
As to claim 22, Ferris teaches the method of claim 13, wherein the translation service receives each individual participant's audio stream and uses each individual participant's audio stream to generate a corresponding individual participant text stream ([0036, 0038] – processing audio data by performing a speech to text conversion).
Response to Arguments
3.	Applicant’s arguments with respect to claim(s) 1-22 have been considered but are moot because the new ground of rejection does not rely on any reference applied in 
With respect to independent claims 1, 12-13, Applicant mainly argues that Ferris generates a consolidated audio stream that includes the contributions of all participants and do not create individual participant text streams based on individual audio streams received from each participant. Examiner respectfully submits that Ferris teaches parsing the consolidated audio stream to separate out to identify the portions of the consolidated audio stream that were contributed by each individual participant and mark each individual participant’s contribution to indicate the participant to which they belong ([0037-0038, 0051]), and converting the audio stream received from each participant into a participant text stream in a first language using automated speech-to-text techniques ([0021, 0024, 0036, 0038]). Applicant further agrees/confirmed that Ferris carefully parse the consolidated audio stream to separate out to identify the portions of the consolidated audio stream that were contributed by each individual participant and mark each individual participant’s contribution to indicate the participant to which they belong ([0037-0038, 0051]), and converting the audio stream received from each participant into a participant text stream in a first language using automated speech-to-text techniques ([0021, 0036, 0038]). Hence, Ferris discloses methods that involve generating individual participant text streams based on individual audio streams received from each participant, as recited in the claims.
With respect to claim 8, Applicant argues that Ferris does not teach receiving requests to join a translation enabled multiparty communications session include for each participant, an indication of the language that will be spoken by the participant 
Conclusion
4.	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
5.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to Quynh H. Nguyen whose telephone number is (571)272-7489.  The examiner can normally be reached on Monday-Friday 7AM-3PM.  If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ahmad Matar can be reached on 571-272-7488.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Any response to this action should be mailed to:
                        Commissioner of Patents and Trademarks
                        P.O. Box 1450
                        Alexandria, VA  22313-1450

Or faxed to:

                    (571) 273-8300, for formal communications intended for entry and for 
                          Informal or draft communications, please label “PROPOSED” or “DRAFT.”
                             
 Hand-delivered responses should be brought to: 

                         Customer Service Window 
                         Randolph Building 
                         401 Dulany Street 
                         Alexandria, VA 22314



/QUYNH H NGUYEN/Primary Examiner, Art Unit 2652