DETAILED ACTION
Claim Status
Claim 12 has been canceled. New claims 21 has been added. Claims 1-11 and 13-21 are pending in the application.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 3-7 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 3 recites the limitation "the audio profile" in Line 3.  There is insufficient antecedent basis for this limitation in the claim. Claim 4-7 are rejected as the dependent claims of claim 3.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159.  See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-2, 10-11, 15-16, 18-19, and 21 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-2, 13-14, and 18 of U.S. Patent No. 11082465 in view of Lenke (US 20200110572 A1).
For example, 
Regarding claim 1 of the instant application, claim 1 of U.S. Patent No. 11082465 teaches a conference server, comprising: (Claim 1: A video conference server)
a network interface to a network; (Claim 1: a network interface to a network)
a storage component comprising a non-transitory storage device; (Claim 1: a storage component comprising a non-transitory storage device)
a processor, comprising at least one microprocessor; and (Claim 1: a processor, comprising at least one microprocessor)
wherein the processor, upon accessing machine-executable instructions, cause the processor to perform: (Claim 1: wherein the processor, upon accessing machine-executable instructions, is caused to)
broadcasting conference content, via the network, to each of a plurality of endpoints and wherein the conference content comprises an audio portion received from a contributing endpoint of the plurality of endpoints; (Claim 1: broadcast conference content, via the network, to each of a plurality of endpoints, wherein the broadcasted conference content comprises an audio portion and a video portion received from each of the plurality of endpoints)
determining whether the audio portion is extraneous to the conference content; and (Claim 1: determine whether a corresponding audio portion is extraneous to the broadcasted conference content)
upon determining that the audio portion is extraneous to the conference content, executing a muting action to exclude the audio portion from the conference content. (Claim 1: )
Claim 1 of U.S. Patent No. 11082465 does not explicitly disclose the audio portion comprises human speech that is extraneous to the conference content.
However, Lenke teaches the audio portion comprises human speech that is extraneous to the conference content. ([0015]: the classifier can be trained to determine whether first, the audio is speech and second, whether the audio is intended for a conference call or a video communication session. [0034]: speech to another person in the room, and thus not intended for the communication session.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Claim 1 of U.S. Patent No. 11082465 to include above limitations. One would have been motivated to do so because conference attendees often need mute/unmute to hide background noise and sometimes they may forget to mute/unmute promptly. It is desirable for an improved telephone or communication system with respect to the operation of the mute feature. In one example, the system determines that the attendee is speaking and automatically unmutes the attendee's indication device. Another example, the system can also detect non-speech audio such as the movement of papers, the typing on a keyboard, animal noises or children noises, and so forth, and automatically mute the microphone. As taught by Lenke, [0002]-[0012].

Same rationales apply to rejection of independent claims 10 and 15. Claim 18 of U.S. Patent No. 11082465 in view of Lenke (US 20200110572 A1) teach claim 10 of the instant 

Claim 21 is recites limitations substantially as found in claim 10. Claim 18 of U.S. Patent No. 11082465 in view of Lenke (US 20200110572 A1) teach claim 21 of the instant application.

Claim 2 of U.S. Patent No. 11082465 teach claim 2 of the instant application. Claim 18 of U.S. Patent No. 11082465 teach claim 11 of the instant application. Claim 14 of U.S. Patent No. 11082465 teach claim 16 of the instant application. Claim 18 of U.S. Patent No. 11082465 teach claim 18 of the instant application. Claim 18 of U.S. Patent No. 11082465 teach claim 19 of the instant application.

Claims 3 and 5 are rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 11082465 in view of Lenke (US 20200110572 A1), and further in view of Shaw (US 20050288930 A1).
Regarding claim 3 of the instant application, Claim 1 of U.S. Patent No. 11082465 and Lenke teach the conference server of claim 1.
Claim 1 of U.S. Patent No. 11082465 and Lenke does not explicitly disclose accessing an audio profile of a participant, wherein in the audio profile characterizes speech provided by the participant while contributing speech to the conference content.
However, Shaw teaches accessing an audio profile of a participant, wherein in the audio profile characterizes speech provided by the participant while contributing speech to the conference content. ([0035]: the enrollment process may be used to establish a unique user 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Claim 1 of U.S. Patent No. 11082465 to include above limitations. One would have been motivated to do so because in prior art systems, users with unique manners of speech, regional accents, dialects, foreign accents, speech impediments or the like have faced difficulty in voice recognition. It is desirable to “train” a voice recognition system to recognize different speech patterns and sounds. As taught by Shaw, [0007].

Similar rationales apply to claim 12 of the instant application. Claim 12 is rejected on the ground of nonstatutory double patenting as being unpatentable over claims 18 of U.S. Patent No. 11082465 in view of Lenke (US 20200110572 A1), and further in view of Shaw (US 20050288930 A1).

Regarding claim 5 of the instant application, Claim 1 of U.S. Patent No. 11082465, Lenke and Shaw teach the conference server of claim 3.
Claim 1 of U.S. Patent No. 11082465 does not explicitly disclose wherein the processor determines that the audio portion is extraneous to the conference content upon determining that at least one of the speaking volume, pitch, range, tone, or pace of speaking of the audio portion differs from the at least one of speaking volume, pitch, range, tone, or pace of speaking of the audio profile and that the difference is greater than a previously determined threshold.
However, Lenke teaches wherein the processor determines that the audio portion comprise human speech that is extraneous to the conference content upon determining that at audio volume data. The classifier may also be trained on the volume of the speech. [0035]: The speech may be at a certain volume, or particular words might be used that relate to the conference or not, and thus the content of the speech may be used to determine user intent to be part of the conference.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Claim 1 of U.S. Patent No. 11082465 to include above limitations. One would have been motivated to do so because detecting whether the first user is speaking and intending to speak in the conference can be based at least in part on one or more of a voice detection module, facial recognition data, gaze detection data, background noise, motion detection, and audio volume data. As taught by Lenke, [0015].

Claim 4 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 11082465 in view of Lenke (US 20200110572 A1), and in view of Shaw (US 20050288930 A1), and further in view of Lim (US 20010003173 A1).
Regarding claim 4 of the instant application, Claim 1 of U.S. Patent No. 11082465, Lenke and Shaw teach the conference server of claim 3.
Shaw teaches accessing the audio profile of the participant.

However, Lim teaches the audio profile comprising at least one of speaking volume, pitch, range, tone, or pace of speaking; and ([0025]:  the voice recognition processing unit, after repeatedly inputted with a specific voice range, obtains reference voice models of the voice data via the range and feature of the voice data and stores each of the reference voice models (e.g. audio profile) into a memory.)
determining whether the audio portion is extraneous to the conference content, further comprising, determining that at least one of the speaking volume, pitch, range, tone, or pace of speaking of the audio portion differs from the at least one of speaking volume, pitch, range, tone, or pace of speaking of the audio profile. ([0013]: detecting the range and characteristics of the received voice data; comparing the range and characteristics of the detected voice data with the characteristics of the previously obtained reference voice model (e.g. audio profile).)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Claim 1 of U.S. Patent No. 11082465, Lenke and Shaw to include above limitations. One would have been motivated to do so because the foregoing voice recognition system of the prior art discriminates the entered voices by the previously established reference voice model. Therefore, when the reference voice model is erroneously established due to noise, incorrect pronunciation of the user or etc. in establishing the reference 

Similar rationales apply to claim 13 and claim 17 of the instant application. Claim 13 is rejected on the ground of nonstatutory double patenting as being unpatentable over claims 18 of U.S. Patent No. 11082465 in view of Lenke (US 20200110572 A1), and in view of Shaw (US 20050288930 A1), and further in view of Lim (US 20010003173 A1). Claim 17 is rejected on the ground of nonstatutory double patenting as being unpatentable over claims 13 of U.S. Patent No. 11082465 in view of Lenke (US 20200110572 A1), and in view of Shaw (US 20050288930 A1), and further in view of Lim (US 20010003173 A1).

Claim 6 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 11082465 in view of Lenke (US 20200110572 A1), and in view of Shaw (US 20050288930 A1), and in view of Lim (US 20010003173 A1), and further in view of Weisman (US 20040047461 A1).

Lim teaches wherein the audio profile comprises at least one of the speaking volume, pitch, range, tone, or pace of speaking as sampled from the conference content. ([0013]: detecting the range and characteristics of the received voice data; comparing the range and characteristics of the detected voice data with the characteristics of the previously obtained reference voice model (e.g. audio profile).)
Claim 1 of U.S. Patent No. 11082465, Lenke, Shaw and Lim do not explicitly disclose that follows the participant being addressed by name by another participant associated with a different one of the plurality of endpoints.
However, Weisman teaches that follows the participant being addressed by name by another participant associated with a different one of the plurality of endpoints. ([0049]: to designate a participant as the speaker, and to pass such designation to subsequent participants. [0162]: As the current speaker is voluntarily yielding to another participant he has selected, the outgoing speaker may customarily say, “And with that, I pass to WizKid,” or some similar statement.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Claim 1 of U.S. Patent No. 11082465, Lenke, Shaw and Lim to include above limitations. One would have been motivated to do so because in a conference call, there is a need for a way to designate a participant as the speaker, and to pass such designation to subsequent participants. As taught by Weisman, [0049].

Similar rationales apply to claim 14 and claim 20 of the instant application. Claim 14 is rejected on the ground of nonstatutory double patenting as being unpatentable over claims 18 of U.S. Patent No. 11082465 in view of Lenke (US 20200110572 A1), and in view of Shaw (US 20050288930 A1), and in view of Lim (US 20010003173 A1), and further in view of Weisman (US 20040047461 A1). Claim 20 is rejected on the ground of nonstatutory double patenting as being unpatentable over claims 13 of U.S. Patent No. 11082465 in view of Lenke (US 20200110572 A1), and in view of Shaw (US 20050288930 A1), and in view of Lim (US 20010003173 A1), and further in view of Weisman (US 20040047461 A1).

Claim 7 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 11082465 in view of Lenke (US 20200110572 A1), and in view of Shaw (US 20050288930 A1), and further in view of Weisman (US 20040047461 A1).
Regarding claim 7 of the instant application, Claim 1 of U.S. Patent No. 11082465, Lenke and Shaw teach the conference server of claim 3.
Claim 1 of U.S. Patent No. 11082465, Lenke and Shaw do not explicitly disclose wherein the processor determines that the audio profile of the participant upon detecting the conference content comprises a name and, following the name, hearing speech from the participant.
However, Weisman teaches wherein the processor determines that the audio profile of the participant upon detecting the conference content comprises a name and, following the name, hearing speech from the participant. ([0049]: to designate a participant as the speaker, and to pass such designation to subsequent participants. [0162]: As the current speaker is voluntarily yielding to another participant he has selected, the outgoing speaker may customarily say, “And 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Claim 1 of U.S. Patent No. 11082465, Lenke and Shaw to include above limitations. One would have been motivated to do so because in a conference call, there is a need for a way to designate a participant as the speaker, and to pass such designation to subsequent participants. As taught by Weisman, [0049].

Claim 8 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 11082465 in view of Lenke (US 20200110572 A1), and further in view of IP.COM (Technique for detection of a person in a conference call, to detect a user action requiring unmute/mute, and to do it automatically).
Regarding claim 3 of the instant application, Claim 1 of U.S. Patent No. 11082465 and Lenke teach the conference server of claim 1.
Claim 1 of U.S. Patent No. 11082465 and Lenke does not explicitly disclose accessing an audio profile of a participant, wherein in the audio profile characterizes speech provided by the participant with regard to a sound attribute comprising a first spoken language; and determining whether the audio portion comprises human speech that is extraneous to the conference content, further comprising, determining if the audio portion comprises a second spoken language.
However, IP.COM teaches accessing an audio profile of a participant, wherein in the audio profile characterizes speech provided by the participant with regard to a sound attribute comprising a first spoken language; and determining whether the audio portion comprises human speech that is extraneous to the conference content, further comprising, determining if the audio 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Claim 1 of U.S. Patent No. 11082465 and Lenke to include above limitations. One would have been motivated to do so because the conference being conducted in ENGLISH, and user is having a side conversation with his family (at home) in his/her native language (Hindi or German) and only wants UNMUTE to happen when he/she starts to speak in English. This allows the user to multi-task. As taught by IP.COM, Page 2.

Claim 9 is rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1 of U.S. Patent No. 11082465 in view of Lenke (US 20200110572 A1), and further in view of Shawn T. (Muting Yourself and Participants in Webex).
Regarding claim 3 of the instant application, Claim 1 of U.S. Patent No. 11082465 and Lenke teach the conference server of claim 1.
Claim 1 of U.S. Patent No. 11082465 and Lenke do not explicitly disclose wherein the processor further performs, causing each of the plurality of endpoints to present indicia of the muting action associated with the contributing endpoint.
However, Shawn T. teaches wherein the processor further performs, causing each of the plurality of endpoints to present indicia of the muting action associated with the contributing )
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Claim 1 of U.S. Patent No. 11082465 and Lenke to include above limitations. One would have been motivated to do so because in order to prevent unwanted noise in the meeting, event, or training session, Participants may be muted or unmuted by the Host. It is desirable for the host to know who is on mute/unmute.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-2, 10-12, 15-16, 18-19, and 21 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Lenke (US 20200110572 A1).
Regarding claim 1, Lenke teaches a conference server, comprising: 
a network interface to a network; a storage component comprising a non-transitory storage device; a processor, comprising at least one microprocessor; and wherein the processor, 
broadcasting conference content, via the network, to each of a plurality of endpoints and wherein the conference content comprises an audio portion received from a contributing endpoint of the plurality of endpoints; (Fig. 2. Abstract: managing a mute and unmute feature on a device which is used to communicate data in a communication conference.)
determining whether the audio portion comprises human speech that is extraneous to the conference content; and ([0015]: the classifier can be trained to determine whether first, the audio is speech and second, whether the audio is intended for a conference call or a video communication session. [0034]: speech to another person in the room, and thus not intended for the communication session.)
upon determining that the audio portion comprises human speech that is extraneous to the conference content, executing a muting action to exclude the audio portion from the conference content. ([0015]: the classifier can be trained to determine whether first, the audio is speech and second, whether the audio is intended for a conference call or a video communication session. Speaker identification can also be included in the analysis to insure that the speech is from the expected individual who is participating in a conference session. If the speaker identification component determines that the user is not an expected participant in the call, then the mute/unmute decision can take that into account and likely mute the device for that speaker.)

Regarding claim 2, Lenke teaches the conference server of claim 1.


Regarding claim 10, Lenke teaches a conference server, comprising: 
a network interface to a network; a storage component comprising a non-transitory storage device; a processor, comprising at least one microprocessor; and wherein the processor, upon accessing machine-executable instructions, cause the processor to perform: (Fig. 1-2. [0018]: The step of detecting whether the background noise exists in the communication conference at the predetermined threshold can be performed on a network-based server.)
broadcasting conference content, via the network, to each of a plurality of endpoints and wherein the conference content comprises an audio portion received from a contributing endpoint of the plurality of endpoints; (Fig. 2. Abstract: managing a mute and unmute feature on a device which is used to communicate data in a communication conference.)
determining whether the audio portion is muted, wherein the processor receives the audio portion from the contributing endpoint and omits the audio portion from the conference content; ([0003]: attendees often will place their phones on mute to hide the background noise and forget that they are set on mute. In this scenario, an attendee may start speaking for a period of time, assuming that other participants in the conference can hear them, when in fact they cannot be heard because the phone is set to mute.) 

wherein the audio portion comprises encoded sound and wherein the processor determines the contributing endpoint is erroneously muted further comprising the encoded sound comprises human speech; ([0034]: to distinguish not only whether the audio is speech versus background noise, but what is the type of speech with respect to whether the speech is intended or likely to be intended as part of the communication session as opposed to speech to another person in the room.)
when erroneously muted, executing an unmuting action to include the audio portion in the conference content. ([0033]: detecting whether the user is speaking. [0035]: Based on such a determination, and by the system distinguishing between talking to the conference and background noise or side speech, the component can automatically unmute the device, such that the speech provided by the user will be heard by other users in the communication session.)

Regarding claim 11, Lenke teaches the conference server of claim 10.
Lenke teaches wherein the processor performs executing the unmuting action, further comprising, signaling the contributing endpoint to cause the contributing endpoint to energize an unmuting prompt circuit. ([0035]: Based on such a determination, and by the system distinguishing between talking to the conference and background noise or side speech, the component can automatically unmute the device, such that the speech provided by the user will be heard by other users in the communication session, or mute the device.)

Regarding claim 15, Lenke teaches a method for correcting an erroneous audio setting, comprising: 
broadcasting conference content, via a network, to each of a plurality of endpoints, wherein the conference content comprises audio content provided by one or more of the plurality of endpoints; (Fig. 2. Abstract: managing a mute and unmute feature on a device which is used to communicate data in a communication conference.)
determining whether a first audio portion, of the audio content and comprising human speech that, received from a first endpoint of the plurality of endpoints is extraneous to the conference content; and ([0015]: the classifier can be trained to determine whether first, the audio is speech and second, whether the audio is intended for a conference call or a video communication session. [0034]: speech to another person in the room, and thus not intended for the communication session.)
upon determining that the first audio portion comprises human speech that is extraneous to the conference content, executing a muting action to exclude the first audio portion from the conference content. ([0015]: the classifier can be trained to determine whether first, the audio is speech and second, whether the audio is intended for a conference call or a video communication session. Speaker identification can also be included in the analysis to insure that the speech is from the expected individual who is participating in a conference session. If the speaker identification component determines that the user is not an expected participant in the call, then the mute/unmute decision can take that into account and likely mute the device for that speaker.)

Regarding claim 16, Lenke teaches the method of claim 15.


Regarding claim 18, Lenke teaches the method of claim 15.
Lenke teaches receiving a second audio portion from a second endpoint of the plurality of endpoints that is muted and, when muted, omitted from the conference content; (Fig. 2. Abstract: managing a mute and unmute feature on a device which is used to communicate data in a communication conference.)
determining whether the second endpoint is erroneously muted; and ([0003]: attendees often will place their phones on mute to hide the background noise and forget that they are set on mute. In this scenario, an attendee may start speaking for a period of time, assuming that other participants in the conference can hear them, when in fact they cannot be heard because the phone is set to mute.)
upon determining that the second endpoint is erroneously muted, executing an unmuting action to include the second audio portion in the conference content. ([0033]: assume that the user has forgotten that they are on mute or that the mute feature is turned to “on”. The user might start talking thinking that other users in the communication session will be able to hear them. [0035]: Based on such a determination, and by the system distinguishing between talking to the conference and background noise or side speech, the component can automatically unmute the 

Regarding claim 19, Lenke teaches the method of claim 18.
Lenke teaches wherein the processor performs executing the unmuting action, further comprising, signaling the contributing endpoint to cause the contributing endpoint to energize an unmuting prompt circuit. ([0035]: Based on such a determination, and by the system distinguishing between talking to the conference and background noise or side speech, the component can automatically unmute the device, such that the speech provided by the user will be heard by other users in the communication session, or mute the device.)

Regarding claim 21, Lenke teaches the conference server of claim 1.
Lenke teaches determining whether the audio portion is muted, wherein the processor receives the audio portion from the contributing endpoint and omits the audio portion from the conference content; upon determining that the audio portion is muted, determining whether the contributing endpoint is erroneously muted and wherein the audio portion comprises encoded sound and ([0034]: to distinguish not only whether the audio is speech versus background noise, but what is the type of speech with respect to whether the speech is intended or likely to be intended as part of the communication session as opposed to speech to another person in the room.)
wherein the processor determines the contributing endpoint is erroneously muted further comprising, determining the encoded sound comprises human speech; and when erroneously muted, executing an unmuting action to include the audio portion in the conference content. talking to the conference and background noise or side speech, the component can automatically unmute the device, such that the speech provided by the user will be heard by other users in the communication session.)

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 3 and 5 are rejected under 35 U.S.C. 103 as being unpatentable over Lenke (US 20200110572 A1) in view of Shaw (US 20050288930 A1).
Regarding claim 3, Lenke teaches the conference server of claim 1.
Lenke does not explicitly disclose accessing an audio profile of a participant, wherein in the audio profile characterizes speech provided by the participant while contributing the speech to the conference content.
However, Shaw teaches accessing an audio profile of a participant, wherein in the audio profile characterizes speech provided by the participant while contributing the speech to the conference content. ([0035]: the enrollment process may be used to establish a unique user 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Lenke to include above limitations. One would have been motivated to do so because in prior art systems, users with unique manners of speech, regional accents, dialects, foreign accents, speech impediments or the like have faced difficulty in voice recognition. It is desirable to “train” a voice recognition system to recognize different speech patterns and sounds. As taught by Shaw, [0007].

Regarding claim 5, Lenke and Shaw teach the conference server of claim 3.
Lenke teaches wherein the processor determines that the audio portion comprise human speech that is extraneous to the conference content upon determining that at least one of the speaking volume, pitch, range, tone, or pace of speaking of the audio portion differs from the at least one of speaking volume, pitch, range, tone, or pace of speaking of the audio profile and that the difference is greater than a previously determined threshold. ([0015]: Detecting whether the first user is speaking and intending to speak in the conference can be based at least in part on one or more of a voice detection module, facial recognition data, gaze detection data, background noise, motion detection, and audio volume data. The classifier may also be trained on the volume of the speech. [0035]: The speech may be at a certain volume, or particular words might be used that relate to the conference or not, and thus the content of the speech may be used to determine user intent to be part of the conference.)

Claim 4, 13 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Lenke (US 20200110572 A1) in view of Shaw (US 20050288930 A1), and further in view of Lim (US 20010003173 A1).
Regarding claim 4, Lenke and Shaw teach the conference server of claim 3.
Shaw teaches accessing the audio profile of the participant. 
Lenke and Shaw do not explicitly disclose the audio profile comprising at least one of speaking volume, pitch, range, tone, or pace of speaking; and determining whether the audio portion is extraneous to the conference content, further comprising, determining that at least one of the speaking volume, pitch, range, tone, or pace of speaking of the audio portion differs from the at least one of speaking volume, pitch, range, tone, or pace of speaking of the audio profile.
However, Lim teaches the audio profile comprising at least one of speaking volume, pitch, range, tone, or pace of speaking; and ([0025]:  the voice recognition processing unit, after repeatedly inputted with a specific voice range, obtains reference voice models of the voice data via the range and feature of the voice data and stores each of the reference voice models (e.g. audio profile) into a memory.)
determining whether the audio portion is extraneous to the conference content, further comprising, determining that at least one of the speaking volume, pitch, range, tone, or pace of speaking of the audio portion differs from the at least one of speaking volume, pitch, range, tone, or pace of speaking of the audio profile. ([0013]: detecting the range and characteristics of the received voice data; comparing the range and characteristics of the detected voice data with the characteristics of the previously obtained reference voice model (e.g. audio profile).)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Lenke and Shaw to include above limitations. One 

Regarding claim 13, Lenke teaches the conference server of claim 10.
Lenke teaches upon determining the encoded sound comprises speech; determining whether the audio portion comprises human speech that is extraneous to the conference content; and  Page 34 of 37Avaya Ref. No. 420068-NP-USAttorney File No. 4366-1166when the audio portion comprises human speech that is determined not to be extraneous, performing the unmuting action. ([0033]: detecting whether the user is speaking. Abstract: detecting, when the device is set to mute, whether the user is speaking and whether the speech is meant for the conference. [0035] : Based on such a determination, and by the system distinguishing between talking to the conference and background noise or side speech, the 
Lenke does not explicitly disclose accessing an audio profile of a participant, wherein in the audio profile characterizes speech provided by the participant while contributing speech to the conference content.
However, Shaw teaches accessing an audio profile of a participant, wherein in the audio profile characterizes speech provided by the participant while contributing speech to the conference content. ([0035]: the enrollment process may be used to establish a unique user profile. For example, the user inputs enrollment data which may include reading a predetermined passage into the input of the system to train the system to recognize the user's voice.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Lenke to include above limitations. One would have been motivated to do so because in prior art systems, users with unique manners of speech, regional accents, dialects, foreign accents, speech impediments or the like have faced difficulty in voice recognition. It is desirable to “train” a voice recognition system to recognize different speech patterns and sounds. As taught by Shaw, [0007].
Lenke and Shaw do not explicitly disclose the audio profile comprising at least one of speaking volume, pitch, range, tone, or pace of speaking; and determining whether the audio portion is extraneous to the conference content, further comprising, determining that at least one of the speaking volume, pitch, range, tone, or pace of speaking of the audio portion differs from the at least one of speaking volume, pitch, range, tone, or pace of speaking of the audio profile.
However, Lim teaches the audio profile comprising at least one of speaking volume, pitch, range, tone, or pace of speaking; and ([0025]:  the voice recognition processing unit, after  reference voice models (e.g. audio profile) into a memory.)
determining whether the audio portion is extraneous to the conference content, further comprising, determining that at least one of the speaking volume, pitch, range, tone, or pace of speaking of the audio portion differs from the at least one of speaking volume, pitch, range, tone, or pace of speaking of the audio profile. ([0013]: detecting the range and characteristics of the received voice data; comparing the range and characteristics of the detected voice data with the characteristics of the previously obtained reference voice model (e.g. audio profile).)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Lenke and Shaw to include above limitations. One would have been motivated to do so because the foregoing voice recognition system of the prior art discriminates the entered voices by the previously established reference voice model. Therefore, when the reference voice model is erroneously established due to noise, incorrect pronunciation of the user or etc. in establishing the reference model, the voice recognition rate may degrade. Also, repeating the voice training is required for accurate establishment of the reference voice model so that the voices should be repeatedly entered by the user thereby causing the user troublesome. It is therefore an object of the invention, which is proposed to solve the foregoing problems, to provide a method in which voice characteristics are extracted from voice data entered by a user for voice recognition and compared to an established reference voice model, and then, when the voice recognition succeeded, corresponding commands are performed and the voice data are reflected to the previously established reference voice model so that effect 

Regarding claim 17, Lenke teaches the method of claim 15.
Lenke teaches determining whether the first audio portion comprises human speech that is extraneous to the conference content,  ([0033]: detecting whether the user is speaking. Abstract: detecting, when the device is set to mute, whether the user is speaking and whether the speech is meant for the conference. [0035] : Based on such a determination, and by the system distinguishing between talking to the conference and background noise or side speech, the component can automatically unmute the device, such that the speech provided by the user will be heard by other users in the communication session, or mute the device.)
further comprising, determining that at least one of the speaking volume, pitch, range, tone, or pace of speaking of the first audio portion differs from the at least one of speaking volume, pitch, range, tone, or pace of speaking of the audio profile and that the difference is greater than a previously determined threshold. ([0015]: Detecting whether the first user is speaking and intending to speak in the conference can be based at least in part on one or more of a voice detection module, facial recognition data, gaze detection data, background noise, motion detection, and audio volume data. The classifier may also be trained on the volume of the speech. [0035]: The speech may be at a certain volume, or particular words might be used that relate to the conference or not, and thus the content of the speech may be used to determine user intent to be part of the conference.)

However, Shaw teaches accessing an audio profile of a participant, wherein in the audio profile characterizes speech provided by the participant while contributing speech to the conference content. ([0035]: the enrollment process may be used to establish a unique user profile. For example, the user inputs enrollment data which may include reading a predetermined passage into the input of the system to train the system to recognize the user's voice.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Lenke to include above limitations. One would have been motivated to do so because in prior art systems, users with unique manners of speech, regional accents, dialects, foreign accents, speech impediments or the like have faced difficulty in voice recognition. It is desirable to “train” a voice recognition system to recognize different speech patterns and sounds. As taught by Shaw, [0007].
Lenke and Shaw do not explicitly disclose the audio profile comprising at least one of speaking volume, pitch, range, tone, or pace of speaking; and determining whether the audio portion is extraneous to the conference content, further comprising, determining that at least one of the speaking volume, pitch, range, tone, or pace of speaking of the audio portion differs from the at least one of speaking volume, pitch, range, tone, or pace of speaking of the audio profile.
However, Lim teaches the audio profile comprising at least one of speaking volume, pitch, range, tone, or pace of speaking; and ([0025]:  the voice recognition processing unit, after repeatedly inputted with a specific voice range, obtains reference voice models of the voice data  reference voice models (e.g. audio profile) into a memory.)
determining whether the audio portion is extraneous to the conference content, further comprising, determining that at least one of the speaking volume, pitch, range, tone, or pace of speaking of the audio portion differs from the at least one of speaking volume, pitch, range, tone, or pace of speaking of the audio profile. ([0013]: detecting the range and characteristics of the received voice data; comparing the range and characteristics of the detected voice data with the characteristics of the previously obtained reference voice model (e.g. audio profile).)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Lenke and Shaw to include above limitations. One would have been motivated to do so because the foregoing voice recognition system of the prior art discriminates the entered voices by the previously established reference voice model. Therefore, when the reference voice model is erroneously established due to noise, incorrect pronunciation of the user or etc. in establishing the reference model, the voice recognition rate may degrade. Also, repeating the voice training is required for accurate establishment of the reference voice model so that the voices should be repeatedly entered by the user thereby causing the user troublesome. It is therefore an object of the invention, which is proposed to solve the foregoing problems, to provide a method in which voice characteristics are extracted from voice data entered by a user for voice recognition and compared to an established reference voice model, and then, when the voice recognition succeeded, corresponding commands are performed and the voice data are reflected to the previously established reference voice model so that effect of repeating training on the user voices can be expected thereby increasing the voice recognition rate. As taught by Lim, [0011]-[0012].

Claims 6 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Lenke (US 20200110572 A1) in view of Shaw (US 20050288930 A1), and in view of Lim (US 20010003173 A1), and further in view of Weisman (US 20040047461 A1).
Regarding claim 6, Lenke, Shaw and Lim teach the conference server of claim 4.
Lim teaches wherein the audio profile comprises at least one of the speaking volume, pitch, range, tone, or pace of speaking as sampled from the conference content. ([0013]: detecting the range and characteristics of the received voice data; comparing the range and characteristics of the detected voice data with the characteristics of the previously obtained reference voice model (e.g. audio profile).)
Lenke, Shaw and Lim do not explicitly disclose that follows the participant being addressed by name by another participant associated with a different one of the plurality of endpoints.
However, Weisman teaches that follows the participant being addressed by name by another participant associated with a different one of the plurality of endpoints. ([0049]: to designate a participant as the speaker, and to pass such designation to subsequent participants. [0162]: As the current speaker is voluntarily yielding to another participant he has selected, the outgoing speaker may customarily say, “And with that, I pass to WizKid,” or some similar statement.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Lenke, Shaw and Lim to include above limitations. One would have been motivated to do so because in a conference call, there is a need for a way to 

Regarding claim 14, Lenke, Shaw and Lim teach the conference server of claim 10.
Lenke teaches upon determining the encoded sound comprises speech. ([0033]: detecting whether the user is speaking. Abstract: detecting, when the device is set to mute, whether the user is speaking and whether the speech is meant for the conference.)
Lenke, Shaw and Lim do not explicitly disclose that follows the participant being addressed by name by another participant associated with a different one of the plurality of endpoints.
However, Weisman teaches that follows the participant being addressed by name by another participant associated with a different one of the plurality of endpoints. ([0049]: to designate a participant as the speaker, and to pass such designation to subsequent participants. [0162]: As the current speaker is voluntarily yielding to another participant he has selected, the outgoing speaker may customarily say, “And with that, I pass to WizKid,” or some similar statement.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Lenke, Shaw and Lim to include above limitations. One would have been motivated to do so because in a conference call, there is a need for a way to designate a participant as the speaker, and to pass such designation to subsequent participants. As taught by Weisman, [0049].

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Lenke (US 20200110572 A1) in view of Shaw (US 20050288930 A1), and further in view of Weisman (US 20040047461 A1).
Regarding claim 7, Lenke and Shaw teach the conference server of claim 3.
Lenke and Shaw do not explicitly disclose wherein the processor determines that the audio profile of the participant upon detecting the conference content comprises a name and, following the name, hearing speech from the participant.
However, Weisman teaches wherein the processor determines that the audio profile of the participant upon detecting the conference content comprises a name and, following the name, hearing speech from the participant. ([0049]: to designate a participant as the speaker, and to pass such designation to subsequent participants. [0162]: As the current speaker is voluntarily yielding to another participant he has selected, the outgoing speaker may customarily say, “And with that, I pass to WizKid,” or some similar statement. [0050]: provide a clear status display at least to the current and subsequent speaker.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Lenke and Shaw to include above limitations. One would have been motivated to do so because in a conference call, there is a need for a way to designate a participant as the speaker, and to pass such designation to subsequent participants. As taught by Weisman, [0049].

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Lenke (US 20200110572 A1) in view of IP.COM (Technique for detection of a person in a conference call, to detect a user action requiring unmute/mute, and to do it automatically).

Lenke does not explicitly disclose accessing an audio profile of a participant, wherein in the audio profile characterizes speech provided by the participant with regard to a sound attribute comprising a first spoken language; and determining whether the audio portion comprises human speech that is extraneous to the conference content, further comprising, determining if the audio portion comprises a second spoken language.
However, IP.COM teaches accessing an audio profile of a participant, wherein in the audio profile characterizes speech provided by the participant with regard to a sound attribute comprising a first spoken language; and determining whether the audio portion comprises human speech that is extraneous to the conference content, further comprising, determining if the audio portion comprises a second spoken language. (Page 1: Technique to detect when the user is speaking in NATIVE (non-preferred) language (e.g. second spoken language) and MUTE the phone, and alternatively when speaking in the "Preferred" language (e.g. first spoken language) of the conference, UNMUTE the phone. Page 2: An example of this is, the conference being in ENGLISH, and user is having a side conversation with his family (at home) in his/her native language (Hindi or German).)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Lenke to include above limitations. One would have been motivated to do so because the conference being conducted in ENGLISH, and user is having a side conversation with his family (at home) in his/her native language (Hindi or German) and only wants UNMUTE to happen when he/she starts to speak in English. This allows the user to multi-task. As taught by IP.COM, Page 2.

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Lenke (US 20200110572 A1) in view of Shawn T. (Muting Yourself and Participants in Webex).
Regarding claim 9, Lenke teaches the conference server of claim 1.
Lenke does not explicitly disclose wherein the processor further performs, causing each of the plurality of endpoints to present indicia of the muting action associated with the contributing endpoint.
However, Shawn T. teaches wherein the processor further performs, causing each of the plurality of endpoints to present indicia of the muting action associated with the contributing endpoint. (Page 1 Image 1: Display participants mute/unmute status in the Participants panel of the Webex interface. Page 2 Paragraph 1: Your mute status appears in the meeting controls and the Participants panel.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Lenke to include above limitations. One would have been motivated to do so because in order to prevent unwanted noise in the meeting, event, or training session, Participants may be muted or unmuted by the Host. It is desirable for the host to know who is on mute/unmute.

Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Lenke (US 20200110572 A1) in view of Weisman (US 20040047461 A1).
Regarding claim 20, Lenke teaches the method of claim 18.
Lenke teaches upon determining the encoded sound comprises speech. ([0033]: detecting whether the user is speaking. Abstract: detecting, when the device is set to mute, whether the user is speaking and whether the speech is meant for the conference.)

However, Weisman teaches that follows the participant being addressed by name by another participant associated with a different one of the plurality of endpoints. ([0049]: to designate a participant as the speaker, and to pass such designation to subsequent participants. [0162]: As the current speaker is voluntarily yielding to another participant he has selected, the outgoing speaker may customarily say, “And with that, I pass to WizKid,” or some similar statement.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Lenke, Shaw and Lim to include above limitations. One would have been motivated to do so because in a conference call, there is a need for a way to designate a participant as the speaker, and to pass such designation to subsequent participants. As taught by Weisman, [0049]. 

Response to Arguments
Applicant's arguments, see pages 8-9 (filed 02/16/2022), with respect to the Double Patenting rejection(s) to be held in abeyance until allowable subject matter is identified is acknowledged. The double patenting rejection is maintained and updated in view of the claim amendment.
Applicant’s arguments, see pages 9-15, filed 01/18/2022, with respect to the rejection(s) of claim(s) 1-20 under 35 U.S.C. § 102 and 35 U.S.C. § 103 have been fully considered but are moot in view of new ground(s) of rejection. 

In response to applicant’s arguments, it is noted that Lenke teaches in [0015] that “the classifier to determine whether first, the audio is speech and second, whether the audio is intended for a conference call or a video communication session. Speaker identification can also be included in the analysis to insure that the speech is from the expected individual who is participating in a conference session. If the speaker identification component determines that the user is not an expected participant in the call, then mute the device for that speaker” which teaches the limitation of “the audio portion comprises human speech that is extraneous to the conference content”. The new paragraphs are cited in view of the claim amendment.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.                                                                                                                                                                                             

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emmanuel Moise can be reached on 5712723865. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/ZI YE/Primary Examiner, Art Unit 2455