DETAILED ACTION
Introduction
This office action is in response to Applicant’s submission filed on 04/1//2022. Claims 1-5, 7-12, 14-17 and 19-23 are pending in the application and have been examined.
Notice of Pre-AIA  or AIA  Status
The present application is being examined under the pre-AIA  first to invent provisions. 
Response to Amendment
The response filed on 04/11/2022 has been correspondingly accepted and considered in this Office Action. Claims 1-5, 7-12, 14-17 and 19-23  have been examined. Claims 6, 13 and 18 have been cancelled. Applicant’s amendments to claims 1, 10, 16 to incorporate subject matter based on original dependent claims 6, 13 and 18 respectively overcome the 35 U.S.C 101 rejections previously set forth in the Non-Final Office Action mailed 02/01/2022. The dependent claims 7-9, 14-15 and 19-20 overcome the 35 U.S.C 101 rejections previously set forth in the Non-Final Office Action mailed 02/01/2022 based on their dependency to the amended claim 1, 10 and 16 respectively. Therefore, the above referenced rejections under 35 U.S.C. 101 are withdrawn.
Response to Arguments
Applicant's arguments filed 04/11/2022  have been fully considered as follows:
Applicant’s arguments with respect to claim 1 on pg. 10 state that
“Applicant respectfully submits that even if par. 0032 of Lee teaches that if “a user
pauses during the content, the upcoming time stamp remains valid. The time stamping
may also include a duration of the trigger word such that all of the potential triggers may
be ignored,” Lee does not suggest detecting a “silence pattern anomaly” using a “trained
audio activity model.”
	
Applicant’s arguments above with respect to claim 1 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Applicant’s further arguments with respect to prior claim 6 on p. 10 states that
“Sun et al. teach the comparison of an instantaneous signal level to a threshold, and does not teach evaluating “a length of an audio activity portion of the audio signal relative to a length of a silence portion of the audio signal to identify the silence pattern anomaly,” as currently claimed..”

The Examiner respectfully disagrees. Sun teaches if speaker events are  detected/determined and they are detected/determined to last for at least a certain period of time, then an unmute notification is generated. The notification can be displayed on a corresponding monitor or terminal for the speaker corresponding to the speaker event (see Sun, [0056]). Therefore, Sun teaches evaluating the speaker (silence) event which happens during a speaking activity (audio activity) for a certain period of time to determine the length of the silence portion relative to the length of an audio activity portion to determine a silence pattern anomaly.
Applicant’s further arguments with respect to prior claim 1 on p. 11 states that
“Bhattacharjee et. al. does not teach “evaluat(ing) one or more features provided by a communication application relative to 5 one or more thresholds, wherein the communication application is provided by a different provider than a provider of the communication issue detector,” as currently claimed.”

The Examiner respectfully disagrees. Bhattacharjee teaches one or more audio resources monitoring a multimedia conference event serviced by the multimedia conferencing server 130 which is a different provider than the provider for the audio monitor module (see Bhattacharjee, col 10, lines 10-23) and therefore, the rejections of Claims 1, 10 and 16 are rejected under 35 U.S.C. 103 are sustained and further updated accordingly.
In response to the art rejection(s) of the remainder of dependent claims are rejected under 35 U.S.C 103, in case said claims are correspondingly discussed and/or argued for at least the same rationale presented in Remarks filed 04/11/2022, Examiner respectfully notes as follows. For completeness, should the mentioned claims are likewise traversed for similar reasons to independent claims 1, 10 and 16 correspondingly, Examiner respectfully directs Applicant to the same previous supra reasons provided in the response directed towards claims 1, 10 and 16 correspondingly discussed above. For at least the same supra provided reasons, Examiner likewise respectfully disagrees, and Applicant's arguments have been fully considered but they are not persuasive.
Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1-3, 8, 9, 10, 15, 16 and 20-23  are rejected under 35 U.S.C. 103 as being unpatentable over Lee, et. al. (US Patent Application Publication 2019/0341035) in view of Sun et. al. (US Patent Application Publication 2015/0156598) further in view of Bhattacharjee et. al. (US Patent 8,713,440).
Regarding claim 1, Lee teaches a method, comprising: applying a representation of an audio signal associated with a communication to at least two of: (i) a trigger word analysis module that determines if one or more trigger words are detected in the audio signal from the audio signal using a trained trigger word detection model that evaluates contextual 
    PNG
    media_image1.png
    354
    223
    media_image1.png
    Greyscale
information, wherein the one or more trigger words are indicative of a communication issue associated with the communication  (see Lee, [0029] teaches the method 200 initiates where voice command trigger words are identified. This is illustrated at step 201. In embodiments, a data store can include a table of all trigger words and corresponding actions to be executed for each respective trigger word. These can be stored on local memory of the VCD ; command trigger words are identified as trigger words indicative of a communication issue associated with the communication); (ii) an audio activity pattern analysis module that determines if a silence pattern anomaly is detected in the audio signal using a trained audio activity mode (see Lee, [0032] teaches a time stamp is generated for each identified trigger word (and/or word that resembles a trigger word). This is illustrated at step 203. The generated time stamp corresponds to a time in the media content that the trigger words is recited. Accordingly, if a user pauses during the content, the upcoming time stamp remains valid. The time stamping may also include a duration of the trigger word such that all of the potential triggers may be ignored); trigger word model is interpreted to detect the silence pattern anomaly) but fails to teach, wherein the audio activity pattern analysis module evaluates a length of an audio activity portion of 15the audio signal relative to a length of a silence portion of the audio signal to identify the silence pattern anomaly;(iii) a communication application analysis module that evaluates one or more features provided by a communication application relative to one or more thresholds, wherein the communication application is provided by a different provider than a 20provider of the communication issue detector; combining results of the at least two of the trigger word analysis module, the audio activity pattern analysis module and the communication application analysis module to identify a communication issue for the communication; and implementing one or more remedial actions responsive to the identification of the communication issue, wherein the method is performed by at least one processing device comprising a processor coupled to a memory. 
However, Sun teaches (ii) an audio activity pattern analysis module that determines if a silence pattern anomaly is detected in the audio signal using a trained audio activity mode , wherein the audio activity pattern analysis module evaluates a length of an audio activity portion of 15the audio signal relative to a length of a silence portion of the audio signal to identify the silence pattern anomaly (see Sun, [0056] if speaker events are  detected/determined and they are detected/determined to last for at least a certain period of time, then an unmute notification is generated. The notification can be displayed on a corresponding monitor or terminal for the speaker corresponding to the speaker event; the speaker is interpreted as the (silence) event which happens during a speaking activity (interpreted as audio activity) for a certain period of time to determine the length of the silence portion relative to the length of an audio activity portion to determine a silence pattern anomaly ).
Lee and Sun are considered to be analogous to the claimed invention because they relate to audio processing in media content. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Lee to use the voice command processing based on trigger words with the silence event teachings of Sun to get notifications of mute/unmute in conferencing systems (see Sun, [0002]).
Lee and Sun fail to teach (iii) a communication application analysis module that evaluates one or more features provided by a communication application relative to one or more thresholds, wherein the communication application is provided by a different provider than a 20provider of the communication issue detector; combining results of the at least two of the trigger word analysis module, the audio activity pattern analysis module and the communication application analysis module to identify a communication issue for the communication; and implementing one or more remedial actions responsive to the identification of the communication issue, wherein the method is performed by at least one processing device comprising a processor coupled to a memory.

    PNG
    media_image2.png
    468
    708
    media_image2.png
    Greyscale
However, Bhattacharjee teaches (iii) a communication application analysis module that evaluates one or more features provided by a communication application relative to one or more thresholds, wherein the communication application is provided by a different provider than a 20provider of the communication issue detector (see Bhattacharjee, col 10, lines 24-37 teaches the audio monitor module 220 may determine at least one audio quality parameter for an audio connection 202-1-f is lower than a defined threshold value to form an audio quality warning state. The audio monitor module 220 may be arranged to monitor various real time quality indicators using certain heuristics. If the audio quality falls below a predefined threshold then it will surface the various options for the user to improve the audio performance; see Bhattacharjee, col 10, lines 10-23 the audio management component 134 includes an audio monitor module 220. The audio monitor module 220 may be arranged to monitor one or more audio connections 202-1-f for a multimedia conference event serviced by the multimedia conferencing server 130. The audio connections 202-1-f may represent various audio connections established between various meeting consoles 110-1-m participating in the multimedia conference event; one or more audio resources monitoring a multimedia conference event serviced by the multimedia conferencing server 130 is interpreted as different provider than the provider for the audio monitor module); combining results of the at least two of the trigger word analysis module, the audio activity pattern analysis module and the communication application analysis module to identify a communication issue for the communication (see Bhattacharjee, col 12, lines 52-67 teaches the GUI view 300-2 may represent a second GUI view providing a connection troubleshooter to assist an operator through the various options available to improve audio quality in response to the audio quality warning state message); and implementing one or more remedial actions responsive to the identification of the communication issue, wherein the method is performed by at least one processing device comprising a processor coupled to a memory (see Bhattacharjee, col 12, lines 52-67 teaches the GUI view 300-2 may include a "Details" tab 312 and an "Advanced" tab 314. When the Details tab 312 is selected by an operator, a text box 314 may appear with a first message generated by the audio message module 230. In the illustrated embodiment shown in FIG. 3, the text box 314 may include a message describing an audio quality warning state (e.g., "We have detected that voice quality has degraded") and a reason for the audio quality warning state (e.g., "Excessive packet loss on the network"). The Details tab 312 may further display an option button 316 with a label indicating an option to establish and use an alternate audio connection 202-1-f (e.g., "Call me on my cell phone"); method implemented as described in Bhattacharjee, col 15, lines 12-17).
Lee, Sun and Bhattacharjee are considered to be analogous to the claimed invention because they relate to audio processing in media content. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Lee and Sun to use the voice command processing based on trigger words with the communications resources management for the various participants of a meeting teachings of Bhattacharjee to improve the user experience and convenience in a virtual meeting environment (see Bhattacharjee, col 3, lines 22-33). 
Regarding claim 2, Lee, Sun and Bhattacharjee teach the method of claim 1. Lee further teaches wherein the representation of the audio signal comprises one or more spectrograms associated with the communication (see Lee, [0031] teaches in some embodiments, trigger words are identified based on a Fast Fourier Transform (FFT) comparison to known trigger words).
Regarding claim 3, Lee, Sun and Bhattacharjee teach the method of claim 1. Lee further teaches wherein the trained trigger word detection model is trained by evaluating one or more of a nearby trigger word feature, a meeting time feature and a silence pattern feature relative to a user-provided feedback score (see Lee, [0039] teaches the media streaming device 310 then instructs the voice command device 320 to ignore audio content based on the time stamp. This is illustrated at step 316. The voice command device receives the instruction and ignores the audio content based on the time stamp. This is illustrated at step 322. In embodiments, ignoring can be completed based on a direction of the media streaming device 310. In embodiments, ignoring can be completed for any direction based on the time stamp. In some embodiments, only recognized voices are permitted, and any other audio input at the VCD 320 is ignored; interpreted as trigger word feature ignored related to time stamp which is interpreted as based on user feedback).
Regarding claim 8, Lee, Sun and Bhattacharjee teach the method of claim 1. Bhattacharjee further teaches wherein the combining evaluates an accuracy of each of the at least two of the trigger word analysis module, the audio activity pattern analysis module and the communication application analysis module to combine the at least two results (see Bhattacharjee, col 12, lines 52-67 teaches the GUI view 300-2 may represent a second GUI view providing a connection troubleshooter to assist an operator through the various options available to improve audio quality in response to the audio quality warning state message. The GUI view 300-2 may include a "Details" tab 312 and an "Advanced" tab 314. When the Details tab 312 is selected by an operator, a text box 314 may appear with a first message generated by the audio message module 230. In the illustrated embodiment shown in FIG. 3, the text box 314 may include a message describing an audio quality warning state (e.g., "We have detected that voice quality has degraded") and a reason for the audio quality warning state (e.g., "Excessive packet loss on the network"); the processing of the warning message is interpreted as combining at least two of audio activity and communication application analysis).
Regarding claim 9, Lee, Sun and Bhattacharjee teach the method of claim 1. Bhattacharjee further teaches wherein the features provided by the communication application comprise one or more of an audio device not found feature, a number of screen share sessions feature, a number of connection attempts feature and a poor connection events feature (see Bhattacharjee, col 13, lines 42-52 teaches as shown in FIG. 4, the logic flow 400 may monitor multiple audio connections for a multimedia conference event at block 402. For example, the audio monitor module 220 may monitor multiple audio connections 202-1-f for a multimedia conference event. The audio monitor module 220 may measure or generate statistics for various features or characteristics of the audio connections 202-1-f, and compare the measured statistics with corresponding static or dynamic threshold values. The audio monitor module 220 may monitor the audio connections 202-1-f on a continuous, periodic or on-demand basis as desired for a given implementation).
Regarding claim 10, is directed to an apparatus claim corresponding to the method claim presented in claim 1 and is rejected under the same grounds stated above regarding claim 1.
Regarding claim 15, is directed to an apparatus claim corresponding to the method claim presented in claim 8 and is rejected under the same grounds stated above regarding claim 8.
Regarding claim 16, is directed to a non-transitory processor-readable storage medium claim corresponding to the method claim presented in claim 1 and is rejected under the same grounds stated above regarding claim 1.
Regarding claim 20, is directed to a non-transitory processor-readable storage medium claim corresponding to the method claim presented in claim 8 and is rejected under the same grounds stated above regarding claim 8.
Regarding claim 21, is directed to an apparatus claim corresponding to the method claim presented in claim 2 and is rejected under the same grounds stated above regarding claim 2.
Regarding claim 22, is directed to an apparatus claim corresponding to the method claim presented in claim 3 and is rejected under the same grounds stated above regarding claim 3.
Regarding claim 23, is directed to a non-transitory processor-readable storage medium claim corresponding to the method claim presented in claim 3 and is rejected under the same grounds stated above regarding claim 3.
Claims 4, 5, 11, 12 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Lee, et. al. (US Patent Application Publication 2019/0341035) in view of Sun et. al. (US Patent Application Publication 2015/0156598) further in view of Bhattacharjee et. al. (US Patent 8,713,440) further in view of Prasad et. al. (US Patent 9,697,828).
Regarding claim 4, Lee, Sun and Bhattacharjee teach the method of claim 1, but fail to teach wherein the applying the audio signal to the trigger word analysis module further comprises evaluating a relevance score generated by the trained trigger word detection model. However, Prasad teaches wherein the applying the audio signal to the trigger word analysis module further comprises evaluating a relevance score generated by the trained trigger word detection model (see Prasad, col 5, lines 20-28 teach in order to reduce or minimize false detections, the wake word detector 100 may use information in addition to acoustic features associated with the wake word when computing wake word detection scores. As shown in FIG. 1, the wake word detector 100 may use other acoustic information 102, environmental information 104, lexical and/or semantic information 106, contextual information 108, some combination thereof, etc. Illustratively, a user 150 may make an utterance in the presence of a device that implements the wake word detector 100).
Lee, Sun, Bhattacharjee and Prasad are considered to be analogous to the claimed invention because they relate to audio processing in media content. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Lee and Bhattacharjee to use the voice command processing based on trigger words for communications resources management in a virtual meeting with the confidence scoring teachings of Prasad to improve the accuracy of trigger word detection (see Prasad, col 1, lines 32-37).
Regarding claim 5, Lee, Sun and Bhattacharjee teach the method of claim 1, but fail to teach wherein the trained trigger word detection model is trained using a set of trigger words, a plurality of additional words and a plurality of background samples. However, Prasad teaches wherein the trained trigger word detection model is trained using a set of trigger words, a plurality of additional words and a plurality of background samples (see Prasad, col 11 lines 25-30 teaches FIG. 5 illustrates a process 500 that can be performed to generate customized detection models for individual users, or for groups of users that exhibit similar wake word detection system usage patterns (e.g., similar contextual information, similar environmental information, similar acoustic information, some combination thereof, etc.)).
Regarding claim 11, is directed to an apparatus claim corresponding to the method claim presented in claim 4 and is rejected under the same grounds stated above regarding claim 4.
Regarding claim 12, is directed to an apparatus claim corresponding to the method claim presented in claim 5 and is rejected under the same grounds stated above regarding claim 5.
Regarding claim 17, is directed to a non-transitory processor-readable storage medium claim corresponding to the method claim presented in claim 4 and is rejected under the same grounds stated above regarding claim 4.
Claims 7, 14 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Lee, et. al. (US Patent Application Publication 2019/0341035) in view of Sun et. al. (US Patent Application Publication 2015/0156598) further in view of Bhattacharjee et. al. (US Patent 8,713,440) further in view of Carlough et. al. (US Patent Application Publication 2018/0096384).
Regarding claim 7, Lee, Sun and Bhattacharjee teach the method of claim 1, but fail to teach wherein the combining employs an ensemble model that combines the at least two results to identify the communication issue for the communication. However, Carlough teaches wherein the combining employs an ensemble model that combines the at least two results to identify the communication issue for the communication (see Carlough, [0047, 0049] The predictive features/variables are used to estimate different kinds of models using different estimation techniques. These models can include, but are not limited to: linear models, dynamic linear models, stochastic process models, hierarchical temporal memory models, gradient boosted trees, recurrent neural networks with long-short term memory, and/or the like. These model types are evaluated according to standard fit and hold-out prediction statistics (e.g., MAPE, AIC, BIC, etc.) to select the highest value models for use in ensemble modeling. The selected models have a further estimation run on top of them, resulting in the application of a weight to each model form. A combined ensemble model is then constructed from the results from the selected initial models and their weights and is packaged as the final predicted metric values 74 used in the comparison. Remediation activity module 96 of system 72, as executed by computer system/server 12, is configured to perform a remediation activity in response to a determination that the response metric value is anomalous. This remediation action can take one or more of many different forms. In an embodiment, remediation activity module 96 may perform data cleansing and validation to fix issues in the communication 84 on the fly and resend the fixed communication 84. In an embodiment, remediation activity module 96 may update the data models with the anomalous data. In an embodiment, remediation activity module 96 may forward an alert that advises user 80 of the anomaly. This alert can include one or more explanations, charts, graphs, etc., and can be provided to user 80 via email, SMS messaging, push service, and/or the like).
Lee, Bhattacharjee and Carlough are considered to be analogous to the claimed invention because they relate to audio processing in media content. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Lee and Bhattacharjee to use the voice command processing based on trigger words for communications resources management in a virtual meeting with the anomalies detection in electronic communications teachings of Carlough to perform remediation activity based on the response metric value (see Carlough, [0004]).
Regarding claim 14, is directed to an apparatus claim corresponding to the method claim presented in claim 7 and is rejected under the same grounds stated above regarding claim 7.
Regarding claim 19, is directed to a non-transitory processor-readable storage medium claim corresponding to the method claim presented in claim 7 and is rejected under the same grounds stated above regarding claim 7.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Heda et.al. (US Patent Application Publication 2015/0280970) teaches detection of hardware/software failure in a distributed audio-video conference environment (see Heda, [0022]). 
Baran et. al. (US Patent Application Publication 2016/0182727) teaches mute detectors to take various different actions when a user is determined to be speaking while mute is on. (see Baran,[0020]).
Klemm (US Patent Application Publication 2013/0006881) teaches processing relevant customer feedback on user’s experience with a product or service (see Klemm, [0033]). 
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NANDINI SUBRAMANI whose telephone number is (571)272-3916. The examiner can normally be reached Monday - Friday 2:00pm - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh M Mehta can be reached on (571)272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/NANDINI SUBRAMANI/Examiner, Art Unit 2656                                                                                                                                                                                                        
/BHAVESH M MEHTA/Supervisory Patent Examiner, Art Unit 2656