DETAILED ACTION
Claims 1-30 are pending.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant’s submission filed on 23 December 2020 has been entered.
Response to Amendment
With regard to the Final Office Action from 16 November 2020, the Applicant has filed a response on 23 December 2020.
EXAMINER’S AMENDMENT
An Examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to Applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this Examiner’s amendment was given in an interview with Matthew Davison on 25 February 2021.
Please amend claims 1, 5, 7, 10, 17, 18, 19, 20, 23, 24, 25 and 28 as follows:
Claim 1 (currently amended)
A device for performing audio analytics, the device comprising:
a memory configured to store one or more category labels associated with one or more categories of a natural language processing library; and
[AltContent: connector]a processor coupled to the memory and configured to:
analyze first input audio data to generate a first text string from a first portion of the first input audio data that represents speech;
[AltContent: connector]perform natural language processing on at least the first text string to generate an output text string comprising an action associated with at least one of a first device, a speaker, a location, or a combination thereof; 
in response to determining that a second portion of the first input audio data that represents sound distinct from the speech does not match audio data corresponding to any of the one or more categories: 
[AltContent: connector]add a new category to the one or more categories, the new category including a new category label and the second portion of the first input audio data;
associate the new category label with at least part of the output text string; and
generate a notification indicating the new category label;
analyze second input audio data to generate a second text string from a first portion of the second input audio data that represents speech; and
in response to determining that a second portion of the second input audio data corresponds to the new category, perform natural language processing on at least the second text string based on the at least part of the output text string.

Claim 5 (currently amended)
[AltContent: connector]The device of claim 1, wherein the processor is further configured to, in response to the first input audio data matching a category label of the one or more category labels, perform additional natural language processing based on the category label.
Claim 7 (currently amended)
[AltContent: connector]The device of claim 1, wherein the processor is further configured to:
determine an emotion associated with the first input audio data;
associate the emotion with the new category label; and
update the one or more categories based on the new category label.
Claim 10 (currently amended)
[AltContent: connector]The device of claim 7, wherein the processor is further configured to:
determine that third input audio data matches the new category label;
determine an operation based on the emotion; and
initiate performance of the operation.
Claim 17 (currently amended)
[AltContent: connector]The device of claim 1, further comprising a microphone configured to receive the first input audio data.
Claim 18 (currently amended)
[AltContent: connector]The device of claim 1, further comprising a camera configured to capture image data associated with the first input audio data, wherein natural language processing is performed based further on one or more text strings based on the image data.


Claim 19 (currently amended)
[AltContent: connector]The device of claim 1, further comprising a comprising a position sensor configured to determine a position associated with the first input audio data, wherein the natural language processing is performed based further on one or more text strings and based on the position.
Claim 20 (currently amended)
[AltContent: connector]A method for performing audio analytics, the method comprising
analyzing, at a processor, first input audio data to generate a first text string from a first portion of the first input audio data that represents speech;
[AltContent: connector]performing natural language processing on at least the first text string to generate an output text string comprising an action associated with at least one of a device, a speaker, a location, or a combination thereof; 
in response to a second portion of the first input audio data that represents sound distinct from the speech not matching audio data corresponding to any of one or more categories of a natural language processing library: 
[AltContent: connector]adding a new category to the one or more categories, the new category including a new category label and the second portion of the first input audio data;
associating the new category label with at least part of the output text string; and
generating a notification indicating the new category label;
analyzing second input audio data to generate a second text string from a first portion of the second input audio data that represents speech; and

Claim 23 (currently amended)
[AltContent: connector]The method of claim 20, further comprising:
analyzing third input audio data;
[AltContent: connector]determining that the third input audio data matches input audio data of a particular category of the one or more categories; and
performing second natural language processing on the third input audio data based on a particular label of the particular category.
Claim 24 (currently amended)
[AltContent: connector]The method of claim 20, further comprising determining a count of times that the first input audio data is identified, wherein the notification is generated based at least in part on the count satisfying a threshold.
Claim 25 (currently amended)
An apparatus comprising:
means for storing one or more category labels associated with one or more categories of a natural language processing library; and
[AltContent: connector]means for processing, the means for processing configured to:
analyze first input audio data to generate a first text string from a first portion of the first input audio data that represents speech;
perform natural language processing on at least the first text string to generate an output text string comprising an action associated with at least one of a first device, a speaker, a location, or a combination thereof; 
[AltContent: connector]in response to determining that a second portion of the first input audio data that represents sound distinct from the speech does not match audio data corresponding to any of the one or more categories: 
[AltContent: connector]add a new category to the one or more categories, the new category including a new category label and the second portion of the first input audio data;
associate the new category label with at least part of the output text string; and
generate a notification indicating the new category label; 
analyze second input audio data to generate a second text string from a first portion of the second input audio data that represents speech; and
in response to determining that a second portion of the second input audio data corresponds to the new category, perform natural language processing on at least the second text string based on the at least part of the output text string.
Claim 28 (currently amended)
[AltContent: connector]A non-transitory, computer readable medium storing instructions that, when executed by a processor, cause the processor to perform operations comprising:
analyzing first input audio data to generate a first text string from a first portion of the first input audio data that represents speech;
[AltContent: connector]performing natural language processing on at least the first text string to generate an output text string comprising an action associated with at least one of a device, a speaker, a location, or a combination thereof; 
in response to determining that a second portion of the first input audio data that represents sound distinct from the speech does not match audio data corresponding to any of one or more categories of a natural language processing library:
[AltContent: connector]adding a new category to the one or more categories, the new category including a new category label and the second portion of the first input audio data;
associating the new category label with at least part of the output text string; and
generating a notification indicating the new category label; 
analyze second input audio data to generate a second text string from a first portion of the second input audio data that represents speech; and
in response to determining that a second portion of the second input audio data corresponds to the new category, perform natural language processing on at least the second text string based on the at least part of the output text string.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 
The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked.

(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 

The application recites
“means for storing one or more category labels …” in claim 25;
“means for processing …” in claim 25;
“means for displaying a graphical user interface …” in claim 26;
“means for transmitting the notification …” in claim 27; and
“means for receiving one or more …” in claim 27.
The components attributed to cover these by the specification are provided as a memory, processor, a transmitter and receiver (Specification: [0022]), a display device such as a touchscreen or a monitor (Specification: [0037]) and a wireless communications device which can incorporate wireless communications protocols (Specification: [0105]-[0106]).
Allowable Subject Matter
Claims 1-30 are allowed.
The following is an Examiner’s statement of reasons for allowance:
With regard to claim 1, the invention states:
A device for performing audio analytics, the device comprising:
a memory configured to store one or more category labels associated with one or more categories of a natural language processing library; and
a processor coupled to the memory and configured to:

perform natural language processing on at least the first text string to generate an output text string comprising an action associated with at least one of a first device, a speaker, a location, or a combination thereof; 
in response to determining that a second portion of the first input audio data that represents sound distinct from the speech does not match audio data corresponding to any of the one or more categories: 
add a new category to the one or more categories, the new category including a new category label and the second portion of the first input audio data;
associate the new category label with at least part of the output text string; and
generate a notification indicating the new category label;
analyze second input audio data to generate a second text string from a first portion of the second input audio data that represents speech; and
in response to determining that a second portion of the second input audio data corresponds to the new category, perform natural language processing on at least the second text string based on the at least part of the output text string.
Closest Prior Art
The reference of Lindahl et al (US 2011/0166856 A1) provides teaching for storing noise profiles of ambient sounds as category labels [0039], a processor [0025], analysing a voice signal to get a voice command [0040], a microphone for receiving both speech and ambient sounds ([0041], FIG. 5), with ambient sound and user speech being received at approximately the same time [0035], creating a new noise profile (sound 
Another reference of Pasupalak et al (US 2015/0066479 A1) provides teaching for a speech recognition engine that can generate a textual representation of a speech query, a natural language processing engine that can derive intent on a user query [0006], NLP processing commands related to a location [0081], providing services involving the use of a speaker and a first device [0044], converting an audio query to a text a string [0060], real-time adding of new domains to update user queries and common phrases [0189], and informing a user that a new category has been added [0191].
A further reference of HENRIQUE BARBOSA POSTAL et al (US 2017/0263266 A1) teaches of storing an unknown sound to create a new label added to label data [0007].
Another reference of Mixter et al (US 2018/0096690 A1) provides teaching for determining a noise profile of an environment around an electronic device, as well as generating and storing noise profiles ([0094], Figure 2B).
Chrisman (US 2018/0053518 A1) also provides teaching for generating a new audio noise profile ([0039], [0052]).
The prior art of record taken alone or in combination however fail to teach, inter alia, a device for performing audio analytics involving, given the determination that a second portion of input audio data corresponds to a newly created category, performing natural language processing on a text string obtained from a first portion of the same input audio data that represents speech based upon part of an output text string, the 
Claim 1 is hereby allowed over the prior art of record.
Dependent claims 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18 and 19 depend on claim 1 and are also allowed over the prior art of record based on their dependence on an allowable base claim.
With regard to claim 20, the prior art of record taken alone or in combination fail 
to teach, inter alia, a method for performing audio analytics involving, given the determination that a second portion of input audio data corresponds to a newly created category, performing natural language processing on a text string obtained from a first portion of the same input audio data that represents speech based upon part of an output text string, the output text string having been generated based on a speech portion of a separate input audio data that was used to register the newly created category.
Dependent claims 21, 22, 23 and 24 depend on claim 20 and are also allowed over the prior art of record based on their dependence on an allowable base claim.
With regard to claim 25, the prior art of record taken alone or in combination fail 
to teach, inter alia, an apparatus for performing audio analytics involving, given the determination that a second portion of input audio data corresponds to a newly created category, performing natural language processing on a text string obtained from a first portion of the same input audio data that represents speech based upon part of an output text string, the output text string having been generated based on a speech portion of a separate input audio data that was used to register the newly created category.

With regard to claim 28, the prior art of record taken alone or in combination fail 
to teach, inter alia, a non-transitory computer-readable medium storing instructions for performing audio analytics involving, given the determination that a second portion of input audio data corresponds to a newly created category, performing natural language processing on a text string obtained from a first portion of the same input audio data that represents speech based upon part of an output text string, the output text string having been generated based on a speech portion of a separate input audio data that was used to register the newly created category.
Dependent claims 29 and 30 depend on claim 28 and are also allowed over the prior art of record based on their dependence on an allowable base claim.
Conclusion
Any comments considered necessary by Applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee. Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Any inquiry concerning this communication or earlier communications from the Examiner should be directed to OLUWADAMILOLA M OGUNBIYI whose telephone number is (571)272-4708. The Examiner can normally be reached on Monday - Thursday (8:00 AM - 5:30 PM Eastern Standard Time).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an 
If attempts to reach the Examiner by telephone are unsuccessful, the Examiner’s Supervisor, DANIEL C WASHBURN can be reached on (571)272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/OLUWADAMILOLA M OGUNBIYI/Examiner, Art Unit 2657

/DANIEL C WASHBURN/Supervisory Patent Examiner, Art Unit 2657