DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-5, 8-10, 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over Mars et al., U.S.  Patent No. 10,296,848 (“Mars”) in view of Ray et al., “Robust Spoken Language Understanding via Paraphrasing”, arxiv.org, published 17 September 2018 (accessed from https://arxiv.org/pdf/1809.06444.pdf, “Rey”)
Regarding Claim 1:
Mars teaches:
A system for bootstrapping a machine learning subsystem to generate training data for a deep learning subsystem, the system comprising: 
the machine learning subsystem, the machine learning subsystem comprising: 
a training module comprising a feature engineering and extraction module, said training module operable to: 
receive a predetermined number of labeled training data elements, each training data element comprising a data element and a label, the label corresponding to  a class included in a plurality of classes; and
{ Mars 2:66ff } ... a training data processing engine 190 [training module] ... { Mars 3:16ff } The plurality of external training data sources 180 preferably include several disparate sources of labeled training data that may be used for training machine learning models [receive a predetermined number of labeled training data elements]. For instance, the plurality of external training data sources 180 may include a crowdsourcing data platform, such as Amazon Mechanical Turk or the like [each training data element comprising a data element and a label, the label corresponding to a class included in a plurality of classes], in which labeled data is sourced from a number of data sources or users into the crowdsourcing data platform. The plurality of datastores 185 may function to collect and store machine learning training data [also the label corresponding to a class included in a plurality of classes from the plurality of external training data sources 180. {~ The examiner summarizes that Amazon Mechanical Turk provides labeled utterances ~}
generate, by the feature engineering and extraction module, based in part on the received labeled training data elements, a plurality of sub-models that correspond to the plurality of classes; and
{ Mars 7:13ff } The observables extractor [feature engineering]140 functions to use the slot values comprising the one or more program-comprehensible objects generated at slot extraction unit [extraction module] 135 to determine or generate [generate, by the feature engineering and extraction module] one or more handlers [sub-models] or subroutines [also sub-models - that correspond to the plurality of classes] for handling the data [labeled training data elements] of or responding to the user query or user command of user input data. 
an execution module, said execution module operable to: 
receive a plurality of live unlabeled data elements, each live unlabeled data element being transmitted by an entity;
{ Mars 5:30ff } The slot identification engine [execution module, the examiner notes that slot identification would be part of the execution module]130 functions to implement one or more machine learning models to identify slots or meaningful segments [receive a predetermined number of live unlabeled data elements] of user queries or user commands [being transmitted by an entity] and to assign a slot classification label for each identified slot.
label each live unlabeled data element based on the generated sub-models;
{ Mars 5:42ff, cf also 8:51 } Alternatively, the slot identification engine 130 may function to implement an ensemble of deep machine learning algorithms [based on the generated sub-models] in which each deep machine learning algorithm of the ensemble functions to identify distinct slot labels or slot type labels [label each live unlabeled data element] for user input data ... The machine learning status data of the first user interface may include any type and/or suitable data regarding a current and/or historical configuration of machine learning models [based on the generated sub-models] of the machine learning system . The machine learning status data may include operational metrics (e.g. utilization metrics) of the machine learning models in the machine learning system including accuracy metrics regarding a level of accuracy in generating predictions and/or classification labels by the machine learning models in the system.
for each live data element, present the label to the entity that transmitted the data element;
{Mars 4:13ff, cf also 8:47} The user interface system 105 receives user input data in the form of a verbal utterance and passes the utterance to the automatic speech recognition unit 115 to convert the utterance into text. The user interface system 105 may include, but are not limited to, mobile computing devices (e.g., mobile phones, tablets, etc.) having a client application of the system 100, desktop computers or laptops implementing a web browser, an automated teller machine, virtual and/or personal assistant devices ( e.g., Alexa, Google Home, Cortana, Jarvis, etc.), chatbots or workboats, etc. An intelligent personal assistant device (e.g., Alexa, etc.) may be any type of device capable of touchless interaction with a user to performing one or more tasks or operations [present the label to the entity that transmitted the data element] including providing data or information and/or controlling one or more other devices ... The competency classification engine 120 together with the slot identification engine {~ The examiner notes that slot identification is specific method for natural language processing and natural language understanding ~} 130 and the slot value extractor 135 preferably function to define a natural language processing (NLP) component of the artificial intelligence platform 110. ... { Mars 4:34ff }it shall be noted that the system 100 may obtain training data [for each live data element] from any suitable external data sources ... Specifically, the first user interface may function to present machine learning status data [present the label to the entity that transmitted the data element] relating to any or all machine learning models implemented or that will be implemented within the machine learning system. The machine learning status data of the first user interface may include any type and/or suitable data regarding a current and/or historical configuration of machine learning models of the machine learning system. The machine learning status data may include operational metrics (e.g. utilization metrics) of the machine learning models in the machine learning system including accuracy metrics regarding a level of accuracy in generating predictions and/or classification labels by the machine learning models in the system. 
discern, based on a plurality of signals inferred during the presenting, [whether the label was accurate over a predetermined confidence threshold, or inaccurate]; and
{ Mars 3:27ff, cf also 3:39 } The training data processing engine 190 may function to process the raw training data samples collected from the plurality of external training data sources 180 into a refined or finished composition or list of training data samples that may be deployed into an operational or live machine learning model of the system 100. Generally, the system 100 functions to implement the artificial intelligence virtual assistant platform no to enable intelligent and conversational responses [discern, based on a plurality of signals inferred during  the presenting] by an artificially intelligent virtual assistant to a user query and/or user command input into the system 100. ... At natural language processing components of the system 100 that may include, at least, the competency classification engine.
transmit, to the deep learning subsystem, [each labeled live data element of which the label was determined to be accurate over the predetermined confidence level]; and
{ Mars 10:32ff cf also 10:45ff } S230 ... functions to transmit [transmit to the deep learning subsystem] the machine learning training data request to a  plurality of external machine learning training data sources.  ... S230 may function to identify in advance [labeled live data element] of providing the machine learning training data request.
the deep learning subsystem, the deep learning subsystem comprising: 
an artificial neural network;
{Mars 2:27ff } Using one or more trained (deep) machine learning models, such as long short-term memory (LSTM) neural network [artificial neural network], the embodiments of the present application may function to understand any variety of natural language utterance or textual input provided to the system. The one or more deep machine learning models [deep learning subsystem] post deployment can continue to train using unknown and previously incomprehensible queries or commands from users. As a result, the underlying system that implements the (deep) machine learning models may function to evolve with increasing interactions with users and training rather than being governed by a fixed set of predetermined rules for responding to narrowly-defined queries, as may be accomplished in the current state of the art.
a training module, said training module operable to:
receive the labeled live data elements from the machine learning subsystem; and
{ Mars 11:27ff } Additionally, S240 [a training module, Mars Figure 2:S240/S250/S260] preferably functions to the machine learning training data from the external training data sources synchronously (in parallel). That is, S240 may function to collect machine learning training data from each of the  plurality of external training data sources at a same time [receive the labeled live data elements from the machine learning subsystem] without waiting for any one external training data source to provide a completed response to the machine learning training data request. 
use the received labeled live data elements to train the artificial neural network to identify inputted, unlabeled data elements; and
{ Mars 2:27ff cf also 11:48, 12:36 } Using one or more trained (deep) machine learning models, such as long shortterm [sic, short-term] memory (LSTM) neural network, the embodiments of the present application may function to understand any variety of natural language utterance or textual input provided to the system. The one or more deep machine learning models post deployment can continue to train using unknown and previously incomprehensible queries or commands from users. As a result, the underlying system that implements the (deep) machine learning models may function to evolve with increasing interactions with users and training rather than being governed by a fixed set of predetermined rules for responding to narrowly-defined queries, as may be accomplished in the current state of the art. ... In some embodiments, all machine learning training data [use the received labeled live data] may be mixed together or combined. Alternatively, S240 may function to augment the machine learning training data [to identify inputted, unlabeled data elements] with metadata that identifies from which external machine learning training data source that a label data sample originated from. ... S250, which includes processing the machine learning training data, functions to assess and refine [to train the artificial neural network] (if necessary) the machine learning training data samples collected from the plurality of external training data sources . 
an execution module that receives an unlabeled data element directly at the deep learning subsystem and accurately classifies the heretofore unlabeled data element.
{ Mars 3:33ff } Generally, the system 100 functions to implement [an execution module] the artificial intelligence virtual assistant platform no to enable intelligent and conversational responses by an artificially intelligent virtual assistant to a user query and/or user command input [that receives an unlabeled data element directly at the deep learning subsystem and accurately classifies Figure 1:115/120 the heretofore unlabeled data element] into the system.
Mars does not explicitly teach:
[discern, based on a plurality of signals inferred during the presenting,] whether the label was accurate over a predetermined confidence threshold, or inaccurate; [and]
[transmit, to the deep learning subsystem,] each labeled live data element of which the label was determined to be accurate over the predetermined confidence level; [and]
Rey Teaches:
[discern, based on a plurality of signals inferred during the presenting,] whether the label was accurate over a predetermined confidence threshold, or inaccurate; [and]
The confidence threshold τ in our paraphrase models were set to 0.8 [whether the label was accurate over a predetermined confidence threshold, or inaccurate, page 6, para. 1 and figure 3 for context]
    PNG
    media_image1.png
    384
    820
    media_image1.png
    Greyscale

[transmit, to the deep learning subsystem,] each labeled live data element of which the label was determined to be accurate over the predetermined confidence level; [and]
{ Rey Page 1 } Voice controlled personal agents (e.g. Alexa, Google Assistant, Bixby) are becoming popular due to their ability to understand a wide variety of user utterances, and perform different actions/tasks as requested by the user [live data element]. { Rey Page 2 } We are provided with a labeled training dataset T = {xi , yi , Ii}Ni = 1, where xi are the utterances [each labeled live data element] with words in a vocabulary VT, yi represent the sequence of slot tags from a slot vocabulary S, and Ii ∈ I represent the intent label of the utterance. { Rey Page 6 } We also use the RASA parser with its default settings. Our neural models were implemented in TensorFlow. The confidence threshold τ in our paraphrase models were set to 0.8 [of which the label was determined to be accurate over the predetermined confidence level]. For RASA we were unable to use RNN based paraphrase model, since it does not return the slot tagging probabilities required for paraphrase template construction. In paraphrase models, we generate two best paraphrases using the paraphrase generators, and perform a simple majority voting to predict the final intent and slot labels. 
In view of the teachings of Rey, it would have been obvious for a person of ordinary skill in the art, before the effective filing date, to apply the teachings of Rey to the teachings of Mars in order to increase performance of the machine learning model { cf. Rey, Page 7:5ff “... complex paraphrases are first converted to a mode familiar utterance using a paraphrase generator ... are able to greatly improve the performance of the standalone base parser.”}

Regarding Claim 8:
Mars teaches:
A system for labeling raw real-time human utterances using a deep learning subsystem, wherein the deep learning subsystem is trained by the output of a live machine learning subsystem, the system comprising: 
the machine learning subsystem, the machine learning subsystem comprising: 
a training module comprising a feature engineering and extraction module, said training module operable to: 
receive a predetermined number of labeled training data utterances, each training utterance comprising an utterance and a label, the label corresponding to  an intent included in a plurality of intents; and
{ Mars 2:66ff } ... a training data processing engine 190 [training module] ... { Mars 3:16ff } The plurality of external training data sources 180 preferably include several disparate sources of labeled training data that may be used for training machine learning models [receive a predetermined number of labeled training utterances]. For instance, the plurality of external training data sources 180 may include a crowdsourcing data platform, such as Amazon Mechanical Turk or the like [each labeled training utterance comprising an utterance and an intent, the intent being included in a plurality of intents], in which labeled data is sourced from a number of data sources or users into the crowdsourcing data platform. The plurality of datastores 185 may function to collect and store machine learning training data [also the intent being included in a plurality of intents] from the plurality of external training data sources 180. {~ The examiner summarizes that Amazon Mechanical Turk provides labeled ~}
generate, by the feature engineering and extraction module, based in part on the received labeled training utterance, a plurality of sub-models that correspond to the plurality of intents; and
{ Mars 7:13ff } The observables extractor [feature engineering]140 functions to use the slot values comprising the one or more program-comprehensible objects generated at slot extraction unit [extraction module] 135 to determine or generate [generate, by the feature engineering and extraction module] one or more handlers [sub-models] or subroutines [also sub-models - that correspond to the plurality of intents] for handling the data [labeled training utterances] of or responding to the user query or user command of user input data. 
an execution module, said execution module operable to: 
receive a plurality of live unlabeled utterances, each live utterance being transmitted by an entity;
{ Mars 5:30ff } The slot identification engine [execution module, the examiner notes that slot identification would be part of the execution module]130 functions to implement one or more machine learning models to identify slots or meaningful segments [receive a predetermined number of live unlabeled utterances] of user queries or user commands [being transmitted by an entity] and to assign a slot classification label for each identified slot.
label each live unlabeled utterance based on the feature engineering and extraction module;
{ Mars 5:42ff, cf also 8:51 } Alternatively, the slot identification engine 130 may function to implement an ensemble of deep machine learning algorithms [based on the generated sub-models] in which each deep machine learning algorithm of the ensemble functions to identify distinct slot labels or slot type labels [label each live unlabeled data element] for user input data ... The machine learning status data of the first user interface may include any type and/or suitable data regarding a current and/or historical configuration of machine learning models [based on the feature engineering and extraction module] of the machine learning system . The machine learning status data may include operational metrics (e.g. utilization metrics) of the machine learning models in the machine learning system including accuracy metrics regarding a level of accuracy in generating predictions and/or classification labels by the machine learning models in the system.
for each live utterance, present the label to the entity that transmitted the utterance;
{Mars 4:13ff, cf also 8:47} The user interface system 105 receives user input data in the form of a verbal utterance and passes the utterance to the automatic speech recognition unit 115 to convert the utterance into text. The user interface system 105 may include, but are not limited to, mobile computing devices (e.g., mobile phones, tablets, etc.) having a client application of the system 100, desktop computers or laptops implementing a web browser, an automated teller machine, virtual and/or personal assistant devices ( e.g., Alexa, Google Home, Cortana, Jarvis, etc.), chatbots or workboats, etc. An intelligent personal assistant device (e.g., Alexa, etc.) may be any type of device capable of touchless interaction with a user to performing one or more tasks or operations [present the label to the entity that transmitted the utterance] including providing data or information and/or controlling one or more other devices ... The competency classification engine 120 together with the slot identification engine {~ The examiner notes that slot identification is specific method for natural language processing and natural language understanding ~} 130 and the slot value extractor 135 preferably function to define a natural language processing (NLP) component of the artificial intelligence platform 110. ... { Mars 4:34ff }it shall be noted that the system 100 may obtain training data [for each utterance] from any suitable external data sources ... Specifically, the first user interface may function to present machine learning status data [present the label to the entity that transmitted the utterance] relating to any or all machine learning models implemented or that will be implemented within the machine learning system. The machine learning status data of the first user interface may include any type and/or suitable data regarding a current and/or historical configuration of machine learning models of the machine learning system. The machine learning status data may include operational metrics (e.g. utilization metrics) of the machine learning models in the machine learning system including accuracy metrics regarding a level of accuracy in generating predictions and/or classification labels by the machine learning models in the system. 
determine, based on a plurality of signals inferred during, and after, the presenting the label to the entity, [whether the label was accurately assigned over a predetermined confidence threshold, or inaccurately assigned]; and
{ Mars 3:27ff, cf also 3:39 } The training data processing engine 190 may function to process the raw training data samples collected from the plurality of external training data sources 180 into a refined or finished composition or list of training data samples that may be deployed into an operational or live machine learning model of the system 100. Generally, the system 100 functions to implement the artificial intelligence virtual assistant platform no to enable intelligent and conversational responses [determine, based on a plurality of signals inferred during and after the presenting the label to the entity] by an artificially intelligent virtual assistant to a user query and/or user command input into the system 100. ... At natural language processing components of the system 100 that may include, at least, the competency classification engine
transmit, to the deep learning subsystem, [each labeled live utterance, of which the label was determined to be accurate over the predetermined confidence level]; and
{ Mars 10:32ff cf also 10:45ff } S230 ... functions to transmit [transmit to the deep learning subsystem] the machine learning training data request to a  plurality of external machine learning training data sources.  ... S230 may function to identify in advance [labeled live utterance] of providing the machine learning training data request.
the deep learning subsystem, the deep learning subsystem comprising: 
an artificial neural network;
{Mars 2:27ff } Using one or more trained (deep) machine learning models, such as long short-term memory (LSTM) neural network [artificial neural network], the embodiments of the present application may function to understand any variety of natural language utterance or textual input provided to the system. The one or more deep machine learning models [deep learning subsystem] post deployment can continue to train using unknown and previously incomprehensible queries or commands from users. As a result, the underlying system that implements the (deep) machine learning models may function to evolve with increasing interactions with users and training rather than being governed by a fixed set of predetermined rules for responding to narrowly-defined queries, as may be accomplished in the current state of the art.
a training module, said training module operable to:
receive the labeled live utterance from the machine learning subsystem; and
{ Mars 11:27ff } Additionally, S240 [a training module, Mars Figure 2:S240/S250/S260] preferably functions to the machine learning training data from the external training data sources synchronously (in parallel). That is, S240 may function to collect machine learning training data from each of the  plurality of external training data sources at a same time [receive the labeled live utterance from the machine learning subsystem] without waiting for any one external training data source to provide a completed response to the machine learning training data request. 
use the received labeled live utterance to train the artificial neural network to identify inputted, unlabeled data elements; and
{ Mars 2:27ff cf also 11:48, 12:36 } Using one or more trained (deep) machine learning models, such as long shortterm [sic, short-term] memory (LSTM) neural network, the embodiments of the present application may function to understand any variety of natural language utterance or textual input provided to the system. The one or more deep machine learning models post deployment can continue to train using unknown and previously incomprehensible queries or commands from users. As a result, the underlying system that implements the (deep) machine learning models may function to evolve with increasing interactions with users and training rather than being governed by a fixed set of predetermined rules for responding to narrowly-defined queries, as may be accomplished in the current state of the art. ... In some embodiments, all machine learning training data [use the received labeled live data] may be mixed together or combined. Alternatively, S240 may function to augment the machine learning training data [to identify inputted, unlabeled data elements] with metadata that identifies from which external machine learning training data source that a label data sample originated from. ... S250, which includes processing the machine learning training data, functions to assess and refine [to train the artificial neural network] (if necessary) the machine learning training data samples collected from the plurality of external training data sources . 
an execution module that receives an unlabeled utterance directly at the deep learning subsystem and accurately determines an intent for the heretofore unlabeled utterance.
{ Mars 3:33ff } Generally, the system 100 functions to implement [an execution module] the artificial intelligence virtual assistant platform no to enable intelligent and conversational responses by an artificially intelligent virtual assistant to a user query and/or user command input [that receives an unlabeled utterance directly at the deep learning subsystem and accurately classifies Figure 1:115/120 the heretofore unlabeled utterance] into the system.
Mars does not explicitly teach:
[determine, based on a plurality of signals inferred during, and after, the presenting the label to the entity,] whether the label was accurately assigned over a predetermined confidence threshold, or inaccurately assigned; [and]
[transmit, to the deep learning subsystem,] each labeled live utterance, of which the label was determined to be accurate over the predetermined confidence level; [and]
Rey teaches: 
[determine, based on a plurality of signals inferred during, and after, the presenting the label to the entity,] whether the label was accurately assigned over a predetermined confidence threshold, or inaccurately assigned; [and]The confidence threshold τ in our paraphrase models were set to 0.8 [whether the label was accurate over a predetermined confidence threshold, or inaccurately assigned, page 6, para. 1 and figure 3 for context].
    PNG
    media_image1.png
    384
    820
    media_image1.png
    Greyscale

[transmit, to the deep learning subsystem,] each labeled live utterance, of which the label was determined to be accurate over the predetermined confidence level; [and]
{ Rey Page 1 } Voice controlled personal agents (e.g. Alexa, Google Assistant, Bixby) are becoming popular due to their ability to understand a wide variety of user utterances, and perform different actions/tasks as requested by the user [live utterance]. { Rey Page 2 } We are provided with a labeled training dataset T = {xi , yi , Ii}Ni = 1, where xi are the utterances [each labeled live utterance] with words in a vocabulary VT, yi represent the sequence of slot tags from a slot vocabulary S, and Ii ∈ I represent the intent label of the utterance. { Rey Page 6 } We also use the RASA parser with its default settings. Our neural models were implemented in TensorFlow. The confidence threshold τ in our paraphrase models were set to 0.8 [of which the label was determined to be accurate over the predetermined confidence level]. For RASA we were unable to use RNN based paraphrase model, since it does not return the slot tagging probabilities required for paraphrase template construction. In paraphrase models, we generate two best paraphrases using the paraphrase generators, and perform a simple majority voting to predict the final intent and slot labels.
In view of the teachings of Rey, it would have been obvious for a person of ordinary skill in the art, before the effective filing date, to apply the teachings of Rey to the teachings of Mars for the same rational as claim 1.

Regarding Claim 16:
Mars teaches:
A method for determining intents associated with human utterances at a voice response system, the method comprising: 
the machine learning subsystem, the machine learning subsystem comprising: 
receive a predetermined number of labeled training data utterances at the training module, each labeled training utterance comprising an utterance and an intent, the intent being included in the plurality of intents; 
{ Mars 2:66ff } ... a training data processing engine 190 [training module] ... { Mars 3:16ff } The plurality of external training data sources 180 preferably include several disparate sources of labeled training data that may be used for training machine learning models [receive a predetermined number of labeled training utterances at the training module]. For instance, the plurality of external training data sources 180 may include a crowdsourcing data platform, such as Amazon Mechanical Turk or the like [each labeled training utterance comprising an utterance and an intent, the intent being included in a plurality of intents], in which labeled data is sourced from a number of data sources or users into the crowdsourcing data platform. The plurality of datastores 185 may function to collect and store machine learning training data [also the intent being included in a plurality of intents] from the plurality of external training data sources 180. {~ The examiner summarizes that Amazon Mechanical Turk provides labeled utterances ~}
generating, by a feature engineering and extraction module, based in part on the received labeled training utterance, at the training module of the machine learning system, a plurality of sub-models that correspond to the plurality of intents; and
{ Mars 7:13ff } The observables extractor [feature engineering]140 functions to use the slot values comprising the one or more program-comprehensible objects generated at slot extraction unit [extraction module] 135 to determine or generate [generate, by the feature engineering and extraction module] one or more handlers [sub-models] or subroutines [also sub-models - that correspond to the plurality of intents] for handling the data [labeled training utterances] of or responding to the user query or user command of user input data. 
receive, at an execution module in the machine learning subsystem, a plurality of live unlabeled utterances, each live utterance being transmitted by an entity;
{ Mars 5:30ff } The slot identification engine [receive at an execution module in the machine learning subsystem, the examiner notes that slot identification would be part of the execution module]130 functions to implement one or more machine learning models to identify slots or meaningful segments [receive a predetermined a plurality of live unlabeled utterances] of user queries or user commands [being transmitted by an entity] and to assign a slot classification label for each identified slot.
identifying, at the extraction module, a sub-model included in the plurality of sub-models, that corresponds to each unlabeled utterance;
{ Mars 5:42ff } Alternatively, the slot identification engine 130 may function to implement an ensemble of deep machine learning algorithms [a sub-model included in the plurality of sub-models] in which each deep machine learning algorithm of the ensemble functions to identify distinct slot labels or slot type labels [identifying, at the extraction module] for user input data [that corresponds to each unlabeled utterance]
for each live utterance, present to the entity that transmitted the utterance, a series of steps associated with the intent that corresponds to the identified sub-model;
{Mars 4:13ff, cf also 8:47} The user interface system 105 receives user input data in the form of a verbal utterance and passes the utterance to the automatic speech recognition unit 115 to convert the utterance into text. The user interface system 105 may include, but are not limited to, mobile computing devices (e.g., mobile phones, tablets, etc.) having a client application of the system 100, desktop computers or laptops implementing a web browser, an automated teller machine, virtual and/or personal assistant devices ( e.g., Alexa, Google Home, Cortana, Jarvis, etc.), chatbots or workboats, etc. An intelligent personal assistant device (e.g., Alexa, etc.) may be any type of device capable of touchless interaction with a user to performing one or more tasks or operations [present the label to the entity that transmitted the utterance] including providing data or information and/or controlling one or more other devices ... The competency classification engine 120 together with the slot identification engine {~ The examiner notes that slot identification is specific method for natural language processing and natural language understanding ~} 130 and the slot value extractor 135 preferably function to define a natural language processing (NLP) component of the artificial intelligence platform 110. ... { Mars 4:34ff } it shall be noted that the system 100 may obtain training data [for each utterance] from any suitable external data sources ... Specifically, the first user interface may function to present machine learning status data [present the label to the entity that transmitted the utterance] relating to any or all machine learning models implemented or that will be implemented within the machine learning system [a series of steps associated with the intent]. The machine learning status data of the first user interface may include any type and/or suitable data regarding a current and/or historical configuration of machine learning models of the machine learning system. The machine learning status data may include operational metrics (e.g. utilization metrics) of the machine learning models in the machine learning system including accuracy metrics regarding a level of accuracy in generating predictions and/or classification labels by the machine learning models in the system.
identifying, [over a predetermined confidence threshold] said identifying based on a plurality of signals received during, and after the presenting the series of steps, [whether each identified intent was accurately assigned or inaccurately assigned]5; and
{ Mars 3:27ff, cf also 3:39 } The training data processing engine 190 may function to process the raw training data samples collected from the plurality of external training data sources 180 into a refined or finished composition or list of training data samples that may be deployed into an operational or live machine learning model of the system 100. Generally, the system 100 functions to implement the artificial intelligence virtual assistant platform no to enable intelligent and conversational responses [identifying ... based on a plurality of signals inferred during and after the presenting the series of steps] by an artificially intelligent virtual assistant to a user query and/or user command input into the system 100. ... At natural language processing components of the system 100 that may include, at least, the competency classification engine [whether each identified intent was accurate]
transmitting to a deep learning subsystem, [each live utterance and the associated intent that was determined to be accurately assigned over the predetermined confidence level];
{ Mars 10:32ff cf also 10:45ff } S230 ... functions to transmit [transmit to the deep learning subsystem] the machine learning training data request to a  plurality of external machine learning training data sources.  ... S230 may function to identify in advance [labeled live utterance] of providing the machine learning training data request.
training an artificial neural network of the deep learning system using the received live utterances and the associated intents;
{Mars 2:27ff } Using one or more trained (deep) machine learning models, such as long short-term memory (LSTM) neural network [training an artificial neural network of the deep learning system], the embodiments of the present application may function to understand any variety of natural language utterance or textual input provided to the system. The one or more deep machine learning models [deep learning subsystem] post deployment can continue to train using unknown and previously incomprehensible queries or commands [utterances and the associated intents] from users. As a result, the underlying system that implements the (deep) machine learning models may function to evolve with increasing interactions with users and training rather than being governed by a fixed set of predetermined rules for responding to narrowly-defined queries, as may be accomplished in the current state of the art.
receive an unlabeled utterance directly at the deep learning system; and
{ Mars 7:53ff } In some embodiments, the user interface system 105 receives user input data in the form of a verbal utterance and passes the utterance to the automatic speech recognition unit 115 to convert the utterance into text [receive an unlabeled utterance directly at the deep learning system].
accurately determining an intent for the unlabeled utterance.
{ Mars 3:33ff } Generally, the system 100 functions to implement the artificial intelligence virtual assistant platform no to enable intelligent and conversational responses by an artificially intelligent virtual assistant to a user query and/or user command input [accurately determine an intent for the unlabeled utterance Figure 1:115/120/130/135/150] into the system.
Mars does not explicitly teach:
[identifying,] over a predetermined confidence threshold [said identifying based on a plurality of signals received during, and after the presenting the series of steps,] whether each identified intent was accurately assigned or inaccurately assigned; [and]
[transmitting to a deep learning subsystem,] each live utterance and the associated intent that was determined to be accurately assigned over the predetermined confidence level;
Rey teaches:
[identifying,] over a predetermined confidence threshold [said identifying based on a plurality of signals received during, and after the presenting the series of steps,] whether each identified intent was accurately assigned or inaccurately assigned; [and]
The confidence threshold τ in our paraphrase models were set to 0.8 [over a predetermined confidence threshold ... whether each identified intent was accurate over a predetermined confidence threshold or inaccurately assigned, Page 6 and Figure 3 for context].

    PNG
    media_image1.png
    384
    820
    media_image1.png
    Greyscale

[transmitting to a deep learning subsystem,] each live utterance and the associated intent that was determined to be accurately assigned over the predetermined confidence level;
{ Rey Page 1 } Voice controlled personal agents (e.g. Alexa, Google Assistant, Bixby) are becoming popular due to their ability to understand a wide variety of user utterances, and perform different actions/tasks as requested by the user [live utterance]7. { Rey Page 2 } We are provided with a labeled training dataset T = {xi , yi , Ii}Ni = 1, where xi are the utterances [each labeled live utterance]7 with words in a vocabulary VT, yi represent the sequence of slot tags from a slot vocabulary S, and Ii ∈ I represent the intent label [and the associated intent]7 of the utterance. { Rey Page 6 } We also use the RASA parser with its default settings. Our neural models were implemented in TensorFlow. The confidence threshold τ in our paraphrase models were set to 0.8 [was determined to be accurately over the predetermined confidence level]7. For RASA we were unable to use RNN based paraphrase model, since it does not return the slot tagging probabilities required for paraphrase template construction. In paraphrase models, we generate two best paraphrases using the paraphrase generators, and perform a simple majority voting to predict the final intent and slot labels. 
In view of the teachings of Rey, it would have been obvious for a person of ordinary skill in the art, before the effective filing date, to apply the teachings of Rey to the teachings of Mars for the same rational as claim 1.
Regarding Claim 2:
Mars and Rey teach the system of claim 1
Mars further teaches wherein the data elements are human utterances.
{ Mars:2:41ff } Accordingly, the evolving nature of the artificial intelligence platform described herein therefore enables the artificially intelligent virtual assistant latitude to learn without a need for additional programming and the capabilities to 45 ingest complex ( or uncontemplated) utterances [wherein the data elements are human utterances] and text input to provide meaningful and accurate responses.

Regarding Claim 3:
Mars and Rey teach the system of claim 1
Mars further teaches wherein each class, included in the plurality of classes, corresponds to an intent of human utterance.
{ Mars 2:20ff cf also 2:32ff } The embodiments of the present application, however, provide artificial intelligence virtual assistant platform and natural language processing capabilities that function to process and comprehend structured and/or unstructured natural language input from a user [wherein each class, included in the plurality of classes]. ... The one or more deep machine learning models post deployment can continue to train using unknown and previously incomprehensible queries or commands from users. As a result, the underlying system that implements the (deep) machine learning models may function to evolve with increasing interactions with users and training rather than being governed by a fixed set of predetermined rules for responding to narrowly-defined queries [corresponds to an intent of human utterance], as may be accomplished in the current state of the art.

Regarding Claim 4, analogous claim 9, and 17:
Mars and Rey teach the system of claim 1
Mars further teaches wherein the artificial neural network is a feed-forward neural network, convolutional neural network or a recurrent neural network.
{ Mars 2:27ff cf also 6:11ff, 6:42ff } Using one or more trained (deep) machine learning models, such as long short-term memory (LSTM) neural network, the embodiments of the present application may function to understand any variety of natural language utterance or textual input provided to the system. The one or more deep machine learning models post deployment can continue to train using unknown and previously incomprehensible queries or commands from users. As a result, the underlying system that implements the (deep) machine learning models may function to evolve with increasing interactions with users and training rather than being governed by a fixed set of predetermined rules for responding to narrowly-defined queries, as may be accomplished in the current state of the art. {~ The examiner notes that an LSTM is both recurrent, and feed-forward, and that both may exist, feed-forward & recurrent, within a convolution network, From Wikipedia: “Unlike standard feedforward neural networks, LSTM has feedback connections. Such a recurrent neural network (RNN) can process not only single data points (such as images), but also entire sequences of data.” In short, LSTMs operate in both territories. ~} ... The machine learning models and/or the ensemble of machine learning models may employ any suitable machine learning including one or more of: ... a deep learning algorithm (e.g., a restricted Boltzmann machine, a deep belief network method, a convolution network method ...

Regarding Claim 5, analogous claim 10, and 18:
Mars, and Rey teach The system of claim 1, 
Rey further teaches wherein a new label [special blank token, ‹?›] that corresponds to a class [new paraphrase] that is not included in the plurality of classes [generated paraphrase templates] is inputted directly into the deep learning subsystem [Tpara] by inputting a plurality of labeled data elements [utterances x], each of which corresponds to the new label [special blank token, ‹?›].
{ Rey, Page 4 } This model is motivated by the following observation. We use the term context words as words having the slot label “O” (which are non-informational), and slot words as the remaining informational words. For example, in Figure 1, words {“chicago”, “san francisco”, “thursday”} are slot words, and the remaining are context words. We observe that, often when the base parser Pbase [the examiner emphasizes Pbase is a pretrained dataset] fail to identify the correct slot labels, it can still identify the position of the slot words (but not their exact labels) with sufficient confidence. Since, context words play a major role in enabling identification of slot words, we would like to replace context words having low parser confidence with more frequent words, thereby generating a new paraphrase [that is not included in the plurality of classes, new paraphrase]. This enables the slot words to be correctly labeled using this paraphrase. After Pbase identifies slot words [a plurality of labeled data elements, slot words] in utterance x [each of which corresponds, x, to the new label ‹?›]; we assume the remaining words are context words, and find the average slot confidence                         
                            
                                
                                    
                                        
                                            S
                                        
                                        -
                                    
                                
                                
                                    C
                                
                            
                            
                                
                                    x
                                
                            
                        
                     over these context words C: We generate a paraphrase template [that corresponds to a class, paraphrase template] T(x) = (ui...un) as follows; ui = xi, if slot probability P(yi = si) >                         
                            
                                
                                    
                                        
                                            S
                                        
                                        -
                                    
                                
                                
                                    C
                                
                            
                            
                                
                                    x
                                
                            
                        
                     or if xi is a slot word, else we replace ui = ‹?› [a new label, ‹?›], a special blank token. We then run a modified beam search algorithm using the forward language model Lf over the template T(x); such that the beams are constrained to generate ui = xi for non-blank tokens, but are allowed to generate new words to replace the blank tokens ‹?› in the template. However, these hard constraints tend to reduce the normal beam search quality of the RNN. To mitigate this, we also perform a similar reverse beam search using a reversed language model Lb: Finally, all generated beams are scored by both language models, and the one having the highest average score is output as paraphrase xʹ. As an example, a possible template T(x) for utterance 1 in Figure 1 is “‹?› ‹?› a flight from chicago to san francisco on ‹?› thursday"; after beam search this may produce a paraphrase “show me a flight from chicago to san francisco on next thursday".
In view of the teachings of Rey, it would have been obvious for a person of ordinary skill in the art, before the effective filing date, to apply the teachings of Rey to the teachings of Mars for the same rational as claim 1.
Claims 6, 7, 11, and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Mars in view of Rey and further in view of Radiuk et al., “Impact of Training Set Batch Size on the Performance of Convolutional Neural Networks for Diverse Datasets”, Information Technology and Management Science Vol..20, published December 2017 (accessed from https://itms-journals.rtu.lv/article/view/itms-2017-0003, “Radiuk”) 
Regarding Claims 6, and 11:
Mars and Rey teach:
The system of claim 1.
Mars further teaches:
wherein a predetermined number of received live labeled data elements used to train the deep learning subsystem is two orders of magnitude more than the predetermined number of labeled data elements used to train the machine learning subsystem.
{ Mars 3:16ff } The plurality of external training data sources 180 preferably include several disparate sources of labeled training data that may be used for training machine learning models [wherein a predetermined number of received live labeled data elements used to train the deep learning subsystem].
Mars and Rey do not teach: 
wherein a predetermined number of received live labeled data elements used to train the deep learning subsystem is two orders of magnitude more than the predetermined number of labeled data elements used to train the machine learning subsystem.
Radiuk teaches:
wherein a predetermined number of received live labeled data elements used to train the deep learning subsystem is two orders of magnitude more than the predetermined number of labeled data elements used to train the machine learning subsystem.
{Radiuk Section 5: Model Architecture & Figures 4 & 5}: According to the related works, two sequences of the values of the batch size [than the predetermined number of labeled data elements used to train the machine learning subsystem] are chosen, namely, number to the power of two and numbers multiples of ten. {~ The examiner notes that figure 4 shows the magnitude of two batch sizes, one at magnitude 16, and the other at 1024.  In the graph, a higher batch size means higher accuracy as well as more consistent accuracies over multiple training iterations ~}
    PNG
    media_image2.png
    543
    1294
    media_image2.png
    Greyscale

In view of the teachings of Radiuk, it would have been obvious for a person of ordinary skill in the art before the effective filing date to apply the teachings of Radiuk into the system and method of Mars, and Rey in order to assert (Raduk, page 24, para. 2 “ ... the supposition about the dependence of the recognition accuracy on the batch size value was confirmed: the larger the batch size value, the higher the testing accuracy.”).
Regarding Claims 7, and 12:
Mars, and Rey, teach:
The system of claim 1.  
Mars further teaches:
wherein a predetermined number of data elements used to train the deep learning subsystem is 10,000 percent more than the predetermined number of data used to train the machine learning subsystem.
{ Mars 3:16ff } The plurality of external training data sources 180 preferably include several disparate sources of labeled training data that may be used for training machine learning models [wherein a predetermined number of labeled utterances used to train the deep learning subsystem].
Mars and Rey do not teach:
wherein a predetermined number of data elements used to train the deep learning subsystem is 10,000 percent more than the predetermined number of data used to train the machine learning subsystem.
Radiuk teaches:
is 10,000 percent more than the predetermined number of data used to train the machine learning subsystem. 
{Radiuk Section 5: Model Architecture & Figures 4 & 5}: According to the related works, two sequences of the values of the batch size [than the predetermined number of data used to train the machine learning subsystem] are chosen, namely, number to the power of two and numbers multiples of ten [10,000 percent more than]. {~ The examiner notes that figure 4 shows the magnitude of two batch sizes, one at magnitude 16, and the other at 1024.  In the graph, a higher batch size means higher accuracy as well as more consistent accuracies over multiple training iterations ~}
    PNG
    media_image2.png
    543
    1294
    media_image2.png
    Greyscale

In view of the teachings of Radiuk, it would have been obvious for a person of ordinary skill in the art before the effective filing date to apply the teachings of Radiuk into the system and method of Mars, and Rey for the same reasoning in claim 6.

Claims 13, 14 , and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Mars in view of Rey and further in view of McDonald-Jenson et al “Python Queue Count of Unfinished Tasks”, stackoverflow.com, published 16 April 2016 (accessed from https://stackoverflow.com/questions/36658531/python-queue-count-of-unfinished-tasks, “McDonald Jenson”
Regarding Claims 13:
Mars, Rey, and Radiuk teach: 
The system of claim 12, 
Mars further teaches:
wherein the plurality of signals comprise whether the entity completed a plurality of tasks with the labeled intent.
{ Mars 3:33ff } Generally, the system 100 functions to implement the artificial intelligence virtual assistant platform no to enable intelligent and conversational responses [plurality of signals comprise ... with the labeled intent] by an artificially intelligent virtual assistant to a user query and/or user command input into the system 100.
Mars, Rey, and Radiuk do not teach:
wherein the plurality of signals comprise whether the entity completed a plurality of tasks with the labeled intent.
McDonald-Jenson teaches :
wherein the plurality of signals comprise whether the entity completed a plurality of tasks with the labeled intent.
{McDonald-Jenson, Page 1, para 1} The count of unfinished tasks goes up whenever an item is added to the queue [whether the entity completed a plurality of tasks associated].
In view of the teachings of McDonald-Jenson it would have been obvious for a person of ordinary skill in the art, before the effective filing date, to apply the teaching of McDonald-Jenson into the system of Mars and Rey in order to improve efficiency by tracking the completion associated with tasks, steps, or a plurality of steps and whether or not they were abandoned. (McDonald-Jenson, page 1, para 2, “The count of unfinished tasks goes up whenever an item is added to the queue. The count goes down whenever a consumer thread calls task_done() to indicate that the item was retrieved and all work on it is complete. When the count of unfinished tasks drops to zero join() unblocks .”)

Regarding Claim 14: 
Mars, and Rey teach: 
The system of claim 12, 
Mars further teaches:
wherein the plurality of signals comprise whether the entity to complete an action associated with the labeled intent.
{ Mars 3:33ff } Generally, the system 100 functions to implement the artificial intelligence virtual assistant platform no to enable intelligent and conversational responses [plurality of signals comprise ... with the labeled intent] by an artificially intelligent virtual assistant to a user query and/or user command input into the system 100.
Mars and Rey do not teach:
wherein the plurality of signals comprise whether the entity to complete an action associated with the labeled intent.
McDonald-Jenson teaches:
wherein the plurality of signals comprise whether the entity to complete an action associated with the labeled intent.
{McDonald-Jenson, Page 1, para 1} The count of unfinished tasks goes up whenever an item is added to the queue [whether the entity to completed a plurality of tasks associated].
In view of the teachings of McDonald-Jenson it would have been obvious for a person of ordinary skill in the art, before the effective filing date, to apply the teaching of McDonald-Jenson into the system of Mars and Rey for the same rational in claim 13.
Regarding Claim 20:
Mars, and Rey teach:
The system of claim 16, 
Mars further teaches:
wherein the plurality of signals includes whether the plurality of steps was completed or abandoned.
Mars does not teach:
wherein the plurality of signals includes whether the plurality of steps was completed or abandoned.
McDonald-Jenson teaches:
wherein the plurality of signals includes whether the plurality of steps was completed or abandoned.
{McDonald-Jenson, Page 1, para 1} The count of unfinished tasks goes up whenever an item is added to the queue [includes whether the plurality of steps was completed or abandoned].
In view of the teachings of McDonald-Jenson it would have been obvious for a person of ordinary skill in the art, before the effective filing date, to apply the teaching of McDonald-Jenson into the system of Mars and Rey for the same rational in claim 13.
Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Mars (US 10,296,848), Rey, and Radiuk in view of Kub et al “Sentiment Analysis Walkthrough Part 2.ipynb”, github.com, published 24 January 2019 accessed from https://github.com/aaronkub/machine-learning-examples/blob/master/imdb-sentiment-analysis/Sentiment%20Analysis%20Walkthrough%20Part%202.ipynb, “Kub”
Regarding Claim 15:
Mars in view of Rey and Radiuk teaches the system of claim 12.
Mars further teaches wherein the plurality of signals ... received from the entity. 
{ Mars 8:2ff } Thus, an intelligent personal assistant may be used by a user to perform any portions of the methods described herein, including the steps [the plurality of steps] and processes of method 200, described below. Additionally, a chatbot or a workbot [sic, work bot] may include any type of program (e.g., slack bot, etc.) implemented by one or more devices that may be used to interact with a user using any type of input method [wherein the plurality of signals includes ...]  ( e.g., verbally, textually, etc.).
Mars in view of Rey and Radiuk does not explicitly teach comprise a sentiment analysis.
Kub teaches comprise a sentiment analysis
{ Kub: Top Positive and Negative Features }     ...
for best_positive in sorted(
    feature_to_coef.items(), 
    key=lambda x: x[1], 
    reverse=True)[:30]:
    print (best_positive)[ comprise a sentiment analysis]
...
for best_negative in sorted(
    feature_to_coef.items(), 
    key=lambda x: x[1])[:30]:
    print (best_negative) [ comprise a sentiment analysis]

In view of the teachings of Kub, it would have been obvious for a person of ordinary skill in the art, before the effective filing date, to apply the teachings of Kub to the teachings of Mars Ray, and Radiuk in order to combine the teachings of Mars in view of Rey and Radiuk (i.e. a neural network that classifies utterances)  with the teachings of Kub (i.e. sentiment analysis) , with the combination of these familiar elements functioning together according to known methods yielding predictable results (i.e. a neural network that classified utterances into sentiments) 

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Mars in view of Rey and further in view of Pietzcker Calculating Time Difference, stackoverflow.com, accessed 20Oct2022, Published 6Aug2010, accessed from https://stackoverflow.com/questions/3426870/calculating-time-difference, “Pietzcker”
Regarding Claim 19:
Mars, and Rey teach the method of claim 16.
Mars further teaches: 
wherein the plurality of signals includes ... the plurality of steps. { Mars 8:2ff } Thus, an intelligent personal assistant may be used by a user to perform any portions of the methods described herein, including the steps [the plurality of steps] and processes of method 200, described below. Additionally, a chatbot or a workbot [sic, work bot] may include any type of program (e.g., slack bot, etc.) implemented by one or more devices that may be used to interact with a user using any type of input method [wherein the plurality of signals includes ...]  ( e.g., verbally, textually, etc.).
Mars and Rey do not teach an amount of time associated with completing.
Pietzcker teaches an amount of time associated with completing. { Pietzcker page 1} 
The datetime module will do all the work for you:
>>> import datetime>>> a = datetime.datetime.now()>>> # ...wait a while... >>> b = datetime.datetime.now()>>> print(b-a)0:03:43.984000 [an amount of time associated with completing]
In view of the teachings of Pietzcker, it would have been obvious for a person of ordinary skill in the art, before the effective filing date, to apply the teachings of Pietzcker into the system and method of Mars and Rey in order to track the time associated with steps. In doing so the method is improved since an observer of user behavior may now “log how long it took for a user to progress through some menus.” (Pietzcker page 1 ff).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RICHARD CARL STANLEY whose telephone number is (571)272-2002. The examiner can normally be reached Monday-Friday 8:30am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael Huntley can be reached on michael.huntley@uspto.gov. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/R.C.S./Examiner, Art Unit 2129
/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129