DETAILED ACTION
This office action is in response to Applicant’s submission filed on 11/6/2020. Claims 1-20 are pending in the application. As such, claims 1- 20 have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, or 365 is acknowledged. The prior-filed application (Provisional application No. 62/932949 Filed on 11/8/2019) is acknowledged.

Claim Objections
Claims 4, and 12, and therefore claims 9-10 and 16 which respectively depend therefrom, are objected to because of the following informalities: 
Claims 4 and 12 claim a “RAILS” model architecture. It should recite the expanded version of an acronym if it is not a term of art and known to one of ordinary skill in the art.
Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are:
“a first filter arranged to receive,” “a first machine learning system and a second machine learning system arranged to analyze,” and “a second filter arranged to receive,” as claimed in claim 11.
“a phonetic encoding component for determining,” as claimed in claim 16.

Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1, 2, 4, 7, 8, 11, 12,  15, 17  and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Mairesse et al. (US9558740B1)(hereinafter "Mairesse"), and Steelberg et al. (US20200286485A1)(hereinafter "Steelberg").

Regarding claims 1, and 17, Mairesse teaches [a computer-implemented method - claim 1] and [a non-transitory computer readable storage medium containing computer program instructions for detecting and resolving mis-transcriptions in a transcript generated by an automatic speech recognition system when transcribing spoken words, the computer program instructions, when executed by a processor, causing the processor to perform an operation comprising: - claim 17] (Mairesse, Col. 22, lines 24 - 31:" Aspects of the disclosed system may be implemented as a computer method or as an article of manufacture such as a memory device or non-transitory computer readable storage medium. The computer readable storage medium may be readable by a computer and may comprise instructions for causing a computer or other device to perform processes described in the present disclosure.". and Col.  20, lines 34 - 39:"A device's computer instructions may be stored in a non-transitory manner in non-volatile memory [706/806], storage [708/808], or an external device[s]. Alternatively, some or all of the executable instructions may be embedded in hardware or firmware on the respective device in addition to or instead of software.").
receiving a machine language generated transcript of a speech signal by at least one of a first machine learning system and a second machine learning system; (Mairesse, Col. 3. lines 57 - 63: “A user 10 may speak an utterance including a command. The user's utterance is captured by a microphone of device 110. The system may then determine [152] audio data corresponding to the utterance, for example as a result of the microphone converting the sound to an audio data signal. The system may then perform [154] ASR processing [transcript generated] on the audio data, for example using techniques described below.”).

analyzing, by the at least one of the first machine learning system and the second machine learning system, the machine language generated transcript to find a region of low confidence indicative of a mis-transcription; (Mairesse, Col. 3, lines 36 – 38: “Each processing point may use a model configured using machine learning techniques.”, and Col. 3, line 64 - Col. 4, line 8: “The system may then process [156] the ASR results [transcription] with a first model [machine learning system] to determine if disambiguation [low confidence region] of ASR hypotheses is desired. The first model [machine learning system] may be trained to determine, using confidence scores corresponding to a plurality of ASR hypotheses, whether to select a single ASR hypothesis or whether to perform further selection from among the plurality of ASR hypotheses. If disambiguation [low confidence region] is desired, the system may process [158] ASR results [mis-transcription] with a second model [machine learning system] to determine what hypotheses should be selected for disambiguation. The second model [machine learning system] may be trained to determine, also using confidence scores, which of the plurality of ASR hypothesis to select for disambiguation. The second model [machine learning system] may be trained to determine, also using confidence scores, which of the plurality of ASR hypothesis to select for disambiguation.”).
analyzing, by the at least one of the first machine learning system and the second machine learning system, the region of low confidence and predicting an improvement to the region of low confidence indicative of the mis-transcription; (Mairesse, Col. 3, line 64 - Col. 4, line 8: The system may then process [156] the ASR results [transcription] with a first model [machine learning system] to determine if disambiguation [low confidence region] of ASR hypotheses is desired . The first model [machine learning system] may be trained to determine, using confidence scores corresponding to a plurality of ASR hypotheses, whether to select a single ASR hypothesis or whether to perform further selection from among the plurality of ASR hypotheses. If disambiguation [low confidence region] is desired, the system may process (158) ASR results [mis-transcription] with a second model [machine learning system] to determine what hypotheses should be selected for disambiguation. The second model [machine learning system] may be trained to determine, also using confidence scores, which of the plurality of ASR hypothesis to select for disambiguation. The second model [machine learning system] may be trained to determine, also using confidence scores, which of the plurality of ASR hypothesis to select for disambiguation.”).
Mairesse fails to explicitly disclose, however, Steelberg teaches selecting, by a word selector, a replacement word for the mis-transcription based on the predicted improvement to the region of low confidence; and replacing, by the word selector, the mis-transcription by the replacement word. (Steelberg, Par. 0127:” Truth engine 1140 includes algorithms and instructions that, when executed by a processor, cause the processor to identify transcription errors [mis-transcription] in one or more parts of a transcribed portion, for example, by identifying words with confidence score below a predetermined threshold. The truth engine 1140 may then correct [select] the identified errors, for example, by replacing the words with low confidence score with correct words. In some embodiments, the truth engine may utilize machine learning model to find the correct replacement words. The truth engine 1140 may also label [or tag] the corrected words.
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Mairesse in view of Steelberg to select, by a word selector, a replacement word for the mis-transcription based on the predicted improvement to the region of low confidence; and replacing, by the word selector, the mis-transcription by the replacement word, in order to generate a revised transcription based on the received reward function and reach a desired accuracy threshold for transcription, as evidence by Steelberg (See Par. 0032).

Regarding claims 2, 18, Mairesse teaches wherein the first machine learning system and the second machine learning system are connected in tandem. (Mairesse, Col. 3, lines 36 – 45: Each processing point may use a model configured using machine learning techniques. A first model may be trained to determine whether speech processing results should be disambiguated before passing results to be executed. A second model may be trained to determine what potential speech processing results should be displayed for user selection [if any], following selection of disambiguation by the first model. A system for operating this improvement is illustrated in FIG. 1A."). Note: models are being processed serially which is an indication of tandem connection.

Regarding claims 4, and 12, Mairesse does not explicitly teach, but Steelberg further teaches wherein the first machine learning system comprises a RAILS model architecture. (Given that RAILS is not a term of art known to one of ordinary skill in the art, the broadest reasonable interpretation is determined in view of the Specification, where Applicant has acted as their own lexicographer [MPEP 2111.01[IV]] in defining the term “RAILS” in Par. 99 of the originally filed specification, understood to be a real-time model receiving low word confidences, and outputting higher probability replacements, and in view of this definition, Steelberg teaches Par. 0127: “Truth engine 1140 includes algorithms and instructions that, when executed by a processor, cause the processor to identify transcription errors [low word confidence] in one or more parts of a transcribed portion, for example, by identifying words with confidence score below a predetermined threshold. The truth engine 1140 may then correct [higher probability replacements] the identified errors, for example, by replacing the words with low confidence score with correct words. In some embodiments, the truth engine may utilize machine learning model to find the correct replacement words.)
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Mairesse in view of Steelberg to wherein the first machine learning system comprises a RAILS model architecture, in order to generate a revised transcription based on the received reward function, as evidence by Steelberg (See Par. 0032).

Regarding claims 7, and 15, Mairesse does not explicitly teach, but Steelberg further teaches wherein the word selector comprises a trained decision trees model. (Steelberg, Par. 0054; “In contrast, in some embodiments, modeling module 200-2 may train one or more transcription models using both existing media files and the most recent data [transcribed data] available for the input media file. In some embodiments, the training modules 200-1 and 200-2 may include machine learning algorithms such as, but not limited to, deep learning neural networks; gradient boosting, random forests, support vector machine learning, decision trees, variational auto-encoders [VAE], generative adversarial networks, recurrent neural networks, and convolutional neural networks [CNN], faster R-CNNs, mask R-CNNs, and SSD neural networks.”).
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Mairesse in view of Steelberg to wherein the word selector comprises a trained decision trees model, in order to generate a revised transcription based on the received reward function, as evidence by Steelberg (See Par. 0032).

Regarding claim 8, Mairesse does not explicitly teach, but Steelberg further teaches wherein the word selector comprises a Random Forests model. (Steelberg, Par. 0038; “At 110, an initial transcription neural network model can be used to select an initial transcription engine for transcribing the input media file [or a portion of the input media file]. The initial a transcription neural network model [“transcription model”] that can be previously trained. Based the features profile of the input media file, the transcription model may then use one or more machine learning algorithms to generate a list of one or more transcription engines [candidate engines] with the highest predicted transcription accuracy. The one or more machine learning algorithms may include, but not limited to: a deep learning neural network; a gradient boosting algorithm (which may also be referred to as gradient boosted trees), and a random forest algorithm. In some embodiments, all three of the mentioned machine learning algorithms may be used—using model stacking—to create a multi-model.”).
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Mairesse in view of Steelberg to wherein the word selector comprises a Random Forests model, in order to generate a revised transcription based on the received reward function, as evidence by Steelberg (See Par. 0032).

Regarding claim 11, Mairesse teaches A system for detecting and resolving mis-transcriptions in a transcript generated by an automatic speech recognition system when transcribing spoken words, the system comprising: a first filter arranged to receive a machine language generated transcript of a speech signal, the first filter including a first machine learning system and a second machine learning system (Mairesse, Col. 3. lines 57 - 63: “A user 10 may speak an utterance including a command. The user's utterance is captured by a microphone of device 110. The system may then determine [152] audio data corresponding to the utterance, for example as a result of the microphone converting the sound to an audio data signal. The system may then perform [154] ASR processing [transcript generated] on the audio data, for example using techniques described below.”).
arranged to analyze the machine language generated transcript in tandem or in parallel and find a region of low confidence indicative of a mis-transcription, and; (Mairesse, Col. 3, lines 36 – 38: “Each processing point may use a model configured using machine learning techniques.”, and Col. 3, line 64 - Col. 4, line 8: “The system may then process [156] the ASR results [transcription] with a first model [machine learning system] to determine if disambiguation [low confidence region] of ASR hypotheses is desired. The first model [machine learning system] may be trained to determine, using confidence scores corresponding to a plurality of ASR hypotheses, whether to select a single ASR hypothesis or whether to perform further selection from among the plurality of ASR hypotheses. If disambiguation [low confidence region] is desired, the system may process [158] ASR results [mis-transcription] with a second model [machine learning system] to determine what hypotheses should be selected for disambiguation. The second model [machine learning system] may be trained to determine, also using confidence scores, which of the plurality of ASR hypothesis to select for disambiguation. The second model [machine learning system] may be trained to determine, also using confidence scores, which of the plurality of ASR hypothesis to select for disambiguation.”).
the first machine learning system and the second machine learning system further arranged to analyze the region of low confidence and predict an improvement to the region of low confidence; and a second filter arranged to receive the machine generated transcript and the predicted improvement to the region of low confidence from the first filter, and, (Mairesse, Col. 3, line 64 - Col. 4, line 8: The system may then process [156] the ASR results [transcription] with a first model [machine learning system] to determine if disambiguation [low confidence region] of ASR hypotheses is desired . The first model [machine learning system] may be trained to determine, using confidence scores corresponding to a plurality of ASR hypotheses, whether to select a single ASR hypothesis or whether to perform further selection from among the plurality of ASR hypotheses. If disambiguation [low confidence region] is desired, the system may process [158] ASR results [mis-transcription] with a second model [machine learning system] to determine what hypotheses should be selected for disambiguation. The second model [machine learning system] may be trained to determine, also using confidence scores, which of the plurality of ASR hypothesis to select for disambiguation. The second model [machine learning system] may be trained to determine, also using confidence scores, which of the plurality of ASR hypothesis to select for disambiguation.”).
Mairesse fails to explicitly disclose, however, Steelberg teaches based on the predicted improvement to the region of low confidence, select a replacement word for the mis-transcription, and replace the mis-transcription by the replacement word. (Steelberg, Par. 0127:” Truth engine 1140 includes algorithms and instructions that, when executed by a processor, cause the processor to identify transcription errors [mis-transcription] in one or more parts of a transcribed portion, for example, by identifying words with confidence score below a predetermined threshold. The truth engine 1140 may then correct [select] the identified errors, for example, by replacing the words with low confidence score with correct words. In some embodiments, the truth engine may utilize machine learning model to find the correct replacement words. The truth engine 1140 may also label [or tag] the corrected words.
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Mairesse in view of Steelberg to base on the predicted improvement to the region of low confidence, select a replacement word for the mis-transcription, and replace the mis-transcription by the replacement word, in order to generate a revised transcription based on the received reward function, as evidence by Steelberg (See Par. 0032).

Claims 3 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Mairesse (US9558740B1), and Steelberg (US20200286485A1), and in further view of Scott Fischthal (US5822741)(hereinafter “Fischthal”).

Regarding claim 3, and 19 Mairesse and Steelberg fail to explicitly disclose, however Fischthal teaches wherein the first machine learning system and the second machine learning system are connected in parallel (Fischthal, Col. 2, lines 6-8: The neural network or artificial neural system is defined by a plurality of these simple, densely interconnected processing units which operate in parallel.”).
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Mairesse and Steelberg in view of Fischthal to wherein the first machine learning system and the second machine learning system are connected in parallel, in order to employ genetic algorithms, as evidence by Fischthal (See Col. 6, lines 35-36).

Claims 9 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Mairesse (US9558740B1), and Steelberg (US20200286485A1), and in further view of Dua et al. (US20200019863A1)(hereinafter “Dua”).

Regarding claim 9, and 16 Mairesse and Steelberg fail to explicitly disclose, however Dua teaches a dataset containing all unigrams, bigrams, trigrams and quadgrams present in a corpus of transcripts and their respective probabilities. (Dua, Par. 0124:” Each row of the concatenated matrix is processed by a neural network to generate a bag-of-ngrams [BoN] vector data structure representing the probability distribution over the ngrams of the vocabulary [step 618].”, and Par. 0022:” The bag-of-ngrams is encoded as a probability distribution over a full vocabulary V. Ngrams that do not belong to the bag have a probability of zero, while ngrams in the bag have probability larger than zero.”).
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Mairesse and Steelberg in view of Fischthal to employ a dataset containing all unigrams, bigrams, trigrams and quadgrams present in a corpus of transcripts and their respective probabilities, in order to improve knowledge and learn with each iteration and interaction through machine learning processes, as evidence by Dua (see Par. 0073).

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Mairesse (US9558740B1), and Steelberg (US20200286485A1), and in further view of Kibre et al. (US20150243278A1)(hereinafter “Kibre”).

Regarding claim 10, Mairesse and Steelberg fail to explicitly disclose, however Kibre teaches a phonetic encoding component for determining phonetic similarity between lexical items. (Kibre, Par. 0059:” Phonetic features are also indicative of the similarity between the inputs of successive input pairs. Inputs that do not appear similar in written form could be indeed very close in the spoken form. For example, the phonetic similarity of the successive input pair "western palo alto".fwdarw."west elm palo alto" is even greater than the lexical similarity.”).
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Mairesse and Steelberg in view of Kibre to employ a phonetic encoding component for determining phonetic similarity between lexical items, in order to improve recognition accuracy, as evidence by Kibre (See Par. 0002).

Claims 5, 13, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Mairesse (US9558740B1), and Steelberg (US20200286485A1), and in further view of Hilleli et al. (US20210099317A1)(hereinafter “Hilleli”).


Regarding claims 5, and 13 Mairesse and Steelberg fail to explicitly disclose, however Hilleli teaches wherein the second machine learning system comprises a BERT model architecture. (Hilleli, Par 0086:” In an example illustration of a model that may be used to define beginnings and/or ends of action items, BERT models or other similar models can be used. BERT generates a language model by using an encoder to read content all at once or in parallel [i.e., it is bidirectional], as opposed to reading text from left to right, for example.”).
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Mairesse and Steelberg in view of Hilleli to wherein the second machine learning system comprises a BERT model architecture, in order to improve these virtual assistants because they can clarify action items using contextual data, as evidence by Hilleli (See Par. 0033).

Regarding claim 20, Mairesse does not explicitly teach, but Steelburg teaches wherein the first machine learning system comprises a RAILS model architecture, [[the second machine learning system comprises a BERT model architecture,]] and the word selector comprises a trained decision trees model. (Given that RAILS is not a term of art known to one of ordinary skill in the art, the broadest reasonable interpretation is determined in view of the Specification, where Applicant has acted as their own lexicographer [MPEP 2111.01[IV]] in defining the term “RAILS” in Par. 99 of the originally filed specification, understood to be a real-time model receiving low word confidences, and outputting higher probability replacements, and in view of this definition, Steelberg teaches Par. 0127: “Truth engine 1140 includes algorithms and instructions that, when executed by a processor, cause the processor to identify transcription errors [low word confidence] in one or more parts of a transcribed portion, for example, by identifying words with confidence score below a predetermined threshold. The truth engine 1140 may then correct [higher probability replacements] the identified errors, for example, by replacing the words with low confidence score with correct words. In some embodiments, the truth engine may utilize machine learning model to find the correct replacement words.)
and the word selector comprises a trained decision trees model. (Steelberg, Par. 0054; “In contrast, in some embodiments, modeling module 200-2 may train one or more transcription models using both existing media files and the most recent data [transcribed data] available for the input media file. In some embodiments, the training modules 200-1 and 200-2 may include machine learning algorithms such as, but not limited to, deep learning neural networks; gradient boosting, random forests, support vector machine learning, decision trees, variational auto-encoders [VAE], generative adversarial networks, recurrent neural networks, and convolutional neural networks [CNN], faster R-CNNs, mask R-CNNs, and SSD neural networks.”).
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Mairesse in view of Steelberg to wherein the first machine learning system comprises a RAILS model architecture, and the word selector comprises a trained decision trees model, in order to generate a revised transcription based on the received reward function, as evidence by Steelberg (See Par. 0032).
Mairesse and Steelberg fail to explicitly disclose, however Hilleli teaches [[wherein the first machine learning system comprises a RAILS model architecture,]] the second machine learning system comprises a BERT model architecture, [[and the word selector comprises a trained decision trees model.]] (Hilleli, Par 0086:” In an example illustration of a model that may be used to define beginnings and/or ends of action items, BERT models or other similar models can be used. BERT generates a language model by using an encoder to read content all at once or in parallel [i.e., it is bidirectional], as opposed to reading text from left to right, for example.”).
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Mairesse and Steelberg in view of Hilleli to wherein the second machine learning system comprises a BERT model architecture, in order to improve these virtual assistants because they can clarify action items using contextual data, as evidence by Hilleli (See Par. 0033).

Claims 6, and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Mairesse (US9558740B1), and Steelberg (US20200286485A1), and in further view of Henry Mao, “GPT3 and SEO: why AI will revolutionize your content forever”, Feb 2019, Jenni.ai Blog, accessible at: https://jenni.ai/blog/gpt3-seo-content-marketing  (hereinafter “Mao”).

Regarding claims 6, and 14 Mairesse and Steelberg fail to explicitly disclose, however Mao teaches wherein the second machine learning system comprises a GPT-3 model architecture. (Mao, Page 1: OpenAI has released a new version of Generative Pre-trained Transformer version 3 [in short, GPT-3 or GPT 3] with beta API access GPT 3, much like its predecessor GPT 2, is a large deep neural network that can automatically generate text ... It is an advanced AI that learns how to imitate human writing from the web.”)
Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Mairesse and Steelberg in view of Mao to wherein the second machine learning system comprises a GPT-3 model architecture, in order to learn without human labeled data, as evidence by Mao (see page 4).


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure. Abdulkader et al. (US-20200286487A1) teaches Par. 0042:” For each inputted set of transcriptions 226-228 and/or associated features, machine learning model 208 may generate a score (e.g., scores 230) reflecting the accuracy or correctness of the transcription from the contributor ASR, based on the corresponding transcriptions 228 and/or distribution of transcriptions 228 produced by selector ASRs 224. For example, machine learning model 208 may produce a score that represents an estimate of the overall or cumulative error rate between the transcription from the contributor ASR and the corresponding collection of transcriptions 228 produced by selector ASRs 224. During calculation of the score, machine learning model 208 may apply different weights to certain transcriptions 228 and/or portions of one or more transcriptions 226-228 (e.g., words of different lengths, words at the beginning or end of each transcription, etc.). As a result, machine learning model 208 may use transcriptions 228 from selector ASRs 224 as “votes” regarding the correctness or accuracy of a transcription from a given contributor ASR.”.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DARIOUSH AGAHI whose telephone number is (408)918-7689. The examiner can normally be reached Monday - Thursday and alternate Fridays, 7:30-4:30 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DARIOUSH AGAHI/             Examiner, Art Unit 2656                                                                                                                                                                                           

/MICHELLE M KOETH/             Primary Examiner, Art Unit 2656