DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 1/31/2020 was filed after the mailing date of 1/31/2020.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3, 4, 16 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lin et al (US20200112575) in view of Liu et al (US20170293725).
Regarding claim 1, Lin teaches a method for implementing a prototype sequence machine learning network, comprising: 
mapping one or more labeled sequence datasets (S510 and S520 in fig. 5, para. [0032], the raw domain name 611 is, for example, “google.com”. Correspondingly, the processor 404 can divide the raw domain name 611 into “google” (i.e., the specific part 612) and the “.com” part in step S510, and extract the specific part 612 in step S520) using a sequence encoder (S530 in fig. 5, para. [0033], in step S530, the processor 404 can encode the characters 612a-612f into encoded data 613a, 613b, 613c, 613d, 613e, and 613f) to generate an embedded vector having a fixed length (S540 and S550 in fig. 5, fig. 7, para. [0034], [0036], in step S540, the processor 404 may pad the encoded data 613a-613f to a specific length;  in step S550, the processor 404 can project the encoded data 614 being padded as a plurality of embedded vectors); 
generating a vector based on the embedded vector (S560 in fig. 5, para. [0039], result vector); and
classifying one or more prediction values using a fully-connected layer that applies a function against the generated vector (S570 in fig. 5, para. [0044],  in step S570, the processor 404 can convert the result vector into a prediction probability via the fully-connected layer and the specific function).

Lin fails to teach determining a score between the embedded vector and one or more prototype vectors to generate one or more similarity vectors; and 
applying a weight matrix against the one or more similarity vectors.

However Liu teaches determining a score between an embedded vector and one or more prototype vectors (para. [0023], claim 9, a similarity matching process 125 is applied to the sentence embedding vectors 110 and the question embedding vector 120 to generate a similarity score for each of the sentence embedding vectors 110) to generate one or more similarity vectors (claim 9, selecting a predetermined number of the highest ranking sentence embedding vectors as the subset of the sentence embedding vectors); and 
classifying one or more prediction values using a fully-connected layer that applies a weight matrix against the one or more similarity vectors (para. [0009], The subset of the sentence embedding vectors most similar to the question embedding vector is identified and divided into a first sequence of words. The subset of the question embedding vector is divided into a second sequence of words. The first sequence of words and the second sequence of words are passed through a plurality of LSTM cells sequentially to yield a plurality of outputs corresponding to different states. The outputs are combined into a single input vector and a softmax function is applied to the single input vector to generate a predicted answer).

Therefore taking the combined teachings of Lin and Liu as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Liu into the method of Lin. The motivation to combine Liu and Lin would be to automatically understand the results presented in image analytics reports through a question answering system (para. [0005] of Liu).


Regarding claim 3, the modified invention of Lin fails to explicitly teach a method wherein the score is assigned a value of zero when a sequence embedding of the embedding vector is not substantially equal to the one or more prototype vectors. However the value of zero is interpreted to be an arbitrarily selected value. One of ordinary skill in the art would have found it obvious to assign any desired value for the score (claim 9 of Liu).


Regarding claim 4, the modified invention of Lin fails to explicitly teach a method wherein the score is assigned a value of one when a sequence embedding of the embedding vector is substantially equal to the one or more prototype vectors. However the value of one is interpreted to be an arbitrarily selected value. One of ordinary skill in the art would have found it obvious to assign any desired value for the score (claim 9 of Liu).


Regarding claim 16, the modified invention of Lin teaches a method wherein the sequence encoder is a long short-term memory (LSTM) network (para. [0036] of Lin, Specifically, the LSTM model generally includes an embedded layer, an LSTM layer, and a fully-connected layer, and the step S550 is to establish the above embedded layer).


Regarding claim 19, the claim recites similar limitations as those claimed in claim 1 and is therefore rejected for the same reasons as stated above.


Claims 2 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lin et al (US20200112575) and Liu et al (US20170293725) in view of Karthik et al (US20190361767).
Regarding claim 2, the modified invention of Lin teaches a method further comprising: 
computing a predicted probability for the one or more labeled sequence datasets using a softmax layer (para. [0009] of Liu, The subset of the sentence embedding vectors most similar to the question embedding vector is identified and divided into a first sequence of words. The subset of the question embedding vector is divided into a second sequence of words. The first sequence of words and the second sequence of words are passed through a plurality of LSTM cells sequentially to yield a plurality of outputs corresponding to different states. The outputs are combined into a single input vector and a softmax function is applied to the single input vector to generate a predicted answer).

Lin fails to teach wherein the softmax layer divides an exponential of the one or more prediction values by a sum of the one or more prediction values. However Karthik teaches a softmax layer which divides an exponential of one or more prediction values by a sum of the one or more prediction values (para. [0053], The above softmax function (normalized exponential) computes the exponential of every softmax score, then normalizes the exponentials by dividing by the sum of all of the exponentials).
Therefore taking the combined teachings of Lin and Liu with Karthik as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Karthik into the method of Lin and Liu. The motivation to combine Karthik, Liu and Lin would be to automatically react to data ingest exceptions in a data pipeline system based on determined probable cause of the exception (para. [0007] of Karthik).


Regarding claim 20, the claim recites similar limitations as those claimed in claims 1-2 and is therefore rejected for the same reasons as stated above.


Claims 7 and 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lin et al (US20200112575) and Liu et al (US20170293725) in view of Branavan et al (US10210244).
Regarding claim 7, the modified invention of Lin fails to teach a method wherein a clustering regularization function is applied to the one or more labeled sequence datasets and the one or more prototype vectors, wherein the clustering regularization function ensures a clustering structure in a latent space.
However Branavan teaches wherein a clustering regularization function is applied to the one or more labeled sequence datasets and the one or more prototype vectors (abstract,  col. 13 lines 14-36, Messages corresponding to an intent may be clustered into clusters of similar messages, and a prototype message may be obtained for each cluster to provide a human understandable description of the cluster). Ensuring a clustering structure in a latent space is interpreted to be an intended use.
Therefore taking the combined teachings of Lin and Liu with Branavan as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Branavan into the method of Lin and Liu. The motivation to combine Branavan, Liu and Lin would be to improve the natural language interface, such as by creating a new intent with a cluster or moving a cluster to a different intent (abstract of Branavan).


Regarding claim 11, the modified invention of Lin fails to teach a method further comprising: 
projecting the one or more prototype vectors to a subsequence of events within a training dataset.

However Branavan teaches projecting one or more prototype vectors to a subsequence of events within a training dataset (col. 13 lines 14-36, To describe the cluster, a message may be generated or selected from the cluster and referred to as a prototype message. Any appropriate techniques may be used to obtain a prototype message for a cluster. In some implementations, the prototype message may be a most frequent message, a message with a largest count, a message closest to the center of the cluster (e.g., geometrically or by center of mass), a message having a selected language type (e.g., simplest language, most commonly used words, or the inclusion or exclusion of jargon or selected words), or a randomly selected message).
Therefore taking the combined teachings of Lin and Liu with Branavan as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Branavan into the method of Lin and Liu. The motivation to combine Branavan, Liu and Lin would be to improve the natural language interface, such as by creating a new intent with a cluster or moving a cluster to a different intent (abstract of Branavan).


Claims 8-10 and 13-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lin et al (US20200112575) and Liu et al (US20170293725) in view of Wohlwend (US10747957).
Regarding claim 8, the modified invention of Lin fails to teach a method wherein an evidence regularization function is applied to ensure the one or more prototype vectors are approximately equal to the one or more labeled sequence datasets.
However Wohlwend teaches ensuring the one or more prototype vectors are approximately equal to the one or more labeled sequence datasets (col. 6 lines 7-17, a Euclidean distance may be computed, and an intent may be selected as corresponding to the prototype vector that is closest to the message embedding).
Therefore taking the combined teachings of Lin and Liu with Wohlwend as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Wohlwend into the method of Lin and Liu. The motivation to combine Wohlwend, Liu and Lin would be to efficiently determine intents of message received from customers and may also need to be able to efficiently update its automated communications system when it is determined that it is needed or desirable to add additional intents or modify existing intents (col. 2 lines 54-59 of Wohlwend).


Regarding claim 9, the modified invention of Lin fails to teach a method further comprising: 
assigning the one or more prototype vectors with a sequence embedding vector provided from a training dataset, wherein the sequence embedding vector is approximately equal to the one or more prototype vectors.

However Wohlwend teaches assigning one or more prototype vectors with a sequence embedding vector provided from a training dataset (col. 7 lines 5-11, the initial model m may be used to create an initial set of prototype vectors from the training data). The term “approximately equal” is relative and therefore any two vectors may be interpreted to be approximately equal. 
Therefore taking the combined teachings of Lin and Liu with Wohlwend as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Wohlwend into the method of Lin and Liu. The motivation to combine Wohlwend, Liu and Lin would be to efficiently determine intents of message received from customers and may also need to be able to efficiently update its automated communications system when it is determined that it is needed or desirable to add additional intents or modify existing intents (col. 2 lines 54-59 of Wohlwend).


Regarding claim 10, it would be obvious to assigning the one or more prototype vectors occurs at a predetermined epoch such as after receiving the training data (col. 7 lines 5-11 of Wohlwend).


Regarding claim 13, the modified invention of Lin fails to teach a method further comprising: 
deleting at least one of the one or more prototype vectors during a training process

However Wohlwend teaches deleting at least one of the one or more prototype vectors during a training process (col. 7 lines 37-45, The parameters of model m may be updated, for example, by iterating over the training data and minimizing the negative log-probability of the function f. After updating the parameters of model m, the process may be repeated. The updated model m may be used to compute updated prototype vectors, and the updated prototype vectors may be used to again update the parameters of model m. The process may be repeated until a desired convergence criterion has been met)
Therefore taking the combined teachings of Lin and Liu with Wohlwend as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Wohlwend into the method of Lin and Liu. The motivation to combine Wohlwend, Liu and Lin would be to efficiently determine intents of message received from customers and may also need to be able to efficiently update its automated communications system when it is determined that it is needed or desirable to add additional intents or modify existing intents (col. 2 lines 54-59 of Wohlwend).


Regarding claim 14, the modified invention of Lin fails to teach a method further comprising: 
revising at least one of the one or more prototype vectors during a training process

However Wohlwend teaches revising at least one of the one or more prototype vectors during a training process (col. 7 lines 37-45, The parameters of model m may be updated, for example, by iterating over the training data and minimizing the negative log-probability of the function f. After updating the parameters of model m, the process may be repeated. The updated model m may be used to compute updated prototype vectors, and the updated prototype vectors may be used to again update the parameters of model m. The process may be repeated until a desired convergence criterion has been met)
Therefore taking the combined teachings of Lin and Liu with Wohlwend as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Wohlwend into the method of Lin and Liu. The motivation to combine Wohlwend, Liu and Lin would be to efficiently determine intents of message received from customers and may also need to be able to efficiently update its automated communications system when it is determined that it is needed or desirable to add additional intents or modify existing intents (col. 2 lines 54-59 of Wohlwend).


Regarding claim 15, the modified invention of Lin fails to teach a method further comprising: 
modifying at least one of the one or more prototype vectors during a training process

However Wohlwend teaches modifying at least one of the one or more prototype vectors during a training process (col. 7 lines 37-45, The parameters of model m may be updated, for example, by iterating over the training data and minimizing the negative log-probability of the function f. After updating the parameters of model m, the process may be repeated. The updated model m may be used to compute updated prototype vectors, and the updated prototype vectors may be used to again update the parameters of model m. The process may be repeated until a desired convergence criterion has been met)
Therefore taking the combined teachings of Lin and Liu with Wohlwend as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Wohlwend into the method of Lin and Liu. The motivation to combine Wohlwend, Liu and Lin would be to efficiently determine intents of message received from customers and may also need to be able to efficiently update its automated communications system when it is determined that it is needed or desirable to add additional intents or modify existing intents (col. 2 lines 54-59 of Wohlwend).


Claims 17-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lin et al (US20200112575) and Liu et al (US20170293725) in view of Elkind et al (US20190273509).
Regarding claim 17, the modified invention of Lin fails to teach a method wherein the sequence encoder is a bi-directional LSTM network.
However Elkind teaches wherein a sequence encoder (940 in fig. 9, para. [0111], The output of the convolutional filter 930 (or input at block 920 in the absence of a convolutional filter 930) is input into one or more encoding recurrent neural network layers 940. As discussed previously, encoding recurrent neural network layers 940 receive samples as input, generate output, and update their state values) is a bi-directional LSTM network (para. [0080], Other types of RNNs may also be used, including Bidirectional RNNs, Deep (Bidirectional) RNNs, among others).
Therefore taking the combined teachings of Lin and Liu with Elkind as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Elkind into the method of Lin and Liu. The motivation to combine Elkind, Liu and Lin would be to reduce or remove noise and improve classification (para. [0025] of Elkind).


Regarding claim 18, the modified invention of Lin fails to teach a method wherein the sequence encoder is a gated recurrent unit (GRU) network.
However Elkind teaches wherein a sequence encoder (725 in fig. 7, 940 in fig. 9, para. [0111], The output of the convolutional filter 930 (or input at block 920 in the absence of a convolutional filter 930) is input into one or more encoding recurrent neural network layers 940. As discussed previously, encoding recurrent neural network layers 940 receive samples as input, generate output, and update their state values) is a gated recurrent unit (GRU) network (para. [0128], Encoder RNN 725 may include any number of layers of one or more types of RNN cells, each layer including an LSTM, GRU, or other RNN cell type).
Therefore taking the combined teachings of Lin and Liu with Elkind as a whole, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate the steps of Elkind into the method of Lin and Liu. The motivation to combine Elkind, Liu and Lin would be to reduce or remove noise and improve classification (para. [0025] of Elkind).


Allowable Subject Matter
Claims 5, 6, and 12 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEON VIET Q NGUYEN whose telephone number is (571)270-1185. The examiner can normally be reached Mon-Fri 11AM-7PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Claire Wang can be reached on 571-270-1051. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/LEON VIET Q NGUYEN/           Primary Examiner, Art Unit 2663