DETAILED ACTION
Response to Arguments
	Applicant argues that the prior art of Luo fails to teach at least the aforementioned features. See pgs. 10-11 of Applicant’s Remarks submitted on 09/13/2022 (stating that “Luo merely describes using REs to guide attention of a neural network performing language processing and that Luo describes using the evaluation outcomes of REs as features which are fed to NN models and is silent as to the above-recited claimed features”).
Examiner respectfully disagrees. Applicant’s Remarks have not specifically pointed out specific claim limitations that Applicant believes Luo does not teach, Instead Applicant cites to the entire independent claim, describes what Applicant thinks Luo teaches, and then concludes that Luo does not teach all of the features of independent claim 1. See pg. 11 of Applicant’s Remarks submitted on 09/13/2022 (stating that “Luo merely describes using REs to guide attention of a neural network performing language processing and that Luo describes using the evaluation outcomes of REs as features which are fed to NN models. Accordingly, Luo is silent as to the above-recited claimed features”).
Applicant has not pointed to disagreements with the previous Office Action of 05/13/2022 nor has Applicant discussed the reference of Luo applied against the claims, explaining how the claims avoid the prior art of Luo or distinguish from them. A blanket statement that the prior art of Luo does not teach independent claim one is no substitute for explaining how the claim limitations of claim one (and in this case the amended claim limitations) overcome the teachings of Luo as detailed in the previous Office Action of 05/13/2022. 
With that being said the 102 rejection has not been withdrawn, but in light of Applicant’s amendments and/or arguments the objections to the drawings, the rejection under 112(b), the rejection under 112(d), and the rejection under 101 have been withdrawn.  

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-2, 4-12, and 14-19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Luo, Bingfeng, et al. "Marrying up regular expressions with neural networks: A case study for spoken language understanding." arXiv preprint arXiv:1805.05588 (2018)(“Luo”). 
Regarding claim 1, Luo teaches a computer-implemented method comprising: receiving a prediction input(Luo, pg. 2087, left-column,  see also fig. 1, “We use the ATIS dataset…to evaluate our approach. This dataset is widely used in SLU research. It includes queries of flights, meal, etc… [w]e also split words like Miami’s into Miami ’s during the tokenization phase to reduce the number of words that do not have a pre-trained word embedding. This strategy is useful for fewshot learning.” ); generating a plurality of rule-based prediction scores by executing one or more prediction rules on the prediction input, wherein (a) each prediction rule of the one or more prediction rules is associated with a rule condition and one or more predictive weights(Luo, pg. 2085, right-column, “Taking the sentence in Fig. 1 for example, the RE: /ˆflights? from/ that leads to intent flight means that flights from are the key words to decide the intent flight…                        
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    w
                                
                                
                                    k
                                
                            
                            
                                
                                    s
                                
                                
                                    k
                                
                            
                            +
                            
                                
                                    b
                                
                                
                                    k
                                
                            
                        
                     ” Luo teaches: RE: /ˆflights? from/(i.e. generating a plurality of rule-based prediction scores by executing one or more prediction rules on the prediction input, wherein (a) each prediction rule of the one or more prediction rules is associated with a rule condition)                        
                             
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    w
                                
                                
                                    k
                                
                            
                            
                                
                                    s
                                
                                
                                    k
                                
                            
                            +
                            
                                
                                    b
                                
                                
                                    k
                                
                            
                        
                      (i.e. and one or more predictive weights) ), (b) each predictive weight of the one or more predictive weights is associated with a related prediction category of a plurality of prediction categories(Luo, pg. 2086, left-column, “Second, apart from indicating a sentence for intent k (positive REs), a RE can also indicate that a sentence does not express intent k (negative REs)… We denote the logits computed by positive attentions as                         
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    p
                                    k
                                
                            
                        
                    , and those by negative attentions as                         
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    n
                                    k
                                
                            
                        
                    , the final logit for intent k can then be calculated as:                         
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    k
                                
                            
                            =
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    p
                                    k
                                
                            
                            -
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    n
                                    k
                                
                            
                        
                    ” Luo teaches:                         
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    k
                                
                            
                            =
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    p
                                    k
                                
                            
                            -
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    n
                                    k
                                
                            
                        
                     (i.e. each predictive weight of the one or more predictive weights )(positive REs), a RE can also indicate that a sentence does not express intent k (negative REs)(i.e. is associated with a related prediction category of a plurality of prediction categories)), and (c) each prediction category of the plurality of prediction categories is associated with a rule-based prediction score(Luo, pg. 2085, right-column, “The probability                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                        
                     that the input sentence expresses intent k is computed by                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    l
                                                    o
                                                    g
                                                    i
                                                    
                                                        
                                                            t
                                                        
                                                        
                                                            k
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                            
                                        
                                        
                                            
                                                
                                                    exp
                                                
                                                ⁡
                                                
                                                    
                                                        
                                                            l
                                                            o
                                                            g
                                                            i
                                                            
                                                                
                                                                    t
                                                                
                                                                
                                                                    k
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                            .
                        
                    ” Luo teaches:                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    l
                                                    o
                                                    g
                                                    i
                                                    
                                                        
                                                            t
                                                        
                                                        
                                                            k
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                            
                                        
                                        
                                            
                                                
                                                    exp
                                                
                                                ⁡
                                                
                                                    
                                                        
                                                            l
                                                            o
                                                            g
                                                            i
                                                            
                                                                
                                                                    t
                                                                
                                                                
                                                                    k
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                        
                     (i.e. each prediction category of the plurality of prediction categories is associated with a rule-based prediction score) ); determining a rule-based prediction output based at least in part on the plurality of rule- based prediction scores(Luo, pg. 2085, right-column, “The probability                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                        
                     that the input sentence expresses intent k is computed by                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    l
                                                    o
                                                    g
                                                    i
                                                    
                                                        
                                                            t
                                                        
                                                        
                                                            k
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                            
                                        
                                        
                                            
                                                
                                                    exp
                                                
                                                ⁡
                                                
                                                    
                                                        
                                                            l
                                                            o
                                                            g
                                                            i
                                                            
                                                                
                                                                    t
                                                                
                                                                
                                                                    k
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                            .
                        
                    ”); and providing the plurality of rule-based prediction scores and the rule-based prediction output to a machine learning engine, wherein the machine learning engine is configured to generate a machine-learning based prediction output based at least in part on the plurality of rule- based prediction scores and the rule-based prediction output(Luo, pg. 2085, see also fig.2(a), “At the network module level, we explore ways to utilize the clue words in the surface form of a RE (bold blue arrows and words in 2 of Fig. 2) to guide the attention module in NNs.” & see also Luo, pg. 2086, right column, see also fig.2(a), “At the output level, REs are used to amend the output of NNs. At this level, we take the same approach used for intent detection and slot filling (see 3 in Fig. 2).” ),
wherein: the machine learning engine comprises a neural network having one or more input layers, one or more hidden layers, and one or more output layers, wherein a first hidden layer of the one or more hidden layers comprises one or more first hidden nodes, and wherein providing the plurality of rule-based prediction scores to the machine learning engine comprises providing the plurality of rule-based prediction scores to the one or more first hidden nodes(Luo, pg. 2085, Figure 2(a) details the model used for intent detection, in which the hidden layer is comprised of a BLSTM layer of hidden nodes and hidden nodes of                         
                            
                                
                                    h
                                
                                
                                    1
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    2
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    3
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    4
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    5
                                
                            
                        
                    , and                         
                            s
                        
                    , the output layer is comprised of the nodes after the SoftMax classifier that outputs the intent, which in example is flight, and the rule-based prediction scores illustrated by RE tag are injected into the hidden nodes at 2 of figure 2(a) outlined in section 3.3 of Luo pages. 2085-2086).  
Regarding claim 2, Luo teaches the computer-implemented method of claim 1, wherein a prediction system comprises the machine learning engine(Luo, pg. 2088, left-column, see also fig. 2, “Our hyper-parameters for the BLSTM are similar…[s]pecifically, we use batch size 16, dropout probability 0.5, and BLSTM cell size 100. The attention loss weight is 16 (both positive and negative) for full few-shot learning settings and 1 for other settings. We use the 100d GloVe word vectors… pre-trained on Wikipedia and Gigaword…and the Adam optimizer…with learning rate 0.001.” ).  
Regarding claim 4, Luo teaches the computer-implemented method of claim 1, wherein:  37AttyDktNo: 054642/525249the machine learning engine comprises a neural network having one or more input layers, one or more hidden layers, and one or more output layers, the one or more output layers comprises one or more output nodes, and providing the rule-based prediction output to the machine learning engine comprises providing the rule-based prediction output to at least one output node of the one or more output nodes(Luo, pg. 2085, Figure 2(a) details the model used for intent detection, in which the hidden layer is comprised of a BLSTM layer of hidden nodes and hidden nodes of                         
                            
                                
                                    h
                                
                                
                                    1
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    2
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    3
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    4
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    5
                                
                            
                        
                    , and                         
                            s
                        
                    , the output layer is comprised of the nodes after the SoftMax classifier that outputs the intent, which in example is flight, and the rule-based prediction scores illustrated by RE tag are injected into the output nodes at 3 of figure 2(a) outlined in section 3.4 of Luo pages. 2086-2087).  
Regarding claim 5, Luo teaches the computer-implemented method of claim 1, wherein generating the plurality of rule-based prediction scores further comprises: determining, based at least in part on one or more satisfied predictive weights, a plurality of adjusted rule-based prediction scores; and normalizing the plurality of adjusted rule-based prediction scores to generate a plurality of normalized prediction scores(Luo, pg. 2085, right-column, “The probability                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                        
                     that the input sentence expresses intent k is computed by                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    l
                                                    o
                                                    g
                                                    i
                                                    
                                                        
                                                            t
                                                        
                                                        
                                                            k
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                            
                                        
                                        
                                            
                                                
                                                    exp
                                                
                                                ⁡
                                                
                                                    
                                                        
                                                            l
                                                            o
                                                            g
                                                            i
                                                            
                                                                
                                                                    t
                                                                
                                                                
                                                                    k
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                            .
                        
                    ” Luo teaches:                        
                             
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    k
                                
                            
                        
                     (i.e. on one or more satisfied predictive weights  )                        
                             
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                             
                        
                     (i.e. a plurality of adjusted rule-based prediction score)                           
                            T
                            h
                            e
                             
                            p
                            r
                            o
                            b
                            a
                            b
                            i
                            l
                            i
                            t
                            y
                             
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                             
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    l
                                                    o
                                                    g
                                                    i
                                                    
                                                        
                                                            t
                                                        
                                                        
                                                            k
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                            
                                        
                                        
                                            
                                                
                                                    exp
                                                
                                                ⁡
                                                
                                                    
                                                        
                                                            l
                                                            o
                                                            g
                                                            i
                                                            
                                                                
                                                                    t
                                                                
                                                                
                                                                    k
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                        
                      (i.e. and normalizing the plurality of adjusted rule-based prediction scores to generate a plurality of normalized prediction scores)).  
Regarding claim 6, Luo teaches the computer-implemented method of claim 1 further comprising (a) determining one or more rule-based features for the prediction input(Luo, pg. 2084, left-column, see also fig. 1, “In this work, a RE defines a mapping from a text pattern to several REtags which are the same as or related to the target labels (i.e., intent and slot labels). A search function takes in a RE, applies it to all sentences, and returns any texts that match the pattern. We then assign the REtag (s) (that are associated with the matching RE) to the matched sentence (for intent detection)…[s]pecifically, our REtags for intent detection are the same as the intent labels. For example, in Fig. 1, we get a REtag of flight that is the same as the intent label flight.”), and (b) determining one or more machine learning features for the prediction input(Luo, pg. 2085, left-column, see also fig. 2(a), “We use the Bi-directional LSTM (BLSTM) as our base NN model… [a]s shown in Fig. 2, the BLSTM takes as input the word embeddings [                        
                            
                                
                                    x
                                
                                
                                    1
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    x
                                
                                
                                    n
                                
                            
                        
                    ] of a n-word sentence, and produces a vector                         
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                        
                     for each word i. A self-attention layer then takes in the vectors produced by the BLSTM to compute the sentence embedding s…[as detailed by equation (1)] where                         
                            
                                
                                    α
                                
                                
                                    i
                                
                            
                        
                     is the attention for word i, c is a randomly
initialized trainable vector used to select informative words for classification, and W is a weight matrix. Finally, s is fed to a softmax classifier for intent classification.”).1 
Regarding claim 7, Luo teaches the computer-implemented method of claim 6, wherein the one more rule-based features comprise at least one feature different than the one or more machine learning features(Luo, pg. 2086, Figure 2(a) details the model used for intent detection in the rule based feature is defined by the RE instance of the following rule: /^flights? from/ while the features inputted to the machine learning model are a series of words from a given sentence: flights from Boston to Miami which make to the inputs [                        
                            
                                
                                    x
                                
                                
                                    1
                                
                            
                            ,
                            
                                
                                    x
                                
                                
                                    2
                                
                            
                            ,
                            
                                
                                    x
                                
                                
                                    3
                                
                            
                            ,
                            
                                
                                    x
                                
                                
                                    4
                                
                            
                            ,
                            a
                            n
                            d
                             
                            
                                
                                    x
                                
                                
                                    5
                                
                            
                        
                    ]).  
Regarding claim 8, Lou teaches the computer-implemented method of claim 1, wherein generating the plurality of rule-based prediction scores further comprises: determining, by the rules engine, the plurality of rule-based prediction scores based at least in part on one or more satisfied predictive weights, wherein each satisfied predictive weight of the one or more satisfied predictive weights is associated with a satisfied prediction rule of the one or more prediction rules, and wherein each satisfied prediction rule is a prediction rule of the one or more prediction rules whose respective rule condition is satisfied by the prediction input(Luo, pg. 2085, right-column, “The probability                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                        
                     that the input sentence expresses intent k is computed by                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    l
                                                    o
                                                    g
                                                    i
                                                    
                                                        
                                                            t
                                                        
                                                        
                                                            k
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                            
                                        
                                        
                                            
                                                
                                                    exp
                                                
                                                ⁡
                                                
                                                    
                                                        
                                                            l
                                                            o
                                                            g
                                                            i
                                                            
                                                                
                                                                    t
                                                                
                                                                
                                                                    k
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                            .
                        
                    ”).).  
Regarding claim 9,  Lou teaches, the computer-implemented method of claim 8, wherein determining the plurality of rule-based prediction scores based at least in part on one or more satisfied predictive weights comprises: for each prediction category of a plurality of prediction categories, determine a rule-based prediction score of the plurality of rule-based prediction scores for the prediction category based at least in part an aggregate of each satisfied predictive weight of the one or more satisfied predictive weights that is associated with the prediction category(Luo, pg. 2086, left-column, “Second, apart from indicating a sentence for intent k (positive REs), a RE can also indicate that a sentence does not express intent k (negative REs)… We denote the logits computed by positive attentions as                         
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    p
                                    k
                                
                            
                        
                    , and those by negative attentions as                         
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    n
                                    k
                                
                            
                        
                    , the final logit for intent k can then be calculated as:                         
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    k
                                
                            
                            =
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    p
                                    k
                                
                            
                            -
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    n
                                    k
                                
                            
                        
                    ”). 
Regarding claim 10, Lou teaches, the computer-implemented method of claim 1, wherein at least a first prediction rule of the one or more prediction rules is determined based on data provided by a subject- matter-expert user(Luo, pg. 2087, right-column, “We use the syntax of REs in Perl in this work. Our REs are written by a paid annotator who is familiar with the domain.”).
Referring to independent claims 11 and 18 they are rejected on the same basis as
independent claim 1 since they are analogous claims. 
Referring to dependent claims 12 and 14-17, they are rejected on the same basis as dependent claims 2 and 4-7 since they are analogous claims.
Referring to dependent claim 19 it is rejected on the same basis as dependent claim 2 since they are analogous claims.
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Adam Clark Standke whose telephone number is (571)270-1806. The examiner can normally be reached 10AM-7PM M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J Huntley can be reached on (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
Adam Clark Standke
Assistant Examiner
Art Unit 2129



/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129