DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Examiner’s Remarks
	Examiner has found dependent claims 3-4, 13-14, and 20 to be eligible subject matter under 35 U.S.C. 101 since these are not directed to a judicial exception. 
Claim 3 states in relevant part the following: “the machine learning engine comprises a neural network having one or more input layers, one or more hidden layers, and one or more output layers, a first hidden layer of the one or more hidden layers comprises one or more first hidden nodes, and providing the plurality of rule-based prediction scores to the machine learning engine comprises providing the plurality of rule-based prediction scores to a first plurality of nodes of the one or more first hidden nodes.”(emphasis added). 
Claim 4 states in relevant part the following: “the machine learning engine comprises a neural network having one or more input layers, one or more hidden layers, and one or more output layers, the one or more output layers comprises one or more output nodes, and providing the rule-based prediction output to the machine learning engine comprises providing the rule-based prediction output to at least one output machine learning node of the one or more output machine learning nodes. providing the rule-based prediction output to the machine learning engine comprises providing the rule-based prediction output to at least one output node of the one or more output nodes.”(emphasis added). 
Both claims recited above do not recite an abstract idea since the claims, especially the emphasized portions, present functional and palpable applications to the field of computer technology. See Research Corp. Technologies v. Microsoft Corp., 627 F.3d 859, 97 USPQ2d 1274 (Fed. Cir. 2010)(finding that the patent claims were subject matter eligible under 35 U.S.C. 101 since the invention had specific application or improvement to technology in the marketplace). In this instance not only is a neural network having a plurality of layers and/or nodes being claimed, but also a neural network configuration that combines rule-based predictions with the different layers and/or nodes of the network. As the art cited in the current Office Action details in the related work section, while combining neural networks with rules and/or regular expressions is not new, combining a neural network with regular expressions at different levels of the neural network leads to better performance for few shot learning tasks.1 
Since claims 13-14, and 20 are analogous claims the above reasoning applies to these claims as well in regards to being eligible subject matter under 35 U.S.C. 101. 

Drawings
The subject matter of this application admits of illustration by a drawing to facilitate understanding of the invention.  Applicant is required to furnish a drawing under 37 CFR 1.81(c).  No new matter may be introduced in the required drawing.  Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d).
The drawings are objected to under 37 CFR 1.83(a) because the Regular Expression element of 901 of figure 9 does not match the text element of 801 of figure 8. Any structural detail that is essential for a proper understanding of the disclosed invention should be shown in the drawing. MPEP § 608.02(d). Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Furthermore, the drawings are objected to under 37 CFR 1.83(a) because they fail to show the rules engine of figure 7 as described in the specification as element 115 in paragraph 0095.  Any structural detail that is essential for a proper understanding of the disclosed invention should be shown in the drawing. MPEP § 608.02(d). Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1, 4-5, 8-11, and 14-16 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
A broad range or limitation together with a narrow range or limitation that falls within the broad range or limitation (in the same claim) may be considered indefinite if the resulting claim does not clearly set forth the metes and bounds of the patent protection desired. See MPEP § 2173.05(c). In the present instance, claims 4 and 14 recites the broad recitation ‘providing the rule-based prediction output to at least one output machine learning node of the one or more output machine learning nodes,’ and the claim also recites ‘providing the rule-based prediction output to at least one output node of the one or more output nodes,’ which is the narrower statement of the range/limitation. . The claim(s) are considered indefinite because there is a question or doubt as to whether the feature introduced by such narrower language is (a) merely exemplary of the remainder of the claim, and therefore not required, or (b) a required feature of the claims.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

The following is a quotation of pre-AIA  35 U.S.C. 112, fourth paragraph:
Subject to the following paragraph [i.e., the fifth paragraph of pre-AIA  35 U.S.C. 112], a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

Claims 6-7 are rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends.  Claim 6 refers to itself and thus does not further limit the subject matter and  claim 7 inherits claim’s 6 deficiency.  Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-2, 5-12, and 15-19 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea and does not integrate the judicial exception into a practical application or amount to significantly more than the judicial exception.
Regarding independent claim 1, it recites: receiving a prediction input; generating a plurality of rule-based prediction scores by executing one or more prediction rules on the prediction input, wherein (a) each prediction rule of the one or more prediction rules is associated with a rule condition and one or more predictive weights, (b) each predictive weight of the one or more predictive weights is associated with a related prediction category of a plurality of prediction categories, and (c) each prediction category of the plurality of prediction categories is associated with a rule-based prediction score; determining a rule-based prediction output based at least in part on the plurality of rule- based prediction scores; and providing the plurality of rule-based prediction scores and the rule-based prediction output… is configured to generate…prediction output based at least in part on the plurality of rule-based prediction scores and the rule-based prediction output. All of these limitations can be performed in the human mind through the use of observations, evaluations, judgments and options, and thus, claim 1 recites a mental concept and is an abstract idea.
	This judicial exception is not integrated into a practical application because it only recites
the following additional elements: machine learning engine. Machine learning while relating to a technological field, the claim as recited does not amount to an improvement in machine learning technology. See MPEP 2106.05(a).
Furthermore, claim 1 does not include additional elements that are sufficient to amount to
significantly more than the judicial exception because the additional elements of a machine learning engine amount to no more than a recitation of the words "apply it" (or an equivalent) and/or are no more than mere instructions to implement an abstract idea on a computer. See MPEP 2106.05(f).
Regarding dependent claim 2, it recites…wherein a prediction system comprises the machine learning engine. This limitation is also an abstract idea since it recites a mental concept that does not integrate the judicial exception into a practical application, because the additional element of a machine learning engine as recited does not amount to an improvement in machine learning technology. See MPEP 2106.05(a). Furthermore, the additional element of a machine learning engine does not amount to significantly more than the judicial exception because the additional elements of a machine learning engine amount to no more than a recitation of the words "apply it" (or an equivalent) and/or are no more than mere instructions to implement an abstract idea on a computer. See MPEP 2106.05(f). Thus, the dependent claim is ineligible.
Regarding dependent claim 5, it recites… wherein generating the plurality of rule-based prediction scores further comprises: determining, based at least in part on one or more satisfied predictive weights, a plurality of adjusted rule-based prediction scores; and normalizing the plurality of adjusted rule-based prediction scores to generate a plurality of normalized prediction scores. This limitation is also an abstract idea since it recites a mental concept that does not integrate the judicial exception into a practical application or amount to significantly more than the judicial exception as recited in claim 1. Thus, the dependent claim is ineligible.
Regarding dependent claim 6, it recites…(a) determining one or more rule-based features for the prediction input, and (b) determining one or more machine learning features for the prediction input. This limitation is also an abstract idea since it recites a mental concept that does not integrate the judicial exception into a practical application or amount to significantly more than the judicial exception because the additional element of one or more machine learning features as recited does not amount to an improvement in machine learning technology. See MPEP 2106.05(a). Furthermore, the additional element of one or more machine learning features does not amount to significantly more than the judicial exception because the additional elements amount to no more than a recitation of the words "apply it" (or an equivalent) and/or are no more than mere instructions to implement an abstract idea on a computer. See MPEP 2106.05(f). Thus, the dependent claim is ineligible. 
Regarding dependent claim 7, it recites… wherein the one more rule-based features comprise at least one feature different than the one or more machine learning features. This limitation is also an abstract idea since it recites a mental concept that does not integrate the judicial exception into a practical application or amount to significantly more than the judicial exception because the additional element of one or more machine learning features as recited does not amount to an improvement in machine learning technology. See MPEP 2106.05(a). Furthermore, the additional element of one or more machine learning features does not amount to significantly more than the judicial exception because the additional elements amount to no more than a recitation of the words "apply it" (or an equivalent) and/or are no more than mere instructions to implement an abstract idea on a computer. See MPEP 2106.05(f). Thus, the dependent claim is ineligible. 
Regarding dependent claim 8, it recites… wherein generating the plurality of rule-based prediction scores further comprises: determining, by the rules engine, the plurality of rule-based prediction scores based at least in part on one or more satisfied predictive weights, wherein each satisfied predictive weight of the one or more satisfied predictive weights is associated with a satisfied prediction rule of the one or more prediction rules, and wherein each satisfied prediction rule is a prediction rule of the one or more prediction rules whose respective rule condition is satisfied by the prediction input. This limitation is also an abstract idea since it recites a mental concept that does not integrate the judicial exception into a practical application or amount to significantly more than the judicial exception as recited in claim 1. Thus, the dependent claim is ineligible.
Regarding dependent claim 9, it recites… wherein determining the plurality of rule-based prediction scores based at least in part on one or more satisfied predictive weights comprises: for each prediction category of a plurality of prediction categories, determine a rule-based prediction score of the plurality of rule-based prediction scores for the prediction category based at least in part an aggregate of each satisfied predictive weight of the one or more satisfied predictive weights that is associated with the prediction category. This limitation is also an abstract idea since it recites a mental concept that does not integrate the judicial exception into a practical application or amount to significantly more than the judicial exception as recited in claim 8. Thus, the dependent claim is ineligible.
Regarding dependent claim 10, it recites…wherein at least a first prediction rule of the one or more prediction rules is determined based on data provided by a subject- matter-expert user. This limitation is also an abstract idea since it recites a mental concept that does not integrate the judicial exception into a practical application or amount to significantly more than the judicial exception as recited in claim 1. Thus, the dependent claim is ineligible.
Regarding independent claim 11, it recites: a prediction system comprising…the program code configured to…cause the prediction system to at least: receive a prediction input; generate a plurality of rule-based prediction scores by executing one or more prediction rules on the prediction input, wherein (a) each prediction rule of the one or more prediction rules is associated with a rule condition and one or more predictive weights, (b) each predictive weight of the one or more predictive weights is associated with a related prediction category of a plurality of prediction categories, and (c) each prediction category of the plurality of prediction categories is associated with a rule-based prediction score; determine a rule-based prediction output based at least in part on the plurality of rule- based prediction scores; and provide the plurality of rule-based prediction scores and the rule-based prediction output…is configured to generate…prediction output based at least in part on the plurality of rule-based prediction scores and the rule-based prediction output. All of these limitations can be performed in the human mind through the use of observations, evaluations, judgments and options, and thus, claim 11 recites a mental concept and is an abstract idea.
This judicial exception is not integrated into a practical application because it only recites
the following additional elements: processor, memory, and machine learning engine. A processor, and memory are recited at a high- level of generality (i.e., as a generic hardware to perform generic computer functions) such that it amounts to no more than mere instructions to apply the exception using generic computer components. See MPEP 2106.05(f). Machine learning while relating to a technological field, the claim as recited does not amount to an improvement in machine learning technology. See MPEP 2106.05(a).
Furthermore, claim 11 does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements of processor, memory, and a machine learning engine amount to no more than a recitation of the words "apply it" (or an equivalent) and/or are no more than mere instructions to implement an abstract idea on a computer. See MPEP 2106.05(f).
Regarding dependent claim 12, it recites…wherein a prediction system comprises the machine learning engine. This limitation is also an abstract idea since it recites a mental concept that does not integrate the judicial exception into a practical application, because the additional element of a machine learning engine as recited does not amount to an improvement in machine learning technology. See MPEP 2106.05(a). Furthermore, the additional element of a machine learning engine does not amount to significantly more than the judicial exception because the additional elements of a machine learning engine amount to no more than a recitation of the words "apply it" (or an equivalent) and/or are no more than mere instructions to implement an abstract idea on a computer. See MPEP 2106.05(f). Thus, the dependent claim is ineligible.
Regarding dependent claim 15, it recites…wherein generating the plurality of rule-based prediction scores further comprises: determining, based at least in part on one or more satisfied predictive weights, a plurality of adjusted rule-based prediction scores; and normalizing the plurality of adjusted rule-based prediction scores to generate a plurality of normalized prediction scores. This limitation is also an abstract idea since it recites a mental concept that does not integrate the judicial exception into a practical application or amount to significantly more than the judicial exception as recited in claim 11. Thus, the dependent claim is ineligible.
Regarding dependent claim 16, it recites…(a) determining one or more rule-based features for the prediction input, and (b) determining one or more machine learning features for the prediction input. This limitation is also an abstract idea since it recites a mental concept that does not integrate the judicial exception into a practical application or amount to significantly more than the judicial exception because the additional element of one or more machine learning features as recited does not amount to an improvement in machine learning technology. See MPEP 2106.05(a). Furthermore, the additional element of one or more machine learning features does not amount to significantly more than the judicial exception because the additional elements amount to no more than a recitation of the words "apply it" (or an equivalent) and/or are no more than mere instructions to implement an abstract idea on a computer. See MPEP 2106.05(f). Thus, the dependent claim is ineligible. 
Regarding dependent claim 17, it recites… wherein the one more rule-based features comprise at least one feature different than the one or more machine learning features. This limitation is also an abstract idea since it recites a mental concept that does not integrate the judicial exception into a practical application or amount to significantly more than the judicial exception because the additional element of one or more machine learning features as recited does not amount to an improvement in machine learning technology. See MPEP 2106.05(a). Furthermore, the additional element of one or more machine learning features does not amount to significantly more than the judicial exception because the additional elements amount to no more than a recitation of the words "apply it" (or an equivalent) and/or are no more than mere instructions to implement an abstract idea on a computer. See MPEP 2106.05(f). Thus, the dependent claim is ineligible. 
Regarding independent claim 18, it recites: a computer program product comprising…cause a prediction system to: receive a prediction input; generate a plurality of rule-based prediction scores by executing one or more prediction rules on the prediction input, wherein (a) each prediction rule of the one or more prediction rules is associated with a rule condition and one or more predictive weights, (b) each predictive weight of the one or more predictive weights is associated with a related prediction category of a plurality of prediction categories, and (c) each prediction category of the plurality of prediction categories is associated with a rule-based prediction score; determine a rule-based prediction output based at least in part on the plurality of rule- based prediction scores; and provide the plurality of rule-based prediction scores and the rule-based prediction output…configured to…generate…prediction output based at least in part on the plurality of rule-based prediction scores and the rule-based prediction output. All of these limitations can be performed in the human mind through the use of observations, evaluations, judgments and options, and thus, claim 18 recites a mental concept and is an abstract idea.
This judicial exception is not integrated into a practical application because it only recites
the following additional elements: non-transitory computer-readable storage medium and machine learning engine. A non-transitory computer-readable storage medium is recited at a high-level of generality (i.e., as a generic hardware to perform generic computer functions) such that it amounts to no more than mere instructions to apply the exception using generic computer components. See MPEP 2106.05(f). Machine learning while relating to a technological field, the claim as recited does not amount to an improvement in machine learning technology. See MPEP 2106.05(a).
Furthermore, claim 18 does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements of non-transitory computer-readable storage medium and a machine learning engine amount to no more than a recitation of the words "apply it" (or an equivalent) and/or are no more than mere instructions to implement an abstract idea on a computer. See MPEP 2106.05(f).
Regarding dependent claim 19, it recites…wherein a prediction system comprises the machine learning engine. This limitation is also an abstract idea since it recites a mental concept that does not integrate the judicial exception into a practical application, because the additional element of a machine learning engine as recited does not amount to an improvement in machine learning technology. See MPEP 2106.05(a). Furthermore, the additional element of a machine learning engine does not amount to significantly more than the judicial exception because the additional elements of a machine learning engine amount to no more than a recitation of the words "apply it" (or an equivalent) and/or are no more than mere instructions to implement an abstract idea on a computer. See MPEP 2106.05(f). Thus, the dependent claim is ineligible.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Luo, Bingfeng, et al. "Marrying up regular expressions with neural networks: A case study for spoken language understanding." arXiv preprint arXiv:1805.05588 (2018)(“Luo”). 
Regarding claim 1, Luo teaches a computer-implemented method comprising: receiving a prediction input(Luo, pg. 2087, left-column,  see also fig. 1, “We use the ATIS dataset…to evaluate our approach. This dataset is widely used in SLU research. It includes queries of flights, meal, etc… [w]e also split words like Miami’s into Miami ’s during the tokenization phase to reduce the number of words that do not have a pre-trained word embedding. This strategy is useful for fewshot learning.” ); generating a plurality of rule-based prediction scores by executing one or more prediction rules on the prediction input, wherein (a) each prediction rule of the one or more prediction rules is associated with a rule condition and one or more predictive weights(Luo, pg. 2085, right-column, “Taking the sentence in Fig. 1 for example, the RE: /ˆflights? from/ that leads to intent flight means that flights from are the key words to decide the intent flight…                        
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    w
                                
                                
                                    k
                                
                            
                            
                                
                                    s
                                
                                
                                    k
                                
                            
                            +
                            
                                
                                    b
                                
                                
                                    k
                                
                            
                        
                     ” Luo teaches: RE: /ˆflights? from/(i.e. generating a plurality of rule-based prediction scores by executing one or more prediction rules on the prediction input, wherein (a) each prediction rule of the one or more prediction rules is associated with a rule condition)                        
                             
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    w
                                
                                
                                    k
                                
                            
                            
                                
                                    s
                                
                                
                                    k
                                
                            
                            +
                            
                                
                                    b
                                
                                
                                    k
                                
                            
                        
                      (i.e. and one or more predictive weights) ), (b) each predictive weight of the one or more predictive weights is associated with a related prediction category of a plurality of prediction categories(Luo, pg. 2086, left-column, “Second, apart from indicating a sentence for intent k (positive REs), a RE can also indicate that a sentence does not express intent k (negative REs)… We denote the logits computed by positive attentions as                         
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    p
                                    k
                                
                            
                        
                    , and those by negative attentions as                         
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    n
                                    k
                                
                            
                        
                    , the final logit for intent k can then be calculated as:                         
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    k
                                
                            
                            =
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    p
                                    k
                                
                            
                            -
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    n
                                    k
                                
                            
                        
                    ” Luo teaches:                         
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    k
                                
                            
                            =
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    p
                                    k
                                
                            
                            -
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    n
                                    k
                                
                            
                        
                     (i.e. each predictive weight of the one or more predictive weights )(positive REs), a RE can also indicate that a sentence does not express intent k (negative REs)(i.e. is associated with a related prediction category of a plurality of prediction categories)), and (c) each prediction category of the plurality of prediction categories is associated with a rule-based prediction score(Luo, pg. 2085, right-column, “The probability                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                        
                     that the input sentence expresses intent k is computed by                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    l
                                                    o
                                                    g
                                                    i
                                                    
                                                        
                                                            t
                                                        
                                                        
                                                            k
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                            
                                        
                                        
                                            
                                                
                                                    exp
                                                
                                                ⁡
                                                
                                                    
                                                        
                                                            l
                                                            o
                                                            g
                                                            i
                                                            
                                                                
                                                                    t
                                                                
                                                                
                                                                    k
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                            .
                        
                    ” Luo teaches:                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    l
                                                    o
                                                    g
                                                    i
                                                    
                                                        
                                                            t
                                                        
                                                        
                                                            k
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                            
                                        
                                        
                                            
                                                
                                                    exp
                                                
                                                ⁡
                                                
                                                    
                                                        
                                                            l
                                                            o
                                                            g
                                                            i
                                                            
                                                                
                                                                    t
                                                                
                                                                
                                                                    k
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                        
                     (i.e. each prediction category of the plurality of prediction categories is associated with a rule-based prediction score) ); determining a rule-based prediction output based at least in part on the plurality of rule- based prediction scores(Luo, pg. 2085, right-column, “The probability                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                        
                     that the input sentence expresses intent k is computed by                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    l
                                                    o
                                                    g
                                                    i
                                                    
                                                        
                                                            t
                                                        
                                                        
                                                            k
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                            
                                        
                                        
                                            
                                                
                                                    exp
                                                
                                                ⁡
                                                
                                                    
                                                        
                                                            l
                                                            o
                                                            g
                                                            i
                                                            
                                                                
                                                                    t
                                                                
                                                                
                                                                    k
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                            .
                        
                    ”); and providing the plurality of rule-based prediction scores and the rule-based prediction output to a machine learning engine, wherein the machine learning engine is configured to generate a machine-learning based prediction output based at least in part on the plurality of rule- based prediction scores and the rule-based prediction output(Luo, pg. 2085, see also fig.2(a), “At the network module level, we explore ways to utilize the clue words in the surface form of a RE (bold blue arrows and words in 2 of Fig. 2) to guide the attention module in NNs.” & see also Luo, pg. 2086, right column, see also fig.2(a), “At the output level, REs are used to amend the output of NNs. At this level, we take the same approach used for intent detection and slot filling (see 3 in Fig. 2).” ).  
Regarding claim 2, Luo teaches the computer-implemented method of claim 1, wherein a prediction system comprises the machine learning engine(Luo, pg. 2088, left-column, see also fig. 2, “Our hyper-parameters for the BLSTM are similar…[s]pecifically, we use batch size 16, dropout probability 0.5, and BLSTM cell size 100. The attention loss weight is 16 (both positive and negative) for full few-shot learning settings and 1 for other settings. We use the 100d GloVe word vectors… pre-trained on Wikipedia and Gigaword…and the Adam optimizer…with learning rate 0.001.” ).  
Regarding claim 3, Luo teaches the computer-implemented method of claim 1, wherein: the machine learning engine comprises a neural network having one or more input layers, one or more hidden layers, and one or more output layers, a first hidden layer of the one or more hidden layers comprises one or more first hidden nodes, and providing the plurality of rule-based prediction scores to the machine learning engine comprises providing the plurality of rule-based prediction scores to a first plurality of nodes of the one or more first hidden nodes(Luo, pg. 2085, Figure 2(a) details the model used for intent detection, in which the hidden layer is comprised of a BLSTM layer of hidden nodes and hidden nodes of                         
                            
                                
                                    h
                                
                                
                                    1
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    2
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    3
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    4
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    5
                                
                            
                        
                    , and                         
                            s
                        
                    , the output layer is comprised of the nodes after the SoftMax classifier that outputs the intent, which in example is flight, and the rule-based prediction scores illustrated by RE tag are injected into the hidden nodes at 2 of figure 2(a) outlined in section 3.3 of Luo pages. 2085-2086).  
Regarding claim 4, Luo teaches the computer-implemented method of claim 1, wherein:  37AttyDktNo: 054642/525249the machine learning engine comprises a neural network having one or more input layers, one or more hidden layers, and one or more output layers, the one or more output layers comprises one or more output nodes, and providing the rule-based prediction output to the machine learning engine comprises providing the rule-based prediction output to at least one output machine learning node of the one or more output machine learning nodes. providing the rule-based prediction output to the machine learning engine comprises providing the rule-based prediction output to at least one output node of the one or more output nodes(Luo, pg. 2085, Figure 2(a) details the model used for intent detection, in which the hidden layer is comprised of a BLSTM layer of hidden nodes and hidden nodes of                         
                            
                                
                                    h
                                
                                
                                    1
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    2
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    3
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    4
                                
                            
                            ,
                             
                            
                                
                                    h
                                
                                
                                    5
                                
                            
                        
                    , and                         
                            s
                        
                    , the output layer is comprised of the nodes after the SoftMax classifier that outputs the intent, which in example is flight, and the rule-based prediction scores illustrated by RE tag are injected into the output nodes at 3 of figure 2(a) outlined in section 3.4 of Luo pages. 2086-2087).  
Regarding claim 5, Luo teaches the computer-implemented method of claim 1, wherein generating the plurality of rule-based prediction scores further comprises: determining, based at least in part on one or more satisfied predictive weights, a plurality of adjusted rule-based prediction scores; and normalizing the plurality of adjusted rule-based prediction scores to generate a plurality of normalized prediction scores(Luo, pg. 2085, right-column, “The probability                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                        
                     that the input sentence expresses intent k is computed by                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    l
                                                    o
                                                    g
                                                    i
                                                    
                                                        
                                                            t
                                                        
                                                        
                                                            k
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                            
                                        
                                        
                                            
                                                
                                                    exp
                                                
                                                ⁡
                                                
                                                    
                                                        
                                                            l
                                                            o
                                                            g
                                                            i
                                                            
                                                                
                                                                    t
                                                                
                                                                
                                                                    k
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                            .
                        
                    ” Luo teaches:                        
                             
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    k
                                
                            
                        
                     (i.e. on one or more satisfied predictive weights  )                        
                             
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                             
                        
                     (i.e. a plurality of adjusted rule-based prediction score)                           
                            T
                            h
                            e
                             
                            p
                            r
                            o
                            b
                            a
                            b
                            i
                            l
                            i
                            t
                            y
                             
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                             
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    l
                                                    o
                                                    g
                                                    i
                                                    
                                                        
                                                            t
                                                        
                                                        
                                                            k
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                            
                                        
                                        
                                            
                                                
                                                    exp
                                                
                                                ⁡
                                                
                                                    
                                                        
                                                            l
                                                            o
                                                            g
                                                            i
                                                            
                                                                
                                                                    t
                                                                
                                                                
                                                                    k
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                        
                      (i.e. and normalizing the plurality of adjusted rule-based prediction scores to generate a plurality of normalized prediction scores)).  
Regarding claim 6, Luo teaches the computer-implemented method of claim 6 further comprising (a) determining one or more rule-based features for the prediction input(Luo, pg. 2084, left-column, see also fig. 1, “In this work, a RE defines a mapping from a text pattern to several REtags which are the same as or related to the target labels (i.e., intent and slot labels). A search function takes in a RE, applies it to all sentences, and returns any texts that match the pattern. We then assign the REtag (s) (that are associated with the matching RE) to the matched sentence (for intent detection)…[s]pecifically, our REtags for intent detection are the same as the intent labels. For example, in Fig. 1, we get a REtag of flight that is the same as the intent label flight.”), and (b) determining one or more machine learning features for the prediction input(Luo, pg. 2085, left-column, see also fig. 2(a), “We use the Bi-directional LSTM (BLSTM) as our base NN model… [a]s shown in Fig. 2, the BLSTM takes as input the word embeddings [                        
                            
                                
                                    x
                                
                                
                                    1
                                
                            
                            ,
                            …
                            ,
                            
                                
                                    x
                                
                                
                                    n
                                
                            
                        
                    ] of a n-word sentence, and produces a vector                         
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                        
                     for each word i. A self-attention layer then takes in the vectors produced by the BLSTM to compute the sentence embedding s…[as detailed by equation (1)] where                         
                            
                                
                                    α
                                
                                
                                    i
                                
                            
                        
                     is the attention for word i, c is a randomly
initialized trainable vector used to select informative words for classification, and W is a weight matrix. Finally, s is fed to a softmax classifier for intent classification.”).2 
Regarding claim 7, Luo teaches the computer-implemented method of claim 6, wherein the one more rule-based features comprise at least one feature different than the one or more machine learning features(Luo, pg. 2086, Figure 2(a) details the model used for intent detection in the rule based feature is defined by the RE instance of the following rule: /^flights? from/ while the features inputted to the machine learning model are a series of words from a given sentence: flights from Boston to Miami which make to the inputs [                        
                            
                                
                                    x
                                
                                
                                    1
                                
                            
                            ,
                            
                                
                                    x
                                
                                
                                    2
                                
                            
                            ,
                            
                                
                                    x
                                
                                
                                    3
                                
                            
                            ,
                            
                                
                                    x
                                
                                
                                    4
                                
                            
                            ,
                            a
                            n
                            d
                             
                            
                                
                                    x
                                
                                
                                    5
                                
                            
                        
                    ]).  
Regarding claim 8, Lou teaches the computer-implemented method of claim 1, wherein generating the plurality of rule-based prediction scores further comprises: determining, by the rules engine, the plurality of rule-based prediction scores based at least in part on one or more satisfied predictive weights, wherein each satisfied predictive weight of the one or more satisfied predictive weights is associated with a satisfied prediction rule of the one or more prediction rules, and wherein each satisfied prediction rule is a prediction rule of the one or more prediction rules whose respective rule condition is satisfied by the prediction input(Luo, pg. 2085, right-column, “The probability                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                        
                     that the input sentence expresses intent k is computed by                         
                            
                                
                                    p
                                
                                
                                    k
                                
                            
                            =
                            
                                
                                    
                                        
                                            exp
                                        
                                        ⁡
                                        
                                            
                                                
                                                    l
                                                    o
                                                    g
                                                    i
                                                    
                                                        
                                                            t
                                                        
                                                        
                                                            k
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                                
                                    
                                        
                                            ∑
                                            
                                                k
                                            
                                        
                                        
                                            
                                                
                                                    exp
                                                
                                                ⁡
                                                
                                                    
                                                        
                                                            l
                                                            o
                                                            g
                                                            i
                                                            
                                                                
                                                                    t
                                                                
                                                                
                                                                    k
                                                                
                                                            
                                                        
                                                    
                                                
                                            
                                        
                                    
                                
                            
                            .
                        
                    ”).).  
Regarding claim 9,  Lou teaches, the computer-implemented method of claim 8, wherein determining the plurality of rule-based prediction scores based at least in part on one or more satisfied predictive weights comprises: for each prediction category of a plurality of prediction categories, determine a rule-based prediction score of the plurality of rule-based prediction scores for the prediction category based at least in part an aggregate of each satisfied predictive weight of the one or more satisfied predictive weights that is associated with the prediction category(Luo, pg. 2086, left-column, “Second, apart from indicating a sentence for intent k (positive REs), a RE can also indicate that a sentence does not express intent k (negative REs)… We denote the logits computed by positive attentions as                         
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    p
                                    k
                                
                            
                        
                    , and those by negative attentions as                         
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    n
                                    k
                                
                            
                        
                    , the final logit for intent k can then be calculated as:                         
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    k
                                
                            
                            =
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    p
                                    k
                                
                            
                            -
                            l
                            o
                            g
                            i
                            
                                
                                    t
                                
                                
                                    n
                                    k
                                
                            
                        
                    ”). 
Regarding claim 10, Lou teaches, the computer-implemented method of claim 1, wherein at least a first prediction rule of the one or more prediction rules is determined based on data provided by a subject- matter-expert user(Luo, pg. 2087, right-column, “We use the syntax of REs in Perl in this work. Our REs are written by a paid annotator who is familiar with the domain.”).
Referring to independent claims 11 and 18 they are rejected on the same basis as
independent claim 1 since they are analogous claims. 
Referring to dependent claims 12-17, they are rejected on the same basis as dependent claims 2-7 since they are analogous claims.
Referring to dependent claims 19-20, they are rejected on the same basis as dependent claims 2-3 since they are analogous claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
a. US 9,734,447 Bl (details incorporating a neural network and rule based system through the use of an Alternating Decision Tree model)
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Adam Clark Standke whose telephone number is (571)270-1806. The examiner can normally be reached 10AM-7PM M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J Huntley can be reached on (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
Adam Clark Standke
Assistant Examiner
Art Unit 2129



/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 As Luo Bingfeng states in section seven of Marrying up regular expressions with neural networks: A case study for spoken language understanding, “[i]n this paper, we investigate different ways to combine NNs[neural networks] and Res[regular expressions] for solving typical SLU [spoken language understanding] tasks.
        Our experiments demonstrate that the combination clearly improves the NN performance in both the few-shot learning and the full dataset settings. We show that by exploiting the implicit knowledge encoded within REs, one can significantly improve the learning performance.”
        2 For purposes of this Office Action claim 6 is being interpreted as referring back to claim 1.