DETAILED ACTION
Status of the Claims
This action is in response to the amendment filed on 8/9/2022 for application 16/209,249 filed on 12/4/2018. Claim 1, 3 – 20 are pending and have been examined.

Claim 1, 4 – 7, 9, 12 – 17 and 19 – 20 are amended.

Claim 2 is canceled. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 9/3/2019 and 10/13/2021 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Respond to Argument
Applicant's remark filed on 8/9/2022 has been fully considered but they are not persuasive.

Regarding claim rejection under 35 U.S.C 101, applicant state that the steps of normalizing input file, generating jitter set of audio file, generating spectrogram frames, obtaining predicted character probabilities from a trained neural network and decoding transcription of input audio are not steps that can practically be performed mentally. Examiner respectfully disagree. It is noted that the feature (jitter set of audio file, spectrogram frames, transcription of input audio) upon which applicant relies are not recited in the rejected claims. Although the claims are interpreted in light of the specification, limitation from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). The steps of generate group of low-order feature values, determining group of low-order feature values, identifying group of input feature values, pairwise transformations, determining interactive feature values, selecting interactive feature values, processing feature vector, and merging as recited in independent Claim 1, under BRI, recites evaluation and judgement of mental step performed in human mind with or without physical aid. 

Regarding Claim rejection under 35 U.S.C. 103, applicant further state that the cited art do not disclose concepts listed below: 
identifying a group of input feature values for the iteration;
identifying a group of input feature value pairs, wherein each input feature value pair from the group of input feature value pairs comprises a first feature value from the group of input feature values and a second feature value from the group of input feature values;
for each input feature value pair from the group of input feature value pairs, performing a pairwise transformation of the first feature value in the input feature value pair and the second feature value in the input feature value pair to generate a corresponding interactive feature value for the input feature value pair;
for each interactive feature value associated with an input feature value pair from the group of input feature value pairs, determining a scored interactive feature value associated with the input feature value pair based at least in part on the interactive feature value and an interactive scoring parameter for the interactive feature value; and
from each scored interactive feature value associated with an input feature value pair from the group of input feature value pairs, selecting a third number of scored interactive feature values as a group of output feature values;
generating, based at least in part on the original feature vector, a group of high-order feature values;
merging the group of low-order feature values and the group of high-order feature values to generate a processed feature vector corresponding to the original feature vector; and
providing the processed feature vector as an input to a machine-learning based prediction unit, wherein the machine-learning based prediction unit generates one or more predictions based at least in part on processed feature vector.
However, applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references. In this instant case, applicant’s merely statement has not provided discussion and analysis between the prior art references recited by one of ordinary skill in the art along with the applicant's application but merely offer an allegation states neither prior art references shows the recited claim limitation. For the detailed disclosure of prior art over the claimed invention, refer to Claim Rejection under 35 U.S.C. 103 section.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1 and 3  – 20 are rejected under 35 U.S.C 101 because the claimed invention is directed to a judicial exception (ie., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.

Regarding Claim 1, 
Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis
Claim 1 is directed to a computer-implemented method, which is one of the statutory categories. 
Step 2A Prong One Analysis: 
Claim 1 recite the abstract ideas in the following limitations:
Generating based at least in part on the original feature vector, a group of low-order feature values by (a) performing a first number of iterations of a feature engineering transformation to generate a group of engineered feature values and (b) determining the group of low-ordered feature values based at least in part on a second number of feature values from the group of engineered feature values 
identifying a group of input feature values for the iteration
identifying a group of input feature value pairs, wherein each input feature value pair from the group of input feature value pairs comprises a first feature value from the group of input feature values and a second feature value from the group of input feature values
performing a pairwise transformation of the first feature value in the input feature value pair and the second feature value in the input feature value pair to generate a corresponding interactive feature value for the input feature value pair
determining a scored interactive feature value associated with the input feature value pair based at least in part on the interactive feature value and an interactive scoring parameter for the interactive feature value
selecting a third number of scored interactive feature values as a group of output feature values
generating, based at least in part on the original feature vector, a group of high-order feature values
merging the group of low-order feature values and the group of high-order feature values to generate a processed feature vector corresponding to the original feature vector
The steps of processing value to generate another value, performing pairwise transformation, and merging values recite mathematical calculations of mathematical concept group. The steps of identifying, determining and selecting recite evaluation and judgement of mental process group. And thus, the claim falls within judicial exception of mental processes of abstract idea and requires further analysis under Step 2A Prong Two.
Step 2A Prong Two Analysis:
This judicial exception is not integrated into a practical application. 
Claim 1 recite the following additional elements along with the abstract ideas:
obtaining an original feature vector 
providing the processed feature vector as an input to a machine learning based prediction unit, wherein the machine learning based prediction unit generates one or more predictions based at least in part on processed feature vector 
The step of obtaining and providing is recited at high level generality which add insignificant extra solution activity to the judicial exception. The additional element of machine learning based prediction unit is recited in high generality and generally linking the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h)).
Claim 1 do not integrate the abstract idea into a practical application.
Step 2B Analysis:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. In the Subject Matter Eligibility Test Step 2B, the recited additional steps of obtaining and providing is well-understood, routine, conventional activity recognized in MPEP 2106.05(d)i - receiving or transmitting data over a network. The additional element of prediction unit is recited in high generality and generally linking the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h)).
Claim 1 do not contribute inventive concept.


Regarding Claim 3, 
Claim 3 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis
Claim 3 is directed to a computer-implemented method, which is one of the statutory categories. 
Step 2A Prong One Analysis: 
Claim 3 does not recite additional limitations that are direct to an abstract idea. 
Step 2A Prong Two Analysis:
This judicial exception is not integrated into a practical application. 
Claim 3 recite the following additional elements along with the abstract ideas:
obtaining the first number 
The step of obtaining and providing is recited at high level generality which add insignificant extra solution activity to the judicial exception. 
Claim 3 do not integrate the abstract idea into a practical application.
Step 2B Analysis:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. In the Subject Matter Eligibility Test Step 2B, the recited additional steps of obtaining is well-understood, routine, conventional activity recognized in MPEP 2106.05(d)i - receiving or transmitting data over a network. 
Claim 3 do not contribute inventive concept.


Regarding Claim 4, 
Claim 4 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis
Claim 4 is directed to a computer-implemented method, which is one of the statutory categories. 
Step 2A Prong One Analysis: 
Claim 4 recite the abstract ideas in the following limitations:
determining the first number based at least in part on one or more first trained parameters
The steps of determining recite evaluation and judgement of mental process group. And thus, the claim falls within judicial exception of mental processes of abstract idea and requires further analysis under Step 2A Prong Two.
Step 2A Prong Two Analysis:
This judicial exception is not integrated into a practical application. 
Claim 4 does not recite additional element. 
Claim 4 do not integrate the abstract idea into a practical application.
Step 2B Analysis:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Claim 4 does not recite additional element. 
Claim 4 do not contribute inventive concept.


Regarding Claim 5, 
Claim 5 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis
Claim 5 is directed to a computer-implemented method, which is one of the statutory categories. 
Step 2A Prong One Analysis: 
Claim 5 recite the abstract ideas in the following limitations:
identifying the group of input feature values comprises determining the group of input feature values based at least in part on the original feature vector 
The steps of identifying and determining recite evaluation and judgement of mental process group. And thus, the claim falls within judicial exception of mental processes of abstract idea and requires further analysis under Step 2A Prong Two.
Step 2A Prong Two Analysis:
This judicial exception is not integrated into a practical application. 
Claim 5 does not recite additional element. 
Claim 5 do not integrate the abstract idea into a practical application.
Step 2B Analysis:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Claim 5 does not recite additional element. 
Claim 5 do not contribute inventive concept.


Regarding Claim 6, 
Claim 6 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis
Claim 6 is directed to a computer-implemented method, which is one of the statutory categories. 
Step 2A Prong One Analysis: 
Claim 6 recite the abstract ideas in the following limitations:
identifying the group of input feature values comprises determining the group of input feature values based at least in part on the output feature value selected in a previous iteration of the feature engineering transformation
The steps of identifying and determining recite evaluation and judgement of mental process group. And thus, the claim falls within judicial exception of mental processes of abstract idea and requires further analysis under Step 2A Prong Two.
Step 2A Prong Two Analysis:
This judicial exception is not integrated into a practical application. 
Claim 6 does not recite additional element. 
Claim 6 do not integrate the abstract idea into a practical application.
Step 2B Analysis:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Claim 6 does not recite additional element. 
Claim 6 do not contribute inventive concept.


Regarding Claim 7, 
Claim 7 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis
Claim 7 is directed to a computer-implemented method, which is one of the statutory categories. 
Step 2A Prong One Analysis: 
Claim 7 recite the abstract ideas in the following limitations:
determining the group of engineered feature values based at least in part on the group of output feature values selected in a final iteration of the feature engineering transformation as the group of engineered feature value 
The steps of determining recite evaluation and judgement of mental process group. And thus, the claim falls within judicial exception of mental processes of abstract idea and requires further analysis under Step 2A Prong Two.
Step 2A Prong Two Analysis:
This judicial exception is not integrated into a practical application. 
Claim 7 does not recite additional element. 
Claim 7 do not integrate the abstract idea into a practical application.
Step 2B Analysis:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Claim 7 does not recite additional element. 
Claim 7 do not contribute inventive concept.


Regarding Claim 8, 
Claim 8 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis
Claim 8 is directed to a computer-implemented method, which is one of the statutory categories. 
Step 2A Prong One Analysis: 
Claim 8 recite the abstract ideas in the following limitations:
generating the interactive scoring parameter for each interactive feature value generated during the iteration 
The steps of generating a number recite evaluation of mental process group. And thus, the claim falls within judicial exception of mental processes of abstract idea and requires further analysis under Step 2A Prong Two.
Step 2A Prong Two Analysis:
This judicial exception is not integrated into a practical application. 
Claim 8 does not recite additional element. 
Claim 8 do not integrate the abstract idea into a practical application.
Step 2B Analysis:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Claim 8 does not recite additional element. 
Claim 8 do not contribute inventive concept.


Regarding Claim 9, 
Claim 9 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis
Claim 9 is directed to a computer-implemented method, which is one of the statutory categories. 
Step 2A Prong One Analysis: 
Claim 9 recite the abstract ideas in the following limitations:
applying a scoring function to the respective interactive feature value to generate an attention-based parameter for the respective interactive feature value
applying a normalization function to the attention-based parameter to generate a normalized attention-based parameter for the respective interactive feature value; 
and determining the interactive scoring parameter for the interactive feature value based at least in part on the normalized attention-based parameter for the respective interactive feature value
The steps of applying a score function and applying a normalization function recite mathematical calculation of mathematical concept group. The steps of determining recite evaluation and judgement of mental process group. And thus, the claim falls within judicial exception of mental processes of abstract idea and requires further analysis under Step 2A Prong Two.
Step 2A Prong Two Analysis:
This judicial exception is not integrated into a practical application. 
Claim 9 does not recite additional element. 
Claim 9 do not integrate the abstract idea into a practical application.
Step 2B Analysis:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Claim 9 does not recite additional element. 
Claim 9 do not contribute inventive concept.


Regarding Claim 10, 
Claim 10 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis
Claim 10 is directed to a computer-implemented method, which is one of the statutory categories. 
Step 2A Prong One Analysis: 
Claim 10 does not recite additional abstract ideas.
Step 2A Prong Two Analysis:
This judicial exception is not integrated into a practical application. 
Claim 10 recite the following additional elements along with the abstract ideas:
the scoring function is a hyperbolic tangent function. 
The additional element of hyperbolic tangent function is recited in high generality and generally linking the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h))
Claim 10 do not integrate the abstract idea into a practical application.
Step 2B Analysis:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional element of hyperbolic tangent function is recited in high generality and generally linking the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h))
Claim 10 do not contribute inventive concept.


Regarding Claim 11, 
Claim 11 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis
Claim 11 is directed to a computer-implemented method, which is one of the statutory categories. 
Step 2A Prong One Analysis: 
Claim 11 does not recite additional abstract ideas.
Step 2A Prong Two Analysis:
This judicial exception is not integrated into a practical application. 
Claim 11 recite the following additional elements along with the abstract ideas:
the normalization function is a softmax normalization function. 
The additional element of softmax normalization function is recited in high generality and generally linking the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h))
Claim 11 do not integrate the abstract idea into a practical application.
Step 2B Analysis:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional element of softmax normalization function is recited in high generality and generally linking the use of the judicial exception to a particular technological environment or field of use (MPEP 2106.05(h))
Claim 11 do not contribute inventive concept.


Regarding Claim 12, 
Claim 12 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis
Claim 12 is directed to a computer-implemented method, which is one of the statutory categories. 
Step 2A Prong One Analysis: 
Claim 12 does not recite additional abstract idea.
Step 2A Prong Two Analysis:
This judicial exception is not integrated into a practical application. 
Claim 12 recite the following additional elements along with the abstract ideas:
the group of high-order feature values is generated at least in part on a multilayer perceptron neural network 
The additional element of multilayer perceptron neural network is recited in high generality and amounts to no more than a recitation of the words "apply it" (or an equivalent), or no more than mere instructions to implement an abstract idea or other exception on a computer (MPEP 2106.05(f)).
Claim 12 do not integrate the abstract idea into a practical application.
Step 2B Analysis:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional element of multilayer perceptron neural network is recited in high generality and amounts to no more than a recitation of the words "apply it" (or an equivalent), or no more than mere instructions to implement an abstract idea or other exception on a computer (MPEP 2106.05(f)).
Claim 12 do not contribute inventive concept.


Regarding Claim 13, 
Claim 13 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis
Claim 13 is directed to a computer-implemented method, which is one of the statutory categories. 
Step 2A Prong One Analysis: 
Claim 13 does not recite additional abstract ideas
.
Step 2A Prong Two Analysis:
This judicial exception is not integrated into a practical application. 
Claim 13 recite the following additional elements along with the abstract ideas:
the group of high-order features is generated based at least in part on a convolutional neural network 
The additional element of convolutional neural network is recited in high generality and amounts to no more than a recitation of the words "apply it" (or an equivalent), or no more than mere instructions to implement an abstract idea or other exception on a computer (MPEP 2106.05(f)).
Claim 13 do not integrate the abstract idea into a practical application.
Step 2B Analysis:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional element of convolutional neural network is recited in high generality and amounts to no more than a recitation of the words "apply it" (or an equivalent), or no more than mere instructions to implement an abstract idea or other exception on a computer (MPEP 2106.05(f)).
Claim 13 do not contribute inventive concept.


Regarding Claim 14 - 19, 
Claim 14 – 19 are apparatus claim corresponding to Claim 1, 5 – 9. Claim 14 – 19 recite additional elements of processor, memory and program code. However, the elements are recited in high generality and generally linking the use of judicial exception to a particular technological environment or field of use. Thus, Claim 14 – 19 do not integrate the abstract idea into a practical application or contribute inventive concept. Claim 14 – 19 are rejected with the same reason as Claim 1, 5 – 9. 

Regarding Claim 20, 
Claim 20 is non-transitory computer-readable storage medium claim corresponding to Claim 1. Claim 20 recites additional elements of computer-readable storage medium and program code instructions. However, the elements are recited in high generality and generally linking the use of judicial exception to a particular technological environment or field of use. Thus, Claim 20 does not integrate the abstract idea into a practical application or contribute inventive concept. Claim 20 is rejected with the same reason as Claim 1. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim 1 – 4, 6 – 8, 12, 14, 16 – 18 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Guo DeepFM a Factorization-Machine based Neural Network for CTR Prediction arXiv, 2017, in view of Ferrucci A Framework for Genetic Algorithms Based on Hadoop, arXiv, 2013, further in view of Tsang Detecting Statistical Interactions from Neural Network Weights, arXiv, Feb. 2018. 

Regarding Claim 1, Guo discloses: A computer-implemented method for reducing feature sparsity or cardinality by generating a processed feature vector from an original feature vector (Guo, fig. 1, where the outputs from FM layer and output from hidden layer together [processed feature vector] are generated and reduced from the dense embedding [original feature]), the computer-implemented method comprising: obtaining an original feature vector; generating, based at least in part on the original feature vector, a group of low-order feature values by (a) performing … a feature engineering transformation to generate a group of engineered feature values and (b) determining the group of low-ordered feature values based, at least in part, on a second number of feature values from the group of engineered feature values (Guo fig. 1, sec. 2.2 para. 3 – 4, where dense embeddings [original feature vector] is processed by the FM layer to generate low order feature interactions [engineered feature values], the number of inner product nodes [group of low order feature values] is based on the number of embedding [second number of feature values] ), 
wherein performing … the feature engineering transformation comprises: … (ii) identifying a group of input feature value pairs, wherein each input feature value pair from the group of input feature value pairs comprises a first feature value from the group of input feature values and a second feature value from the group of input feature values (Guo, fig. 2, where in FM layer, each inner product node takes a pair of input [feature value pairs including first feature value and second feature value] from each embedding); 
(iii) for each input feature value pair from the group of input feature value pairs, performing a pairwise transformation of the first feature value in the input feature value pair and the second feature value in the input feature value pair to generate a corresponding interactive feature value for the input feature value pair (Guo, fig. 2, where in FM layer, the inner product [pairwise transformation] is performed to each value in the pair [first feature value and second feature value], the value represents the feature interaction [interactive feature value]);
generating, based at least in part on the original feature vector, a group of high-order feature values (Guo, fig. 1 where the hidden layer process the embeddings [original feature vector], and generate output [high order feature values] to the output layer); 
merging the group of low-order feature values and the group of high-order feature values to generate a processed feature vector corresponding to the original feature vector and providing the processed feature vector as an input to a machine-learning based prediction unit, wherein the machine-learning based prediction unit generates one or more predictions based at least in part on processed feature vector (Guo, fig. 1, where the outputs from FM layer [group of low order feature values] and output from hidden layer [high order feature values] together are the input of the output units [machine-learning based prediction unit] which produce CTR prediction);
Guo does not explicitly disclose:
a first number of iterations of a feature engineering transformation
(i) identifying a group of input feature values for the iteration;
(iii) for each interactive feature value associated with an input feature value pair from the group of input feature value pairs, determining a scored interactive feature value associated with the input feature value pair based on the interactive feature value and an interactive scoring parameter for the interactive feature value
(v) from each scored interactive feature value associated with an input feature value pair from the group of input feature value pairs, selecting a third number of scored interactive feature values as a group of output feature values;
Ferrucci explicitly discloses: 
a first number of iterations of a feature engineering transformation (Ferrucci, fig. 7 & sec. IV A, where using m [first number of iterations] generations of Genetic Algorithm to perform feature subset selection)
(i) identifying a group of input feature values for the iteration (Ferrucci, sec. IV, para. 1, ln. 9 – 10, where find an optimal dataset [input feature values] in each iterations using Genetic Algorithm);
	Guo and Ferrucci both discloses feature engineering for machine learning model and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Guo’s disclosure of feature engineering using order 1 and order 2 feature interaction with Ferrucci’s disclosure of feature selection using Genetic Algorithm to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification as the order 2 interaction of Guo multiplies the number of intermediate features which in term increase the cost of computation while feature subset selection of Ferrucci find the optimal subset of features which reduces the number of features and thus reduce the cost of computation (Ferrucci, sec. IV A, para. 3).
Guo in view of Ferrucci does not explicitly disclose:
(iii) for each interactive feature value associated with an input feature value pair from the group of input feature value pairs, determining a scored interactive feature value associated with the input feature value pair based on the interactive feature value and an interactive scoring parameter for the interactive feature value
(v) from each scored interactive feature value associated with an input feature value pair from the group of input feature value pairs, selecting a third number of scored interactive feature values as a group of output feature values
Tsang explicitly discloses: 
for each interactive feature value associated with an input feature value pair from the group of input feature value pairs, determining a scored interactive feature value associated with the input feature value pair based on the interactive feature value and an interactive scoring parameter for the interactive feature value (Tsang, eq. 1 & sec. 3.3 where wi is the interaction strength [interactive score parameter]; sec. 4.4 para. 2, where rank all pairs of features {I,j} according to their interaction strengths w{I,j}; sec. 2.3, para. 3, where                         
                            
                                
                                    h
                                
                                
                                    (
                                    l
                                    )
                                
                            
                        
                     [scored interactive feature value]                         
                            =
                            ∅
                            (
                            
                                
                                    W
                                
                                
                                    
                                        
                                            l
                                        
                                    
                                
                            
                            
                                
                                    h
                                
                                
                                    
                                        
                                            l
                                            -
                                            1
                                        
                                    
                                
                            
                            +
                            
                                
                                    b
                                
                                
                                    (
                                    l
                                    )
                                
                            
                            )
                        
                    ; i.e., under the pairwise setup, the first layer in fig. 2 with h(1) is the interaction between feature value pair, the second layer h(2) [scored interactive feature value] is calculated by the interactive feature value h(1) with a parameter W(2) which indicate the strength of the pairwise interaction.
and from each scored interactive feature value associated with an input feature value pair from the group of input feature value pairs, selecting a third number of scored interactive feature values as a group of output feature values (Tsang, sec. 4, ln. 4, where obtain a ranking of interaction candidates and determine a cutoff for the top-K [third number] interactions. I.e. only k interactions are selected as effective and will be used);
Guo (in view of Ferrucci) and Tsang both discloses feature engineering for machine learning model and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Guo (in view of Ferrucci)’s disclosure of feature engineering by iteratively feature selection using Genetic Algorithm and inference using pairwise interaction with Tsang’s disclosure of detecting the effective interaction and cutoff between input features to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification to create a simpler and interpretable model (Tsang, sec. 1, para. 2, ln. 8 - 9).

Regarding Claim 3, depending on Claim 1, Guo in view of Ferrucci and Tsang discloses the method of Claim 1. Guo in view of Ferrucci and Tsang further disclose: comprising obtaining the first number (Ferrucci, page. 6, col. 2, para. 3, where check if the stopping conditions have occurred the count of the maximum number of generations [first number] has been reached; i.e., the max number of generations is a parameter of the training).

Regarding Claim 4, depending on Claim 1, Guo in view of Ferrucci and Tsang discloses the method of Claim 1. Guo in view of Ferrucci and Tsang further disclose: determining the first number based at least in part on one or more first trained parameters (Ferrucci, page. 6, col. 2, para. 3, where check if the stopping conditions have occurred … at least one individual has been marked of satisfying the termination criterion; i.e., the objective function is evaluated during each iteration based on trained parameters).

Regarding Claim 6, depending on Claim 1, Guo in view of Ferrucci and Tsang discloses the method of Claim 1. Guo in view of Ferrucci and Tsang further disclose: during each current iteration of the feature engineering transformation after an initial iteration of the feature engineering transformation: identifying the group of feature values comprises determining the group of input feature values based at least in part on the group of output feature values selected in a previous iteration of the feature engineering transformation; and the previous iteration of the feature engineering transformation is an iteration of the feature engineering transformation immediately preceding the current iteration of the feature engineering transformation (Ferrucci, fig. 8 & page. 8, col. 2, para. 2, where the population [group of input feature values] of the current iteration is determined based on the fitness of prior iteration immediately preceding current iteration. Fitness is calculated based on the output accuracy of each of the subset of model, the best accuracy is returned [selected]).

Regarding Claim 7, depending on Claim 1, Guo in view of Ferrucci and Tsang discloses the method of Claim 1. Guo in view of Ferrucci and Tsang further disclose: 
wherein generating the group of engineered feature values comprises determining the group of engineered feature values based at least in part on the group of output feature values selected in a final iteration of the feature engineering transformation as the group of engineered feature values (Ferrucci, page. 6, sec. III. C, para. 2, where the stopping condition has occurred … at least one individual has been marked of satisfying the termination criterion during the most recent generation; i.e., based on the objective calculated from the output, determine that the feature set is the final feature set).

Regarding Claim 8, depending on Claim 1, Guo in view of Ferrucci and Tsang discloses the method of Claim 1. Guo in view of Ferrucci and Tsang further disclose: performing each iteration of the feature engineering transformation further comprises: generating the interactive scoring parameter for each interactive feature value generated during the iteration (the interaction strength [interactive score parameter] wi of Tsang is trained with the set of features selected in each iteration of Genetic Algorithm of Ferrucci to determine the effective feature interactions).

Regarding Claim 12, depending on Claim 1, Guo in view of Ferrucci and Tsang discloses the method of Claim 1. Guo in view of Ferrucci and Tsang further disclose: wherein the group of high-order feature values is generated based at least in part on a multilayer perceptron neural network (Guo fig. 1, where the high order feature is processed by the hidden layer which is a neural network having multilayer perceptron).

Regarding Claim 14, Claim 14 is an apparatus claim corresponding to Claim 1. Guo in view of Ferrucci and Tsang disclose the method of Claim 1. Ferrucci further disclose: An apparatus … comprising at least one processor and at least one memory including program code, the at least one memory and the program code configured to, with the processor, cause the apparatus to (Ferrucci, intro, ln. 6, where Apache Hadoop platform involve a computation device with processors that including memory that store program code of instructions for the processors). Claim 14 is rejected with the same reason as Claim 1. 
Guo and Ferrucci both discloses machine learning model and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Guo’s disclosure of feature engineering using high low order interaction with Ferrucci’s disclosure of perform machine learning on a computer environment to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification as the combination yield predictable result.

Regarding Claim 16 – 18, Claim 16 – 18 are the apparatus claim corresponding to Claim 6 – 8. Claim 16 – 18 are rejected with the same reason as Claim 6 – 8. 

Regarding Claim 20, Claim 20 is a non-transitory computer-readable storage medium claim corresponding to Claim 1. Guo in view of Ferrucci and Tsang disclose the method of Claim 1. Ferrucci further disclose: program code instructions that when executed cause a computing device to (Ferrucci, intro, ln. 6, where Apache Hadoop platform involve a computation device with processors that including memory that store program code of instructions for the processors). Claim 20 is rejected with the same reason as Claim 1. 
Guo and Ferrucci both discloses machine learning model and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Guo’s disclosure of feature engineering using high low order interaction with Ferrucci’s disclosure of perform machine learning on a computer environment to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification as the combination yield predictable result.

Claim 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Guo DeepFM a Factorization-Machine based Neural Network for CTR Prediction arXiv, 2016, in view of Tsang Detecting Statistical Interactions from Neural Network Weights, arXiv, Feb. 2018, Ferrucci A Framework for Genetic Algorithms Based on Hadoop, arXiv, 2013, and further in view of Diaz-Gomez Initial Population for Genetic Algorithms A Metric Approach, GEM 2007, p43 – 49. 

Regarding Claim 5, depending on Claim 1, Guo in view of Ferrucci and Tsang discloses the method of Claim 1. Guo in view of Ferrucci and Tsang do not explicitly disclose: during an initial iteration of the feature engineering transformation, identifying the group of input feature values comprises determining the group of input feature values based on the original feature vector.
Diaz-Gomez explicitly discloses: during an initial iteration of the feature engineering transformation, identifying the group of input feature values comprises determining the group of input feature values based at least in part on the original feature vector (sec. 4.5, para. 4, eq. 10 – 11, where we are suggesting to use this metric in uniformly randomly generated genes in a population; i.e. the initial population is generated based on the distribution of original input/feature values).
Guo (in view of Ferrucci and Tsang) and Diaz-Gomez both discloses Genetic Algorithm in machine learning model and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Guo (in view of Ferrucci and Tsang)’s disclosure of iterative feature engineering using Genetic Algorithm with Diaz-Gomez’s disclosure of optimization method for the initial population of Genetic Algorithm to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification to avoid bias towards a specific region of the search space (Diaz-Gomez, sec. 5, para. 3, ln. 13 – 18).

Regarding Claim 15, Claim 15 is the apparatus claim corresponding to Claim 5. Claim 15 is rejected with the same reason as Claim 5.

Claim 9 – 11 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Guo DeepFM a Factorization-Machine based Neural Network for CTR Prediction arXiv, 2016, in view of Tsang Detecting Statistical Interactions from Neural Network Weights, arXiv, Feb. 2018, Ferrucci A Framework for Genetic Algorithms Based on Hadoop, arXiv, 2013, and further in view of Ribalta, Band Selection from Hyperspectral Images Using Attention-based Convolutional Neural Networks, arXiv, Oct, 2018. 

Regarding Claim 9, depending on Claim 8, Guo in view of Ferrucci and Tsang discloses the method of Claim 8. Guo in view of Ferrucci and Tsang do not explicitly disclose: generating the interactive scoring parameter for a respective interactive feature value comprises: applying a scoring function to the respective interactive feature value to generate an attention-based parameter for the respective interactive feature value
applying a normalization function to the attention-based parameter to generate a normalized attention-based parameter for the respective interactive feature value; 
and determining the interactive scoring parameter for the interactive feature value based at least in part on the normalized attention-based parameter for the respective interactive feature value.
Ribalta explicitly discloses:
generating the interactive scoring parameter for a respective interactive feature value comprises: applying a scoring function to the respective interactive feature value to generate an attention-based parameter for the respective interactive feature value (Ribalta, sec. II B, para. 1, & eq. 4, where confidence score c [attention-based parameter for the respective feature value]  is calculated by apply tanh() [score function] to Hl feature value; )
applying a normalization function to the attention-based parameter to generate a normalized attention-based parameter for the respective interactive feature value, and determining the interactive scoring parameter for the interactive feature value based on the normalized attention-based parameter for the respective interactive feature value (Ribalta, sec. II B, & eq. 5, where softmax normalization function is used on the confidence score [attention-based parameter for respective feature value] and the overall normalization effect on each of the output [scoring parameter] ol is based on the softmax normalization calculation).
 Guo (in view of Ferrucci and Tsang) and Ribalta both discloses feature selection in machine learning model and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Guo (in view of Ferrucci and Tsang)’s disclosure of iterative feature selection with Ribalta’s disclosure of attention based approach for feature selection to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification in order to shorten the training time and compress the data without sacrificing the amount of information (Ribalta, sec. I. B, para. 3, ln. 13 – 19).

Regarding Claim 10, depending on Claim 9, Guo in view of Ferrucci, Tsang and Ribalta discloses the method of Claim 9. Guo in view of Ferrucci, Tsang and Ribalta further disclose: wherein the scoring function is a hyperbolic tangent function (Ribalta, eq. 4, where score function is tanh [hyperbolic tangent]).

Regarding Claim 11, depending on Claim 9, Guo in view of Ferrucci, Tsang and Ribalta discloses the method of Claim 9. Guo in view of Ferrucci, Tsang and Ribalta further disclose: wherein the normalization function is a softmax normalization function (Ribalta, eq. 5, where softmax is applied to normalize each band).

Regarding Claim 19, Claim 19 is the apparatus claim corresponding to Claim 9. Claim 19 is rejected with the same reason as Claim 9.

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Guo DeepFM a Factorization-Machine based Neural Network for CTR Prediction arXiv, 2016, in view of Tsang Detecting Statistical Interactions from Neural Network Weights, arXiv, Feb. 2018, Ferrucci A Framework for Genetic Algorithms Based on Hadoop, arXiv, 2013, and further in view of Zheng Wide and Deep Convolutional Neural Networks for Electricity-Theft Detection to Secure Smart Grids, IEEE Transactions on Industrial Information, Vol 14 No. 4, April, 2018. 

Regarding Claim 13, depending on Claim 1, Guo in view of Ferrucci and Tsang discloses the method of Claim 1. Guo in view of Ferrucci and Tsang do not explicitly disclose wherein processing the group of original features to generate the group of high-order features comprises processing the group of original features using a convolutional neural network.
Zheng explicitly discloses: wherein the group of high-order features is generated based at least in part on a convolutional neural network (Zheng, fig. 5, where in the wide and deep architecture, the deep model to process high order feature is a CNN model).
Guo (in view of Ferrucci and Tsang) and Zheng both discloses wide and deep machine learning model and are analogous. It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combining Guo (in view of Ferrucci and Tsang)’s disclosure of having Factorization Machine as the wide model with Zheng’s disclosure of deep CNN model to achieve the claimed teaching. One of the ordinary skill in the art would have motivated to make this modification in order to identify the patterns in the 2-D data(Zheng, abs. ln. 19 – 22).
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHIEN MING CHOU whose telephone number is (571)272-9354. The examiner can normally be reached Monday- Friday 9 am - 5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CHAKI KAKALI can be reached on (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/S.C./Examiner, Art Unit 2122      
                                                                                                                                                                                                  /BRIAN M SMITH/Primary Examiner, Art Unit 2122