Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Status of Claims
This office action is in response to communication filed on 05/23/2022, amending claims 1-19. This application was filed 04/24/2020 and has PRO of 62/465,919 03-02-2017.
Claims 1, 5, and 15 have been amended by Examiner’s AMENDMENT; Claims 2-4, 8-9, 12-14, and 18-19 have been cancelled, by EXAMINER’S AMENDMENT.

USC § 101 Analysis
Claim(s) 1, 11, and dependent claims 5-7, 15-17, and 20, are directed to a technical solution to a technical problem associated with initialization training of a pre-existing stored metagenomic data cluster associated with labeled training data by employing a supervised learning algorithm, a Naïve Bayes classifier, to said cluster and uses the classifier to classify new received samples in conjunction with updates provided in an Expectation step of an unsupervised learning algorithm, Expectation-Maximization, updates to the classifier representing incremental updates if the new received samples cannot be confidently classified into any existing training data, updating the classifier’s algorithm by updating a probabilistic decision boundary as new received samples arrive, and in an Expectation-Maximization Maximization step updating the classifier pursuant to an updated parameter set based on said Expectation step update.
Thus, based on the aforementioned analysis, claim(s) 1, 5-7, 10-11, 15-17 and 20 are patent eligible.



35 USC § 103 
Closest prior art of record, Layer (US 2016/0132640) Non-Patent Literature, NPL, Kraken, and Non-Patent Literature, NPL, Dalvi, are withdrawn from consideration pursuant to Allowable Subject Matter.

Examiner’s Amendment
Authorization for this examiner’s amendment, was given in an Examiner-Initiated Interview with Applicant Representative Stephen (Steve) Schott on 01 September 2022.
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Claim Objections
Claim objections have been jettisoned based on Applicant amendments to the claims. 

35 USC § 101 Software per se
Software per se rejections have been jettisoned based on Applicant amendments to the claims. 

--- 1, 5, and 15 have been amended by Examiner’s AMENDMENT;
Claims 2-4, 8-9, 12-14, and 18-19 have been cancelled, by EXAMINER’S AMENDMENT as Follows ---



AMENDMENT TO CLAIMS
1. (Currently Amended) A system for classifying samplescomprising:
a processor, a memory accessible by the processor, and computer instructions stored in the memory and executable by the processor to perform: 
store a pre-existing training data cluster; 
apply a Naive Bayes Classifier (NBC) to data within the training data cluster; receive new data; and 
apply the NBC to the new data based on the NBC applied to training data;  
wherein the NBC includes a classification algorithm used to classify the data with the NBC; 
wherein the classification algorithm uses classification criterion to determine NBC, and
wherein the classification algorithm is updated based on the training data and receipt of the new data,
wherein updating of the classification algorithm is done using an Expectation-Maximization (EM) algorithm that estimates the NBC being accurately applied to the data, wherein the EM algorithm includes an Expectation Step applying a formula: 
Q(X) = p(X|C, Θt-1), 
wherein the EM algorithm includes a Maximization Step applying a formula:
Θt = argmaxΘ             
                 
                
                    
                        ∫
                        
                            x
                        
                        
                            ∞
                        
                    
                    
                        Q
                        
                            
                                x
                            
                        
                        
                            
                                log
                            
                            ⁡
                            
                                p
                                (
                                x
                                ,
                                c
                                ,
                            
                        
                    
                
            
         Θ) dx;
	wherein updating of the NBC includes using a framework based on High Precision rules to classify the data, wherein the High Precision rules generate pseudo-labeled data and high precision clusters from existing datasets comprising at least a simulated testing dataset and based on the pseudo-labeled data, the NBC is trained to create a probabilistic model representation for each data cluster, 
wherein X = (x1, x2, x3, ...xn) is an observation with n features, and there are m target classes (C1, C2, C3, ... Cm);
wherein             
                Q
                
                    
                        x
                    
                
            
         represents an expected value of a likelihood function estimated by current parameter set, Θt-1, C, and observation X, and where Θt is the updated parameter set based on expectation.

2-4. (Canceled)

5. (Previously presented) The system of claim [[2]] 1, wherein updating of the NBC

    PNG
    media_image1.png
    200
    400
    media_image1.png
    Greyscale

where 
    PNG
    media_image2.png
    200
    400
    media_image2.png
    Greyscale
 is a frequency of word xi in class 
    PNG
    media_image3.png
    200
    400
    media_image3.png
    Greyscale
 is a total number of words in class Ck, d is a total number of unique words in the class Ck and             
                α
            
         is a smoothing factor that users can tune.

6. (Previously presented) The system of claim 1, further identify some of the new data as not able to be classified, and add the new data to an undetermined data set.

7. (Previously presented) The system of claim 1, wherein the updating of the NBC includes using an incremental k-mer based metagenome fragment classifier (iKMF) to classify the data.

8-9.	(Canceled)

10. (Previously presented) The system for claim 1 performed by a client.

11. (Currently Amended) A method for classifying samples, wherein the method comprises:
storing a pre-existing training data cluster;
applying a Naive Bayes Classifier (NBC) to data within the training data cluster;
receiving new data; and
applying the NBC to the new data based on the NBC applied to training data;
wherein the NBC
wherein the classification algorithm uses classification criterion to determine the NBC, and
wherein the classification algorithm is updated based on the training data and receipt of the new data,
wherein updating of the classification algorithm is done using an Expectation-Maximization (EM) algorithm that estimates the NBC being accurately applied to the data, wherein the EM algorithm includes an Expectation Step applying a formula: 
Q(X) = p(X|C, Θt-1), 
wherein the EM algorithm includes a Maximization Step applying a formula:
Θt = argmaxΘ             
                 
                
                    
                        ∫
                        
                            x
                        
                        
                            ∞
                        
                    
                    
                        Q
                        
                            
                                x
                            
                        
                        
                            
                                log
                            
                            ⁡
                            
                                p
                                (
                                x
                                ,
                                c
                                ,
                            
                        
                    
                
            
         Θ) dx;
 wherein updating of the NBC includes using a framework based on High Precision rules to classify the data, wherein the High Precision rules generate pseudo-labeled data and high precision clusters from existing datasets comprising at least a simulated testing dataset and based on the pseudo-labeled data, the NBC is trained to create a probabilistic model representation for each data cluster, 
wherein:
X = (x1, x2, x3, ...xn) is an observation with n features, and there are m target classes (C1, C2, C3, ... Cm);
where             
                Q
                
                    
                        x
                    
                
            
         represents an expected value of a likelihood function estimated by current parameter set, Θt-1, C, and observation X, and where Θt is the updated parameter set based on expectation.

12-14. 	(Canceled) 

15. (Currently amended) The method of claim [[12]] 11, wherein updating of the NBCa formula:

    PNG
    media_image1.png
    200
    400
    media_image1.png
    Greyscale

where 
    PNG
    media_image2.png
    200
    400
    media_image2.png
    Greyscale
 is a frequency of word xi in class 
    PNG
    media_image3.png
    200
    400
    media_image3.png
    Greyscale
 is a total number of words in class Ck, d is a total number of unique words in the class Ck and             
                α
            
         is a smoothing factor that users can tune.

16. (Previously presented) The method of claim 11, wherein the method identify some of the new data as not able to be classified, and add the new data to an undetermined data set.

17. (Previously presented) The method of claim 11, wherein the updating of the NBC

18-19. (Canceled) 

20. (Original) The method for claim 11, wherein the method is performed by a client.

Allowable Subject Matter
Claims 1, 5-7, 10-11, 15-17 and 20, are allowed.

The following is an examiner’s statement of reasons for allowance:
While closest prior art of record, Layer (US 2016/0132640) Non-Patent Literature, NPL, Kraken, and Non-Patent Literature, NPL, Dalvi disclose the employment of Expectation-Maximization in creating new classes to aid a Naïve Bayes classifier, but they do not teach: 
“apply a Naive Bayes Classifier (NBC) to data within the training data cluster; receive new data; and 
apply the NBC to the new data based on the NBC applied to training data;  
wherein the NBC includes a classification algorithm used to classify the data with the NBC; 
wherein the classification algorithm uses classification criterion to determine NBC, and
wherein the classification algorithm is updated based on the training data and receipt of the new data,
wherein updating of the classification algorithm is done using an Expectation-Maximization (EM) algorithm that estimates the NBC being accurately applied to the data, wherein the EM algorithm includes an Expectation Step applying a formula: 
Q(X) = p(X|C, Θt-1), 
wherein the EM algorithm includes a Maximization Step applying a formula:
Θt = argmaxΘ                         
                             
                            
                                
                                    ∫
                                    
                                        x
                                    
                                    
                                        ∞
                                    
                                
                                
                                    Q
                                    
                                        
                                            x
                                        
                                    
                                    
                                        
                                            log
                                        
                                        ⁡
                                        
                                            p
                                            (
                                            x
                                            ,
                                            c
                                            ,
                                        
                                    
                                
                            
                        
                     Θ) dx;
 	wherein updating of the NBC includes using a framework based on High Precision rules to classify the data, wherein the High Precision rules generate pseudo-labeled data and high precision clusters from existing datasets comprising at least a simulated testing dataset and based on the pseudo-labeled data, the NBC is trained to create a probabilistic model representation for each data cluster, 
wherein:
X = (x1, x2, x3, ...xn) is an observation with n features, and there are m target classes (C1, C2, C3, ... Cm);
where                         
                            Q
                            
                                
                                    x
                                
                            
                        
                     represents an expected value of a likelihood function estimated by current parameter set, Θt-1, C, and observation X, and where Θt is the updated parameter set based on expectation”, in the context of the claim when considered as a whole.  

These uniquely distinct features render the claim(s) 1 allowable. 
Therefore, independent claims 1, 11, and dependent claims 5-7, 15-17, and 20 are allowable based on the same rationale as the claim(s) from which they depend.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
The prior art made of record and NOT relied upon is considered pertinent to applicant's disclosure:

    PNG
    media_image4.png
    1225
    924
    media_image4.png
    Greyscale


Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL EZEWOKO whose telephone number is (571)272-7850.  The examiner can normally be reached on Monday - Thursday.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Waseem Ashraf can be reached on (571) 270-3948.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MICHAEL I EZEWOKO/Examiner, Art Unit 3682                                                                                                                                                                                                        

/WASEEM ASHRAF/Supervisory Patent Examiner, Art Unit 3682