DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 06/06/2022 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Objections
Claim 30 is objected to because of the following informalities:  the claim discloses "at least one of deleting a conditional probability of a node based on a threshold;" however, there are no other alternatives given the "at least one of" recitation.  Appropriate correction is required.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 21-40 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1, 2, 4, 6-15 and 17-20 of U.S. Patent No. 11,360,996. Although the claims at issue are not identical, they are not patentably distinct from each other because they are substantially similar in scope.  The claims of the instant application and the claims of Patent No. 11,360,996 are therefore obvious variants as show in the table below.

Instant Application Claims
Patent No. 11,360,996 Claims
Claim 21. A system for formatting data, the system comprising: at least one memory storing instructions; and one or more processors configured to execute the instructions to perform operations comprising: generating a direct probabilistic-graph, the direct probabilistic-graph including a set of nodes corresponding to positions in data value sequences, by iteratively: determining conditional counts and total counts of data values at a subsequent node in the set of nodes based on data values at one or more preceding nodes in the set of nodes; and determining conditional probabilities based on the conditional counts and total counts; determining a similarity metric of a modeled probabilistic-graph and the direct probabilistic-graph, the modeled probabilistic-graph being generated by a machine learning model; and training the machine learning model to output conditional probabilities based on the similarity metric.
Claim 1. A system for formatting data, the system comprising: at least one memory storing instructions; and one or more processors configured to execute the instructions to perform operations comprising: receiving data comprising data value sequences; generating a direct probabilistic-graph, the direct probabilistic-graph including a set of nodes corresponding to positions in the data value sequences, by iteratively: determining conditional counts of data values at a subsequent node in the set of nodes based on data values at one or more preceding nodes in the set of nodes; and determining conditional probabilities based on the conditional counts; determining a similarity metric of a modeled probabilistic-graph and the direct probabilistic-graph; and training the modeled probabilistic-graph based on the similarity metric.
Claim 6. The system of claim 1, wherein training the modeled probabilistic-graph comprises training the modeled probabilistic-graph to output conditional probabilities for subsequent data values of the received data value sequences based on preceding data values of the received data value sequences.
Claim 15. The system of claim 1, wherein the operations further include determining a total count of a data value of the data-value sequences.
Claim 22
Claim 17
Claim 23
Claim 18
Claim 24
Claim 4
Claim 26
Claim 13
Claim 28
Claim 14
Claim 29
Claim 7
Claim 30
Claim 8
Claim 31
Claim 2
Claim 33
Claim 9
Claim 34
Claim 10
Claim 35
Claim 11
Claim 36
Claim 12
Claim 39. A method for formatting data, comprising: generating a direct probabilistic-graph, the direct probabilistic-graph including a set of nodes corresponding to positions in data value sequences, by iteratively: determining conditional counts and total counts of data values at a subsequent node in the set of nodes based on data values at one or more preceding nodes in the set of nodes; and determining conditional probabilities based on the conditional counts and total counts; determining a similarity metric of a modeled probabilistic-graph and the direct probabilistic-graph, the modeled probabilistic-graph being generated by a machine learning model; and training the machine learning model to output conditional probabilities based on the similarity metric.
Claim 19. A method for formatting data, comprising: receiving data comprising data value sequences; generating a direct probabilistic-graph, the direct probabilistic-graph including a set of nodes corresponding to positions in the data value sequences, by iteratively: determining conditional counts of data values at a subsequent node in the set of nodes based on data values at one or more preceding nodes in the set of nodes; and determining conditional probabilities based on the conditional counts; determining a similarity metric of a modeled probabilistic-graph and the direct probabilistic-graph; and training the modeled probabilistic-graph based on the similarity metric.

Claim 40. A system for formatting data, the system comprising: at least one memory storing instructions; and one or more processors configured to execute the instructions to perform operations comprising: generating a direct probabilistic-graph, the direct probabilistic-graph including a set of nodes corresponding to positions in data value sequences, by iteratively: determining conditional counts and total counts of data values at a subsequent node in the set of nodes based on data values at one or more preceding nodes in the set of nodes; determining conditional probabilities based on the conditional counts and total counts; and generating embedded data based on the data value sequences by implementing at least one of a one-hot encoding method or a glove method; determining a similarity metric of a modeled probabilistic-graph and the direct probabilistic-graph, the similarity metric including at least one of a percent overlap, an average relative difference between nodes, or a measure of a statistical distribution of differences between nodes, the modeled probabilistic-graph being generated by a machine learning model; and training the machine learning model to output conditional probabilities based on the similarity metric and the embedded data.
Claim 20. A system for formatting data, the system comprising: at least one memory storing instructions; and one or more processors configured to execute the instructions to perform operations comprising: receiving data comprising data value sequences; generating a direct probabilistic-graph, the direct probabilistic-graph including a set of nodes corresponding to positions in the data value sequences, by iteratively: determining total counts of data values in the set of nodes; determining conditional counts of data values at a subsequent node in the set of nodes based on data values at one or more preceding nodes in the set of nodes; and determining conditional probabilities based on the conditional counts and the total counts; determining a similarity metric of a modeled probabilistic-graph and the direct probabilistic-graph, the similarity metric including at least one of a percent overlap reflecting a percent of nodes that include same conditional probabilities in both the modeled probabilistic-graph and the direct probabilistic-graph or a measure of a statistical distribution of differences between nodes; training the modeled probabilistic-graph based on the similarity metric; generating reformatted data based on the trained modeled probabilistic graph and the received data value sequences; and training a synthetic data model to generate synthetic data using the reformatted data.



Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 21-40 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 21, 39 and 40 disclose “a direct probabilistic-graph” and “a modeled probabilistic-graph” and it is unclear what these graphs entail.  Therefore, the claims are rejected for failing to clearly and distinctly define what each of these graphs entails. Claims 22-38 are also rejected for failing to cure the deficiencies of claim 21.
Claims 21, 39 and 40 disclose “conditional counts” and it is unclear what this entails.  Therefore, the claims are rejected for failing to clearly and distinctly define what conditional counts entail.  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). Claims 22-38 are also rejected for failing to cure the deficiencies of claim 21.
Claim 26 discloses “the similarity metric comprises a percent overlap” and it is unclear what this overlap pertains to (e.g., such as an overlap of data, an overlap of values, etc.). Therefore, the claim is rejected for failing to clearly and distinctly define what this percent overlap entails.  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
Claim 30 discloses “the pruning comprising at least one of deleting a conditional probability of a node based on a threshold” and it is unclear what this threshold pertains to in order to trigger the deletion (e.g., such as the deletion occurring once a threshold number of items has/has not been received, once a value meets/exceeds/falls below a threshold, etc.). Therefore, the claim is rejected for failing to clearly and distinctly define what this threshold entails.  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
Claim 32 discloses “training the machine learning model is based on a threshold” and it is unclear what this threshold pertains to in order to trigger the training (e.g., such as the training occurring once a threshold number of items has been received, once a value meets/exceeds a threshold, etc.).  Therefore, the claim is rejected for failing to clearly and distinctly define what this threshold entails.  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
Claim 40 discloses “at least one of a percent overlap” and it is unclear what this overlap pertains to (e.g., such as an overlap of data, an overlap of values, etc.). Therefore, the claim is rejected for failing to clearly and distinctly define what this percent overlap entails.  Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).

Support for Amendments and Newly Added Claims
Applicants are respectfully requested, in the event of an amendment to claims or submission of new claims, that such claims and their limitations be directly mapped to the specification, which provides support for the subject matter.  This will assist in expediting compact prosecution and reducing potential 35 USC § 112(a) or 35 USC § 112, 1st paragraph issues that can arise when claims are amended.  MPEP 714.02 recites: “Applicant should also specifically point out the support for any amendments made to the disclosure. See MPEP § 2163.06. An amendment which does not comply with the provisions of 37 CFR 1.121(b), (c), (d), and (h) may be held not fully responsive. See MPEP § 714.”  Amendments not pointing to specific support in the disclosure may be deemed as not complying with provisions of 37 C.F.R.  1.121(b), (c), (d), and (h) and therefore held not fully responsive.  Generic statements such as “Applicants believe no new matter has been introduced” may be deemed insufficient.  The examiner thanks the Applicant in advance for providing support for any amendments or newly added claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Cormier (US 2018/0082208): computing systems employing cognitive modeling; 
Eskin (US 2004/0205474 ): data mining techniques to detect intrusions in monitoring system calls; 
Ghafourifar (US 2021/0089375): improved natural language processing (NLP) intent determination, e.g., for use with intelligent personal assistant software agents that are configured to interact with people, services, and devices across multiple communications formats and protocol; 
Kaye (US 2016/0342737): graph-based reference genome for generating and using the reference genome, that allow for the rapid assembly of sequence reads and determination of DNA sequence differences; 
Shim (WO-2019124724-A1): learning a sequence data association based on a probability graph; 
Yadav (US 2021/0019309): mapping natural language to queries using a query grammar.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DIEDRA M MCQUITERY whose telephone number is (571)272-9607. The examiner can normally be reached Monday - Thursday, 8 am - 6 pm (C.S.T.).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mark Featherstone can be reached on (571)270-3750. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Diedra McQuitery/Primary Examiner, Art Unit 2166