US20200111545A1
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims

This action is in reply to the application filed on 07 May 2020.
Claims 1-21 are currently pending and have been examined.

Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.



Claim 1 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite in that it fails to point out what is included or excluded by the claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.  Claim 1 recites the limitation " ".  There is insufficient antecedent basis for these limitations in the claim. Dependent claims 2-7 incorporate the deficiencies of claim 1 and are rejected for the same reason.
Claim 2 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite in that it fails to point out what is included or excluded by the claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.  Claim 2 recites the limitation "the analysis model ".  There is insufficient antecedent basis for these limitations in the claim. Dependent claims 3 and 4 incorporate the deficiencies of claim 2 and are rejected for the same reason.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.  
The claim(s) recite(s) subject matter within a statutory category as a process (claims 1-7), article of manufacture (claims 8-14), and machine (claims 15-21) which recite steps of clustering one or more EMR documents to produce one or more clusters to produce cluster membership data for each of the EMR documents; training the EMR document analysis model with membership in the clusters of the EMR documents represented by the membership data as one or more machine learning features of the EMR document analysis model to produce a trained EMR document analysis model;
applying the trained EMR document analysis model to the EMR documents to produce adjusted cluster membership data for each of the EMR documents; identifying sample ones of the EMR documents by determining which, of a number of the EMR documents, have adjusted cluster membership data nearest a membership boundary of one or more of the clusters; assigning EMR analysis labels to each of the sample EMR documents; and training the trained EMR document analysis model with the sample EMR documents and the EMR analysis labels assigned to the sample EMR documents to produce an improved EMR document analysis model.
.

Step 2A Prong 1
These steps to create a model for health record classification, as drafted, under the broadest reasonable interpretation, includes performance of the limitations in the mind but for recitation of generic computer components. That is, other than reciting steps as performed by the generic computer components, nothing in the claim element precludes the step from practically being performed in the mind to cluster and classify health records. This could be analogized to a human manually determining a treatment for the patient based on various contextual parameters, but for the recitation of generic computer components. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitations in the mind but for the recitation of generic computer components, then it falls within the “Mental Process” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Dependent claims recite additional subject matter which further narrows or defines the abstract idea embodied in the claims (such as claims 2-7, 9-14, and 16-21, reciting particular aspects of creating a model for health record classification such as modifying the analysis model, performing cluster analysis on the EMR documents, identify frequencies of one or more terms, identify topics of one or more EMRs, comparing frequencies of terms to inverse document frequencies, using a LDA model, determining a measure of correlation, presenting sample EMR document to one or more human analysts to label the EMR document, and associating positive or negative labels with predetermined number of analysts covers mental processes but for the recitation of generic computer components ).  

Step 2A Prong 2
This judicial exception is not integrated into a practical application. In particular, the additional elements do not integrate the abstract idea into a practical application, other than the abstract idea per se, because the additional elements amount to no more than limitations which:
amount to mere instructions to apply an exception (such as training the EMR document analysis model with membership in the clusters of the EMR documents represented by the membership data as one or more machine learning features of the EMR document analysis model to produce a trained EMR document analysis model, applying the trained EMR document analysis model to the EMR documents to produce adjusted cluster membership data for each of the EMR documents, and training the trained EMR document analysis model with the sample EMR documents and the EMR analysis labels assigned to the sample EMR documents to produce an improved EMR document analysis model amounts to invoking computers as a tool to perform the abstract idea, see applicant’s specification [0021] to [0065], see MPEP 2106.05(f))
add insignificant extra-solution activity to the abstract idea (such as assigning EMR analysis labels to each of the sample EMR documents amounts to insignificant application, see MPEP 2106.05(g))
Dependent claims 2-7, 9-14, and 16-21 recite additional subject matter which amount to limitations consistent with the additional elements in the independent claims (such as claims 2-7, 9-14, and 16-21 additional limitations which amount to invoking computers as a tool to perform the abstract idea). Looking at the limitations as an ordered combination adds nothing that is not already present when looking at the elements taken individually. There is no indication that the combination of elements improves the functioning of a computer or improves any other technology.  Their collective functions merely provide conventional computer implementation and do not impose a meaningful limit to integrate the abstract idea into a practical application.

Step 2B
The claim(s) do not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to discussion of integration of the abstract idea into a practical application, the additional elements amount to no more than mere instructions to apply an exception and add insignificant extra-solution activity to the abstract idea.  Additionally, the additional limitations, other than the abstract idea per se, amount to no more than limitations which:
amount to elements that have been recognized as well-understood, routine, and conventional activity in particular fields such as identifying sample ones of the EMR documents by determining which, of a number of the EMR documents, have adjusted cluster membership data nearest a membership boundary of one or more of the clusters, e.g., performing repetitive calculations, Flook, MPEP 2106.05(d)(II)(ii); assigning EMR analysis labels to each of the sample EMR documents, see Qiang et al. [pg. 6] “In this exemplary embodiment, document classification model is entered by the file characteristics of the electronic medical record document with label,”
CN107833603A, MPEP 2106.05(d).
Dependent claims recite additional subject matter which, as discussed above with respect to integration of the abstract idea into a practical application, amount to invoking computers as a tool to perform the abstract idea.  Dependent claims recite additional subject matter which amount to limitations consistent with the additional elements in the independent claims. Looking at the limitations as an ordered combination adds nothing that is not already present when looking at the elements taken individually. There is no indication that the combination of elements improves the functioning of a computer or improves any other technology. Their collective functions merely provide conventional computer implementation.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35
U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148
USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-4, 6, 8-11, 13, 15-18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Lakshminarayan et al. (US20150025908A1) in view of Qiang et al. (CN107833603A).
Regarding claim 1, Lakshminarayan discloses clustering one or more EMR documents to produce one or more clusters to produce cluster membership data for each of the EMR documents ([0020] “The complex care analytics unit 106 may cluster the PRs 102 form the various ICUs into groups or clusters of closely related diseases, such as metabolic diseases or nervous system disorders, for example.”)
identifying sample ones of the EMR documents by determining which, of a number of the EMR documents, have adjusted cluster membership data nearest a membership boundary of one or more of the clusters ([0030] “Regardless of what ICU the patients were seen in, the OPTICS algorithm may cluster the EMRs into clusters of closely related diseases and within a path length of four from one another.”)


Lakshminarayan does not explicitly disclose however Qiang teaches training the EMR document analysis model with membership in the clusters of the EMR documents represented by the membership data as one or more machine learning features of the EMR document analysis model to produce a trained EMR document analysis model ([pg. 3] “Feature extraction unit, for using multiple electronic medical record documents as training sample set, extracting the training sample set in each electronic medical record document file characteristics. Model training unit, for the type according to each electronic medical record document and the file characteristics to document point Class model is trained”)
applying the trained EMR document analysis model to the EMR documents to produce adjusted cluster membership data for each of the EMR documents ([pg. 3 to 4] “On the other hand, pass through Document classification model after training is classified to electronic medical record document to be sorted, can be automatic by way of machine learning Electronic medical record document is classified….”)
assigning EMR analysis labels to each of the sample EMR documents ([pg. 6] “In this exemplary embodiment, document classification model is entered by the file characteristics of the electronic medical record document with label.”)
training the trained EMR document analysis model with the sample EMR documents and the EMR analysis labels assigned to the sample EMR documents to produce an improved EMR document analysis model ([pg. 6] “After row training, it is possible to electronic medical record document to be sorted is classified by the document classification model after training. Citing and Speech, file characteristics such as " Document Title keyword feature ", " the document content chapter of electronic medical record document to be sorted can be extracted. The file characteristics of extraction are mapped as sparse vector by section feature ", " document content keyword feature ". It is it is then possible to this is sparse Vector is input to the document classification model after training, using the output result of the document disaggregated model as electronic health record to be sorted. The type of document.”)

Note: after row training the EMRs then are then classified based on correlated features using the classification model.

Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to expand Lakshminarayan’s techniques for clustering medical records with Qiang’s techniques for sorting medical records. The motivation for the combination of Lakshminarayan and Qiang is to significantly reduce human cost by automatically sorting documents (See Qiang, Background).
Regarding claim 2, Lakshminarayan discloses and performing cluster analysis of the EMR documents using a vector derived from the topics of each of the EMR documents to produce the clusters, each including one or more cluster member ones of the EMR documents ([0030] “Regardless of what ICU the patients were seen in, the OPTICS algorithm may cluster the EMRs into clusters of closely related diseases and within a path length of four from one another. One cluster, to illustrate, may center on a cardiovascular condition associated with a specific SNOMED-CT code. The cluster may also contain patients with similar cardiovascular conditions within four SNOMED-CT codes of the center condition. Due to the hierarchical construction of the SNOMED-CT system, the cluster may contain conditions that are four path lengths above and four path lengths below the center code/disease.”)


Lakshminarayan does not explicitly disclose however Qiang teaches modifying the analysis model into a standardized format ([pg. 3] “Decision-tree model is lifted to the gradient according to the type of each electronic medical record document and the file characteristics it is trained.”)
exporting the analysis model in the standardized format to a host production system that hosts models ([pg. 7] “In addition, in an embodiment of the present invention, additionally provide a kind of electronic medical record document sorter. Shown in reference picture 5, The electronic medical record document sorter 400 can include: Feature extraction unit 410, model training unit 420 and document classification Unit 430…… Model training unit 420 is used for according to each electronic medical record document Type and the file characteristics are trained to document classification model.”)
applying a topic model to the EMR documents to identify one or more topics of the content of each of the EMR documents in the host production system ([pg. 6] “After row training, it is possible to electronic medical record document to be sorted is classified by the document classification model after training. Citing and Speech, file characteristics such as " Document Title keyword feature ", " the document content chapter of electronic medical record document to be sorted can be extracted.”)

Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to expand Lakshminarayan’s techniques for clustering medical records with Qiang’s techniques for sorting medical records. The motivation for the combination of Lakshminarayan and Qiang is to significantly reduce human cost by automatically sorting documents (See Qiang, Background).
Regarding claim 3, Lakshminarayan discloses identifying respective frequencies of one or more terms of the content of each of the EMR documents ([0034] “At step 408, the method 400 may also include quantizing unstructured data associated with each of the plurality of EMRs in the cluster based on text mining techniques. One text mining technique that may be implemented is the term frequency-inverse document frequency (TF-IDF) technique. TF-IDF is a numerical statistic which reflects how important a word is to a document in a collection.”)
and identifying the topics of the content of each of the EMR documents according to the frequencies of the terms ([0034] “The TF-IDF value increases proportionally to the number of times a word appears in the document, but may be offset by the frequency of the word in the collection, which may help to control for the fact that some words are generally more common than others. [0035] “Once the diagnostically significant words are extracted from the unstructured data, the method may then continue at step 404 with concatenating the quantized unstructured data associated with each of the plurality of PRs in the cluster with the structured data associated with the same PR.” [0032] “The EMRs 102, as discussed above, may have a set of structured data that relates to the numbers of tests performed, medications administered, and LOS.”)
Regarding claim 4, Lakshminarayan discloses comparing the frequencies of the terms to inverse document frequencies of the terms ([0034] “One text mining technique that may be implemented is the term frequency-inverse document frequency (TF-IDF) technique.”)
Regarding claim 6, Lakshminarayan discloses for each of the clusters, determining a measure of correlation between the cluster and the cluster member EMR documents of the cluster ([0029] “As such, a path length threshold of, for example, four may be selected to generate closely related clusters of statistically significant size so that further analytical analysis may produce useful information.”)
training a model to distinguish, for each of the clusters, cluster member EMR documents with high correlations with the cluster from cluster member EMR documents with low correlations with the cluster to produce the EMR analysis model ([0028] “In some implementations, the cluster module 310 may apply an Ordering Points to Identify the Clustering Structure (OPTICS) algorithm to a plurality of PRs, such as the PRs 102 of FIG. 1.” [0033] “Further, the cluster analysis engine 206 may also determine groups within each cluster, or sub-clusters, based on ranges of resource usage. The sub-clusters may designate high, moderate and low resource usage patients. The sub-clusters may designate high, moderate and low resource usage patients. Prior to forming the sub-clusters, the cluster of PRs may be sorted by resource usage from high to low, or vice versa. The sorted data may show definite differences between high and low resource usage with the moderated usage falling in between.”)
Regarding claim 8, Lakshminarayan discloses a non-transitory computer readable medium useful in association with a computer which includes one or more processors and a memory, the computer readable medium including computer instructions which are configured to cause the computer, by execution of the computer instructions in the one or more processors from the memory ([0023] “FIG. 3, for example, shows one suitable example in which a processor 302 is coupled to a non-transitory, computer-readable storage device 300. The non-transitory, computer-readable storage device 300 may be implemented as volatile storage (e.g., random access memory), non-volatile storage (e.g., hard disk drive, optical storage, solid-state storage, etc.) or combinations of various types of volatile and/or non-volatile storage.”)

The remaining limitations are rejected for the same reasons as stated above for claim 1.
Regarding claim 9, the limitations are rejected for the same reasons as stated above for claim 2.
Regarding claim 10, the limitations are rejected for the same reasons as stated above for clam 3.
Regarding claim 11, the limitations are rejected for the same reasons as stated above for clam 4.
Regarding claim 13, the limitations are rejected for the same reasons as stated above for clam 6.
Regarding claim 15, Lakshminarayan discloses a processor ([0024] “Each engine of FIGS. 2A and 2B may be implemented as the processor 302 executing the corresponding software module of FIG. 3.”)
a computer readable medium operatively coupled to the processor ([0023] “FIG. 3, for example, shows one suitable example in which a processor 302 is coupled to a non-transitory, computer-readable storage device 300. The non-transitory, computer-readable storage device 300 may be implemented as volatile storage (e.g., random access memory), non-volatile storage (e.g., hard disk drive, optical storage, solid-state storage, etc.) or combinations of various types of volatile and/or non-volatile storage.”)
and EMR document analysis logic (i) that executes in the processor from the computer readable medium and (ii) that, when executed by the processor, causes the computer ([0023] “In some examples of the complex care analytics unit 106, each engine 202-212 may be implemented as a processor executing software.”)

The remaining limitations are rejected for the same reasons as stated above for claim 1.
Regarding claim 16, the limitations are rejected for the same reasons as stated above for claim 2.
Regarding claim 17, the limitations are rejected for the same reasons as stated above for claim 3.
Regarding claim 18, the limitations are rejected for the same reasons as stated above for claim 4.
Regarding claim 20, the limitations are rejected for the same reasons as stated above for claim 6.
Claims 5, 12, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Lakshminarayan et al. (US20150025908A1) in view of Qiang et al. (CN107833603A) and further in view of Nutakki et al. (Distributed LDA based Topic Modeling and Topic Agglomeration in a Latent Space).
Regarding claim 5 Lakshminarayan in view of Qiang does not explicitly disclose however Nutakki teaches wherein the topic model is a latent Dirichlet allocation model ([pg. 1 to 2] “To extract topics from the tweets crawled in each time slot, we use a Latent Dirichlet Allocation (LDA) based technique.”)

Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to expand Lakshminarayan’s techniques for clustering medical records and Qiang’s techniques for sorting medical records with Nutakki’s techniques to use a latent Dirichlet allocation model for topic modeling. The motivation for the combination of Lakshminarayan, Qiang, and Nutakki is to automatically extract topics and organize data (See Nutakki, Abstract).
Regarding claim 12, the limitations are rejected for the same reasons as stated above for claim 5.
Regarding claim 19, the limitations are rejected for the same reasons as stated above for claim 5.
Claims 7, 14, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Lakshminarayan et al. (US20150025908A1) in view of Qiang et al. (CN107833603A) and further in view of Kinney et al. (US20140108047A1).
Regarding claim 5 Lakshminarayan in view of Qiang does not explicitly disclose however Kinney teaches for each of the sample EMR documents:
presenting the sample EMR document to one or more human analysts ([0023] “The system may allow a fraction of the EMRs and their assigned codes to be reviewed by an analyst, such as a human analyst.”)
prompting each of the analysts to associate either a positive label of the EMR analysis labels or a negative label of the EMR analysis labels to the sample EMR document, wherein the negative label is a logical inverse of the positive label ([0072] “In block 1619, scores associated with positive codes are retrieved. The positive codes are the positive codes identified by an analyst, such as a human analyst or another method, as described above with reference to FIG. 15C. In block 1620, scores associated negative codes are retrieved. The negative codes are identified by an analyst and may include final scores that are less than the threshold, as described above with reference to FIG. 15C.”)
receiving label association data from each of the analysts, wherein the label association data indicates whether the positive label or the negative label is associated with the sample EMR document and is generated in response to physical manipulation of one or more user input devices by the analysts ([0056] “FIG. 12 illustrates a collection of scores generated by N different agents. As discusses above, the system 204 uses N different agents to generate codes and associated scores based on a context 206 for the input EMR 202.” [0062] “FIG. 15B illustrates a set of final scores 1508 that are greater than a threshold Tth and associated medical codes 1510 that are a subset of the codes 1504. The same EMR 1506 has been analyzed by human analysts, or by some other analytical method, who have assigned six different individual medical codes 1512 to the EMR 1506, which are considered to the set of final correct medical codes to be used in annotating the EMR 1506.”)
determining whether the label association data represents that a predetermined minimum number of analysts associated the positive label with the sample EMR document or the predetermined minimum number of analysts associated the negative label with the sample EMR document ([0029] “The stream-valuation method produces a real-valued score in the range [0,1], in this implementation. The larger the magnitude of the score, the greater the probability that the individual medical code is related to, or applicable to, the particular EMR with respect to which the individual medical code is evaluated in the stream-comparison operation.” [0052] “The same EMR has been analyzed by human analysts, who have assigned nine different individual medical codes 1008 to the EMR which are together considered to comprise the set “true” 1010.”)
and associating the positive label with the sample EMR document upon a condition in which the predetermined minimum number of analysts associated the positive label with the sample EMR document or associating the negative label with the sample EMR document upon a condition in which the predetermined minimum number of analysts associated the negative label with the sample EMR document ([0070] “Positive codes 1520 are the codes 1512 identified by the analyst, and negative codes 1522 are the incorrectly identified codes “f” and “g” 1518 and the codes “i” and “j” that were generated by the automated system with associated scores below the threshold Tth.”)

Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the claimed invention to expand Lakshminarayan’s techniques for clustering medical records and Qiang’s techniques for sorting medical records with Kinney’s techniques to use human analysts for labeling. The motivation for the combination of Lakshminarayan, Qiang, and Kinney is to identify and label important EMRs (See Kinney, Background).
Regarding claim 14, the limitations are rejected for the same reasons as stated above for claim 7.
Regarding claim 21, the limitations are rejected for the same reasons as stated above for claim 7.



Prior Art Cited but Not Relied Upon
The following document was found relevant to the disclosure but not applied:
US20200111545A1.
This reference is relevant since it discloses parsing and performing clustering on an EMR.


Conclusion




Any inquiry concerning this communication or earlier communications from the examiner should be directed to WINSTON FURTADO whose telephone number is (571)272-5349. The examiner can normally be reached Monday-Friday 8:00 AM to 4:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jason B. Dunham can be reached on (571)-272-8109. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/W.F./Examiner, Art Unit 3626                                                                                                                                                                                                        
/JOSHUA B BLANCHETTE/Primary Examiner, Art Unit 3626