Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
This action is in reply to the claims filed on 02 December 2019. Claims 1-20 are currently pending and have been examined.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 3-11, and 13-20 are rejected under pre-AIA  35 U.S.C. 103 as being unpatentable over Po-Hao Chen et al., "Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports," J. Digit. Imaging 31:178-184 (27 October 2017), hereinafter “Chen,” in view of Sohrab Towfighi et al., "Labelling chest x-ray reports using an open-source NLP and ML tool for text data binary classification," medRxiv.org, posted November 22, 2019 (https://www.medrxiv.org/content/10.1101/19012518v1) (Year: 2019), hereinafter “Towfighi.”
Regarding claim 1, Chen discloses a method for developing a classification model, the method comprising: 
selecting, from a corpus of reports, a subset of the reports from which to form a training set and a testing set (See Chen page 180 column 2; system divided training set into training and testing datasets.); 
assigning labels … to the reports in both the training set and the testing set (See Chen last paragraph of page 179 (continued onto page 180); the data sets were labeled according to the oncology categories.); 
extracting a sparse representation matrix for each of the training set and the testing set based on features in the training set (See Chen page 180 column 2; system used TF-IDF for feature vectorization. See also the second full paragraph of page 179; the unstructured text is represented by vectors.); 
learning, with one or more electronic processors, a correlation between the features of the training set and the corresponding labels using a machine learning classifier, thereby building a classification model (See Chen Abstract and page 180, last paragraph before “Results;” they used machine learning to classify reports into categories for predicting the label of those reports.); 
testing the classification model on the reports in the testing set for accuracy using the sparse representation matrix of the testing set (See Chen page 180 column 2; system divided training set into training and testing datasets. It is understood that the testing data set is used to test the classification model.); and 
predicting, with the classification model, labels … for remaining reports in the corpus not included in the subset (See Chen Abstract; the system combines machine learning and NLP to predict report labels.).
Chen does not disclose: 
the labels are of a modality and an anatomical focus.
Towfighi teaches:
the labels are of a modality and an anatomical focus (See Towfighi abstract 3rd page under “Labeling X-ray Reports,” this system classifies the reports as being chest x-rays (CXR) or not chest x-rays.).
The system of Towfighi is applicable to the disclosure of Chen as they both share characteristics and capabilities, namely, they are directed to using NLP and machine learning to classify unstructured text related to radiological reports. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen to classify for anatomy and modality as taught by Towfighi. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Chen because it would be useful for research (See Towfighi second paragraph of “Background” section).

Regarding claim 3, Chen in view of Towfighi discloses the method of claim 1 as discussed above. Chen further discloses a method, wherein extracting the sparse representation matrix for each of the training set and the testing set based on features in the training set includes:
extracting the sparse representation matrix for each of the training set and the test set based on term frequency-inverse document frequency (TFIDF) features in the training set (See Chen page 180 column 2; system used TF-IDF for feature vectorization.).

Regarding claim 4, Chen in view of Towfighi discloses the method of claim 1 as discussed above. Chen further discloses a method, wherein extracting the sparse representation matrix for each of the training set and the testing set based on features in the training set includes:
extracting the sparse representation matrix for each of the training set and the test set based on term frequency features in the training set (See Chen page 180 column 2; system used term frequency for feature vectorization.).

Regarding claim 5, Chen in view of Towfighi discloses the method of claim 1 as discussed above. Chen further discloses a method, wherein learning the correlation between the features and their corresponding labels using the machine learning classifier includes:
learning the correlation using a logistic regression classifier (See Chen page 180 column 2; system used logistic regression.).

Regarding claim 6, Chen in view of Towfighi discloses the method of claim 1 as discussed above. Chen does not further disclose a method, wherein learning the correlation between the features and their corresponding labels using the machine learning classifier includes:
learning the correlation using a binary classifier.
Towfighi teaches:
learning the correlation using a binary classifier (See Towfighi abstract 3rd page under “Labeling X-ray Reports,” this system classifies the reports as being chest x-rays (CXR) or not chest x-rays.).
The system of Towfighi is applicable to the disclosure of Chen as they both share characteristics and capabilities, namely, they are directed to using NLP and machine learning to classify unstructured text related to radiological reports. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen to classify for anatomy and modality as taught by Towfighi. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Chen because it would be useful for research (See Towfighi second paragraph of “Background” section).

Regarding claim 7, Chen in view of Towfighi discloses the method of claim 1  as discussed above. Chen further discloses a method, wherein learning the correlation between the features and their corresponding labels using the machine learning classifier includes:
learning the correlation using a multiclass classifier (See Chen abstract; the system is classifying the reports into 4 groups.).

Regarding claim 8, Chen in view of Towfighi discloses the method of claim 1 as discussed above. Chen further discloses a method, wherein learning the correlation between the features and their corresponding labels using the machine learning classifier includes:
learning the correlation using one selected from a group consisting of a logistic regression classifier, a decision tree classifier, and a support vector machine classifier (See Chen page 180 column 2; system used logistic regression and an SVM classifier.).

Regarding claim 9, Chen in view of Towfighi discloses the method of claim 1  as discussed above. Chen does not further disclose a method, wherein predicting, with the classification model, the labels of the anatomical focus and the modality for the remaining reports includes:
classifying each of the remaining reports as one selected from a group consisting of mammography, chest X-ray, obstetric ultrasound, spine magnetic resonance imaging (MRI), spine X-ray, bone densitometry analysis (DEXA), chest, abdomen and pelvis computed tomography (cap CT), abdomen ultrasound, leg venous Doppler ultrasound, feet and ankle X-ray, positron-emission tomography PET/CT tumor imaging, chest computed tomography (CT), breast ultrasound, knee MRI, hip X- ray, knee X-ray, brain MRI, breast MRI, and thyroid ultrasound.
Towfighi teaches:
classifying each of the remaining reports as one selected from a group consisting of mammography, chest X-ray, obstetric ultrasound, spine magnetic resonance imaging (MRI), spine X-ray, bone densitometry analysis (DEXA), chest, abdomen and pelvis computed tomography (cap CT), abdomen ultrasound, leg venous Doppler ultrasound, feet and ankle X-ray, positron-emission tomography PET/CT tumor imaging, chest computed tomography (CT), breast ultrasound, knee MRI, hip X- ray, knee X-ray, brain MRI, breast MRI, and thyroid ultrasound (See Towfighi abstract 3rd page under “Labeling X-ray Reports,” this system classifies the reports as being chest x-rays (CXR) or not chest x-rays.).
The system of Towfighi is applicable to the disclosure of Chen as they both share characteristics and capabilities, namely, they are directed to using NLP and machine learning to classify unstructured text related to radiological reports. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen to classify for anatomy and modality as taught by Towfighi. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Chen because it would be useful for research (See Towfighi second paragraph of “Background” section).

Regarding claim 10, Chen in view Towfighi of discloses the method of claim 1 as discussed above. Chen does not further disclose a method, wherein predicting, with the classification model, the labels of the anatomical focus and the modality for the remaining reports includes:
classifying each of the remaining reports as one selected from a group consisting of a chest X-ray report and a non-chest X-ray report.
Towfighi teaches:
classifying each of the remaining reports as one selected from a group consisting of a chest X-ray report and a non-chest X-ray report (See Towfighi abstract 3rd page under “Labeling X-ray Reports,” this system classifies the reports as being chest x-rays (CXR) or not chest x-rays.).
The system of Towfighi is applicable to the disclosure of Chen as they both share characteristics and capabilities, namely, they are directed to using NLP and machine learning to classify unstructured text related to radiological reports. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen to classify for anatomy and modality as taught by Towfighi. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Chen because it would be useful for research (See Towfighi second paragraph of “Background” section).

Regarding claim 11, Chen discloses a system for developing a classification model, the system comprising one or more electronic processors configured to: 
select, from a corpus of reports, a subset of the reports from which to form a training set and a testing set (See Chen page 180 column 2; system divided training set into training and testing datasets.); 
assign labels … to the reports in both the training set and the testing set (See Chen last paragraph of page 179 (continued onto page 180); the data sets were labeled according to the oncology categories.); 
extract a sparse representation matrix for each of the training set and the testing set based on features in the training set (See Chen page 180 column 2; system used TF-IDF for feature vectorization. See also the second full paragraph of page 179; the unstructured text is represented by vectors.); 
learn a correlation between the features of the training set and the corresponding labels using a machine learning classifier, thereby building a classification model (See Chen Abstract and page 180, last paragraph before “Results;” they used machine learning to classify reports into categories for predicting the label of those reports.); 
test the classification model on the reports in the testing set for accuracy using the sparse representation matrix of the testing set (See Chen page 180 column 2; system divided training set into training and testing datasets. It is understood that the testing data set is used to test the classification model.); and 
predict, with the classification model, labels … for remaining reports in the corpus not included in the subset (See Chen Abstract; the system combines machine learning and NLP to predict report labels.).
Chen does not disclose: 
the labels are of a modality and an anatomical focus.
Towfighi teaches:
the labels are of a modality and an anatomical focus (See Towfighi abstract 3rd page under “Labeling X-ray Reports,” this system classifies the reports as being chest x-rays (CXR) or not chest x-rays.).
The system of Towfighi is applicable to the disclosure of Chen as they both share characteristics and capabilities, namely, they are directed to using NLP and machine learning to classify unstructured text related to radiological reports. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen to classify for anatomy and modality as taught by Towfighi. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Chen because it would be useful for research (See Towfighi second paragraph of “Background” section).

Regarding claim 13, Chen in view of Towfighi discloses the system of claim 11 as discussed above. Chen further discloses a system, wherein:
the features include frequency-inverse document frequency (TFIDF) features (See Chen page 180 column 2; system used TF-IDF for feature vectorization.).

Regarding claim 14, Chen in view of Towfighi discloses the system of claim 11 as discussed above. Chen further discloses a system, wherein:
the features include term frequency features (See Chen page 180 column 2; system used term frequency for feature vectorization.).

Regarding claim 15, Chen in view of Towfighi discloses the system of claim 11 as discussed above. Chen does not further disclose a system, wherein:
the machine learning classifier includes a binary classifier.
Towfighi teaches:
the machine learning classifier includes a binary classifier (See Towfighi abstract 3rd page under “Labeling X-ray Reports,” this system classifies the reports as being chest x-rays (CXR) or not chest x-rays.).
The system of Towfighi is applicable to the disclosure of Chen as they both share characteristics and capabilities, namely, they are directed to using NLP and machine learning to classify unstructured text related to radiological reports. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen to classify for anatomy and modality as taught by Towfighi. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Chen because it would be useful for research (See Towfighi second paragraph of “Background” section).

Regarding claim 16, Chen in view of Towfighi discloses the system of claim 11 as discussed above. Chen further discloses a system, wherein:
the machine learning classifier includes a multiclass classifier (See Chen abstract; the system is classifying the reports into 4 groups.).

Regarding claim 17, Chen in view of Towfighi discloses the system of claim 11 as discussed above. Chen further discloses a system, wherein:
the machine learning classifier includes one selected from a group consisting of a logistic regression classifier, a decision tree classifier, and a support vector machine classifier (See Chen page 180 column 2; system used logistic regression and an SVM classifier.).

Regarding claim 18, Chen in view of Towfighi discloses the system of claim 11 as discussed above. Chen does not further disclose a system, wherein:
the predicted labels of the anatomical focus and the modality for the remaining reports includes one selected from a group consisting of mammography, chest X-ray, obstetric ultrasound, spine magnetic resonance imaging (MRI), spine X-ray, bone densitometry analysis (DEXA), chest, abdomen and pelvis computed tomography (cap CT), abdomen ultrasound, leg venous Doppler ultrasound, feet and ankle X- ray, positron-emission tomography PET/CT tumor imaging, chest computed tomography (chest CT), breast ultrasound, knee MRI, hip X-ray, knee X-ray, brain MRI, breast MRI, and thyroid ultrasound.
Towfighi teaches:
the predicted labels of the anatomical focus and the modality for the remaining reports includes one selected from a group consisting of mammography, chest X-ray, obstetric ultrasound, spine magnetic resonance imaging (MRI), spine X-ray, bone densitometry analysis (DEXA), chest, abdomen and pelvis computed tomography (cap CT), abdomen ultrasound, leg venous Doppler ultrasound, feet and ankle X- ray, positron-emission tomography PET/CT tumor imaging, chest computed tomography (chest CT), breast ultrasound, knee MRI, hip X-ray, knee X-ray, brain MRI, breast MRI, and thyroid ultrasound (See Towfighi abstract 3rd page under “Labeling X-ray Reports,” this system classifies the reports as being chest x-rays (CXR) or not chest x-rays.).
The system of Towfighi is applicable to the disclosure of Chen as they both share characteristics and capabilities, namely, they are directed to using NLP and machine learning to classify unstructured text related to radiological reports. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen to classify for anatomy and modality as taught by Towfighi. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Chen because it would be useful for research (See Towfighi second paragraph of “Background” section).

Regarding claim 19, Chen in view of Towfighi discloses the system of claim 11 as discussed above. Chen does not further disclose a system, wherein:
the predicted labels of the anatomical focus and the modality for the remaining reports includes one selected from a group consisting of a chest X-ray report and a non-chest X-ray report.
Towfighi teaches:
the predicted labels of the anatomical focus and the modality for the remaining reports includes one selected from a group consisting of a chest X-ray report and a non-chest X-ray report (See Towfighi abstract 3rd page under “Labeling X-ray Reports,” this system classifies the reports as being chest x-rays (CXR) or not chest x-rays.).
The system of Towfighi is applicable to the disclosure of Chen as they both share characteristics and capabilities, namely, they are directed to using NLP and machine learning to classify unstructured text related to radiological reports. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen to classify for anatomy and modality as taught by Towfighi. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Chen because it would be useful for research (See Towfighi second paragraph of “Background” section).

Regarding claim 20, non-transitory, computer-readable medium storing instructions that, when executed by one or more electronic processors, perform a set of function, the set of functions comprising: 
selecting, from a corpus of reports, a subset of the reports from which to form a training set and a testing set (See Chen page 180 column 2; system divided training set into training and testing datasets.); 
assigning labels … to the reports in both the training set and the testing set (See Chen last paragraph of page 179 (continued onto page 180); the data sets were labeled according to the oncology categories.); 
extracting a sparse representation matrix for each of the training set and the testing set based on features in the training set (See Chen page 180 column 2; system used TF-IDF for feature vectorization. See also the second full paragraph of page 179; the unstructured text is represented by vectors.); 
learning a correlation between the features of the training set and the corresponding labels using a machine learning classifier, thereby building a classification model (See Chen Abstract and page 180, last paragraph before “Results;” they used machine learning to classify reports into categories for predicting the label of those reports.); 
testing the classification model on the reports in the testing set for accuracy using the sparse representation matrix of the testing set (See Chen page 180 column 2; system divided training set into training and testing datasets. It is understood that the testing data set is used to test the classification model.); and 
predicting, with the classification model, labels … for remaining reports in the corpus not included in the subset (See Chen Abstract; the system combines machine learning and NLP to predict report labels.).
Chen does not disclose: 
the labels are of a modality and an anatomical focus.
Towfighi teaches:
the labels are of a modality and an anatomical focus (See Towfighi abstract 3rd page under “Labeling X-ray Reports,” this system classifies the reports as being chest x-rays (CXR) or not chest x-rays.).
The system of Towfighi is applicable to the disclosure of Chen as they both share characteristics and capabilities, namely, they are directed to using NLP and machine learning to classify unstructured text related to radiological reports. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen to classify for anatomy and modality as taught by Towfighi. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Chen because it would be useful for research (See Towfighi second paragraph of “Background” section).


Claims 2 and 12 are rejected under pre-AIA  35 U.S.C. 103 as being unpatentable over Po-Hao Chen et al., "Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports," J. Digit. Imaging 31:178-184 (27 October 2017), hereinafter “Chen,” in view of Sohrab Towfighi et al., "Labelling chest x-ray reports using an open-source NLP and ML tool for text data binary classification," medRxiv.org, posted November 22, 2019 (https://www.medrxiv.org/content/10.1101/19012518v1) (Year: 2019), hereinafter “Towfighi,” and further in view of John Zech et al. "Natural Language – based Machine Learning Models for the Annotation of Clinical Radiology Reports," Radiology 287:2 570-580, May 2018, hereinafter “Zech.”


Regarding claim 2, Chen in view of Towfighi discloses the method of claim 1 as discussed above. Chen does not further disclose a method, comprising:
verifying the correctness of the predicted modality labels by:
(i) performing unsupervised feature clustering on the remaining reports, thereby forming corpus clusters; (ii) measuring compactness of the corpus clusters formed; (iii) performing unsupervised feature clustering on the reports in the training set, thereby forming training clusters; and (iv) measuring overlap of the corpus clusters with the training clusters.
Zech teaches:
verifying the correctness of the predicted modality labels (See Zech page 573, section titled featurization.) by:
(i) performing unsupervised feature clustering on the remaining reports, thereby forming corpus clusters; (ii) measuring compactness of the corpus clusters formed; (iii) performing unsupervised feature clustering on the reports in the training set, thereby forming training clusters; and (iv) measuring overlap of the corpus clusters with the training clusters (See Zech Figure 3 and page 574 to 575.).
The system of Zech is applicable to the disclosure of Chen as they both share characteristics and capabilities, namely, they are directed to using NLP and machine learning to classify unstructured text related to radiological reports. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen to include clustering analysis as taught by Zech. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Chen in order to aid in automatically identifying labels in the data set (see Zech Abstract).
	
	
Regarding claim 12, Chen in view of Towfighi discloses the system of claim 11 as discussed above. Chen does not further disclose a system, wherein:
the one or more electronic processors are further configured to verifying a correctness of the predicted modality labels by:
(i) performing unsupervised feature clustering on the remaining reports, thereby forming corpus clusters; (ii) measuring compactness of the corpus clusters formed; (iii) performing unsupervised feature clustering on the reports in the training set, thereby forming training clusters; and (iv) measuring overlap of the corpus clusters with the training clusters.
Zech teaches:
the one or more electronic processors are further configured to verifying a correctness of the predicted modality labels (See Zech page 573, section titled featurization.) by:
(i) performing unsupervised feature clustering on the remaining reports, thereby forming corpus clusters; (ii) measuring compactness of the corpus clusters formed; (iii) performing unsupervised feature clustering on the reports in the training set, thereby forming training clusters; and (iv) measuring overlap of the corpus clusters with the training clusters (See Zech Figure 3 and page 574 to 575.).
The system of Zech is applicable to the disclosure of Chen as they both share characteristics and capabilities, namely, they are directed to using NLP and machine learning to classify unstructured text related to radiological reports. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen to include clustering analysis as taught by Zech. One of ordinary skill in the art before the effective filing date of the claimed invention would have been motivated to modify Chen in order to aid in automatically identifying labels in the data set (see Zech Abstract).

	
	

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Kangarloo (U.S. 20090228299) discloses a system and method for classifying reports by anatomy and imaging modality. Cai (Tianrun Cai et al., "Natural Language Processing Technologies in Radiology Research and Clinical Applications," Radiographs 2016, 36:176-191, 2016) teaches an overview of using NLP principles in conjunction with machine learning to label radiological reports.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BENJAMIN L HANKS whose telephone number is (571)270-5080. The examiner can normally be reached Monday-Friday 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Victoria Augustine can be reached on (313) 446-4858. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/B.L.H./Examiner, Art Unit 3626                                                                                                                                                                                                        
/JONATHAN DURANT/Primary Examiner, Art Unit 3619