DETAILED ACTION
Status of Claims
This action is in reply to the application filed on 4 December, 2020.
Claims 8 - 10 have been amended.
Claims 1 – 10 are currently pending and have been examined.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claims 1 – 10 are objected to because of the following informalities:  Claims 1, 9 and 10, in the last limitation recite different names for the same thing: i.e. “the q-th report sample”; “the report sample”; and “the current report sample”. The dependent claims share some of these terms. Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


Claim 4 is rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention. Claim 4 recites “parsing the statement database, to establish a named entity recognition dictionary and a pattern rules database, and removing duplicate texts from the named entity recognition dictionary and the pattern rules database.” Examiner cannot determine the metes and bounds of the claims. In particular, it is unclear if the named entity recognition dictionary and the pattern rules database here, are the same as in Claim 1. For example, Claim 1 is based on nouns in the text, and Claim 4 appears to be based on short sentences in the text. Appropriate correction is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

The following rejection is formatted in accordance with the 2019 Revised Patent Subject Matter Eligibility Guidance (7 January, 2019) and the October 2019 Update: Subject Matter Eligibility (17 October, 2019).
In Alice, the Supreme Court re-iterated long held exclusions to patent eligibility under U.S.C. 101 including: laws of nature, natural phenomenon and abstract ideas. The Supreme Court and the Federal Circuit Court have also set forth precedential decisions that contain specific concepts that fall into the abstract idea category. The 2019 Revised Patent Subject Matter Eligibility Guidance issued on 7 January, 2019 by the USPTO provides groupings of subject matter that is considered an abstract idea including: “mathematical concepts” - (i.e. mathematical relationships, mathematical formulas or equations and mathematical calculations); “certain methods of organizing human activity” (i.e. fundamental economic principle and practices, commercial or legal interactions, managing personal behavior or relationships or interactions between people); and “mental processes” – (i.e. concepts performed in the human mind).
Claims 1 - 10 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e. a law of nature, a natural phenomenon, or an abstract idea), and does not include additional elements that either: 1) integrate the abstract idea into a practical application, or 2) that provide an inventive concept – i.e. element that amount to significantly more than the abstract idea.  The Claims are directed to an abstract idea because, when considered as a whole, the plain focus of the claims is on an abstract idea.
Claim 1 is representative. Claim 1 recites:
A method for labelling capsule endoscopy report, comprising: 
collecting p report samples to establish an initial corpus database, any of the p report samples comprising an original text and labeled information, and the labeled information being a naming category corresponding to each noun in the original text; 
parsing the report samples in the initial corpus database, to establish a named entity recognition dictionary and a pattern rules database, and 
removing duplicate texts from the named entity recognition dictionary and the pattern rules database; 
wherein the named entity recognition dictionary comprises named categories in the report samples and nouns corresponding to each named category, and 
the pattern rules database comprises unrecognized texts in the report samples and rules, laws, and characteristics corresponding to the unrecognized texts; 
since the q-th report sample is collected, q=p+1, querying the named entity recognition dictionary and the pattern rules database with texts appearing in the report sample, to automatically label the current report sample. 
Claim 10 recites medium with instructions executed by a processor, and Claim 9 recites an apparatus that executes the steps of the method recited in Claim 1.
STEP 1
The claims are directed to an apparatus, a method and non-transitory computer readable medium which are included in the statutory categories of invention.
STEP 2A PRONG ONE
The claims, as illustrated by Claim 1, recite limitations that encompass an abstract idea including:  
parsing the report samples in the initial corpus database, to establish a named entity recognition dictionary and a pattern rules database, and 
removing duplicate texts from the named entity recognition dictionary and the pattern rules database; 
wherein the named entity recognition dictionary comprises named categories in the report samples and nouns corresponding to each named category, and 
the pattern rules database comprises unrecognized texts in the report samples and rules, laws, and characteristics corresponding to the unrecognized texts; 
since the q-th report sample is collected, q=p+1, querying the named entity recognition dictionary and the pattern rules database with texts appearing in the report sample, to label the current report sample. 
The claims, as illustrated by Claim 1, recite limitations that encompass an abstract idea within the “mental processes” grouping – concepts performed in the human mind including observation, evaluation, judgment and opinion.  The claims are directed to assigning a label to a current (or q-th) capsule endoscopy report by matching text (i.e. nouns) found in the current report with nouns found in a small number of historical reports that have been previously labeled with a naming category by a human. Text that does not match may be compared to a database of misspelled words, or descriptions associated with correctly spelled words. Further, the specification defines the labels or naming categories for capsule endoscopy – for example: “organ identification, disease type, etc.”; defines the nouns corresponding to the organ identification – i.e. “digestive track and anatomical structures – for example: esophagus, stomach, antrum, etc.; and defines the nouns corresponding to the disease type – for example: “cancer, tumor, polyp, ulcer, etc.”. The noun “stomach” found in a hypothetical report would be labeled “organ identification”; and the noun “ulcer” found in the report would be labeled “disease type”. The specification discloses that manual labelling to organize examination reports, and to facilitate subsequent review and analysis, is known in the prior art.
Parsing text to recognize keywords and their respective, previously applied labels, is a process that, under the broadest reasonable interpretation, can be performed in the human mind. Only a small number of reports needs to sampled to identify keyword nouns and corresponding naming category labels. The sampled data is used to build a complete list of nouns for each naming category, while eliminating duplicates. The nouns found in a current report may be compared to the nouns in the list to determine the corresponding naming category for the noun. Word recognition, including nouns in text and corresponding naming category labels, as well as comparing words to those found in a list is a process that can be performed in the human mind. As such, the claims recite an abstract idea within the mental process grouping.
The claims, as illustrated by Claim 1, recite limitations that encompass an abstract idea within the “certain methods of organizing human activity” grouping – 
managing personal behavior or relationships or interactions between people including social activities, teaching, and following rules or instructions. 
Organizing information by assigning naming category labels to keywords in a report, where the label is determined by matching the keywords to a list of keywords associated with a naming category label is process that merely organizes this human activity. This type of activity, i.e. labelling reports, includes conduct that would normally occur when organizing examination reports. For example, it is routine in medicine to label capsule endoscopy reports to form structured data to facilitate subsequent review. As such, the claims recite an abstract idea within the certain methods of organizing human activity grouping.
STEP 2A PRONG TWO
The claims recite limitations that include additional elements beyond those that encompass the abstract idea above including:
collecting p report samples to establish an initial corpus database, any of the p report samples comprising an original text and labeled information, and the labeled information being a naming category corresponding to each noun in the original text.
However, these additional elements do not integrate the abstract idea into a practical application of that idea in accordance with considerations laid out by the Supreme Court or the Federal Circuit. (see MPEP 2106.05 a-c and e) The additional elements integrate the abstract idea when they: encompass an improvement to the functioning of a computer or an improvement to another technology or technical field; use the abstract idea with a particular machine or manufacture that is integral to the claim; transform an article to a different state or thing; or recite meaningful limitations beyond linking the abstract idea to a particular technological environment. The additional limitations do not integrate the abstract idea when they merely serve to link the use of the abstract idea to a particular technological environment or field of use – i.e. merely uses the computer as a tool to perform the abstract idea; or recite insignificant extra-solution activity (see MPEP 2106.05 f - h). 
The claims require that the current report be labelled “automatically”. Examiner construes this to mean that the step is performed by a programmed computer. The specification discloses such computers at a high level of generality such that it amounts to no more than instructions to apply the abstract idea using a generic computer component. These elements merely add instructions to implement the abstract idea on a computer, and generally link the abstract idea to a particular technological environment. Collecting report samples is an insignificant extra-solution activity – i.e. a data gathering step. Nothing in the claim recites specific limitations directed to an improved computer system, processor, memory, network, database or Internet. Similarly, the specification is silent with respect to these kinds of improvements. A general purpose computer that applies a judicial exception by use of conventional computer functions, as is the case here, does not qualify as a particular machine, nor does the recitation of a generic computer impose meaningful limits in the claimed process. (see Ultramercial, Inc. v. Hulu, LLC, 772 F.3d 709, 716-17 (Fed. Cir. 2014)). As such, the additional elements recited in the claim do not integrate the abstract labelling process into a practical application of that process.
STEP 2B
The additional elements identified above do not amount to significantly more than the abstract labelling process. Collecting information for analysis, such as the recited report samples, is a well-understood, routine and conventional computer function – i.e. receiving or transmitting data over a network as in Symantec, TLI, OIP and buySAFE. As such, the additional elements recited in the claim do not provide an inventive concept that amounts to significantly more than the abstract labelling process.
The additional structural elements or combination of elements in the claims, other than the abstract idea per se, amount to no more than a recitation of generic computer structure (i.e. an electronic apparatus comprising a memory and a processor that executes computer programs, computer-readable medium). Each of the above components are disclosed in the specification as being purely conventional and/or known in the industry. Because the specification describes these additional elements in general terms, without describing particulars, Examiner concludes that the claim limitations may be broadly, but reasonably construed, as reciting well-understood, routine and conventional computer components and techniques. The specification describes the elements in a manner that indicates that they are sufficiently well-known that the specification does not need to describe the particulars in order to satisfy U.S.C. 112. Considered as an ordered combination the limitations recited in the claims add nothing that is not already present when the steps are considered individually.
The dependent claims add additional features including those that merely serve to further narrow the abstract idea above; those that recite additional abstract ideas including: reviewing automatically labeled reports, correcting errors and updating lists (2); segmenting into sentences (3); parsing sentences and removing duplicates (4); creating and matching against a prefix list to label a report (5, 6); those that recite well-understood, routine and conventional activity or computer functions including: storing sentences (3); greedy matching (7); sequential searching (8); those that recite insignificant extra-solution activities; or those that are an ancillary part of the abstract idea.  Examiner takes Official Notice that greedy matching is old and well-known and purely conventional. The limitations recited in the dependent claims, in combination with those recited in the independent claims add nothing that integrates the abstract idea into a practical application, or that amounts to significantly more. These elements merely narrow the abstract idea, recite additional abstract ideas, or append conventional activity to the abstract process. As such, the additional element do not integrate the abstract idea into a practical application, or provide an inventive concept that transforms the claims into a patent eligible invention.
The apparatus claims are no different from the method claims in substance. “The equivalence of the method, system and media claims is readily apparent.” “The only difference between the claims is the form in which they were drafted.” (Bancorp). The method claims recite the abstract idea implemented on a generic computer, while the apparatus claims recite generic computer components configured to implement the same idea. Specifically, Claims 9 and 10 merely add the generic hardware noted above that nearly every computer will include. The apparatus claim’s requirement that the same method be performed with a programmed computer does not alter the method’s patentability under U.S.C. 101 (In re Grams). Therefore, the claims are rejected under 35 U.S.C. 101 as being directed to non-statutory subject matter.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1 – 4, 9, 10 are rejected under 35 U.S.C. 103 as being unpatentable over Yegnanarayanan: (US PGPUB 2013/0035961 A1).
CLAIMS 1, 9 and 10
Yegnanarayanan discloses a method and apparatus for extracting medical facts from text documents that includes the following limitations:
A method for labelling a report, comprising: 
collecting p report samples to establish an initial corpus database, any of the p report samples comprising an original text and labeled information, and the labeled information being a naming category corresponding to each noun in the original text; (Yegnanarayanan 0096, 0106, 0112, 0113);
parsing the report samples in the initial corpus database, to establish a named entity recognition dictionary and a pattern rules database, (Yegnanarayanan 0096, 0112, 0113); and 
removing duplicate texts from the named entity recognition dictionary and the pattern rules database; (Yegnanarayanan 0096);
wherein the named entity recognition dictionary comprises named categories in the report samples and nouns corresponding to each named category, (Yegnanarayanan 0105); and 
the pattern rules database comprises unrecognized texts in the report samples and rules, laws, and characteristics corresponding to the unrecognized texts; (Yegnanarayanan 0112);
since the q-th report sample is collected, q=p+1, querying the named entity recognition dictionary and the pattern rules database with texts appearing in the report sample, to automatically label the current report sample; (Yegnanarayanan 0114, 0117). 
Yegnanarayanan discloses a method and apparatus for extracting medical facts from text documents such as medical reports. A model is constructed, based on parsing a set of historical reports that have been labeled by a human labeler, that forms associations between labels assigned by a human, and text in the document. The associations are based on the “prevalence” of labels corresponding to similar text. This fairly teaches removing duplicate texts, as these would merely be counted for prevalence. Embodiments include lists of fact types or entity types (i.e. a naming category label) and nouns associated with the label (i.e. “pneumothorax” is labeled as a “complication”) as well as dictionaries for misspelled words, etc. (i.e. unrecognized text). The constructed model may be applied to new text to automatically label the new text allowing medical records to be considered in the aggregate for best practices, auditing or quality assurance.
Yegnanarayanan discloses extracting and labeling text from medical reports in general, but does not expressly disclose capsule endoscopy reports. Nonetheless, Applicant admits that such reports are old and well-known. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing data of the claimed invention, to have modified the text extraction and labeling system of Yegnanarayanan so as to have included capsule endoscopy reports, in accordance with the Applicant’s admission, in order to allow for analysis for best practices, auditing or quality assurance purposes.
CLAIMS 2 and 3
Yegnanarayanan discloses the limitations above relative to Claim 1. Additionally, Yegnanarayanan discloses the following limitations:
reviewing the automatically labeled report sample, revising errors when there are errors in the automatically labeled report sample, transferring the revised report sample to the original corpus database, and re-iterating and updating the named entity recognition dictionary and pattern rules database; identifying that the labelling of the current report sample completes when there are no errors in the automatically labeled report sample; (Yegnanarayanan 0066, 0072, 0155) – disclosing correcting automatic labeling and updating the model until complete.
segmenting each report sample into a plurality of short sentences by punctuation and storing the first obtained short sentences to form a statement database; (Yegnanarayanan 0117) – teaching segmenting at the phrase or sentence level.

CLAIM 4
Yegnanarayanan discloses the limitations above relative to Claim 1. With respect to the following limitations:
parsing each obtained short sentence, and determining whether the current short sentence already exists in the statement database; 
omitting to process the current short sentence when the current short sentence already exists in the statement database, adding the current short sentence to the statement database when the current short sentence does not exist in the statement database; 
parsing the statement database, to establish a named entity recognition dictionary and a pattern rules database, and 
removing duplicate texts from the named entity recognition dictionary and the pattern rules database. 
Claim 4 recites the same process of parsing, comparing and only adding new short sentences when they don’t already exist. Yegnanarayanan (0117) disclose parsing and processing sentences.
Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Yegnanarayanan: (US PGPUB 2013/0035961 A1) in view of Syeda-Mahmood et al.: (US 8,793,199 B2).
CLAIM 5
Yegnanarayanan discloses the limitations above relative to Claim 1. With respect to the following limitations:
creating a prefix dictionary according to the named entity recognition dictionary, the prefix dictionary storing noun groups corresponding to each noun in the named entity recognition dictionary; 
when the named entity recognition dictionary is composed of {di,......,di,......,dn}, any noun group in the prefix dictionary is expressed as: {di1,......,dij,......,di Li}; wherein, n denotes the total number of nouns in the named entity recognition dictionary, di denotes the i-th noun in the named entity recognition dictionary, i€1, 2......n, the 1- th noun comprises Li characters arranged in sequence, dj ; denotes the word consisting of the characters from the Ist one to the j-th one arranged in sequence, jE1, 2......Li; 
traversing the prefix dictionary and keeping only one of the same words; 
the step "automatically label the current report sample" specifically comprises: since the q-th report sample is collected, querying the named entity recognition dictionary, prefix dictionary and pattern rules database with the texts appearing in the report sample, to automatically label the current report sample.
Yegnanarayanan discloses extracting various form of speech including nouns, verb, adjective, preposition, prefixes, etc., but does not expressly disclose a prefix dictionary (0113). Syeda-Mahmood discloses a system and method for extraction of information from reports that includes a prefix dictionary. Matching using longest common subfix is disclosed (Syeda-Mahmood col. 4 line 46 to col. 5 line 49). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing data of the claimed invention, to have modified the text extraction and labeling system of Yegnanarayanan so as to have included a prefix dictionary, in accordance with the teachings of Syeda-Mahmood, in order to allow for matching prefix terms in text.
Claims 6 - 8 are rejected under 35 U.S.C. 103 as being unpatentable over Yegnanarayanan: (US PGPUB 2013/0035961 A1) in view of Syeda-Mahmood et al.: (US 8,793,199 B2) in view of Ruehle: (US PGPUB 2014/0101176 A1).
CLAIM 6
Yegnanarayanan discloses the limitations above relative to Claim 1. With respect to the following limitations:
segmenting each report sample into a plurality of short sentences by punctuation when the q-th report sample is collected; 
querying the prefix dictionary with word xt_k formed from the t-th character to the k-th character in each short sentence, the value of t is [1,XN], the value of k is [t,XN], wherein XN is the total number of characters in current short sentence;
determining whether xt_k exists in the prefix dictionary, taking t=1 for the first time of determination, 
taking k=k+1 when xt_k exists in the prefix dictionary, continuing to determine whether xt_k+1 exists in the prefix dictionary, till the xt_k+1 is not in the prefix dictionary, then querying the named entity recognition dictionary using xt_k as the keyword, and 
when a noun corresponding to the keyword is found, labeling the current noun with the naming category of the found noun.
Yegnanarayanan discloses extracting various form of speech including nouns, verb, adjective, preposition, etc., but does not expressly disclose a prefix dictionary. Syeda-Mahmood discloses a system and method for extraction of information from reports that includes a prefix dictionary. Matching using longest common subfix is disclosed (Syeda-Mahmood col. 4 line 46 to col. 5 line 49). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing data of the claimed invention, to have modified the text extraction and labeling system of Yegnanarayanan so as to have included a prefix dictionary, in accordance with the teachings of Syeda-Mahmood, in order to allow for matching prefix terms in text. With respect to the following:
when the noun corresponding to the keyword is not found, doing greedy matching for current word xt_k and labelling according the matching result; when the noun corresponding to the current word xt_k is still not found by greedy matching, giving up labeling with querying the named entity recognition dictionary as the standard; (Ruehle 0024 - 0028, 0035).
Ruehle discloses a system and method for matching using greedy matching. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing data of the claimed invention, to have modified the text extraction and labeling system of Yegnanarayanan so as to have included greedy matching, in accordance with the teachings of Ruehle, in order to allow for matching overlapping terms.
CLAIM 8
Yegnanarayanan discloses the limitations above relative to Claim 1. With respect to the following limitations:
first querying the named entity recognition dictionary with the texts appearing in the report sample, and continuing to query the pattern rules database with the texts appearing in the report sample when no corresponding text is found in the named entity recognition dictionary; (Ruehle 0023).
Ruehle discloses a system and method for sequential matching. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing data of the claimed invention, to have modified the text extraction and labeling system of Yegnanarayanan so as to have included greedy matching, in accordance with the teachings of Ruehle, in order to allow for matching across different dictionaries.
CLAIM 7
The combination of Yegnanarayanan/Syeda-Mahmood/Ruehle discloses the limitations above relative to Claims 1 and 6. With respect to the following limitations:
doing a forward greedy matching for the current word x k; in the process of forward greedy matching, keeping k=k-1, and each time k is re- assigned, querying the named entity recognition dictionary using xt_k-1 as keyword, and when the corresponding noun is found, labelling the current noun with the naming category of the found noun, and 
when the corresponding noun is still not found when k=t, performing backward greedy matching for the word x: x; in the process of backward greedy matching, keeping t=t+1, and each time t is re- assigned, querying the named entity recognition dictionary using xt+1 k as keyword, and when the corresponding noun is found, labelling the current noun with the naming category of the found noun, and when the corresponding noun is still not found when t=k, determining that the combination in any sequence of characters from the t-th one to the k-th one in the current word is not successfully queried in the named entity recognition dictionary; (Ruehle 0024 - 0028, 0035).
Ruehle discloses a system and method for matching using greedy matching. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing data of the claimed invention, to have modified the text extraction and labeling system of Yegnanarayanan so as to have included greedy matching, in accordance with the teachings of Ruehle, in order to allow for matching overlapping terms.
CONCLUSION
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US PGPUB 2018/0293227 A1 to Guo discloses a system and method for extracting phrases from text documents and that includes a prefix dictionary.
Any inquiry of a general nature or relating to the status of this application or concerning this communication or earlier communications from the Examiner should be directed to John A. Pauls whose telephone number is (571) 270-5557.  The Examiner can normally be reached on Mon. - Fri. 8:00 - 5:00 Eastern.  If attempts to reach the examiner by telephone are unsuccessful, the Examiner’s supervisor, Robert Morgan can be reached at (571) 272-6773.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://portal.uspto.gov/external/portal/pair.  Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866.217.9197.
Official replies to this Office action may now be submitted electronically by registered users of the EFS-Web system.  Information on EFS-Web tools is available on the Internet at: http://www.uspto.gov/patents/process/file/efs/guidance/index.jsp.  An EFS-Web Quick-Start Guide is available at:  http://www.uspto.gov/ebc/portal/efs/quick-start.pdf.
Alternatively, official replies to this Office action may still be submitted by any one of fax, mail, or hand delivery.  Faxed replies should be directed to the central fax at (571) 273-8300.  Mailed replies should be addressed to “Commissioner for Patents, PO Box 1450, Alexandria, VA  22313-1450.”  Hand delivered replies should be delivered to the “Customer Service Window, Randolph Building, 401 Dulany Street, Alexandria, VA  22314.”

/JOHN A PAULS/Primary Examiner, Art Unit 3626                                                                                                                                                                                                         
Date: 27 October, 2022