DETAIL

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 


Applicant(s) Response to Official Action
The response filed on 04/29/2022 has been entered and made of record.

Claim Rejections - 35 USC § 102 and 35 USC § 103
Presented arguments have been fully considered, but are rendered moot in view of the new ground(s) of rejection necessitated by amendment(s) initiated by the applicant(s).



Notice re prior art available under both pre-AIA  and AIA 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  



Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:

1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.


Claim 1-6, 8-13, 15-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lucas et al. (US20200126663A1) (hereinafter Lucas) and further in view of Bekas et al. (US20180204360A1)  (hereinafter Bekas).

Regarding Claim 1, Lucas meets the limitations as follows: 
A system for information extraction and analysis, the system [i.e. Fig. 29, and associated text, Para 0288-0289] comprising:
at least one memory device with computer-readable program code stored thereon; [i.e. a system 1100 may include a server 1102, and one or more mobile devices, such as smartphones 1108A, 1108B, tablet devices 1108C, 1108D, and laptop or other computing devices 1108E; Fig. 29, and associated text, Para 0288-0289, system processors may be programmed to continually and automatically perform; Para 0262, The system may execute subroutines; Para 0097, 0114, opening or starting-up the application; Para 0045, a mobile application is displayed on the mobile device 12; Para 0046; Therefore, server and computer devices processing and executing the applications and subroutines, indicates the program code stored on a memory device.]
at least one communication device; [i.e. server 1102 may be connected directly to one or more of the devices, such as via an Ethernet or other suitable connection. Alternatively, the server 1102 may be connected wirelessly to one or more of the devices, such as via WiFi or another wireless connection, as would be appreciated by those of ordinary skill in the relevant art; Fig. 29, and associated text, Para 0289]
at least one processing device operatively coupled to the at least one memory device and the at least one communication device, wherein executing the computer-readable code is configured to cause the at least one processing device [i.e. one or more analytical actions described herein may be performed by the mobile devices or alternatively, one or more actions may be performed by the server; Fig. 29, and associated text, Para 0288-0289] to:
receive a graphical representation for analysis; [i.e. receiving an electronic representation of a medical document; Para 0010, capture images of documents; Fig.2, and associated text, Para 0011, 0049]
process the graphical representation to convert the graphical representation to a standard file type and remove unnecessary information, 
detect features within the graphical representation using a convolutional neural network (i.e. MLA or DLNN) analysis by identifying boundary thresholds and contours within the graphical representation; [i.e. MLA may identify distinguishing characteristics, pixels, or colors from the logo and the thickness of the border; Para 0057, a uniform white space (i.e. threshold) of the width above and below, that is also present in the larger section on the border 0087] 
generate a feature map (i.e. feature list/list of features) of the graphical representation comprising detected features in the graphical representation; [i.e. Features of a document may include headers, columns, tables, graphs, and other standard forms which appear in the document.; Para 0055, Predefined models may store a list of features that are derived from the document based on MLA processing.; Para 0056, A region (such as a header, table, graphic, or chart) may be identiﬁed by utilizing a stored feature list for the document; Para 0061]
access a chart repository (i.e. models/templates) containing classification attributes and proportional information for multiple chart types; [i.e. matching the document to a template model; extracting features from the template model; Para 0010, a document classifier may process features of the document stored in a predefined model for each document, where features of a document may include headers, columns, tables, graphs, and other standard forms which appear in the document.; Para 0055, Predefined models may store a list of features that are derived from the document based on MLA processing.; Para 0056, The MLA may identify distinguishing characteristics, pixels, or colors from the logo and the thickness of the border of the header a seemingly random placement as a unique document identiﬁer which is consistent between reports.; Para 0057, A region (such as a header, table, graphic, or chart) may be identified by utilizing a stored feature list for the document; Para 0061, The MLA may access a stored library of templates; Para 0079, a border may actually be identiﬁed using the negative space (such as the white space) around a text by observing that the white space is of at least a uniform distance all around a segment of text and creating a natural boundary. For example, white space that also borders an edge of a paper may be several times as thick as the white space above and below the segment of text, but there will be at least a uniform white space of the width above and below, that is also present in the larger section on the border. … a table may be identiﬁed by observing two or more intersecting lines, a table may be solid, dashed, or even extrapolated from the negative space between the words.; Para 0087, indicating containing classification attributes and proportional information for multiple chart types]
classify the graphical representation according to one of the multiple chart types based on the classification attributes from the chart repository; [i.e. a document classifier may process features of the document stored in a predefined model for each document, where features of a document may include headers, columns, tables, graphs, and other standard forms which appear in the document.; Para 0055, Predefined models may store a list of features that are derived from the document based on MLA processing.; Para 0056, A region (such as a header, table, graphic, or chart) may be identified by utilizing a stored feature list for the document; Para 0061, a MLA or a deep learning neural network (DLNN) may be trained with a training dataset that comprises annotations for types of classification that may be performed. It should be understood that the terms MLA and DLNN are interchangeable throughout this disclosure. Thus, a resulting ruleset or neural network may identify or recognize a plurality of features across a standardized report or other template signifying that a classification may be extracted from a specific section of a particular report based at least in part on that extraction ruleset.; Para 0065, use the identifier to look up the classification model optimized for the document, and classify the document; Para 0186
analyze the detected features using proportional information for the classification of the graphical representation; [i.e. matching the document to a template model; extracting features from the template model; Para 0010, a document classifier may process features of the document stored in a predefined model for each document, where features of a document may include headers, columns, tables, graphs, and other standard forms which appear in the document.; Para 0055, Predefined models may store a list of features that are derived from the document based on MLA processing.; Para 0056, The MLA may identify distinguishing characteristics, pixels, or colors from the logo and the thickness of the border of the header a seemingly random placement as a unique document identiﬁer which is consistent between reports.; Para 0057, A region (such as a header, table, graphic, or chart) may be identified by utilizing a stored feature list for the document; Para 0061, The MLA may access a stored library of templates; Para 0079, a border may actually be identiﬁed using the negative space (such as the white space) around a text by observing that the white space is of at least a uniform distance all around a segment of text and creating a natural boundary. For example, white space that also borders an edge of a paper may be several times as thick as the white space above and below the segment of text, but there will be at least a uniform white space of the width above and below, that is also present in the larger section on the border. … a table may be identiﬁed by observing two or more intersecting lines, a table may be solid, dashed, or even extrapolated from the negative space between the words.; Para 0087, indicating analyze the detected features using proportional information for the classification of the graphical representation.]
extract data from the detected features using optical character recognition and proportional analysis; [i.e. document may be submitted for optical character recognition (OCR) on the document to convert the text into a machine-readable format; Para 0053, extract each section in turn, and then provide the section to an OCR algorithm, such as an OCR post-processing optimized to extracting information from the respective section type.; Para 0077]  and
store the extracted data in an accessible format, wherein the extracted data from the detected features includes contour data and numerical data series.[i.e. once the mobile device captures the NGS report, system may extract some or all of the NGS medical information (such as patient information, genes, mutations, variants, or expression data) contained in the document. Using various OCR, MLA, and DLNN techniques and  extracted information then may be stored in a database or other data repository, preferably in a structured format; Para 0066, where, a region (such as a header, table, graphic, or chart) may be identiﬁed by utilizing a stored feature list for the document; Para 0061, Each ﬁeld of the extracted region may have a plurality of enumerated values,; Para 0062, linking the results of Section 3 to the enumerated content of Section 2.; Fig. 3, and associated text, Para 0078, identifying a region of interest may be performed by identifying a border, a table may be identiﬁed by observing two or more intersecting lines. Similarly, lines segmenting the columns and rows of a table may be solid, dashed, or even extrapolated from the negative space between the words; Para 0087, and information being extracted, such as sequencing information; Para 0088, indicates detected features includes contour data and numerical data series.]
Lucas does not explicitly disclose the following claim limitations:
... wherein removing unnecessary information further comprises removing axis and grid information;
However, in the same field of endeavor Bekas discloses the deficient claim limitations, as follows:
... wherein removing unnecessary information further comprises removing axis and grid information; [i.e. identify and locate the structural primitives that are not representing quantitative data values such as grids, axes and more and are removed from the data region allowing for extracting quantitative data from the correct region of the image, e.g. from the data region framed by the axes, and to remove elements comprised by the data region that are no data such as a legend box and its content or a grid.; Para 0024, 0034, 0063]   
Lucas discloses identifying the boards, tables etc. and extracting the text data from the graphical image and removing the boards, table lines etc.. Bekas discloses identifying grids, axes are removed and extracting quantitative data from the data region framed by the axes or a grid, which is pertinent to the problem with which the applicant was concerned. Therefore, combining the teachings of Lucas and Bekas would provide an expected result thereby resulting in the claimed invention.
Therefore,  it would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the system disclosed by Lucas add the teachings of Bekas as above, in order for automatically extracting data from a digital image. [Bekas: Para 0008]

Regarding Claim 2, Note the Rejection for claim 1, wherein Lucas further discloses
The system of claim 1, wherein the detection of features within the graphical representation further comprises repeatedly analyzing the graphical representation to identify regions of interest within the graphical representation. [i.e. error checking may involve repeating the process in low conﬁdence predictions; Para 0240]
 
Regarding Claim 3, Note the Rejection for claim 1, wherein Lucas further discloses
The system of claim 1, wherein the proportional information comprises thresholds for identifying boundaries and contours based on differences identified in pixel data for the graphical representation. [i.e. Mask may identify the bounds of Section, for example, by identifying a size (such as number of pixels, width, length, diameter, etc.) (i.e. threshold); Para 0078, a border may actually be identiﬁed using the negative space (such as the white space) around a text by observing that the white space is of at least a uniform distance (i.e. threshold) all around a segment of text and creating a natural boundary. For example, white space that also borders an edge of a paper may be several times (i.e. difference) as thick as the white space above and below the segment of text, but there will be at least a uniform white space of the width above and below, that is also present in the larger section on the border.; Para 0087]

Regarding Claim 4, Note the Rejection for claim 1, wherein Lucas further discloses
The system of claim 1, wherein the feature map comprises overlaying feature masks and annotated information (i.e. indicator) on the graphical representation.[i.e. Medical data, or key health information, may include prognostic indicators; para 0058,  diﬀerent variants separated by some kind of indicator, such as a comma, semicolon, colon, slash, backslash, new line, etc.; Para 0072, generate a mask which “outlines” the section; Para 0077, second layer MLA may identify, for each region of interest, which type of feature the region of interest may contain (such as a table, header, graph, etc.). An output of the MLA from the second layer may be a series of masked images for each of the regions of interest and an indicator for the type of feature that exists in the region of interest.; Para 0082]

Regarding Claim 5, Note the Rejection for claim 1, wherein Lucas further discloses
The system of claim 1, wherein extracting the data from the detected features includes parsing the detected features and creating separate files for each detected feature. [i.e. a plurality of features 40, such as a patient's name, date of birth, and diagnosis; the institution's name and location; and the date of report, of data collection, etc. The MLA may also identify extraction techniques to apply, such as use sentence splitting algorithms to parse a plurality of sentences; Para 0069, stage 66 for parsing may include NLP algorithms for sentence splitting and candidate extraction, and modular nature of the pipeline stages, each stage may pass data directly to the next stage based on processing availability or may store data in a corresponding portion of a storage component or database. a sentence splitting algorithm may be stored in a cloud based server or on a local/remote server 76 and may be incorporated into the parser at stage 66; Para 0113, a database of multiple documents, or another form of patient record, the request may pass through a pre-processing subroutine 82, a parsing subroutine 84,; Para 0114, raw OCR information may be pulled from the database. The processing intake pipeline stage for pre-processing and OCR occurs in these servers/processes. The system also may check the database to determine whether improved NLP models have been provided and retrieve any new or updated models; Para 0232, a server 1102 including or in communication with a first database 1102 and a second database 1104, where one or both of those databases individually may actually be a plurality of different databases, such as to assist in data segregation or improved processing by optimizing database calls for the requested different types of data.; Para 0288; indicating includes parsing the detected features and creating separate files for each detected feature.]

Regarding Claim 6, Note the Rejection for claim 1, wherein Lucas further discloses
The system of claim 1, wherein storing the extracted data in an accessible format further comprises storing the detected features, contour data, and numerical data series in an extensible markup language (i.e. XML) file. [i.e. extracted information then may be stored in a database or other data repository, preferably in a structured format.; Para 0066, Exemplary predeﬁned models may be a JSON ﬁle, HTML, XML, or other structured data; Para 0056, electronic document captures may include a structured data form (such as JSON, XML, HTML, etc.); Para 0049, indicating that XML file is a structured data file. Therefore storing the extracted information in a structured, XML, file.]

Regarding claim 8-13, the claim(s) recites analogous limitations to claim 1-6 above, respectively, and is/are therefore rejected on the same premise. Therefore, regarding claim 8-13, Lucas and Bekas meet the claim limitations as set forth in claim 1-6, respectively. [Please, refer to mapping, explanation and pertinence of prior art reference given in claim 1-6, respectively; Note: current claim may contain different terminology or additional claim terms compared to the claim(s) that are referred claim(s) to meet the claim limitations of current claim. However, explanation and pertinence of prior art of reference provided in the referred claim(s) would address any differing claim limitations. Therefore, explanation and pertinence of prior art of reference are not duplicated, and applicant is requested to refer to explanation and pertinence of prior art reference provided in the referred claim(s).]

Regarding claim 15-20, the claim(s) recites analogous limitations to claim 1-6 above, respectively, and is/are therefore rejected on the same premise. Therefore, regarding claim 15-20, Lucas and Bekas meet the claim limitations as set forth in claim 1-6, respectively. [Please, refer to mapping, explanation and pertinence of prior art reference given in claim 1-6, respectively; Note: current claim may contain different terminology or additional claim terms compared to the claim(s) that are referred claim(s) to meet the claim limitations of current claim. However, explanation and pertinence of prior art of reference provided in the referred claim(s) would address any differing claim limitations. Therefore, explanation and pertinence of prior art of reference are not duplicated, and applicant is requested to refer to explanation and pertinence of prior art reference provided in the referred claim(s).]



Claim 7, 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lucas et al. (US20200126663A1) (hereinafter Lucas) and further in view of Bekas et al. (US20180204360A1)  (hereinafter Bekas) and further in view of Messina et al. (US20180336405A1)  (hereinafter Hall).

Regarding Claim 7, Note the Rejection for claim 1, wherein Lucas and Bekas do not explicitly disclose the following claim limitations:
The system of claim 1, wherein the standard file type comprises a tag image file format.
However, in the same field of endeavor Campbell discloses the deficient claim limitations, as follows:
The system of claim 1, wherein the standard file type comprises a tag image file format. [i.e. one of the most common image formats (png, jpeg, bmp, gif, tiff and others).; Para 0038]  
Lucas discloses the take documents of a variety of formats, including, for example, XML, HTML, rich text, PDF, PNG, or JPG, and convert them to a format that a respective OCR service accepts, such as JPG or PNG, indicating JPG, PNG or similar file format can be used. Messina discloses that one of the most common image formats may be png, jpeg, bmp, gif, tiff and others, indicating that PNG, JPG, and TIFF are common image file formats, indicating to the person of ordinary skill in the art before the effective filing date of the claimed invention that input file may be converted to a TIFF format.
Therefore,  it would have been obvious to the person of ordinary skill in the art before the effective filing date of the claimed invention to modify the system disclosed by Lucas and Bekas add the teachings of Messina as above, in order to use TIFF as an image file format. 

Regarding claim 14, the claim(s) recites analogous limitations to claim 7 above, and is/are therefore rejected on the same premise. Therefore, regarding claim 14, Lucas, Bekas and Messina meet the claim limitations as set forth in claim 7. [Please, refer to mapping, explanation and pertinence of prior art reference given in claim 7; Note: current claim may contain different terminology or additional claim terms compared to the claim(s) that are referred claim(s) to meet the claim limitations of current claim. However, explanation and pertinence of prior art of reference provided in the referred claim(s) would address any differing claim limitations. Therefore, explanation and pertinence of prior art of reference are not duplicated, and applicant is requested to refer to explanation and pertinence of prior art reference provided in the referred claim(s).]


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 


Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EXAMINER, DAKSHESH PARIKH, whose telephone number is (571) 272-2777.  The examiner can normally be reached on EXAMINER SCHEDULE.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, Applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, SPE, SATH V. PERUNGAVOOR, can be reached on (571) 272-7455.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
 
/DAKSHESH D PARIKH/Primary Examiner, Art Unit 2488