DETAILED ACTION
This Action is responsive to Applicant’s Amendments filed December 7, 2021.
Please note, claims 1-24 remain pending.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
On page 2, Applicant argues that Woytowitz does not anticipate the amended claim 1 because the cited art does not disclose “receiving input of a knowledgebase schema of data items available for extraction from the unstructured documents” and “parsing the unstructured documents into a machine-readable representation”. Specifically, Applicant argues that the electronic database of disambiguated entity mentions created from a corpus of electronic unstructured documents in Woytowitz are written in XML format, and therefore us not unstructured as recited in the Applicant’s claims. 
As to the above, Examiner respectfully submits paragraph 0046 of Woytowitz teaches harvesting data from the electronic document corpus. Examiner further respectfully submits paragraph 0047 of Woytowitz teaches that the electronic unstructured documents in the electronic document corpus may include unstructured electronic unstructured documents. Examiner introduces a new reference to further teach knowledgebase schema. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to 
Claim 1-3, 9-11, 17-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Woytowitz et al. (US Pub. No. 2012/0197862) further in view of Minton et al. (US Pub. No. 2009/0319515)

Regarding claim 1, Woytowitz teaches a computer-implemented method of extracting data from unstructured documents, the method comprising: 
‘using one or more processors (¶0042) to perform the steps of: 
receiving input of a number of unstructured documents’ as reading new electronic unstructured documents (¶0075)
‘receiving input of a schema of data items available for extraction from the unstructured documents’ as utilizing a document schema from a collection of document schemas to parse and tag extraction content (¶0048)
‘parsing the unstructured documents into a machine-readable representation’ as parsing the unstructured documents to store and report multiple classes of entity mentions and mention relations contained in electronic unstructured documents 
‘identifying data items in the machine-readable representation according to the schema’ as identifying entity mentions and mention relationships (¶0044)
‘propagating interpretations of data items within the unstructured documents to disambiguate identified data items’ as clustering objects from the mentions and mention relationships to disambiguate (¶0078-80, 52)
‘matching identified data items with other data items in the unstructured documents according to the schema’ (¶0078-80, 52)
‘extracting only identified data items that include a minimal set of interpretations specified by the schema’ as processing mentions and extracting only new mentions that match over a threshold score (¶0077-78, 52)
Woytowitz fails to explicitly teach:
‘knowledgebase schema’
Minton teaches:
‘knowledgebase schema’ (¶0051)
It would have been obvious to one of ordinary skill in the art at the time that the present invention was effectively filed to modify the teachings of the cited references because Minton’s would have allowed Woytowitz’s to improve organization of data (¶0004)

	
	
Regarding claim 2, Woytowitz teaches ‘wherein the knowledgebase schema comprises known attributes of data items, known aliases of data items, and specified relationships between data items’ as schema comprising mention objects and relation objects and variations of items (¶0048)

Woytowitz teaches ‘where in identified data items are matched with other data according to the specified relationships in the knowledgebase schema’ as using comparison algorithms for matching objects (¶0108)

Regarding claim 9, Woytowitz teaches a system for extracting data from unstructured documents, the system comprising: 
‘a storage device configured to store program instructions’ (¶0042)
‘and one or more processors operably connected to the storage device (¶0042) and configured to execute the program instructions to cause the system to: 
receive input of a number of unstructured documents’ as reading new electronic unstructured documents (¶0075)
‘receive input of a schema of data items available for extraction from the unstructured documents’ as utilizing a document schema from a collection of document schemas to parse and tag extraction content (¶0048)
‘parse the unstructured documents into a machine-readable representation’ as parsing the unstructured documents to store and report multiple classes of entity mentions and mention relations contained in electronic unstructured documents
‘identify data items in the machine-readable representation according to the schema’ as identifying entity mentions and mention relationships (¶0044)
‘propagate interpretations of data items within the unstructured documents to disambiguate identified data items’ as clustering objects from the mentions and mention relationships to disambiguate (¶0078-80, 52)
‘match identified data items with other data items in the unstructured documents according to the schema’ (¶0078-80, 52)
‘and extract only identified data items that include a minimal set of interpretations specified by the schema’ as processing mentions and extracting only new mentions that match over a threshold score (¶0077-78, 52)
Woytowitz fails to explicitly teach:
‘knowledgebase schema’
Minton teaches:
‘knowledgebase schema’ (¶0051)
It would have been obvious to one of ordinary skill in the art at the time that the present invention was effectively filed to modify the teachings of the cited references because Minton’s would have allowed Woytowitz’s to improve organization of data (¶0004)

Regarding claim 17, Woytowitz teaches a computer program product for extracting data from unstructured documents, the computer program product comprising:
‘a non-volatile computer readable storage medium having program instructions stored thereon (¶0042) to perform the steps of:
receiving input of a number of unstructured documents’ as reading new electronic unstructured documents (¶0075)
‘receiving input of a schema of data items available for extraction from the unstructured documents’ as utilizing a document schema from a collection of document schemas to parse and tag extraction content (¶0048)
‘parsing the unstructured documents into a machine-readable representation’ as parsing the unstructured documents to store and report multiple classes of entity mentions and mention relations contained in electronic unstructured documents 
‘identifying data items in the machine-readable representation according to the schema’ as identifying entity mentions and mention relationships (¶0044)
‘propagating interpretations of data items within the unstructured documents to disambiguate identified data items’ as clustering objects from the mentions and mention relationships to disambiguate (¶0078-80, 52)
‘matching identified data items with other data items in the unstructured documents according to the schema’ (¶0078-80, 52)
‘extracting only identified data items that include a minimal set of interpretations specified by the schema’ as processing mentions and extracting only new mentions that match over a threshold score (¶0077-78, 52)
Woytowitz fails to explicitly teach:
‘knowledgebase schema’
Minton teaches:
‘knowledgebase schema’ (¶0051)
Minton’s would have allowed Woytowitz’s to improve organization of data (¶0004)


Regarding claim 18, Woytowitz teaches ‘wherein the knowledgebase schema comprises known attributes of data items, known aliases of data items, and specified relationships between data items’ as schema comprising mention objects and relation objects and variations of items (¶0048)

Regarding claim 19, Woytowitz teaches ‘where in identified data items are matched with other data according to the specified relationships in the knowledgebase schema’ as using comparison algorithms for matching objects (¶0108)

Claim 4-6, 12-14, 20-22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Woytowitz et al. (US Pub. No. 2012/0197862), Minton et al. (US Pub. No. 2009/0319515) further in view of Viola et al. (US Pub. No. 2007/0003147)

Regarding claim 4, Woytowitz teaches ‘wherein parsing the unstructured documents comprises: text parsing’ (¶0009)
Woytowitz fails to explicitly teach ‘line parsing; and image parsing.’
Viola teaches ‘line parsing; and image parsing’ (¶0023-24, 35)
Viola’s would have allowed Woytowitz’s to provide efficient document recognition with improved accuracy (¶0005)

	
Regarding claim 5, Woytowitz teaches ‘further comprising detecting and reconstructing tables from text, lines, and images.’ (¶0029, 53)

Regarding claim 6, Woytowitz teaches ‘further comprising detecting and reconstructing charts from text, lines, and images’ (¶0053)

Regarding claim 12, Woytowitz teaches ‘wherein parsing the unstructured documents comprises: text parsing’ (¶0009)
Woytowitz fails to explicitly teach ‘line parsing; and image parsing.’
Viola teaches ‘line parsing; and image parsing’ (¶0023-24, 35)
It would have been obvious to one of ordinary skill in the art at the time that the present invention was effectively filed to modify the teachings of the cited references because Viola’s would have allowed Woytowitz’s to provide efficient document recognition with improved accuracy (¶0005)


Regarding claim 13, Woytowitz teaches ‘further comprising detecting and reconstructing tables from text, lines, and images.’ (¶0029, 53)

Regarding claim 14, Woytowitz teaches ‘further comprising detecting and reconstructing charts from text, lines, and images’ (¶0053)

Regarding claim 20, Woytowitz teaches ‘wherein parsing the unstructured documents comprises: text parsing’ (¶0009)
Woytowitz fails to explicitly teach ‘line parsing; and image parsing.’
Viola teaches ‘line parsing; and image parsing’ (¶0023-24, 35)
It would have been obvious to one of ordinary skill in the art at the time that the present invention was effectively filed to modify the teachings of the cited references because Viola’s would have allowed Woytowitz’s to provide efficient document recognition with improved accuracy (¶0005)

Regarding claim 21, Woytowitz teaches ‘further comprising detecting and reconstructing tables from text, lines, and images.’ (¶0029, 53)

Regarding claim 22, Woytowitz teaches ‘further comprising detecting and reconstructing charts from text, lines, and images’ (¶0053)



Claim 7, 15, 23 is/are rejected under 35 U.S.C. 103 as being unpatentable over Woytowitz et al. (US Pub. No. 2012/0197862), Minton et al. (US Pub. No. 2009/0319515) further in view of Griffith et al. (US Pub. No. 2018/0314705)

Regarding claim 7, Woyotwitz fails to explicitly teach ‘wherein, for tabular data, interpretations of data items are propagated across rows and down columns.’
Griffith teaches ‘wherein, for tabular data, interpretations of data items are propagated across rows and down columns.’ (¶0174-176)
It would have been obvious to one of ordinary skill in the art at the time that the present invention was effectively filed to modify the teachings of the cited references because Griffith’s would have allowed Woytowitz’s to optimize linking of data (¶0008)

Regarding claim 15, Woyotwitz fails to explicitly teach ‘wherein, for tabular data, interpretations of data items are propagated across rows and down columns.’
Griffith teaches ‘wherein, for tabular data, interpretations of data items are propagated across rows and down columns.’ (¶0174-176)
It would have been obvious to one of ordinary skill in the art at the time that the present invention was effectively filed to modify the teachings of the cited references because Griffith’s would have allowed Woytowitz’s to optimize linking of data (¶0008)

Woyotwitz fails to explicitly teach ‘wherein, for tabular data, interpretations of data items are propagated across rows and down columns.’
Griffith teaches ‘wherein, for tabular data, interpretations of data items are propagated across rows and down columns.’ (¶0174-176)
It would have been obvious to one of ordinary skill in the art at the time that the present invention was effectively filed to modify the teachings of the cited references because Griffith’s would have allowed Woytowitz’s to optimize linking of data (¶0008)

Claim 8, 16, 24 is/are rejected under 35 U.S.C. 103 as being unpatentable over Woytowitz et al. (US Pub. No. 2012/0197862), Minton et al. (US Pub. No. 2009/0319515) further in view of Goldstein et al. (US Pub. No. 2015/0154164)

Regarding claim 8, Woytowitz fails to explicitly teach ‘wherein, for floating text, interpretation of a matched data item to a corresponding value is propagated according to geometric and semantic information’
Goldstein teaches ‘wherein, for floating text, interpretation of a matched data item to a corresponding value is propagated according to geometric and semantic information’ (¶0284)
It would have been obvious to one of ordinary skill in the art at the time that the present invention was effectively filed to modify the teachings of the cited references because Griffith’s would have allowed Woytowitz’s to more accurately compare entities extracted (¶0025)
Woytowitz fails to explicitly teach ‘wherein, for floating text, interpretation of a matched data item to a corresponding value is propagated according to geometric and semantic information’
Goldstein teaches ‘wherein, for floating text, interpretation of a matched data item to a corresponding value is propagated according to geometric and semantic information’ (¶0284)
It would have been obvious to one of ordinary skill in the art at the time that the present invention was effectively filed to modify the teachings of the cited references because Griffith’s would have allowed Woytowitz’s to more accurately compare entities extracted (¶0025)

Regarding claim 24, Woytowitz fails to explicitly teach ‘wherein, for floating text, interpretation of a matched data item to a corresponding value is propagated according to geometric and semantic information’
Goldstein teaches ‘wherein, for floating text, interpretation of a matched data item to a corresponding value is propagated according to geometric and semantic information’ (¶0284)
It would have been obvious to one of ordinary skill in the art at the time that the present invention was effectively filed to modify the teachings of the cited references because Griffith’s would have allowed Woytowitz’s to more accurately compare entities extracted (¶0025)


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VAN OBERLY whose telephone number is (571)272-7025. The examiner can normally be reached Monday - Friday, 7:30am-4pm MT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mark Featherstone can be reached on (571)270-3750. The fax phone 
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/VAN H OBERLY/Primary Examiner, Art Unit 2166