DETAILED ACTION
Claims 1-20 rejected under nonstatutory double patenting.
Claim 14 objected to for minor informalities.
Claims 1-20 rejected under 35 USC § 103.


Notice of Pre-AIA  or AIA  Status
The present application is being examined under the pre-AIA  first to invent provisions. 


Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-20 rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-24 of U.S. Patent No. 9,430,453 B1; claims 1-21 of U.S. Patent No. 9,317,484; claims 1-20 of U.S. Patent No. 10,248,858 B2; and claims 1-20 of U.S. Patent No. 10,860,848 B2. Although the claims at issue are not identical, they are not patentably distinct from each other because the present application recites limitations patently broader in scope that in the claims of the '453, '858, and '848 patents.


Claim Objections
Claim 14 objected to because of the following informalities:
Claim 14 recites "wherein the interface is configured to repetitively iterate through each of the plurality of fields until either the user enters data that clears the validation error associated with the field." The use of the word "either" implies that the repetitive iteration occurs until either one of multiple things happen. The claim only recites one action. Clarification is requested.
Appropriate correction is required.


Claim Rejections - 35 USC § 103
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.

Claims 1-2 and 17-20 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Tuganbaev et al., U.S. PG-Publication No. 2010/0060947 A1, in view of Tillberg et al., U.S. PG-Publication No. 2010/0246999 A1.

Claim 1
a method of capturing document data. Tuganbaev discloses a method "for enabling a data capture system to capture data from a document image corresponding to a document." Tuganbaev, ¶ 9.
	Tuganbaev discloses obtaining a multi-page document; and extracting data from multiple pages of the multi-page document. The method comprises steps of "processing the scanned images into documents; for documents comprising multiple pages maintaining a page-based coordinate system to specify a location of structures within a page and joining the pages to form a multi-page sheet having a sheet-based coordinate system to specify a location of structures within the multi page sheet," and "performing a data extraction operation to extract data from each document." Id. at ¶ 10.
	Tuganbaev discloses identifying two or more values located on at least two different pages of the multi-page document. The data extraction operation comprises "a document mode wherein structures are detected within the entire document using the sheet-based coordinate system." Id. Using the multi-page sheet "makes it possible to solve tasks as complex as capturing data from documents with multi-page tables," such as document 400 (FIGS. 4A-4B) comprising multiple values located on multiple pages. Id. at ¶ 37.
	Tuganbaev does not expressly disclose wherein at least a first one of the two or more values is dependent on at least a second one of the two or more values; and validating the at least the first one of the two or more values and the at least the second one of the two or more values according to one or more validation rules.
	Tillberg discloses wherein at least a first one of the two or more values is dependent on at least a second one of the two or more values; and validating the at least the first one of the two or more values and the at least the second one of the two or more values according to one or more validation rules. Tillberg discloses a "method employing data extraction technology to provide high accuracy data transfer and editing from paper document and scanned images into electronic machine text." Tillberg, ¶ 8. A system implementing the method comprises a "validation subsystem" performing "a consistency check utility that identifies errors by comparing the extracted data to … business rules." Extracted strings "can be matched or grouped together algorithmically to validate separate outputs via regular expression and logical relationships," for example a zip code should correlate to a town, which should further correlate to a street; and an age output should correlate with a birth data output. Id. at ¶ 29. The extracted data "is validated 350 using rules" and "entered 370 into the database." Id. at ¶ 55. Tillberg discloses that "in cases where there is internal consistency among fields within a form, those fields can be checked automatically for tuples of entries that fit to lexical or regular expression rules," such as "Towns/Cities match with States and Countries, whether Addresses have appropriate Zip codes and area codes, Gender may be checked against a lexicon of first names, Related data may be cross-checked, and Related Names may be checked." Id. at ¶ 69.
	It would have been obvious to a person having ordinary skill in the art before the claimed invention was made, having both Tuganbaev and Tillberg before them to modify the method of extracting data from a multi-page document of Tuganbaev to incorporate the validation rules describing logical consistency between multiple fields as taught by Tillberg. One of ordinary skill in the art would be motivated to integrate the validation rules describing logical consistency between fields into Tuganbaev, with a reasonable expectation of success, in order to increase the accuracy of extracted data to a level "suitable for integration into databases." Tillberg, ¶ 8.

Claim 2
wherein the validating comprises: accessing a set of validation rules in a library of validation rules. Tillberg discloses that an "automated processing module takes the output of recognized machine print and validates the output against rules and lexicon." Tillberg, ¶¶ 55; 60. Figure 12 illustrates a "consistency checking module" comprising rules 1220 (i.e. library of validation rules) used to validate the extracted output. Id. at ¶ 65.
	Tillberg discloses sequentially applying the set of validation rules to the at least the first one of the two or more values and the at least the second one of the two or more values. Tillberg uses an "edit path for recognized data," wherein "data is passed through various levels of quality assurance and editing steps until it is deposited in the database." The specific edit path chose "is determine by the level of accuracy required" and "the ability of the system to automatically validate and edit that data at any step." Id. at ¶¶ 50-51. Tillberg discloses validating data at steps of an edit path; accordingly, Tillberg discloses an ordering of validating data.
	Tillberg discloses marking the at least the first one of the two or more values and the at least the second one of the two or more values as having an error if one of the validation rules fails. Tillberg discloses that "forms are … presented 925 with some means of highlighting the element currently being validated or edited." Id. at ¶ 62.

Claim 17
Tuganbaev discloses wherein the first and second ones of the two or more values are contained in a table or array of the multi-page document. Tuganbaev discloses that "[v]ery often on each page of a document there is a running title at the tope and/or the bottom, with a table flowing over from one page to the next. In this case the running title interrupts the data contained in the table. Describing the running titles as a repeating group which occurs once on each page 

Claim 18
	Tuganbaev discloses determining that a sequence of pages comprise the multi-page document, wherein determining that a sequence of pages in a stream of document page images comprise the single multi- page document includes processing each page individually to determine a corresponding page type; and processing the stream of page types to identify a sequence associated with a multi-page document type. Tuganbaev uses a "flexible structure description comprising descriptions of structures in [a] document and detection information to facilitate detection of said structures in the document images" (Tuganbaev, ¶ 9). The flexible structure description specifies "the likely number of pages for a document type" and describes "the structure of the first (Header) page and the last (Footer) page of the document." The flexible structure description "includes descriptions of all data fields to be detected and of all anchor elements and their relationships within the structure of the documents of a given type." Tuganbaev discloses that different types of multi-page documents have their own flexible structure description that "are used by [a] data capture system to identify documents in a batch of incoming page images, to detect the relevant data fields, and extract the data contained in the data fields" (id. at ¶¶ 19-26).

Claim 19


Claim 20
	Claim 20 recites a medium storing instructions for performing the steps of the method recited in claim 1. Accordingly, claim 20 is rejected as indicated in the rejection of claim 1.

Claims 3-6 and 9 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Tuganbaev, in view of Tillberg, further in view of Graf et al., U.S. PG-Publication No. 2006/0242180 A1.

Claim 3
	Graf discloses receiving a document type definition corresponding to the multi-page document, wherein the document type definition identifies the set of validation rules to be applied to the at least the first one of the two or more values and the at least the second one of the two or more values. Graf discloses a "workflow for extracting and warehousing data from semi-structured documents in any language." Graf, ¶ 3. For validation of documents, "the user may need to tell the system that the document type follows specific rules for data/time representations, numbers, character sets, character encodings, etc." Id. at ¶ 85. Further, the method "may include additional controls specific to the document type … to be extracted." For example, "user-specific … validation rules may be created, such as rules for financial statements that require that revenue be greater than net income line, that depreciation be less than total assets, etc." These rules may be "run upon completion of [an] auto0-extractions process" and Id. at ¶ 116.
	It would have been obvious to a person having ordinary skill in the art before the claimed invention was made, having both Tuganbaev-Tillberg and Grad before them to modify the method of extracting data from a multi-page document of Tuganbaev-Tillberg to incorporate document type specific validation rules as taught by Graf. One of ordinary skill in the art would be motivated to integrate document type specific validation rules into Tuganbaev-Tillberg, with a reasonable expectation of success, in order to model each document "for increased [extraction] accuracy, when encountering future documents of the same type." Graf, ¶ 59.

Claim 4
	Graf discloses wherein the document type definition includes a mapping to document type fields to be used to apply each rule. Graf discloses that in one embodiment, "once the document is tagged with data associated to each desired term, the document's data point value-to-term name mapping is checked for accuracy." Id. at ¶ 72. Figure 6 illustrates a user interface "used to map or tag data point values to those terms creates using [a] Document Structure application." Id. at ¶ 79.

Claim 5
	Tuganbaev discloses identifying a document type of the multi-page document, and identifying the document type definition based on the identified document type of the multi-page document. Tuganbaev uses a "flexible structure description comprising descriptions of structures in [a] document and detection information to facilitate detection of said structures in the Id. at ¶¶ 19-26.

Claim 6
Tuganbaev discloses wherein the document type contains one or more scalar fields and one or more tables of array fields. Tuganbaev discloses detecting data fields in a scanned document and extracting data from the fields using OCR. Tuganbaev, ¶ 35. Thus, Tuganbaev discloses extracting data from a single-dimensional field, i.e. scalar field. Tuganbaev also discloses extracting data in a two-dimensional table, comprised of repeating groups of fields. Id. at ¶ 24. Thus, Tuganbaev discloses extracting data from a multi-dimensional field, i.e. array field.

Claim 9
	Tuganbaev discloses combining data extracted from the respective pages into a form associated with the document type, wherein combining data extracted from the respective pages into a form associated with the document type includes forming an array that spans multiple pages concatenating a first set of rows of values extracted from a first page with a second set of rows of values extracted from a second page to create a combined set of rows to be included in the document type. Tuganbaev discloses that "[v]ery often on each page of a document there is a running title at the tope and/or the bottom, with a table flowing over from one page to the next. In this case the running title interrupts the data contained in the table. Describing the running titles as a repeating group which occurs once on each page enables the system to detect it and remove from the table search area. The information about the number, make-up, and order of columns in the table is used by the system when going from one page to the next." Tuganbaev, ¶¶ 36-38.


Claims 7-8 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Tuganbaev, in view of Tillberg, further in view of Graf, further in view of Pandian et al., U.S. PG-Publication No. 2005/0289182 A1.

Claim 7
Pandian discloses extracting values from each page into per-page scalar and array fields by name. Pandian discloses a document management system comprising modules for image capture, optical character recognition, data extraction, and quality assurance. Pandian, Abstract. Pandian discloses capturing data from scanned documents using an image capture module 30. An image identification module 34 compares a scanned document image with a template library to determine the type of document. An optical character recognition module 36 and data extraction module 37 work to extract data from the scanned document image using zone information and save the extracted data in an XML format for verification. Id. at ¶¶ 78-83.
Id. at ¶ 600-632; figs.31-32.
	It would have been obvious to a person having ordinary skill in the art before the claimed invention was made, having both Tuganbaev-Tillberg-Graf and Pandian before them to modify the method of extracting data from a multi-page document of Tuganbaev-Tillberg-Graf to incorporate the user interface for verifying extracted data as taught by Pandian. One of ordinary skill in the art would be motivated to integrate the user interface for verifying extracted data into Tuganbaev-Tillberg-Graf, with a reasonable expectation of success, in order to extract data from documents in an efficient and labor saving manner that "significantly enhances document management quality by reducing errors and providing the ability to process unstructured forms." Pandian, ¶ 16.

Claim 8
wherein for each extracted value, a corresponding location on the page from which the value was extracted is saved. Pandian illustrates a 'verify data window' user interface in FIGS. 31-32. The user interface comprises a 'data pane' on the left side that "displays the data extracted from the document image." The data pane is a data entry form comprising (1) a 'dictionary entry field' showing a value name, e.g. BankName or AccountNumber, (2) an 'extracted value field' showing a snippet of the image extracted corresponding to the value in the dictionary entry field, (3) a 'found result field' showing the ASCII text derived from the snippet image corresponding to the value in the dictionary entry field, (4) 'navigation buttons' that enable the user "to navigate through the Dictionary Entry Fields in the current document," and (5) an approve button to "confirm all the dictionary entries in the last document image." Pandian saves the location of the document image corresponding to the extracted data and displays it in the 'extracted value field' as a snippet of the document image. That location is also highlighted in a full document image displayed in the right side of the verify data window. Pandian, ¶¶ 600-632; figs.31-32.
	

Claims 10-12, 14, and 16 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Tuganbaev, in view of Tillberg, further in view of Pandian et al., U.S. PG-Publication No. 2005/0289182 A1.

Claim 10
	Pandian discloses wherein the validating comprises providing one or more of the at least the first one of the two or more values and the at least the second one of the two or more values to a user for manual validation. Pandian discloses a quality assurance/verifier module 42 that "allows the user to verify and correct, for example, the extracted XML output from the OCR module 36. It shows the converted text and the source image side-by-side on the desktop display screen… It also permits the user to look at part of the image or the entre page in order to check for problems." Pandian, ¶ 120. Pandian illustrates a 'verify data window' user interface in FIGS. 31-32. The user interface comprises a 'data pane' on the left side that "displays the data extracted from the document image." The data pane is a data entry form comprising (1) a 'dictionary entry field' showing a value name, e.g. BankName or AccountNumber, (2) an 'extracted value field' showing a snippet of the image extracted corresponding to the value in the dictionary entry field, (3) a 'found result field' showing the ASCII text derived from the snippet image corresponding to the value in the dictionary entry field, (4) 'navigation buttons' that enable the user "to navigate through the Dictionary Entry Fields in the current document," and (5) an approve button to "confirm all the dictionary entries in the last document image." The user interface also comprises an 'image pane' on the right that displays the original document image with the current partial snippet location highlighted. The 'verify data window' enables a user to validate extracted data by visually comparing the found result field text with an adjacent extracted value snippet image, iterate through different extracted fields (i.e. dictionary entries) requiring validation, wherein the iteration does not require any other action for navigation, because as the user goes "from field to field, the red outline moves corresponding in the Image pane, and the image and values are updated in the Data pane." Id. at ¶¶ 600-632; figs.31-32.
	It would have been obvious to a person having ordinary skill in the art before the claimed invention was made, having both Tuganbaev-Tillberg and Pandian before them to modify the method of extracting data from a multi-page document of Tuganbaev-Tillberg to incorporate the 

Claim 11
Pandian discloses presenting an interface to the user, wherein the interface displays to the user a plurality of fields which are marked as having errors and enables the user to iterate through the plurality of fields, wherein the plurality of fields displayed to the user include only dependent fields that require validation. Pandian discloses a quality assurance/verifier module 42 that "allows the user to verify and correct, for example, the extracted XML output from the OCR module 36. It shows the converted text and the source image side-by-side on the desktop display screen… It also permits the user to look at part of the image or the entre page in order to check for problems." Pandian, ¶ 120. Pandian illustrates a 'verify data window' user interface in FIGS. 31-32. The user interface comprises a 'data pane' on the left side that "displays the data extracted from the document image." The data pane is a data entry form comprising (1) a 'dictionary entry field' showing a value name, e.g. BankName or AccountNumber, (2) an 'extracted value field' showing a snippet of the image extracted corresponding to the value in the dictionary entry field, (3) a 'found result field' showing the ASCII text derived from the snippet image corresponding to the value in the dictionary entry field, (4) 'navigation buttons' that enable the user "to navigate through the Dictionary Entry Fields in the current document," and (5) an approve button to "confirm all the dictionary entries in the last document image." The user interface also comprises Id. at ¶¶ 600-632; figs.31-32.

Claim 12
Tuganbaev discloses identifying a document type of the multi-page document, and creating an instance of a selected one of a plurality of type- specific data entry forms in a forms library based at least in part on the identified document type of the multi-page document, wherein the plurality of fields displayed to the user are fields contained in the created instance. Tuganbaev uses a "flexible structure description comprising descriptions of structures in [a] document and detection information to facilitate detection of said structures in the document images." Tuganbaev, ¶ 9. The flexible structure description specifies "the likely number of pages for a document type" and describes "the structure of the first (Header) page and the last (Footer) page of the document." The flexible structure description "includes descriptions of all data fields to be detected and of all anchor elements and their relationships within the structure of the documents of a given type." Tuganbaev discloses that different types of multi-page documents have their own flexible structure description that "are used by [a] data capture system to identify documents in a batch of incoming page images, to detect the relevant data fields, and extract the Id. at ¶¶ 19-26. Thus, a collection of the disclosed flexible structure descriptions are analogous to the claimed “data entry forms library.”

Claim 14
Pandian discloses wherein the interface is configured to repetitively iterate through each of the plurality of fields until either the user enters data that clears the validation error associated with the field. The 'verify data window' enables a user to validate extracted data by visually comparing the found result field text with an adjacent extracted value snippet image, iterate through different extracted fields (i.e. dictionary entries) requiring validation, wherein the iteration does not require any other action for navigation, because as the user goes "from field to field, the red outline moves corresponding in the Image pane, and the image and values are updated in the Data pane." Id. at ¶¶ 600-632; figs.31-32. Specifically, Pandian discloses using the Verify Data Window to "display and error message," and when the user is satisfied with the result, proceeding to click "Next Field" (i.e. iterate) to display the next dictionary entry's extracted value. Further, this verification procedure is repeated for each field (i.e. repeatedly iterating). Id. at ¶¶ 614-623.

Claim 16
Pandian discloses wherein as each form field is displayed, a corresponding snippet or other partial image from a page from which a current data value associated with the form field was extracted is displayed adjacent to the field. Pandian discloses a quality assurance/verifier module 42 that "allows the user to verify and correct, for example, the extracted XML output from the OCR module 36. It shows the converted text and the source image side-by-side on the Id. at ¶¶ 600-632; figs.31-32.


Claim 13 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Tuganbaev, in view of Tillberg, further in view of Pandian, further in view of Singh et al., U.S PG-Publication No. 2011/0258182 A1.

Claim 13
Singh discloses wherein the two or more values presented to the user for manual validation are identified by determining whether the first one of the two or more values matches the second one of the two or more values in the multi-page document. Singh discloses a document analysis method used to extract data from an electronic document page that included multiple copies of a form. Singh, Abstract. Specifically, the method extracts the data from each copy of the form and compares the extracted data to see if each data instance is identical: "if all extracted data instances are identical, then the extracted data is considered to be correct with high confidence. Conversely, if extracted data instances are different then the extracted data is flagged." Id. at ¶¶ 260-279. 
	Accordingly, it would have been obvious to one having ordinary skill in the art at the time of invention was made to modify the method disclosed in Tuganbaev-Tillberg-Pandian to include wherein the identifying the one or more form fields for which validation of the corresponding data by the user is required comprises: determining whether the dependent data value matches another data value included in the multi-page document, for the purpose of "improv[ing] the accuracy of data extraction by utilizing each copy of data on an image" in a multi-copy form. Singh, ¶ 260.


Claim 15 is rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Tuganbaev, in view of Tillberg, further in view of Pandian, further in view of Chou et al., U.S. PG-Publication No. 2012/0203676 A1.

Claim 15
	Chou discloses populating, based at least in part on the data extracted from the multi-page document, a plurality of fields of the instance of the selected data entry form including the plurality of fields displayed to the user. Chou discloses a method of extracting data from an image of a paper financial document and populating a financial datasheet with the extracted data Chou, Abstract. Chou illustrates an interface wherein the document image data is shown in a viewing pane and the extracted financial data is displayed in a financial datasheet in a preview pane that "allows the user to preview the extracted information from the paper financial document." Chou expressly discloses that the image data may comprise multiple pages: "For the image data consisting of multiple pages, the user is able to select the page to be viewed by means of the page selector." Id. at ¶¶ 26-27. FIG. 6 illustrates a document image on the left wherein extracted data values are used to populate a single electronic data form on the right.
	Accordingly, it would have been obvious to one having ordinary skill in the art at the time of invention was made to modify the method disclosed in Tuganbaev-Tillberg-Pandian to include wherein the extracting data from two or more different pages included in the sequence comprises extracting the data values from pages comprising the multi-page document, and wherein the data values extracted from the pages comprising the multi-page document are used to populate a single electronic data entry form, for the purpose of enabling the user to display an image and data extracted from than image "side by side to provide easy comparison." Chou, ¶¶ 26-27.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FRANK D MILLS whose telephone number is (571)270-3172. The examiner can normally be reached M-F 10-6 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KAVITA PADMANABHAN can be reached on (571)272-8352. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/FRANK D MILLS/Primary Examiner, Art Unit 2176                                                                                                                                                                                                        March 24, 2022