DETAILED ACTION
This correspondence is responsive to the Continuation Application filed on November 22, 2021. Claims 1-14 are pending in the case, with claims 1, 5 and 11 in independent form. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Applicant’s claim for the benefit of a prior-filed application 16/192,028 under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, 365(c), or 386(c) is acknowledged. 

Terminal Disclaimer
The terminal disclaimer filed on August 31, 2022, disclaiming the terminal portion of any patent granted on this application which would extend beyond the expiration date of US Patent 11,120,209 (Application 16/925,815), US Patent 10,755,039 (Application 16/192,028) and US Patent 11,188,713 (Application 16/192,028) and US Patent 10,769,425 (Application 16/101,763) has been reviewed and is accepted.  The terminal disclaimer has been recorded.


EXAMINER'S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.

Authorization for this examiner’s amendment was given in an interview with Jon Gibbons on August 25, 2022.

The application has been amended as follows: 

1.  (Currently Amended) A system for extracting information from an image of a document with textual content, the system comprising:
a computer memory capable of storing machine instructions; and
a hardware processor in communication with the computer memory, the hardware processor configured to access the computer memory, the hardware processor performing
accessing an image of at least one page of a filled form document, wherein the filled form document includes textual content with textual questions and textual answers and at least one graphical line separating a portion of the textual content;
extracting textual content in the image into a set of text lines and extracting a structural layout of the textual content, wherein the structure layout includes a grouping textual content;
creating a compositional hierarchy of textual content and the structural layout; and
	based on the compositional hierarchy being a known form type, performing vertical merging of two or more lines in the set of text lines based on a relative position of the textual content and an absence of the at least one graphical line separating the two or more lines.

2. (Currently Amended) The system of claim 1, further comprising:
assigning [[a]] the form type to the image based on the compositional hierarchy of the textual content and structural layout; 
	comparing the textual content with the form type to identify a set of textual questions; and
matching a textual answer to a textual question in the set of textual questions by a relative position of the textual question to a textual answer.

3. (Currently Amended) The system of claim 2, further comprising:
	creating a logical view of each textual answer including an identifier to the textual question that has been matched; and.
displaying each textual answer that has been matched to each textual question on the form type with a visual appearance distinct from a visual appearance of the set of textual questions on the form type.

4. (Original) The system of claim 3, wherein the displaying is updated based on a user selecting at least one of a confidence threshold value for the extracting the textual content, a confidence threshold for determining a textual answer to each textual question in the set of textual questions or a combination thereof.

5. (Currently Amended) A system for extracting information from an image of a document with textual content, the system comprising:
a computer memory capable of storing machine instructions; and
a hardware processor in communication with the computer memory, the hardware processor configured to access the computer memory, the hardware processor performing
accessing an image of at least one page of a document with textual content according to a structural layout separating portions of the textual content and at least one graphical line separating a portion of the textual content;
extracting the textual content from the image into a set of text lines and extracting a structural layout of the textual content from the image;
creating a compositional hierarchy of the textual content and the structural layout; and
in response to the compositional hierarchy being a known form type, performing a vertical merging to two or more text lines in the set of text lines based on a relative position of the text lines and an absence of the at least one graphical line separating the two or more text lines.

6. (Currently Amended) The system of claim 5, further comprising:
assigning [[a]] the form type to the image based on the compositional hierarchy of the textual content and structural layout; 
matching text in the form type to a subset of text lines to identify a known set of text that includes a set of known textual questions and a set of textual answer candidates, wherein the matching text 
determining a textual question to an answer candidate from the set of known textual questions by a relative position of the textual question to the set of textual answer candidates; and
using the form type to create a logical data structure of each textual answer.
 
7. (Currently Amended) The system of claim 6, wherein the matching text in the form type to the subset of text lines to identify a known set of text that includes a set of known textual questions which are similar in two different coordinate locations in the image for the form type.

8. (Currently Amended)  The system of claim [[5]] 6, wherein the determining the textual question to the answer candidate from the set of known textual questions by a relative position of the textual question to the set of textual answer candidates includes a null textual question to answer candidate match.

9. (Currently Amended) The system of claim 6, wherein the matching text in the form type to the subset of text lines to identify a known set of text that includes a set of known textual questions and a set of textual answer candidates further includes a set of known textual content for titles, sections, sub-sections, instructions or a combination thereof.

10. (Currently Amended) The system of claim 6, wherein the matching text in the form type to the subset of text lines to identify a known set of text that includes a set of known textual questions includes using a Levenshtein distance between text in the form type and the set of text lines.

11. (Currently Amended) A computer-based method of extracting information from an image of a document with textual content, the computer-based comprising:
accessing an image of at least one page of a filled form document, wherein the filled form document includes textual content with textual questions and textual answers and at least one graphical line separating a portion of the textual content;
extracting textual content in the image into a set of text lines and extracting a structural layout of the textual content, wherein the structure layout includes a grouping textual content;
	creating a compositional hierarchy of textual content and the structural layout; and
based on the compositional hierarchy being a known form type, performing vertical merging of two or more lines in the set of text lines based on a relative position of the textual content and an absence of the at least one graphical line separating the two or more lines.

12. (Currently Amended) The computer-based method of claim 11, further comprising:
assigning [[a]] the form type to the image based on the compositional hierarchy of the textual content and structural layout; 
	comparing the textual content with the form type to identify a set of textual questions; and
matching a textual answer to a textual question in the set of textual questions by a relative position of the textual question to a textual answer.

13. (Currently Amended) The computer-based method of claim 12, further comprising:
	creating a logical view of each textual answer including an identifier to the textual question that has been matched; and.
displaying each textual answer that has been matched to each textual question on the form type with a visual appearance distinct from a visual appearance of the set of textual questions on the form type.

14. (Original) The computer-based method of claim 13, wherein the displaying is updated based on a user selecting at least one of a confidence threshold value for the extracting the textual content, a confidence threshold for determining a textual answer to each textual question in the set of textual questions or a combination thereof.


Reasons for Allowance
The following is an examiner’s statement of reasons for allowance:

Upon entry of the above Examiner’s Amendment, and responsive to the approved Terminal Disclaimers filed August 31, 2022, claims 1-14 are allowed.

The closest known prior art of record is a newly cited Non Patent Literature reference, entitled An Optimization Methodology for Document Structure Extraction on Latin Character Documents (Liang et al. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 23, No. 7, July 2001, pages 719-734) hereinafter Liang, and Tillberg et al. (Pub. No. US 2007/0168382 A1) hereinafter Tillberg, Chakraborty (Pub. No. US 2004/0194035 A1) and Wshah (Patent No. US 9,374,501 B2).

Liang teaches extracting information from an image of a document with textual content, accessing an image of at least one page of a document, wherein the document includes textual content, extracting textual content in the image into a set of text lines and extracting a structural layout of the textual content, wherein the structure layout includes a grouping textual content; creating a compositional hierarchy of textual content and the structural layout in that Liang discloses construction of a document hierarchy, given an input document and segmenting document image into textual zones, text blocks, text lines. Liang, Section 3 Document Structure Representation, pages 721-723, 719-720. Liang teaches based on the compositional hierarchy being known, performing vertical merging of two or more lines in the set of text lines based on a relative position of the textual content and an absence of the at least one graphical line separating the two or more lines in that Liang discloses implementing a text-line extraction algorithm that groups text lines into zones based on relative position of text lines and examines whether text lines are consistent on need to split or merged. For example, a text line may be merge two adjacent text line in a vertical direction within the same zone if the merge results in a better fit with the text zone. Liang, Section 7. An Implementation: Text-Line Extraction Algorithm, pages 726-727, 728-731.  
Tillberg teaches a method of extracting information from an image of a filled form document and accessing an image of at least one page of a filled form document, wherein the filled form document includes textual content with textual questions and textual answers and at least one graphical line separating a portion of the textual content in that Tillberg discloses a method for accessing a scanned image of a filled form and extracting data from identified fields. The scanned filled form document is identified by comparing specific characteristics, including graphical line location and length or text content, against the templates. Tillberg, Abstract, Figs 2, 3, 6, 10, 19, paragraphs 28-32, 63, 69, 71-76, 81 151,153, 155-156, 28-32. When related mark fields within a form such as instances of two mark fields representing the "Yes" and "No" textual answers for the same textual question, these related mark fields may be clustered into groups. Mark field groups may be further clustered, if also related. Tillberg, Abstract, Figs. 2, 3, 9, 10, 19, paragraphs 73, 71 -74, 155-156. Tillberg discloses extracting “lines in the image into a set of lines” and extracting a structural layout of the content, wherein the structure layout includes a grouping of content. Tillberg discloses a line identification and fingerprinting processes that identifies and catalogs each line in a form to be used as a template. The lines and line clusters make up the scaffold structure for the template. Line identification is also used on input scans and the line structure and clusters are compared to the templates to find the best match. Textual content is extracted from identified fields in the image using OCR. Tillberg, Abstract, Figs. 2, 3, 9,10,19, paragraphs 114-120, 81 -82, 71 -74, 28-33, 153-155. Tillberg does not disclose extracting textual content in the image into a set of text lines, extracting a structural layout of the textual content that includes a grouping textual content, creating a compositional hierarchy of textual content and the structural layout and based on the form type being assigned to a known form type, performing vertical merging of two or more lines in the set of text lines based on a relative position of the textual content.
Chakroborty discloses teaches creating a compositional hierarchy of textual content and the structural layout in that Chakraborty discloses performing document understanding and extracting form information from text portions and/or non-text portions (e.g., figures) that are located within the scanned form PDF document. The extracted form information is then saved in a structured manner that follows a predefined syntax and grammar. Preferably, the extracted form information is stored as an XML (extensible Markup Language) file that follows a predefined DTD (document type definition). These XML files are referred to as Anchorable Information Unit (AIU) files and contain all relevant information regarding the structure, format, content, etc, of the corresponding electronic documents. Chakraborty, paragraphs 8-11, 17, 20, 22, 24. Chakraborty discloses that an AIU file is defined in a hierarchical manner. At the root, there is an AlUDoc definition which encompasses the header, footer and the extracted information, including the fields, sections and segments. With the above exemplary AIU specification, the root represents the entire document, which is then divided into "sub-documents" or sections. Chakraborty, paragraphs 57, 55-64, 33, 46, 47.
Wshah discloses a method that includes extraction of the filled-out content from the form. The extracted filled-out content is correlated to the geometrical features of the master form to create a new geometrical representation of the form. The master form includes one or more anchor fields that correspond to one or more anchor zones and one or more anchor segments that have global adjustment parameters and geometrical features. The method further includes adjusting the filled-out content in the form based on the global adjustment parameters having the highest global score for the anchor segments in the form. Wshah, col. 2, lines 3-24. Wshah does not disclose that based on the form type being assigned to a known form type, performing vertical merging of two or more lines in the set of extracted text lines based on a relative position of the textual content.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BARBARA M LEVEL whose telephone number is (303)297-4748. The examiner can normally be reached Monday through Friday 8:00 AM - 5:00 PM MT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Scott T Baderman can be reached on (571) 272-3644. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BARBARA M LEVEL/Examiner, Art Unit 2144                                                                                                                                                                                                        

/SCOTT T BADERMAN/Supervisory Patent Examiner, Art Unit 2144