Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


EXAMINER'S AMENDMENT

An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.

Authorization for this examiner’s amendment was given in an interview with Sam Yip  on 3/3/2022.


1. (Currently Amended)     A method for recognizing and extracting information from data presented in a structured layout in a target source document, comprising:
                providing a character classifier;
                providing a graph neural network (GNN) having a pretrained feature embedding layer and a two-stage GNN mode;
extracting text characters in the structured layout in the target source document by the character classifier;
merging the text characters with two-dimensional positions thereof into n-gram characters by the character classifier;
extracting semantic features from the target source document by the pretrained feature embedding layer of the GNN, wherein the semantic features comprise word meanings;
comprising geometric features of text bounding boxes such as coordinates, heights, widths, and aspect ratios in the document;
using a convolution neural network (CNN) layer to obtain CNN image features of the target source document, wherein the CNN image features represent features of mid-point of a text box of the document and comprises one or more of font sizes and font types of the text characters, and explicit separators in the text of the document;
merging the n-gram characters into words and text lines by the GNN;
wherein the two-stage GNN mode having a first GNN stage and a second GNN stage;
wherein the first GNN stage comprises:
generating graph embedding spatial features from the spatial features;
wherein the second GNN stage comprises:
generating graph embedding semantic features and graph embedding CNN image features from the semantic features and the CNN image features, respectively;
merging the text lines into cells by the GNN;
grouping the cells into rows, columns, and key-value pairs by the GNN, wherein results of the grouping being represented by one or more adjacency matrices, and a row relationship among the cells, a column relationship among the cells, and a key-value relationship among the cells.

2. (Currently Amended)     The method of claim 1, further comprising:
generating content of a table in a form of editable electronic data according to the row relationship among the cells, the column relationship among the cells, and the key-value relationship among the cells.


11. (Currently Amended)   An apparatus for recognizing and extracting information from data presented in a structured layout, comprising:
a character classifier implemented by one or more processors configured to:
extract text characters in the structured layout in a target source document;
merge the text characters with two-dimensional positions thereof into n-gram characters;
a convolution neural network (CNN) layer implemented by the one or more processors further configured to obtain CNN image features of the target source document, wherein the 
a graph neural network (GNN) implemented by the one or more processors further; wherein the GNN having a two-stage GNN mode;
wherein the two-stage GNN mode having a pretrained feature embedding layer and a first GNN stage and a second GNN stage;
wherein the pretrained feature embedding layer is configured to extract semantic features from the target source document, wherein the semantic features comprise word meanings;
wherein the first GNN stage comprises:
generating graph embedding spatial features from spatial features of the target source document, the spatial features being manually defined and comprising geometric feature of text bounding boxes such as coordinates, heights, widths, and aspect ratios in the target source document;
wherein the second GNN stage comprises:
generating graph embedding semantic features and graph embedding CNN image features from the semantic features and the CNN image features, respectively;
                                wherein the GNN is configured to:
merge the n-gram characters into words and text lines;
merge the text lines into cells by the GNN;
group the cells into rows, columns, and key-value, wherein results of the grouping being represented by one or more adjacency matrices, and a row relationship among the cells, a column relationship among the cells, and a key-value relationship among the cells.

12. (Currently Amended)   The apparatus of claim 11, wherein the GNN is further configured to generate content of a table in a form of editable electronic data according to the adjacency matrices.


one or more processors are further configured to store the content of the table into extensible markup language (XML).


Reasons for Allowance
The following is an examiner’s statement of reasons for allowance: Qasim et al Rethinking table recognition using graph neural networks discloses: 

A method for recognizing and extracting information from data presented in a structured layout in a target source document, comprising: providing a character classifier (see figure 2 caption OCR engine); 
providing a graph neural network (GNN) (see abstract GNN) having a pretrained feature embedding layer and a two-stage GNN mode ; 
extracting text characters in the structured layout in the target source document by the character classifier (see figure 2 caption words  are extracted by OCR engine); 
merging the text characters with two-dimensional positions thereof into n-gram characters by the character classifier (see figure 2 caption words  and positions are extracted by OCR engine)
y defining spatial features of the target source document, wherein the spatial features comprise geometric features of text bounding boxes (see section 4 Positional features  ) 
using a convolution neural network (CNN) layer to obtain CNN image features of the target source document (see figure 2 CNN), 
wherein results of the grouping being represented by one or more adjacency matrices, and a row relationship among the cells, a column relationship among the cells, and a key-value relationship among the cells 

Riba discloses Table detection in invoice documents by graph neural networks”  
; extracting semantic features from the target source document by the pretrained feature embedding layer of the GNN, wherein the semantic features comprise word meanings; (see section b. )


Prior art of record does not disclose the combination of wherein the CNN image features represent features of mid-point of a text box of the document and comprises one or more of font sizes and font types of the text characters, and explicit separators in the text of the document; merging the n-gram characters into words and text lines by the GNN; wherein the two-stage GNN mode having a first GNN stage and a second GNN stage; wherein the first GNN stage comprises: generating graph embedding spatial features from the spatial features; wherein the second GNN stage comprises: generating graph embedding semantic features and graph embedding CNN image features from the semantic features and the CNN image features, respectively; merging the text lines into cells by the GNN; grouping the cells into rows, columns, and key-value pairs by the GNN, 



Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the 



Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SEAN T MOTSINGER whose telephone number is (571)270-1237. The examiner can normally be reached 9AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached on (571)272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.