DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments with respect to the claim(s) have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Prior arts Xu et al., US 2020/0174870 A1 (Xu) and Murakawa, US 2018/0081535 A1 (Murakawa) have been newly added to assist in teaching the newly added amended claim language. Prior art Chen et al., US 2017/0351913 A1 (Chen) (which previously used to reject claims 13 and 26) has also been used to assist in rejecting the newly added amendments to claims 5 and 18.
The 35 USC 112(b) rejection made to claim 8 has been withdrawn due to Applicant’s amendments.
Claims 1, 5, 7-14, 18, and 20-35 are pending; claims 2-4, 6, 15-17, and 19 are cancelled; claims 1, 5, 7, 8, 14, 18, 20, 21, and 27 have been amended; and claims 28-35 have been newly added.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 25 and 26 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 25 recites the limitation "wherein the structured paper document 2comprises an item selected from a group consisting of an invoice and a receipt".  There is insufficient antecedent basis for this limitation in the claim. This is because claim 25 depends from cancelled claim 15. The Examiner believes that Applicant meant for claim 25 to depend from independent claim 14 (such as corresponding method claim 12 depends from independent claim 1). Appropriate correction is required.
Claim 26 recites the limitation "wherein the structured paper document 2is crumpled".  There is insufficient antecedent basis for this limitation in the claim. This is because claim 26 depends from cancelled claim 15. The Examiner believes that Applicant meant for claim 26 to depend from independent claim 14 (such as corresponding method claim 13 depends from independent claim 1). Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 12, 14, 25, 27, and 32-35 are rejected under 35 U.S.C. 103 as being unpatentable over Zuev et al., US 2019/0385054 A1 (Zuev), Lau et al., US 2020/0143349 A1 (Lau), and further in view of Xu et al., US 2020/0174870 A1 (Xu).
Regarding claim 1, Zuev teaches a method comprising employing at least one hardware processor of a computer system (computer system 100 which includes computing device 110, wherein the computing device 110 can include one or more computing devices 800, which includes a processing device 802)(Figs. 1 and 8; [0021-0022] and [0072]) 2to:  
(such as a word) ([0032]) extracted from a document image (recognize text in the electronic document 140) (Fig. 1; [0032]), 
the text token comprising a 4sequence of characters (wherein the word comprises a sequence of characters) ([0032]), 
the document image comprising an encoding of an image (receiving a digital copy of the electronic document 140 by scanning a document or photographing the document) (Fig. 1; [0024]) 5of a structured paper document (wherein the electronic document 140 may be any suitable type, such as an invoice) (Fig. 1; [0023]), 
the structured paper document (wherein the electronic document 140 may be any suitable type, such as an invoice) (Fig. 1; [0023]) partitioned into a 6plurality of fields (wherein the document is partitioned into a plurality of text fields in the electronic document) ([0017]) and having a plurality of text tokens distributed among the 7plurality of fields (wherein there are a plurality of words in the text fields) ([0017]), 
each field of the plurality of fields having a distinct field type (wherein each text field is associated with a field type) ([0017]) 8characterizing a distinct category of information represented by text tokens 9located within the each field (wherein the field type may refer to a type of content included in a text field; such as “name,” “company name,” “telephone,” etc.)([0017]);  
10receive a token box indicator (a pseudo-image may be an artificially created image of a certain size) ([0042-0044]) comprising an indicator of a polygon enclosing a region of 11the document image (such as a rectangle enclosing a region such as a word) ([0042-0044]), the region containing an image of the text token (the region containing an image of the word) ([0042-0044]);  
12determine a text feature vector characterizing the text token as a whole (wherein the first plurality of layers 210 of the neural network 200 can be trained to produce vector representations of words; also referred to as “word vectors”) (Fig. 2; [0038]), the text feature 13vector determined according to the character sequence (the text feature vector determined using the sequence of characters, i.e. the next character may correspond to a previous character in one of the words) ([0039]); 14and  
(using the vector representations of words from layers 210 of the neural network 200 to later output in output layer 250 one or more field type identifiers, each of the field type identifiers may identify a field type associated with one of the words)(Fig. 2; [0038] and [0049]) and the image feature (using the pseudo-image to determine information of field types of text fields in the electronic document)([0046]).
Zuev teaches generating an image of the text token using a rectangular area ([0042-0044]) and using that image to determine the field type ([0017]); however, Zuev does not explicitly teach determine an image feature vector characterizing the image of the text token as a whole, 15 wherein the image feature vector determined according to “a pixel content of the region of the document image”.
Lau teaches a system and method for auto-populating an electronic transaction process (Abstract); wherein determine an image feature vector characterizing the image of the text token as a whole (vectorization for the bounding box of an image portion) ([0156]), 15the image feature vector determined (the vector based on the image of the text, in this case the logo and the bounding box) ([0156-0170]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed language to use not only a text feature vector but also an image feature vector within the bounding box since it increases the accuracy of identifying the text (Lau; [0049]).
However, neither explicitly teaches wherein the image feature vector determined according to “a pixel content of the region of the document image”.
Xu teaches text and content-based image document classification systems ([0016]); wherein the image feature vector (feature vector based on an image) ([0016]) determined according to a pixel content of the region (the feature vector being based on content such as image pixels representing the content in the document) ([0016]) of the document image (of the document in the classification system) ([0016]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of prior arts to include generating a (Xu; [0016]).

Regarding claim 12, Zuev teaches 1wherein the structured paper document (wherein the electronic document 140 is from an original document that has been scanned or photographed) ([0024]) comprises an item 2selected from a group consisting of an invoice (wherein the document type can be an invoice) ([0023]) and a receipt (wherein the electronic document may be of any suitable type such as a receipt) ([0016] and [0023]).  

Regarding claim 14, see the rejection made to claim 1, as well as Zuev for a computing system comprising at least one hardware processor (computer system 100 which includes computing device 110, wherein the computing device 110 can include one or more computing devices 800, which includes a processing device 802)(Figs. 1 and 8; [0021-0022] and [0072]) configured to execute a 2text feature extractor (the processing device able to extract text)([0005]), an image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]), and a token classifier (classifier for the text fields also in system 800)(Fig. 8; [0019]) connected to the 3text feature extractor (the processing device able to extract text)([0005]) and the image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]) wherein all the processing is connected together in system 800)(Fig. 8; [0071]), for they teach all the limitations within this claim.

Regarding claim 25, see the rejection made to claim 12, as well as Zuev for a computing system comprising at least one hardware processor (computer system 100 which includes computing device 110, wherein the computing device 110 can include one or more computing devices 800, which includes a processing device 802)(Figs. 1 and 8; [0021-0022] and [0072]) configured to execute a 2text feature extractor (the processing device able to extract text)([0005]), an image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]), and a token classifier (classifier for the text fields also in system 800)(Fig. 8; [0019]) connected to the 3text feature extractor (the processing device able to extract text)([0005]) and (pseudo-image extraction using system 800)(Fig. 8; [0045]) wherein all the processing is connected together in system 800)(Fig. 8; [0071]), for they teach all the limitations within this claim.

Regarding claim 27, see the rejection made to claim 1, as well as Zuev for a non-transitory computer-readable medium storing instructions (non-transitory machine readable storage medium)(Fig. 8; [0006] and [0071]); and a computing system comprising at least one hardware processor (computer system 100 which includes computing device 110, wherein the computing device 110 can include one or more computing devices 800, which includes a processing device 802)(Figs. 1 and 8; [0021-0022] and [0072]) configured to execute a 2text feature extractor (the processing device able to extract text)([0005]), an image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]), and a token classifier (classifier for the text fields also in system 800)(Fig. 8; [0019]) connected to the 3text feature extractor (the processing device able to extract text)([0005]) and the image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]) wherein all the processing is connected together in system 800)(Fig. 8; [0071]), for they teach all the limitations within this claim.

Regarding claim 32, Zuev teaches a polygon enclosing a region of 11the document image (such as a rectangle enclosing a region such as a word) ([0042-0044]). Lau teaches a system and method for auto-populating an electronic transaction process (Abstract); wherein determine an image feature vector characterizing the image of the text token as a whole (vectorization for the bounding box of an image portion) ([0156]), 15the image feature vector determined (the vector based on the image of the text, in this case the logo and the bounding box) ([0156-0170]).
However, neither explicitly teaches determining the image feature vector further 2according to a content of a subset of pixels of the document image, the subset of pixels 3surrounding the polygon.
([0016]); wherein determining the image feature vector (feature vector based on an image) ([0016]) further according to a content of a subset of pixels (the feature vector being based on content such as image pixels representing the content in the document) ([0016]) of the document image (of the document in the classification system) ([0016]), the subset of pixels surrounding the polygon (wherein the pixels can be extracted from any relevant features of the content-based image classification system) ([0016]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of prior arts to include generating a region of the document into an image feature vector based on pixel content of the region since it assists in classifying the region (relevant features) and thus the document (Xu; [0016]).

Regarding claim 33, see the rejection made to claim 32, as well as Zuev for a computing system comprising at least one hardware processor (computer system 100 which includes computing device 110, wherein the computing device 110 can include one or more computing devices 800, which includes a processing device 802)(Figs. 1 and 8; [0021-0022] and [0072]) configured to execute a 2text feature extractor (the processing device able to extract text)([0005]), an image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]), and a token classifier (classifier for the text fields also in system 800)(Fig. 8; [0019]) connected to the 3text feature extractor (the processing device able to extract text)([0005]) and the image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]) wherein all the processing is connected together in system 800)(Fig. 8; [0071]), for they teach all the limitations within this claim.

Regarding claim 34, Lau teaches wherein at least one element of the image feature vector 2is determined according to all pixels of the image of the text token (wherein the image feature vector can be generated by the pixels within the bounding box) ([0156-0170]).  Xu teaches wherein at least one element of the image feature vector 2is determined according to all pixels of the image of the text token (the image feature vector being based on content such as image pixels representing the content) ([0016]) (wherein the content can be that of the image of the text token since the image feature vector is based on pixels that can be extracted from any relevant features) ([0016]).

Regarding claim 35, see the rejection made to claim 33, as well as Zuev for a computing system comprising at least one hardware processor (computer system 100 which includes computing device 110, wherein the computing device 110 can include one or more computing devices 800, which includes a processing device 802)(Figs. 1 and 8; [0021-0022] and [0072]) configured to execute a 2text feature extractor (the processing device able to extract text)([0005]), an image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]), and a token classifier (classifier for the text fields also in system 800)(Fig. 8; [0019]) connected to the 3text feature extractor (the processing device able to extract text)([0005]) and the image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]) wherein all the processing is connected together in system 800)(Fig. 8; [0071]), for they teach all the limitations within this claim.

Claims 5, 13, 18, 26, and 28-31 are rejected under 35 U.S.C. 103 as being unpatentable over Zuev et al., US 2019/0385054 A1 (Zuev), Lau et al., US 2020/0143349 A1 (Lau), Xu et al., US 2020/0174870 A1 (Xu), and further in view of Chen et al., US 2017/0351913 A1 (Chen).
Regarding claim 5, Zuev teaches further comprising employing at least one 2processor of the computer system (computer system 100 which includes computing device 110, wherein the computing device 110 can include one or more computing devices 800, which includes a processing device 802)(Figs. 1 and 8; [0021-0022] and [0072]) to: construct text lines manually or automatically ([0025]); and determine the field type further according to an order of the token sequence (based on the word locations relative to each other determining the field types of the text fields in the electronic document)([0046]). Lau teaches that a text block may comprise zero or more line objects ([0052]); wherein each line object may comprise zero or more element objects, which represent words ([0052]); wherein the text block can span ([0052]); and to construct a plurality of text lines (generating text lines) ([0070]), each text line formed by a subset of the plurality of 4text tokens (each text line generated by text blocks that have been tokenized into lines) ([0070]). Xu teaches text and content-based image document classification systems ([0016]).
However, none of them explicitly teaches to 3construct a plurality of text lines, each text line formed by a subset of the plurality of 4text tokens selected “according to a location of each text token within the structured paper document”; and 7arrange the plurality of text tokens into an ordered token sequence according to the 8plurality of text lines.
Chen teaches a system and method for invoice field detection and parsing (Abstract); to 3construct a plurality of text lines (text line generation from characters) (Fig. 13; [0102]), each text line formed by a subset of the plurality of 4text tokens (each text line formed based on characters within that line; wherein those characters include words; e.g. tokens) ([0102] and [0111-0113]) selected according to a location of each text token within the structured paper document (selected based on position of the text, such as horizontal and vertical positioning of the characters with regard to the line being created) (Fig. 13; [0102-0112]); 7arrange the plurality of text tokens into an ordered token sequence according to the 8plurality of text lines (arranging/sorting the characters and words according to the plurality of text lines; ordering them based on which text line the characters/words belong too) (Fig. 13; [0102-0112]); and 9determine the field type further according to an order of the token sequence (based on the text lines determining the field type) (Figs. 16 and 18; [0139-0142] and [0149-0150]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of prior arts to include using text lines to determine a field type since it provides an accurate and robust parsing system (Chen; [0007]).

Regarding claim 13, Zuev teaches the structured paper document (wherein the electronic document 140 may be any suitable type, such as an invoice) (Fig. 1; [0023]). Lau teaches that a scanner, camera, or OCR engine can be used to obtain an image file of a transaction invoice ([0043]). Xu teaches text and content-based image document classification ([0016]). However, none of them teaches wherein the structured paper document is crumpled.
Chen teaches a system and method for invoice filed detection and parsing including the steps of extracting character bounding blocks (Abstract); wherein the structured paper document is crumpled (wherein the original invoice may be crumpled or wrinkled)([0004] and [0052]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of prior arts to include being able to detect invoice fields even when the invoice is crumpled or wrinkled since it allows for increased modality of the system to work on documents that include noise while still providing accurate and robust invoice parsing (Chen; [0007]), by improving the image quality through image enhancement (Chen; [0044]).

Regarding claim 18, see the rejection made to claim 5, as well as Zuev for a computing system comprising at least one hardware processor (computer system 100 which includes computing device 110, wherein the computing device 110 can include one or more computing devices 800, which includes a processing device 802)(Figs. 1 and 8; [0021-0022] and [0072]) configured to execute a 2text feature extractor (the processing device able to extract text)([0005]), an image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]), and a token classifier (classifier for the text fields also in system 800)(Fig. 8; [0019]) connected to the 3text feature extractor (the processing device able to extract text)([0005]) and the image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]) wherein all the processing is connected together in system 800)(Fig. 8; [0071]) and a line segmentation engine (text lines extracted using the system 800)(Fig. 8; [0025] and [0071]), for they teach all the limitations within this claim.

Regarding claim 26, see the rejection made to claim 13, as well as Zuev for a computing system comprising at least one hardware processor (computer system 100 which includes computing device 110, wherein the computing device 110 can include one or more computing devices 800, which includes a processing device 802)(Figs. 1 and 8; [0021-0022] and [0072]) configured to execute a 2text feature extractor (the processing device able to extract text)([0005]), an image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]), and a token classifier (classifier for the text fields also in system 800)(Fig. 8; [0019]) connected to the 3text feature extractor (the processing device able to extract text)([0005]) and the image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]) wherein all the processing is connected together in system 800)(Fig. 8; [0071]), for they teach all the limitations within this claim.

Regarding claim 28, Lau teaches wherein the token sequence spans multiple text lines of 2the plurality of text lines (wherein the bounding text blocks are blocks of text that are grouped together but are not bounded by physical locality; instead a text block may comprise one or more line objects) ([0052] and [0070]).  

Regarding claim 29, see the rejection made to claim 28, as well as Zuev for a computing system comprising at least one hardware processor (computer system 100 which includes computing device 110, wherein the computing device 110 can include one or more computing devices 800, which includes a processing device 802)(Figs. 1 and 8; [0021-0022] and [0072]) configured to execute a 2text feature extractor (the processing device able to extract text)([0005]), an image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]), and a token classifier (classifier for the text fields also in system 800)(Fig. 8; [0019]) connected to the 3text feature extractor (the processing device able to extract text)([0005]) and the image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]) wherein all the processing is connected together in system 800)(Fig. 8; [0071]), for they teach all the limitations within this claim.

1Regarding claim 30, Lau teaches comprising arranging the plurality of text tokens into the 2ordered token sequence according to a result of concatenating the plurality of text lines (wherein the bounding text blocks are blocks of text that are grouped together but are not bounded by physical locality; instead a text block may comprise one or more line objects; and each line object may comprise zero or more element objects, which represent words) ([0052] and [0070]).  

Regarding claim 31, see the rejection made to claim 30, as well as Zuev for a computing system comprising at least one hardware processor (computer system 100 which includes computing device 110, wherein the computing device 110 can include one or more computing devices 800, which includes a processing device 802)(Figs. 1 and 8; [0021-0022] and [0072]) configured to execute a 2text feature extractor (the processing device able to extract text)([0005]), an image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]), and a token classifier (classifier for the text fields also in system 800)(Fig. 8; [0019]) connected to the 3text feature extractor (the processing device able to extract text)([0005]) and the image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]) wherein all the processing is connected together in system 800)(Fig. 8; [0071]), for they teach all the limitations within this claim.

Claims 7, 8, 20, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Zuev et al., US 2019/0385054 A1 (Zuev), Lau et al., US 2020/0143349 A1 (Lau), Xu et al., US 2020/0174870 A1 (Xu), Chen et al., US 2017/0351913 A1 (Chen), and further in view of Murakawa, US 2018/0081535 A1 (Murakawa).
Regarding claim 7, Zuev teaches wherein constructing the plurality 2of text lines (generating text lines either manually or automatically) 3comprises: 4determining a text line including the text token (wherein the text lines include characters and words) ([0025]) according 5to a set of vertices of the polygon (word neighborhood information can be specified using a plurality of rectangles of words whose vertices are connected)([0043]); and 6determining whether a second text token of the plurality of text tokens belongs 7to the text line according to a distance between a 9second polygon and the first polygon (determining if the second rectangle of the second token/word has connected vertices to the first rectangle of the first token/word or is within a distance)([0043]), the second polygon 10enclosing a second region of the document image (the second rectangle enclosing a second word)([0043]), the second region 11containing an image of the second text token (the second region containing a second word)([0043]). Lau teaches to construct a plurality of text lines (generating text lines) ([0070]), each text line formed by a subset of the plurality of 4text tokens (each text line generated by text blocks that have been tokenized into lines) ([0070]). Xu teaches text and content-based image document classification systems ([0016]). Chen teaches a system and method for invoice field detection and parsing (Abstract); to 3construct a plurality of text lines (text line generation from characters) (Fig. 13; [0102]), each text line formed by a subset of the plurality of 4text tokens (each text line formed based on characters within that line; wherein those characters include words; e.g. tokens) ([0102] and [0111-0113]); and wherein determining whether a second text token of the plurality of text tokens belongs 7to the text line (determining whether the second character or word belongs to a certain text line) ([0102-0113]) according to a distance between a 9second polygon and the text line (according to the distance between the second character/word and the text line, such that the current character should have sufficient overlapping with the text line it belongs to, and that the current character should be close enough to the text line it belongs to) (Fig. 13; [0102-0113]).
However none of them explicitly teaches determining a line guide of a text line.
Murakawa teaches a document viewing apparatus (Abstract); wherein a guide line drawing unit 24 generates a guide line to be added to the reference positions set by the reference position determination unit 23, and by outputting the guide line to the display control unit 21, draws and displays the guide line 52 on the display unit 14 with respect to the character string (Figs. 3 and 7; [0047] and [0050]); wherein the character string has to be within a predetermined distance to the guide line for the guide line to exist for that text line ([0073] and [0078]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of prior arts to include a line guide since it assists in detecting which words/character strings belong to the same text line (Murakawa; [0073]).

1 Regarding claim 8, Zuev teaches comprising determining 2further according to a set of vertices of a third polygon enclosing a 3third region of the document image (word neighborhood information can be specified using a plurality of rectangles (i.e. a third rectangle can be used for a third word) of words whose vertices are connected)([0043]), the third region containing an 4image of a third text token of the plurality of text tokens (the third rectangle including a third word)([0043]), the third text 5token preceding the text token within the token sequence (wherein the third word can be in any location based on distance or connected vertices; i.e. before the “first” word)([0043]).  Lau teaches to construct a plurality of text lines (generating text lines) ([0070]), each text line formed by a subset of the plurality of 4text tokens (each text line generated by text blocks that have been tokenized into lines) ([0070]). Xu teaches text and content-based image document classification systems ([0016]). Chen teaches a system and method for invoice field detection and parsing (Abstract); to 3construct a plurality of text lines (text line generation from characters) (Fig. 13; [0102]), each text line formed by a subset of the plurality of 4text tokens (each text line formed based on characters within that line; wherein those characters include words; e.g. tokens) ([0102] and [0111-0113]); and wherein determining whether a second text token of the plurality of text tokens belongs 7to the text line (determining whether the second character or word belongs to a certain text line) ([0102-0113]) according to a distance between a 9second polygon and the text line (according to the distance between the second character/word and the text line, such that the current character should have sufficient overlapping with the text line it belongs to, and that the current character should be close enough to the text line it belongs to) (Fig. 13; [0102-0113]). 
However none of them explicitly teaches determining a line guide of a text line.
Murakawa teaches a document viewing apparatus (Abstract); wherein a guide line drawing unit 24 generates a guide line to be added to the reference positions set by the reference position determination unit 23, and by outputting the guide line to the display control unit 21, draws and displays the guide line 52 on the display unit 14 with respect to the character string (Figs. 3 and 7; [0047] and [0050]); wherein the character string has to be within ([0073] and [0078]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of prior arts to include a line guide since it assists in detecting which words/character strings belong to the same text line (Murakawa; [0073]).

Regarding claim 20, see the rejection made to claim 7, as well as Zuev for a computing system comprising at least one hardware processor (computer system 100 which includes computing device 110, wherein the computing device 110 can include one or more computing devices 800, which includes a processing device 802)(Figs. 1 and 8; [0021-0022] and [0072]) configured to execute a 2text feature extractor (the processing device able to extract text)([0005]), an image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]), and a token classifier (classifier for the text fields also in system 800)(Fig. 8; [0019]) connected to the 3text feature extractor (the processing device able to extract text)([0005]) and the image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]) wherein all the processing is connected together in system 800)(Fig. 8; [0071]) and a line segmentation engine (text lines extracted using the system 800)(Fig. 8; [0025] and [0071]), for they teach all the limitations within this claim.

Regarding claim 21, see the rejection made to claim 8, as well as Zuev for a computing system comprising at least one hardware processor (computer system 100 which includes computing device 110, wherein the computing device 110 can include one or more computing devices 800, which includes a processing device 802)(Figs. 1 and 8; [0021-0022] and [0072]) configured to execute a 2text feature extractor (the processing device able to extract text)([0005]), an image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]), and a token classifier (classifier for the text fields also in system 800)(Fig. 8; [0019]) connected to the 3text feature extractor (the processing device able to extract text)([0005]) and the image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]) wherein all the processing is connected together in system 800)(Fig. 8; [0071]) and a line segmentation engine (text lines extracted using the system 800)(Fig. 8; [0025] and [0071]), for they teach all the limitations within this claim.

Claims 9 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Zuev et al., US 2019/0385054 A1 (Zuev), Lau et al., US 2020/0143349 A1 (Lau), Xu et al., US 2020/0174870 A1 (Xu), Chen et al., US 2017/0351913 A1 (Chen), and further in view of Bui et al., US 2018/0373952 A1 (Bui).
Regarding claim 9, Zuev teaches wherein arranging the plurality of text tokens into the 2token sequence (arranging the words according to proximity of the words in the electronic document; wherein the electronic document is based on a structured document such as an invoice) ([0023] and [0042-0043]). Lau teaches that the embodiments of the system relate to natural language processing ([0037]). Xu teaches text and content-based image document classification systems ([0016]). Chen teaches a system and method for invoice field detection and parsing (Abstract). However, none of them explicitly teaches determining an ordering of the token sequence 3according to a natural language that the structured paper document is 4formulated in.  
Bui teaches automated workflows for the identification of a reading order from text segments extracted from a document (Abstract); and wherein determining an ordering of the token sequence 3according to a natural language that the structured paper document is 4formulated in (determining an order for the text segments in the document based on trained natural language models)([0004]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of prior arts to include using natural language since it accurately detects the correct reading order (Bui; [0004]).

Regarding claim 22, see the rejection made to claim 9, as well as Zuev for a computing system comprising at least one hardware processor (computer system 100 which includes computing device 110, wherein the computing device 110 can include one or more computing devices 800, which includes a processing device 802)(Figs. 1 and 8; [0021-0022] and [0072]) configured to execute a 2text feature extractor (the processing device able to extract text)([0005]), an image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]), and a token classifier (classifier for the text fields also in system 800)(Fig. 8; [0019]) connected to the 3text feature extractor (the processing device able to extract text)([0005]) and the image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]) wherein all the processing is connected together in system 800)(Fig. 8; [0071]) and a line segmentation engine (text lines extracted using the system 800)(Fig. 8; [0025] and [0071]), for they teach all the limitations within this claim.

Claims 10, 11, 23, and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Zuev et al., US 2019/0385054 A1 (Zuev), Lau et al., US 2020/0143349 A1 (Lau), Xu et al., US 2020/0174870 A1 (Xu), Chen et al., US 2017/0351913 A1 (Chen), and further in view of Saitoh, 5,774,580 (Saitoh).
Regarding claim 10, Zuev teaches wherein arranging the plurality of text tokens into the 2token sequence (arranging the words according to proximity of the words in the electronic document; wherein the electronic document is based on a structured document such as an invoice)([0023] and [0042-0043]); and wherein the proximity of the words may be represented by a word neighborhood graph that is constructed based on data about the portions of the electronic document including the words (e.g., the projections of rectangular areas including words, a distance between the rectangular areas, etc.)([0043]). Lau teaches wherein the text can be determined to be associated with each other based on when they are horizontally aligned and within a distance from each other (Fig. 15; [0197]). Xu teaches text and content-based image document classification systems ([0016]). Chen teaches a system and method for invoice field detection and parsing (Abstract).
However, none of them explicitly teaches 3determining whether a first text token of the plurality of text tokens is located 4to the left of a second text token within the structured paper 
Saitoh teaches extracting text regions from an input document image and classifies the text regions into in-order reading regions to be successively read in the predetermined order (Abstract); wherein determining whether a first text token of the plurality of text tokens is located 4to the left of a second text token within the structured paper document (determining if a text region rectangle is to the left of a second text region rectangle)(col. 20, lines 1-20); 5and 6in response, when yes, determining an ordering of the token sequence wherein 7the first text token precedes the second text token within the token 8sequence (wherein using that the text rectangle is to the left to determine the sorting order; i.e. the text to the left precedes the text to the right)(col. 20, lines 1-20).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of prior arts to include ordering based on the text being to the left since it allows for accurate construction of text regions in the precise reading order (Saitoh; col. 2, lines 28-33).

Regarding claim 11, Zuev teaches wherein arranging the plurality of text tokens into the 2token sequence (arranging the words according to proximity of the words in the electronic document; wherein the electronic document is based on a structured document such as an invoice)([0023] and [0042-0043]); and wherein the proximity of the words may be represented by a word neighborhood graph that is constructed based on data about the portions of the electronic document including the words (e.g., the projections of rectangular areas including words, a distance between the rectangular areas, etc.)([0043]). Lau teaches wherein the text can be determined to be associated with each other based on when they are vertically aligned and within a distance from each other (Fig. 15; [0197]). Xu teaches text and content-based image document classification systems ([0016]). Chen teaches a system and method for invoice field detection and parsing (Abstract).
However, neither explicitly teaches 3 determining whether a first text token of the plurality of text tokens is located 4closer to the top of the structured paper document than a 
Saitoh teaches extracting text regions from an input document image and classifies the text regions into in-order reading regions to be successively read in the predetermined order (Abstract); wherein determining whether a first text token of the plurality of text tokens is located 4closer to the top of the structured paper document than a second text 5token (determining the text rectangle is located closer to the top than another text rectangle)(col. 20, lines 1-20); and 6in response, when yes, determining an ordering of the token sequence wherein 7the first text token precedes the second text token within the token 8sequence (giving priority to the text rectangle closer to the top and thus setting the order so that it precedes the second text rectangle)(col. 20, lines 1-20).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of prior arts to include ordering based on the text being above since it allows for accurate construction of text regions in the precise reading order (Saitoh; col. 2, lines 28-33).

Regarding claim 23, see the rejection made to claim 10, as well as Zuev for a computing system comprising at least one hardware processor (computer system 100 which includes computing device 110, wherein the computing device 110 can include one or more computing devices 800, which includes a processing device 802)(Figs. 1 and 8; [0021-0022] and [0072]) configured to execute a 2text feature extractor (the processing device able to extract text)([0005]), an image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]), and a token classifier (classifier for the text fields also in system 800)(Fig. 8; [0019]) connected to the 3text feature extractor (the processing device able to extract text)([0005]) and the image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]) wherein all the processing is connected together in system 800)(Fig. 8; [0071]) and a line segmentation engine (text lines extracted using the system 800)(Fig. 8; [0025] and [0071]), for they teach all the limitations within this claim.

Regarding claim 24, see the rejection made to claim 11, as well as Zuev for a computing system comprising at least one hardware processor (computer system 100 which includes computing device 110, wherein the computing device 110 can include one or more computing devices 800, which includes a processing device 802)(Figs. 1 and 8; [0021-0022] and [0072]) configured to execute a 2text feature extractor (the processing device able to extract text)([0005]), an image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]), and a token classifier (classifier for the text fields also in system 800)(Fig. 8; [0019]) connected to the 3text feature extractor (the processing device able to extract text)([0005]) and the image feature extractor (pseudo-image extraction using system 800)(Fig. 8; [0045]) wherein all the processing is connected together in system 800)(Fig. 8; [0071]) and a line segmentation engine (text lines extracted using the system 800)(Fig. 8; [0025] and [0071]), for they teach all the limitations within this claim.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL J VANCHY JR whose telephone number is (571)270-1193. The examiner can normally be reached Monday - Friday 9am - 5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emily Terrell can be reached on (571) 270-3717. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MICHAEL J VANCHY  JR/Primary Examiner, Art Unit 2666                                                                                                                                                                                                        Michael.Vanchy@uspto.gov