DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 5-6, 8-9, 12-13, 15-16, and 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Klatt (PGPUB: 20200226162) in view of Chen (PGPUB: 20170351913).

Regarding claims 1, 8, and 15, Klatt teaches a system, comprising: 
a computing device comprising a processor and a memory; and machine-readable instructions stored in the memory that, when executed by the processor (see Fig. 2, paragraph 24, memory 214 of each of computing devices 210, 220, and 130 can store information accessible by the one or more processors 212, including instructions 216 that can be executed by the one or more processors 212. Memory can also include data 218 that can be stored, manipulated, or retrieved by the processor), cause the computing device to at least: 
receive a financial statement (see Fig. 4 and 5 item 401, paragraph 45, a pdf document, such as the financial statement 500, as shown in FIG. 5 may be received); 
identify unstructured text in the invoice (see Fig. 4, paragraph 45, to convert the document from a first file format, such as pdf, to a text format, such as a plain text format); 
convert the unstructured text in the invoice to structured text (see Fig. 4 and 5, paragraph 45, upon receiving the financial statement 500, the processing server 210 may convert the financial statement to a text format); and 
generate a structure preserved layout of the invoice that comprises the structured text (see Fig. 5 and 7, paragraph 58, the template building module may display an interface 700 showing the converted text document 701 created from financial statement 500. The user may select a letter, groups of letters, words, groups of words, numbers, groups of numbers, or any other element of the text). 
However, Klatt does not expressly teach to receive an invoice.
Chen teaches that the system can accept invoices imported from different devices, like a photo copy scanned by a scanner (1000), a digital copy directed imported from a computer (1001), and a picture taken by mobile devices such as tablet (1002) or smart phone (1003). The invoices are then imported to the parsing system (1004). Once the system receives the imported invoices, the invoice parsing program is executed (see Fig. 1, paragraph 40).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Klatt by Chen for providing that the invoices are then imported to the parsing system (1004). Once the system receives the imported 

Regarding claims 2, 9, and 16, the combination teaches wherein the invoice is received in a first format; and 
the machine-readable instructions, when executed by the processor, further cause the computing device to at least: 
convert the invoice to a second format in the form of an image file (see Chen, Fig. 3, paragraph 42, to convert the input PDF into an image that will be used in the pure image-based layout analysis), and 
wherein the machine-readable instructions that cause the computing device to identify the unstructured text in the invoice further cause the computing device to perform optical character recognition on the image file to identify the unstructured text in the invoice (see Chen, Fig. 5, paragraph 54, given the processed image after image enhancement (5001), the OCR workflow starts with checking if the original input is a searchable PDF (5002). If it is a searchable PDF, the text and character blocks have been extracted in Step 3000).  

Regarding claims 5, 12, 19, the combination teaches wherein the invoice is a first invoice and the machine- readable instructions, when executed by the processor, further cause the computing device to at least 
see Klatt, Fig. 4, paragraph 49, upon assigning an identification key to a received document, the processing server may compare the assigned identification key to other stored documents, which were previously received and assigned an identification key. Documents with the same identification key may be considered duplicates).

Regarding claims 6, 13, and 20, the combination teaches wherein the machine-readable instructions, when executed by the processor, further cause the computing device to at least 
extract key-value pairs from the structure preserved layout (see Fig. 5, paragraph 50, the metadata extracted from document 500 (i.e., "XYZ Capital--July Monthly Statement" and belonging to client 330) may be analyzed by processing server 210. Based on the analysis of processing server 210, a low level algorithm associated with statements issued by XYZ Capital to client 330 may be determined and retrieved from storage, such as algorithm database 251).  



Claims 3, 10, and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Klatt (PGPUB: 20200226162) in view of Chen (PGPUB: 20170351913), and further in view of Bliwas (PGPUB: 20190043146).

Regarding claims 3, 10, and 17, the combination teaches wherein the machine-readable instructions, when executed by the processor, further cause the computing device to at least: 
segment non-space separated words in the intermediate text based at least in part on a reference dictionary to generate a clean text (see Chen, Fig. 7, paragraph 68, generates text lines based upon the character nodes, and outputs a vector of text line nodes (6190). Based up on the text lines, the step 6200 segments each text line into words and outputs a vector of word nodes (6210). The step 6220 groups the text lines into paragraph zones and outputs a vector of paragraph zone nodes (6230)).  
However, the combination does not expressly teach remove all spaces from the structured text to generate an intermediate text.
Bliwas teaches that an embodiment performs OCR to extract text from the document. This text may be placed into a temporary location or memory space and processed, e.g., to remove whitespaces, to identify word boundaries, etc. Having processed the extracted text, an embodiment initially searches for a key word or words, e.g., known life insurance company names (see Fig. 6, paragraph 35).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination by Bliwas for providing this text may be placed into a temporary location or memory space and processed, e.g., to remove whitespaces, to identify word boundaries, as remove all spaces from the structured text to generate an intermediate text. Therefore, combining the elements .

	Claims 7 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Klatt (PGPUB: 20200226162) in view of Chen (PGPUB: 20170351913), and further in view of Knudson (US-PAT-NO: 10650086).

Regarding claims 7 and 14, the combination teaches wherein the machine-readable instructions, when executed by the processor, further cause the computing device to at least save the structure preserved layout as a comma separated value (CSV) file. 
Database: Any data structure (and/or combinations of multiple data structures) for storing and/or organizing data, including, but not limited to, relational databases (e.g., Oracle databases, MySQL databases, etc.), non-relational databases (e.g., NoSQL databases, etc.), in -memory databases, spreadsheets, as comma separated values (CSV) files, eXtendible markup language (XML) files, TeXT (TXT) files, flat files, spreadsheet files, and/or any other widely used or proprietary format for data storage. Databases are typically stored in one or more data stores (see Fig. 1, Col. 6 Na 7, lines 1-2 and 1-8).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination by Knudson for providing Any data structure (and/or combinations of multiple data structures) for storing and/or organizing data, including, but not limited to, relational databases (e.g., Oracle databases, MySQL databases, etc.), in -memory databases, spreadsheets, as comma separated values (CSV) files, and/or any other widely used or proprietary format for data storage, as save the structure preserved layout as a comma separated value (CSV) file. Therefore, the combination of the teaching, suggestion, or motivation in the prior art would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention.


Allowable Subject Matter
Claims 4, 11, and 18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to XIN JIA whose telephone number is (571)270-5536. The examiner can normally be reached 9:00 am-7:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matthew Bella can be reached on (571)272-7778. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/XIN JIA/Primary Examiner, Art Unit 2667