DETAILED ACTION
Claims 1-20 are pending in the Instant Application. 
Claims 1-20 are rejected (Non-Final Rejection). 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
The Instant Application filed 01/31/2020 is a national stage entry of PCT/US2018/045001, International Filing Date: 08/02/2018, claiming priority from Provisional Application 62540279, filed 08/02/2017.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 31 January 2020 was considered by the examiner.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-3, 7-10 and 14-17 are rejected under 35 U.S.C. 102(a)(1) as being unpatentable by Takeya, United States Patent Application Publication No. 2009/0019010. 

As per claim 1, Takeya teaches a computer implemented method for extracting data from a document, the method comprising: 
receiving, with one or more processors, the document ([0060] wherein the document is the image described as the non-document data defined in [0013] as an image); 
converting, with the one or more processors, the document to a text format ([0061] wherein the image supplies the data to an OCR); 
performing, with the one or more processors, data extraction from the converted document ([0061] wherein data is extracted from the document); and 
generating, with the one or more processors, a result set including at least some of the extracted data ([0070] wherein the extracted data is returned as a result).  

As per claim 2, Takeya discloses the method of claim 1, wherein performing the data extraction includes: 
receiving, with the one or more processors, a selection of text from the converted document, wherein the selection of text includes one or more portions of text ([0065] wherein selections of text are received from the converted image after the document has been converted to text); and assigning, with the one or more processors, a respective tag to each of the one or more portions of text ([0065]-[0066] wherein the tag assigned to the data is the field name, which is associated with a value, the portion of text).  

As per claim 3, Takeya discloses the method of claim 2, wherein the selection of text from the converted document is based on predefined criteria associated with a low level algorithm ([0064]-[0065] wherein the algorithm is the calculation of similarity, which leads to the selection of text from the converted document).  

As per claim 7, Takeya discloses the method of claim 1, wherein the document includes one or more of tables, fields, Unicode characters, and numbers ([Fig. 13] wherein numbers “1000” and “2” are described as being included in the document).  

As per claim 8, Takeya discloses a system for extracting data from a document, the system comprising: one or more processors ([0031] wherein a data processing unit is a processor) configured to perform the method of claim 1. As a result, claim 8 is rejected for the same rationale and reasoning as claim 1. 

As per claim 9, claim 9 is a system that performs the method of claim 2 and is rejected for the same rationale and reasoning. 

As per claim 10, claim 10 is a system that performs the method of claim 3 and is rejected for the same rationale and reasoning. 



As per claim 15, claim 15 is the product that performs the method of claim 1 and is rejected for the same rationale and reasoning. 

As per claim 16, claim 16 is the product that performs the method of claim 2 and is rejected for the same rationale and reasoning. 
 
As per claim 17, claim 17 is the product that performs the method of claim 3 and is rejected for the same rationale and reasoning. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 4-6, 11-13 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Takeya in view of Ho, United States Patent No. 10,769,357. 

As per claim 4, Takeya discloses the method of claim 3, but does not disclose validating the extracted data.  However, Ho teaches validating the extracted data ([Col 3, line 38-Col 4, line 13] wherein extracted data is validated automatically to identify fields where a human should validate the data). 
Both Takeya and Ho describe using extracted data from the OCR process. One could include the validation process from Ho with the OCR of documents described in Takeya to teach the claimed invention. It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the method of extracting text from a document using the OCR process in Takeya with the validation of extracted data in Ho in order to better verify the extracted data by correcting errors in the data. 

As per claim 5, note the rejection of claim 5 where Takeya and Ho are combined. The combination teaches the method of claim 4. Ho further teaches wherein, in the event the validation of the extracted data fails: receiving, from a user, a selection of text from the converted document, wherein the selection of text includes one or more portions of text ([Col 3, line 38-Col 4, line 13] wherein a subset of text portions are provided to the human user for review and correction); and assigning, with the one or more processors, a respective tag to each of the one or more portions of text ([Col 3, line 38-Col 4, line 13] wherein the tag is the field assigned to the extracted information).  

As per claim 6, Takeya discloses the method of claim 1, but does not disclose wherein prior to performing the data extraction, validating that the conversion was successful.  However, Ho teaches wherein prior to performing the data extraction, validating that the conversion was successful ([Col 3, line 38-Col 4, line 13] wherein the entire converted document is validated, before further validation occurs). Both 

As per claim 11, claim 11 is a system that performs the method of claim 4 and is rejected for the same rationale and reasoning. 

As per claim 12, claim 12 is a system that performs the method of claim 5 and is rejected for the same rationale and reasoning. 

As per claim 13, claim 13 is a system that performs the method of claim 6 and is rejected for the same rationale and reasoning. 

As per claim 18, claim 18 is the product that performs the method of claim 4 and is rejected for the same rationale and reasoning. 

As per claim 19, claim 19 is the product that performs the method of claim 5 and is rejected for the same rationale and reasoning. 
 


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KANNAN SHANMUGASUNDARAM whose telephone number is (571)270-7763. The examiner can normally be reached M-F 9:00 AM -6:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fred Ehichioya can be reached on (571) 272-4034. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 





/KANNAN SHANMUGASUNDARAM/Primary Examiner, Art Unit 2168