DETAILED ACTION
Response to Arguments
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 1/19/2021 has been entered.

The Applicant has canceled claim(s) 21-22.
The application has pending claim(s) 12 and 14-20.

In response to the Request for Continued Examination filed on 1/19/2021:
The “Objections to the claims” have been entered and therefore the Examiner withdraws the objections to the claims.  
The “Claim rejections under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph” have been entered and therefore the Examiner withdraws the rejections under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph.

Applicant's arguments with respect to claim(s) 12 and 14-20 have been considered but are moot in view of the new ground(s) of rejection because of the Request for Continued Examination (RCE). 
Applicant’s arguments, see page 6, filed 1/04/2021, with respect to the rejection(s) of claim(s) 12 and 14-20 under 35 U.S.C. 102 have been fully considered and are persuasive.  Therefore, the 35 U.S.C. 102 rejections have been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in further view of the newly found prior art reference Yellapragada et al (US 2018/0025222 A1).  Further discussions are addressed in the prior art rejection section below.  Therefore claims 12 and 14-20 are not in condition for allowance because they are not patentably distinguishable over the prior art references.

Claim Objections
Claims 12 and 19 are objected to because of the following informalities:  
Claim 12 at line 12; and claim 19 at line 14 respectively: “the accuracy” should be -- an accuracy --.

Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 12 and 14-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Yellapragada et al (US 2018/0032804 A1, as applied in previous Office Action) in view of Yellapragada et al (US 2018/0025222 A1 – hereinafter referred to as Yellapragada ‘222).
Re Claim 12: Yellapragada discloses a method for extracting data from a document (see Yellapragada, Fig. 2, [0016], [0026]-[0028], [0037]-[0038], [0044]-[0045], [0055], recognize text on each of the extracted sections of the document), the method comprising: receiving, at a document input module, a document in a text format, the document containing a set of data that a user desires to have extracted (see Yellapragada, Fig. 2, [0016]-[0017], [0025]-[0028], [0037]-[0038], [0044]-[0045], [0048], capture an image of a document [e.g. tax form such as a W2], [0015], e.g. electronic version of the document such as portable document format PDF, etc., also binarization, [0055], computer processor implemented); assigning, by a locating module, a first signature to the document (see Yellapragada, Fig. 2, [0016], [0026]-[0028], [0037]-[0038], [0044]-[0045], generate a hash based on the image of the document, [0055], computer processor implemented); matching, by a training module, the first signature to a second signature of a template, wherein the template includes a location for every word or number included in the set of data (see Yellapragada, Fig. 2, [0016], [0026]-[0028], [0037]-[0038], [0044]-[0045], compare the generated hash to the hashes stored in the database of templates to select a match, the template has spatial information [e.g. size and location] of the elements and the elements include words or numbers [e.g. SNN, name, address, etc. of such a W2 tax form], [0055], computer processor implemented); locating, by the locating module, every word or number included in the 
	Although Yellapragada further discloses performing optical character recognition OCR to extract e.g. the value “000-00-000” as corresponding to an “SSN” in the SNN extracted section of the extracting step after not utilizing OCR to locate every word or number included in the set of data in the document (see Yellapragada, [0016], [0026]-[0028], [0037]-[0038], [0044]-[0045], and more specifically [0038] and [0045]), Yellapragada however fails to explicitly disclose generating, at a computing device, a validation score representing the accuracy of the extracting step.
	Yellapragada ‘222 discloses generating, at a computing device, a validation score representing the accuracy of the extracting step (see Yellapragada ‘222, [0041]-[0044], [0050], the validator and confidence level determiner receives the OCR data from the OCR operations to determine whether the OCR data is accurate based on the 
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Yellapragada’s method, using Yellapragada ‘222’s teachings by including the validator / confidence level determiner process to Yellapragada’s post processing OCR process in order to improve the OCR output by validation or correction of errors in the OCR data (see Yellapragada ‘222, [0041]-[0044], [0050]).

Re Claim 14: Yellapragada further discloses wherein the first signature includes a location for every word or number within the document (see Yellapragada, Fig. 2, [0016], [0026]-[0028], [0037]-[0038], [0044]-[0045], [0055], exact match will have same elements [e.g. SNN, name, address, etc. of such a W2 tax form] with same spatial sizes and locations respectively).

Re Claim 15: Yellapragada further discloses wherein every word or number included in the first signature is existing in the second signature (see Yellapragada, Fig. 2, [0016], [0026]-[0028], [0037]-[0038], [0044]-[0045], [0055], exact match will have same elements [e.g. SNN, name, address, etc. of such a W2 tax form] with same spatial sizes and locations respectively).

Re Claim 16: Yellapragada further discloses wherein the first signature has less words or numbers than the second signature (see Yellapragada, Fig. 2, [0016], [0026]-[0028], 

Re Claim 17: Yellapragada further discloses wherein the locating module determines a location of the every word or number that comprise the first signature and the set of data by absolute or relative positioning (see Yellapragada, Fig. 2, [0016], [0026]-[0028], [0037]-[0038], [0044]-[0045], [0055], exact match will have same elements [e.g. SNN, name, address, etc. of such a W2 tax form] with same spatial sizes and locations respectively).

Re Claim 18: Yellapragada further discloses wherein OCR is not utilized to determine the location of the words or numbers that comprise the first signature and the set of data (see Yellapragada, Fig. 2, [0016], [0026]-[0028], [0037]-[0038], [0044]-[0045], determine the spatial information [e.g. size and location] of the elements [e.g. SNN, name, address, etc. of such a W2 tax form] in the document based on the template, extract the sections of the document based on the spatial information and then [afterwards] perform OCR text recognition on the extracted sections of the document [OCR text recognition isn’t performed until the end: after the sections have been located and extracted], [0055], computer processor implemented).

Re Claim 19: Yellapragada discloses a non-transitory computer-readable medium having stored thereon computer-executable instructions that when executed by at least 
Although Yellapragada further discloses performing optical character recognition OCR to extract e.g. the value “000-00-000” as corresponding to an “SSN” in the SNN extracted section of the extracting step after not utilizing OCR to locate every word or number included in the set of data in the document (see Yellapragada, [0016], [0026]-[0028], [0037]-[0038], [0044]-[0045], and more specifically [0038] and [0045]), Yellapragada however fails to explicitly disclose generating a validation score representing the accuracy of the extracting step.
	Yellapragada ‘222 discloses generating a validation score representing the accuracy of the extracting step (see Yellapragada ‘222, [0041]-[0044], [0050], the validator and confidence level determiner receives the OCR data from the OCR operations to determine whether the OCR data is accurate based on the regular 
Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Yellapragada’s non-transitory computer-readable medium, using Yellapragada ‘222’s teachings by including the validator / confidence level determiner process to Yellapragada’s post processing OCR process in order to improve the OCR output by validation or correction of errors in the OCR data (see Yellapragada ‘222, [0041]-[0044], [0050]).

Re Claim 20: Yellapragada further discloses wherein every word or number included in the set of data includes only the words and numbers the user desires to have extracted (see Yellapragada, Fig. 2, [0016], [0026]-[0028], [0037]-[0038], [0044]-[0045], [0055], the elements include words or numbers [e.g. SNN, name, address, etc. of such a W2 tax form], e.g. semi-automatically or automatically).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BERNARD KRASNIC whose telephone number is (571)270-1357.  The examiner can normally be reached on Mon. - Thur. and every other Friday from 8am - 4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an to.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached on (571) 272-8243.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/Bernard Krasnic/Primary Examiner, Art Unit 2661                                                                                                                                                                                                        March 15, 2021