DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

This Office Action is in response to the patent application filed on October 17, 2020, for application number 16/655,426. Claims 1-23 have been considered. Claims 1, 10, 13, and 22 are independent claims.
This action is made Non-Final.

Information Disclosure Statement


The information disclosure statement (IDS) submitted on 10/17/2019 was filed prior to the mailing date of the First Office Action.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Claim Objections
Claims 1-23 objected to because of the following informalities:  Regarding claim 1, line 2 recites “enterprise documents”, however, line 6 recites similar “enterprise-documents”. Throughout the claim set, “enterprise document” or “enterprise documents” are not consistently written with or without hyphen, therefore all of the claims 1-23 are objected. Regarding claims 5, 6, 17, and 18 recite “meta-data”, however, it may be a typo of “metadata” used in the parent claim. Regarding claim 8, line 17 recites “atleast” . Appropriate correction is required.

Prior Art
Listed herein below are the prior art references relied upon in this Office Action:
Cali et al. (US Patent Application Publication US 20190220660 A1), referred to as Cali herein.
Roy et al. (US Patent Application Publication US 20200057801 A1), referred to as Roy herein.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

10 and 22 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Cali.
Regarding claim 22, Cali discloses “A system for generating training data for developing a tool for automated data capture from incoming enterprise documents, the system comprising: 
a memory storing program instructions; a processor configured to execute program instructions stored in the memory; and a tool development engine in communication with the processor (Cali, at ¶ [0133], describes the processor operates under stored program control and executes software modules stored in memory such as persistent memory.) and configured to: 
extract one or more document records corresponding to respective plurality of historical enterprise-documents using an index matching technique (Cali, at ¶¶ [0056] and [0076], describes certain text such as keyword is positioned in set locations within the official document that characterizes official documents of the first type, which is equivalent to the index matching technique described at para. [0035] of the original specification.); 
generate a metadata for each of the plurality of historical enterprise-documents based on data point representation list associated with corresponding one or more document records, wherein the data point representation list includes multiple representations of data values associated with respective data points in the document records corresponding to historical enterprise documents (Cali, at ¶ [0015], describes generating a list of template data by a process comprising acquiring at least one document template image; extracting data ; and 
generate a representation template for respective historical enterprise documents based on the corresponding metadata (id. at ¶¶ [0015] and [0017], standard data of the determined document type is formed by extracting data for each document template image, and combining the extracted data associated with each document template image to form the standard data of the determined document type.), wherein the one or more data point identification models are generated using the plurality of historical enterprise documents of a category and the corresponding representation templates(Examiner notes that it is not clear how category of each historical documents is determined in the current independent claims. Therefore, examiner interprets the category of historical documents corresponds to any type or classification of the document. id. at ¶ [0017], semantic labelling, equivalent to the generating the data point identification models, performed by a process that comprises retrieving standard data, which is comparable to the representation templates comprising keyword, which is equivalent to the data point of the instant application, of the determined document type or category, keyword positions and expected data patterns.), wherein the data point identification models are implementable by the tool for automated data capture (id. at ¶ [0017], automated methods of labelling, such 
Independent claim 10 is directed towards a method equivalent to a system found in claim 22, and is therefore similarly rejected.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-9, 11-21, and 23 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cali in view of Roy.
Regarding independent claim 1, Cali discloses “A method for developing a tool for automated data capture from incoming enterprise documents (Cali, at ¶ [0004], Automated data extraction from official documents.), wherein the method is implemented by at least one processor executing program instructions stored in a memory, the method comprising: 
generating, by the processor, a metadata for each of the plurality of historical enterprise-documents based on corresponding data point representation list, wherein the data point representation list includes multiple representations of data values associated with data points in the document records corresponding to historical enterprise documents (Cali, at ¶ [0015], describes generating a list of template data by a process comprising acquiring at least one document template image; extracting data from each document template image of the at least one document template image, the extracted data comprising an extracted keyword and a position of the extracted keyword; and combining the extracted data associated with each document template image. The list of template data is formed for a determined document type as described at ¶ [0014].); 
generating, by the processor, a representation template for each of the respective historical enterprise document based on the corresponding metadata  (id. at ¶¶ [0015] and [0017], standard data of the determined document type is formed by extracting data for each document template image, and combining the extracted data associated with each document template image to form the standard data of the determined document type.); 
generating, by the processor, one or more data point identification models for each category of historical documents using the plurality of historical enterprise documents of respective category and the corresponding representation templates (Examiner notes that it is not clear how category of each historical documents is determined in the current independent claims. Therefore, examiner interprets the category of historical documents corresponds to any type or classification of the document. id. at ¶ [0017], semantic labelling, equivalent to the generating the data point identification models, performed by a process that comprises retrieving standard data, which is comparable to the representation templates comprising keyword, which is equivalent to the data point of the instant application, of ; 
generating, by the processor, one or more data capture rules within each of the data point identification models, wherein the one or more data capture rules cause the corresponding data point identification model to capture a data value associated with data points in each incoming enterprise-document and transform the data values (id. at ¶ [0022], generating data capture rules is corresponds to attempting textual classification of Cali, which is performed by at least one of keyword classification, CNN classification, and visual classification to obtain classification data, which is equivalent to the data value associated with the keyword or label of the incoming document, and examiner notes that the terminology “transform” may direct conditioning the data into a format or type it should be represented in the database as described at para. [0028] of the original specification. Cali teaches similar conditioning process at ¶ [0033].); and 
developing, by the processor, the tool for automated data capture from the generated one or more data point identification models and the generated one or more rules (Examiner notes that this step is describing intended result by performing previous steps. Cali, at ¶ [0054] teaches application software 412 of Fig. 2, executed to extract data from the document, which is equivalent to the tool for automated data capture.).” Cali further discloses the decoding include contracting an external database to obtain classification data (id. at ¶ [0065]). However, Cali does not explicitly teach “for storage into another database”.
id. at ¶ [0040]).
Accordingly, it would have been obvious to one of ordinary skill in the art at the filing date of this application to combine Cali’s method with storing the output into a database after validating and changing format as taught by Roy because it would helpful to build a corpus or collection of documents for document type identification (Roy, at ¶ [0056]).
Independent claim 13 is directed towards a system equivalent to a computer-implemented method found in claim 1, and is therefore similarly rejected.
Regarding claim 2, Cali in view of Roy teaches all the limitation of independent claim 1. Cali further teaches “wherein the plurality of historical enterprise-documents are classified into one or more categories based on a document type using one or more classification techniques (Cali, at ¶ [0018], teaches classify the document image as a determined document type using a deep convolutional neural network classification of the document image.).”
Regarding claim 3, Cali in view of Roy teaches all the limitation of independent claim 1. Cali further teaches “wherein the classification technique includes categorizing the historical enterprise documents based on at least one of: appearance, frequency of occurrence of one or more terms in the historical enterprise documents, text layout and size of the document (Cali, at ¶ [0056], 
Regarding claim 4, Cali in view of Roy teaches all the limitation of independent claim 1. Cali further teaches “wherein one or more document records associated with respective historical enterprise- document are extracted using an index matching technique (Cali, at ¶¶ [0056] and [0076], describes certain text such as keyword is positioned in set locations within the official document that characterizes official documents of the first type, which is equivalent to the index matching technique described at para. [0035] of the original specification.).”
Regarding claim 5, Cali in view of Roy teaches all the limitation of independent claim 1. Cali further teaches “wherein generating the metadata corresponding to each historical enterprise-document comprises: generating the data point representation list corresponding to the data values associated with respective data points in each of the document records using a reverse transformation technique (Cali, at ¶ [0084], describes a sort of reverse transform or inverse transform technique that mapping the recognized data from the target image into the standard data of the official document. So that the keywords read by the blind OCR system and the keywords from the example image, which is equivalent to the data point representation list is obtained by the mapping.); 
performing a search to identify each data point associated with each document record in the corresponding historical enterprise-documents based on the corresponding data point representation list (id. at ¶ [0084], describes searching the recognized data for each keyword from the standard data.); 
marking a position of each identified data point with a special annotation on the corresponding historical enterprise- documents (Examiner notes that “a special annotation” is a position of each identified data point, as described at para. [0026] of original specification, which corresponds the location of the keyword in Cali. Cali, at ¶ [0085], describes homography is used to match other textual data in the recognized data from the blind OCR to the location in the example image.); and 
generating the meta-data associated with respective historical enterprise-document based on corresponding special annotation and the data point representation list (id. at ¶ [0076], describes extract data associated with the document template, the extracted data comprises an extracted keyword and a position of the extracted keyword, and the extracted data for each document template image are then combined into a list of template data.).”
Regarding claim 6, Cali in view of Roy teaches all the limitation of independent claim 1. Cali further teaches “wherein the meta-data comprises information associated with one or more data points of an historical enterprise-document, position of each of the one or more data points in the historical enterprise-document, information associated with document type and document structure (Cali, at ¶¶ [0075]-[0076], a keyword is one or more words that are part of the standard format of official document of a particular type, for example, “Passport Number”, then the list of template formed by extracting data from the image of an official document, the extracted data comprise an extracted keyword and a position of the extracted keyword.).”
claim 7, Cali in view of Roy teaches all the limitation of independent claim 1. Cali further teaches “wherein each representation template represents multiple data points and meta-data associated with the corresponding historical enterprise documents (Cali, at ¶ [0076], describes the extracted data comprises a plurality of extracted keywords and their positions.).”
Regarding claim 8, Cali in view of Roy teaches all the limitation of independent claim 1. Cali further teaches “wherein generating one or more data capture rules within respective data point identification models comprises: 
performing a search for identifying each data point associated with respective document records in the corresponding historical enterprise-documents using the corresponding data point representation list and analyzing a pattern of appearance of the data value associated with each data point in the respective categories of enterprise documents (Cali, at ¶ [0017], describes semantic  labelling performed by a process comprises retrieving standard data of the determined document type, the standard data comprising keywords, keyword positions, and expected data patterns.); 
identifying a data transformation mechanism for the data values of the identified data points associated with respective enterprise-documents based on a relationship determined between the data value associated with each data point in the respective categories of enterprise documents and a data value in the corresponding historical enterprise-documents (id. at ¶ [0017], the process further comprises forming a homography (equivalent to the data transformation mechanism.) ; 
performing a check to determine availability of one or more keywords atleast before or after the data value associated with corresponding identified data points in respective historical enterprise-documents for each category (id. at ¶ [0029], Cali describes obtaining segmentation data performed by a process comprises searching the extracted image data to find at least one text field; associating a label with each of the at least one text field.); 
performing a check to determine the availability of one or more static texts in respective historical enterprise-documents, if no keywords exists before or after the data value corresponding to the identified data points and building a relationship between the static text and the data values using one or more techniques selected from coordinate geometry and pattern matching technique (id. at ¶¶ [0084]-[0085], describes semantic labelling process that recognized text data is matched with the standard data by searching the recognized data for each keyword from the standard data, which is equivalent to determining availability of one or more keywords, and then the semantic labelling process comprises calculating a homography between the example image of the official document and the image data, the homography is used to match other textual data in the recognized data from the blind OCR to the location in the example image, thereby additional classification data is acquired, and also pattern based matching, preferably using regular expression, is used to validate the text being matched to the positions in the official document.); and 
generating the one or more data capture rules for each category of historical documents using the identified data transformation mechanism and at least one of: the identified keywords and the static text associated with corresponding historical enterprise-documents (id. at ¶ [0017], Cali describes semantic labelling performed by a process that comprises retrieving standard data of the determined document type, wherein the standard data comprising keywords, and forming a homography that maps the standard data to the recognized data which is equivalent to the data transformation mechanism.).”
Regarding claim 9, Cali in view of Roy teaches all the limitation of independent claim 1. Cali further teaches “wherein each static text is representative of the text that appears in multiple enterprise- documents of a category (Cali, at ¶ [0056], describes representative of the text as 8 mm in from the left edge and 20 mm down from the top edge may have the term “DRIVING LICENSE” printed in 10 point size of a special font.).”
Regarding claims 11 and 12, the limitations are substantially similar to the steps included in the independent claim 1, and are therefore similarly rejected.
Claims 14-21 are directed towards a system equivalent to a method found in claims 2-9 respectively, and are therefore similarly rejected.
Regarding claim 23, the limitation is substantially similar to a step included in the independent claim 1, therefore, claim 23 is similarly rejected.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SEUNG W JUNG whose telephone number is (571)270-5249.  The examiner can normally be reached on Monday-Friday, 9:00am - 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Scott Baderman can be reached on (571)272-3644.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


SEUNG W. JUNG
Examiner
Art Unit 2144



/SCOTT T BADERMAN/Supervisory Patent Examiner, Art Unit 2144