Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Applicant’s Response
	In Applicant’s Response dated 7/29/21, the Applicant amended and argued previously rejected claims 1, 5, 8, 12, 15 and 19 of the Non-Final Rejection dated 4/29/21. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Stadermann et al., United States Patent Publication 2013/0198123 A1 (hereinafter “Stadermann”), in view of Singh et al, United States Patent Publication 2011/0258182 A1 (hereinafter “Singh”), in further view of Xu et al., United States Patent Publication 2017/0330319 (hereinafter “Xu”).
Claim 1:
	
generating a schema extraction that identifies (i) a set of input field elements of the input electronic form and (ii) an input category for the set of input field elements (see paragraph [0044]). Stadermann teaches a schema extraction that identifies input field elements and a section for input field elements;
accessing a hierarchal entity-data model having (i) an entity category and (ii) entity-data elements within the entity category, wherein the entity category and the entity-data elements are generated in the hierarchical entity-data model by, at least, aggregating a first document structure that is extracted from metadata structure of a completed document (see paragraphs [0040], [0043] and [0044]). Stadermann teaches accessing a hierarchal entity data model having section and elements. An extraction model may be specified such as in metadata  and also each extractor may utilize a library that include a fixed or dynamic set of entities, regular expressions for a particular type of document. Stadermann also teaches the extraction model may utilize the layout for the document to predictively determine the sections that should be included in the document, potentially the hierarchical arrangement of the sections within the document, and/or individual entity types that should be present within a section.;
identifying an association between an element of the entity category of the hierarchal entity-data model and the input category of the schema extraction, wherein the association is identified based on one or more of (i) matching text in an entity category label to an input category label and (ii) matching a number of entity-data elements within the entity category to a number of input field elements of the set of input field elements within the input category (see paragraph [0056]). ;
verifying, based on identifying the association, that the entity category of the hierarchal entity-data model corresponds to the input category of the schema extraction by applying a natural language processing engine to the input field elements and the entity-data elements (see paragraphs [0002] and [0018]). Strademann teaches verifying that the entity section sections corresponds to the inputs by applying processing to correctly recognize the text extracted; and
autocompleting one or more of the set of input field elements of the input electronic form with entity data from one or more of the entity-data elements of the hierarchal entity-data model (see paragraph [0067]). Stadermann teaches autocompleting one or more of the input field elements with entity data elements.

Stadermann fails to expressly disclose generating a schema extraction from an electronic form lacking data values for one or more fields. 

Singh discloses:
generating, from an input electronic form lacking data values for one or more fields, a schema extraction that identifies (i) input field elements of the input electronic form and (ii) an input category for the input field elements (see paragraph [0314]-[0316]). Singh teaches generating extraction rules for further extracting lacking data values for a field. It includes identifying input fields and categories for the input fields;

Accordingly, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to include a schema extraction for lacking data values for one or more fields of an electronic form for the purpose of efficiently distinguishing and correcting fields during extraction process (see paragraph [0314]-[0316]), as taught by Singh. 

	Xu discloses:
wherein the entity category and the entity-data elements are generated in the hierarchical entity-data model by a second document structure that is detected with a trained neural network applied to a document image (see paragraph [0003]-[0005]). Xu teaches the model is generated by detecting a structured using a trained neural network. Model is used to extract features and create a model.

Accordingly, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to include using a trained neural network to create a model to extract features from an image for the purpose of efficiently minimizing false positives when detecting structures and creating hierarchical relationships (see paragraph [0314]-[0316]), as taught by Singh. 

Claim 2:
	
the operations further comprising generating the hierarchical entity-data model by:
applying a document structure detection network that extracts a set of features from a snapshot image (see paragraph [0020]). Stadermann teaches applying a document structure detection that extracts a set of features from the snapshot image;
determining a document structure based on the set of features extracted from the snapshot image, wherein the document structure comprises a hierarchical of document categories and document fields (see paragraph [0021] and [0044]). Stadermann teaches determining document layout and features from the snapshot image and determining the hierarchy of document sections and fields;
extracting, from metadata of an electronic document, a metadata structure comprising a hierarchical of metadata categories and metadata fields (see paragraphs [0044] and [0045]). Stadermann teaches extracting metadata structure comprising a hierarchical of the categories and fields; and
aggregating the metadata structure and the document structure to obtain the entity category and one or more entity-data elements of the hierarchical entity-data model (see paragraphs [0043]-[0046]). Stadermann teaches aggregating the metadata structure and document structure to obtain the category and elements of the model.

Claim 3:
	Stadermann fails to expressly disclose identifying missing data elements and derive a value from existing data elements in the model. 

	Singh discloses:
wherein generating the hierarchical entity-data model further comprises: identifying at least one entity-data element of the hierarchical entity-data model that lacks a data value, and wherein the data value is lacking from the metadata structure and the document structure (see paragraphs [0310]-[0312]). Singh teaches identifying an element in the model and metadata that’s lacking a value; and
deriving an entity-data value from existing entity-data elements in the hierarchical entity-data model (see paragraph [0311]). Singh teaches deriving a value from existing data elements in the hierarchical data model.
wherein deriving the entity-data value comprises computing the entity-data value as a function of the existing entity-data elements, wherein the derived entity-data value is different from values of the existing entity-data elements (see paragraph [0312]). Singh teaches providing a name to a database of people who issued forms, comparing that form to determine if some elements match and extract information that previously didn’t exist. Also known-value databases having information about employers, banks and financial institutions such as addresses, and identification information that was not existing information stored as entity data elements. 



Claim 4:
	Stadermann fails to expressly disclose using similarity score and relationship between the input element and data element.

	Singh discloses:
wherein verifying that the entity category corresponds to the input category by applying natural language processing comprises:
translating the input field elements and the entity-data elements into an input word vector and an entity-data word vector (see paragraphs [0145] and [0146]). Singh teaches translating the input to an image, applying transformations and comparing to the trained image data element stored, 
determining a similarity score between the input word vector and the entity-data word vector, wherein the similarity score is determined based on: computing a distance between the input word vector and the entity-data word vector; or  determining a relationship between the input word vector and the entity-data word vector based on semantic analysis (see paragraph [0148]). Singh teaches determining a similarity score between both images are determine by comparison .

Accordingly, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to include a schema extraction for lacking data values for one or more fields of an electronic form for the purpose of efficiently distinguishing and correcting fields during extraction process (see paragraph [0314]-[0316]), as taught by Singh. 

Claim 5:
	Stadermann fails to expressly disclose verifying that the input category correspond to the entity category by computing a confidence score that increases as the quantity of matches increase.

	Singh discloses:
wherein verifying that the entity category corresponds to the input category further comprises computing a confidence score based on a quantity of matches between input field elements from the input category and entity-data elements from the entity category or based on the quantity of matches between a plurality of input categories and a plurality of entity categories, wherein the confidence score increases as the quantity of matches increase (see paragraphs [00245]-[0246]). Singh teaches uses the extracted candidate images and uses an iterative grouping process that at each iteration step, attempts to merge existing .

Accordingly, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to include a schema extraction for lacking data values for one or more fields of an electronic form for the purpose of efficiently distinguishing and correcting fields during extraction process (see paragraph [0314]-[0316]), as taught by Singh. 

Claim 6:
	Stadermann discloses:
wherein autocompleting one or more of the input field elements with entity data comprises inserting, based on matching input field element labels and entity-data element labels, the entity data from the one or more entity-data elements into to the one or more input field elements (see paragraph [0067]). Stadermann teaches autocompleting one or more input field elements with data comprises inserting based on the matching labels.

Claim 7:
	Stadermann discloses:
wherein the entity category of the hierarchical entity-data model comprises an additional entity category, wherein the additional entity category comprises (i) a supplemental entity category, or (ii) entity data elements (see paragraphs [0057], [0058] and [0065]). Stadermann teaches the hierarchical model comprises additional entity category data which can include subsections. 

Claims 8-14:
	Although claims 8-14 are system claims, they are interpreted and rejected for the same reasons as the methods of Claims 1-7, respectively. 

Claims 15-20:
	Although claims 15-18 and 20 are computer readable medium claims, they are interpreted and rejected for the same reasons as the methods of Claims 1-6, respectively. 

Response to Arguments
Applicant's arguments filed 7/29/21 have been fully considered but they are not persuasive. 
Claims 1, 8, 15:
	Applicant argues Stadermann does not teach or suggest “generating, from an input electronic form lacking data values for one or more fields, a schema extraction that identifies (i) a set of input fields elements of the input electronic form and (ii) an input category for the set of input field elements,” as recited in the independent claims (emphasis added).
	The Examiner disagrees. 


Applicant argues Singh does not teach or suggest generating ... a schema extraction that identifies ... an input category for [a] set of input field elements,” as recited in the independent claims.
	The Examiner disagrees. 
Singh teaches generating extraction rules for further extracting lacking data values for a field. It includes identifying input fields and categories for the input fields (see paragraph [0314]-[0316]). The invention utilizes known constraints between the semantics of extracted data elements to correct potentially incorrectly extracted data. For example, the label “Social security wages” is to the left of the label “Social security 

Applicant argues Stadermann necessarily does not teach or suggest “identifying an association between an element of the entity category of the hierarchical entity-data model and the input category of the schema extraction based on one or more of (i) matching text in an entity category label to an input category label and (ii) matching a number of entity-data elements within the entity category to a number of input field elements of the set of input field elements within the input category,” as recited in the independent claims.
The Examiner disagrees.
Stadermann teaches identifying an association by matching the text in the section and the number of fields within the entity to be filled (see paragraph [0056]).
An expert may apply a business rule that determines a threshold definition relative to each party. The business rule is applied to the section (i.e. matching text with the section because its only application to elements of that matching section) using a set that includes three slots A, B, and C. Slot A of “Threshold” matches with the extracted entity of “Threshold.” Slot B is descriptive of the defining term “Means,” which It is noteworthy to mention that each slot may include one or more properties that determine how the slot is to be filled. For example, the “Threshold” slot A includes the properties of “DISTANCE=40,” “RESET_OTHER,” AND “ORDER=1.” The “DISTANCE=40” property (i.e. number of properties to be filled has to match the number of entities) will fill the slot with the extracted entity data if the extracted entity data is within a given distance “40” to extracted entity data from already filled slots of the set (i.e. identifying an association). It will be understood that the distance may be measured in characters. If the extracted entity data is not within the specified distance property, the slot is cleared. The “RESET_OTHER” property specifies that if the current slot is filled, all other slots will be cleared. Thus, Stadermann discloses this limitation and the rejection is maintained. 

Applicant argues the combination of the cited documents do not teach or suggest verifying that the entity category of the hierarchal entity-data model corresponds to the input category of the schema extraction by applying a natural language processing engine.
	The Examiner disagrees.
	Strademann teaches verifying that the entity section sections corresponds to the inputs by applying processing to correctly recognize the text extracted (see paragraphs [0002] and [0018]).. When all, or a predetermined number of slots has been filled, the expert may verify or validate the entity data. An exemplary application of a business rule to assemble extracted entity data (see paragraph [0050]). In some instances, an expert 

Claims 5, 12, 19:
	Applicant argues The combination of the cited documents also does not teach or suggest “wherein verifying that the entity category corresponds to the input category comprises computing a confidence score based on a quantity of matches between input field elements from the input category and entity-data elements from the entity category or based on the quantity of matches between a plurality of input categories and a plurality of entity categories, wherein the confidence score increases as the quantity of matches increase,” as recited in dependent claims 5, 12, and 19.
	The Examiner disagrees. 
	Singh teaches uses the extracted candidate images and uses an iterative grouping process that at each iteration step, attempts to merge existing grouping using a merging confidence. The merging confidence is determined from matching and mismatching and the more matches, the more merges and the higher the confidence .

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TIONNA M BURKE whose telephone number is (571)270-7259. The examiner can normally be reached M-F 8a-4p.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kavita Stanley can be reached on (571)272-8352. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.







/KAVITA STANLEY/Supervisory Patent Examiner, Art Unit 2176