DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responsive to communications: Application filed on 1/22/2019.
Claims 1-20 are pending. Claims 1, 13, and 18 are independent.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Yellapragada et al. (US2018/0032842) in view of Bettersworth et al. (US2017/0031894) and Han et al. (US2020/0223061).

In regards to claim 1, Yellapragada et al. substantially discloses a machine learning (ML) based data transformation system comprising: 
Yellapragada et al. para[0047]); 
a non-transitory processor readable medium (Yellapragada et al. para[0047]) storing machine-readable instructions that cause the at least one processor to: 
receive an input package including a plurality of documents and related metadata for mapping and evaluation (Yellapragada et al. para[0035], obtains set of input documents);  
extract entities and relationships between the entities included in the plurality of documents (Yellapragada et al. para[0036], extracts entities and relationships);  
determine name-value pairs associated with the entities (Yellapragada et al. fig. 5 para[0043], determines label-value pairs associated with entities); 
automatically produce mappings of the name-value pairs associated with the entities to output fields based on the metadata using a machine learning (ML) based relationship model and an ontology including the output fields (Yellapragada et al. para[0038] and [0043], uses metadata to produce mapping to output fields). 
Yellapragada et al. does not explicitly disclose categorize the plurality of documents into at least one domain based on similarity of the plurality of documents and a domain meta document;  
identify uniquely, each of the plurality of documents, by employing trained classifiers, the trained classifier uniquely identifying each of the plurality of documents based on the domain, document structure and document content.
However Bettersworth et al. substantially discloses categorize the plurality of documents into at least one domain based on similarity of the plurality of documents and a domain meta document (Bettersworth et al. para[0056]-[0057], categorizes documents into domains based on similarity with training data);  
identify uniquely, each of the plurality of documents, by employing trained classifiers, the trained classifier uniquely identifying each of the plurality of documents based on the domain, document structure and document content (Bettersworth et al. para[0027] and[0049]-[0051], 
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined the data extraction method of Yellapragada et al. with the categorization method of Bettersworth et al. in order to identify topic based metadata (Bettersworth para[0006]).
Yellapragada et al. does not explicitly disclose enable execution of an automated process via transmitting the name-value pairs mapped to the output fields to an external robotic process automation (RPA) system.
However Han et al. substantially discloses enable execution of an automated process via transmitting the name-value pairs mapped to the output fields to an external robotic process automation (RPA) system (Han et al. Fig.4 para[0078]-[0080], maps name-value pairs to generate code for RPA system).
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined the data extraction method of Yellapragada et al. with the automation method of Han et al. in order to automate repetitive tasks (Han et al. para[0002]).
	

In regards to claim 2, Yellapragada et al. as modified by Bettersworth et al. and Han et al. substantially discloses the data transformation system of claim 1, wherein the non-transitory processor readable medium stores further machine-readable instructions that cause the processor to: enable display of mappings of the name-value pairs with the output fields on a user interface associated with the data transformation system (Bettersworth et al. para[0032]).  
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined the data extraction method of Yellapragada et al. with the Bettersworth para[0006]).

In regards to claim 3, Yellapragada et al. as modified by Bettersworth et al. and Han et al. substantially discloses the data transformation system of claim 1, wherein the instructions for categorizing the plurality of documents comprise further machine-readable instructions that cause the processor to:  
provide an initial categorization of the plurality of documents into two categories based on whether or not a document is processor-readable (Yellapragada et al. para[0042]); and 
convert documents which are not processor-readable into processor-readable documents using optical character recognition (OCR) prior to domain categorization of the plurality of documents (Yellapragada et al. para[0043]).  

In regards to claim 4, Yellapragada et al. as modified by Bettersworth et al. and Han et al. substantially discloses the data transformation system of claim 1, wherein the instructions for categorizing the plurality of documents comprise further machine-readable instructions that cause the processor to: identifying structures included in each of the plurality of documents, the structures comprising headers and sub-headers (Yellapragada et al. para[0032]).  

In regards to claim 5, Yellapragada et al. as modified by Bettersworth et al. and Han et al. substantially discloses the data transformation system of claim 4, wherein the instructions for categorizing the plurality of documents comprise further machine-readable instructions that cause the processor to: determine positions of the structures in each of the plurality of documents (Yellapragada et al. fig. 4 para[0036]).  

Bettersworth et al. para[0050] and [0067]).  
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined the data extraction method of Yellapragada et al. with the categorization method of Bettersworth et al. in order to identify topic based metadata (Bettersworth para[0006]).

In regards to claim 7, Yellapragada et al. as modified by Bettersworth et al. and Han et al. substantially discloses the data transformation system of claim 6, wherein the instructions for obtaining the entities and the entity relationships include further machine-readable 15instructions that cause the processor to: generate tokens via tokenizing the strings in the list of strings; and tag the tokens with parts of speech (Bettersworth et al. para[0067]).  
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined the data extraction method of Yellapragada et al. with the categorization method of Bettersworth et al. in order to identify topic based metadata (Bettersworth para[0006]).

In regards to claim 8, Yellapragada et al. as modified by Bettersworth et al. and Han et al. substantially discloses the data transformation system of claim 7, wherein the instructions for obtaining the entities and the entity relationships include further machine-readable instructions that cause the processor to: employ relationship models for extracting the relationships between the entities (Bettersworth et al. para[0074]).  
Bettersworth para[0006]).

In regards to claim 9, Yellapragada et al. as modified by Bettersworth et al. and Han et al. substantially discloses the data transformation system of claim 1, wherein the instructions for automatically producing mappings of the name-value pairs to output fields further machine-readable instructions that cause the processor to: 
collect training data pertaining to mapping the name-value pairs in the plurality of documents to output fields (Yellapragada et al. para[0035]); and 
training the ML based relationship model on the training data for producing the mappings (Yellapragada et al. para[0039]).  

In regards to claim 10, Yellapragada et al. as modified by Bettersworth et al. and Han et al. substantially discloses the data transformation system of claim 9, wherein the ML based relationship 10model pertains to Long short-term Memory (LSTM) network (Bettersworth et al. para[0065]).  
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined the data extraction method of Yellapragada et al. with the categorization method of Bettersworth et al. in order to identify topic based metadata (Bettersworth para[0006]).

In regards to claim 11, Yellapragada et al. as modified by Bettersworth et al. and Han et al. substantially discloses the data transformation system of claim 1, wherein the metadata is received in Java Script Notation Object (JSON) format (Bettersworth et al. para[0047]).  
Bettersworth para[0006]).

In regards to claim 12, Yellapragada et al. as modified by Bettersworth et al. and Han et al. substantially discloses the data transformation system of claim 1, wherein the plurality of documents pertain to financial statements and the metadata includes a spreading type for evaluating a risk rating for an entity associated with the financial statements (Yellapragada et al. para[0021]).  


In regards to claim 13, Yellapragada et al. substantially discloses a method of transforming data for enabling robotic process automation (RPA) comprising: 
receiving an input package including a plurality of documents and related metadata for mapping and evaluation  (Yellapragada et al. para[0035], obtains set of input documents);   
identifying one or more documents within the plurality of documents that are not in processor-readable formats (Yellapragada et al. para[0042]); 
converting the documents that are not in the processor-readable formats into processor-readable format using optical character recognition (OCR) (Yellapragada et al. para[0043]); 
extracting entities and relationships between the entities included in the plurality of documents (Yellapragada et al. para[0036], extracts entities and relationships);  
obtaining name-value pairs associated with the entities from the plurality of documents (Yellapragada et al. fig. 5 para[0043], determines label-value pairs associated with entities); 
automatically producing mappings of the name-value pairs associated with the entities to output fields based on the metadata using a machine learning (ML) based relationship model Yellapragada et al. para[0038] and [0043], uses metadata to produce mapping to output fields). 
Yellapragada et al. does not explicitly disclose identifying uniquely, each of the plurality of documents via employing trained classifiers, the trained classifier uniquely identifying each of the plurality of documents based on document structure and document content;
However Bettersworth et al. substantially discloses identifying uniquely, each of the plurality of documents via employing trained classifiers, the trained classifier uniquely identifying each of the plurality of documents based on document structure and document content (Bettersworth et al. para[0027] and[0049]-[0051], trained classifier (inference engine) identifies each document based on document domain, document structure, and document content content).
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined the data extraction method of Yellapragada et al. with the categorization method of Bettersworth et al. in order to identify topic based metadata (Bettersworth para[0006]).
Yellapragada et al. does not explicitly disclose executing an automated process using the name-value pairs mapped to the output fields.
However Han et al. substantially discloses executing an automated process using the name-value pairs mapped to the output fields (Han et al. Fig.4 para[0078]-[0080], maps name-value pairs to generate code for RPA system).
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined the data extraction method of Yellapragada et al. with the automation method of Han et al. in order to automate repetitive tasks (Han et al. para[0002]).

In regards to claim 14, Yellapragada et al. as modified by Bettersworth et al. and Han et al. substantially discloses the method of claim 13, further comprising: categorizing the plurality Bettersworth et al. para[0056]-[0057], categorizes documents into domains based on similarity with training data).  
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined the data extraction method of Yellapragada et al. with the categorization method of Bettersworth et al. in order to identify topic based metadata (Bettersworth para[0006]).

In regards to claim 15, Yellapragada et al. as modified by Bettersworth et al. and Han et al. substantially discloses the method of claim 13, wherein categorizing the plurality of documents further comprising: for each domain, 
calculating respective term weights for terms in each of the plurality of documents (Bettersworth et al. para[0059]), 
calculating average of the term weights for the plurality of documents (Bettersworth et al. para[0073]),
 identifying terms with the respective term weights greater than the average of the term weights (Bettersworth para[0066]).  
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined the data extraction method of Yellapragada et al. with the categorization method of Bettersworth et al. in order to identify topic based metadata (Bettersworth para[0006]).

In regards to claim 16, Yellapragada et al. as modified by Bettersworth et al. and Han et al. substantially discloses the method of claim 15, wherein categorizing the plurality of documents further comprises: adding the identified terms to a corresponding domain meta document (Bettersworth et al. para[0024]).  
Bettersworth para[0006]).

In regards to claim 17, Yellapragada et al. as modified by Bettersworth et al. and Han et al. substantially discloses the method of claim 13, further comprising:  training the machine learning (ML) based relationship model using explicitly labelled data (Bettersworth et al. para[0020]).  
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined the data extraction method of Yellapragada et al. with the categorization method of Bettersworth et al. in order to identify topic based metadata (Bettersworth para[0006]).

In regards to claim 18, Yellapragada et al. substantially discloses a non-transitory processor-readable storage medium comprising machine- readable instructions that cause a processor to:  
receive an input package including a plurality of documents and related metadata for mapping and evaluation (Yellapragada et al. para[0035], obtains set of input documents); 
obtain entities and relationships between the entities included in the plurality of documents (Yellapragada et al. para[0036], extracts entities and relationships); 
determine name-value pairs associated with the entities from the plurality of documents (Yellapragada et al. fig. 5 para[0043], determines label-value pairs associated with entities); 
automatically produce mappings of the name-value pairs associated with the entities to output fields based on the metadata using a machine learning (ML) based relationship model Yellapragada et al. para[0038] and [0043], uses metadata to produce mapping to output fields).  
Yellapragada et al. does not explicitly disclose categorize the plurality of documents into at least one domain based on similarity between the plurality of documents and a corresponding domain meta document; 
identify uniquely each of the plurality of documents via employing trained classifiers, the trained classifiers uniquely identifying each of the plurality of documents based on the domain, document structure and document content.
However Bettersworth et al. substantially discloses categorize the plurality of documents into at least one domain based on similarity between the plurality of documents and a corresponding domain meta document (Bettersworth et al. para[0056]-[0057], categorizes documents into domains based on similarity with training data); 
identify uniquely each of the plurality of documents via employing trained classifiers, the trained classifiers uniquely identifying each of the plurality of documents based on the domain, document structure and document content (Bettersworth et al. para[0027] and[0049]-[0051], trained classifier (inference engine) identifies each document based on document domain, document structure, and document content content);
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined the data extraction method of Yellapragada et al. with the categorization method of Bettersworth et al. in order to identify topic based metadata (Bettersworth para[0006]).
Yellapragada et al. does not explicitly disclose enable execution of an automated process via transmitting the name-value pairs mapped to the output fields to a robotic process automation (RPA) system.  
However Han et al. substantially discloses enable execution of an automated process via transmitting the name-value pairs mapped to the output fields to a robotic process Han et al. Fig.4 para[0078]-[0080], maps name-value pairs to generate code for RPA system).  
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined the data extraction method of Yellapragada et al. with the automation method of Han et al. in order to automate repetitive tasks (Han et al. para[0002]).


In regards to claim 19, Yellapragada et al. as modified by Bettersworth et al. and Han et al. substantially discloses the non-transitory processor-readable storage medium of claim 18, wherein the instructions for categorizing the plurality of documents into at least one domain further comprising instructions that cause the processor to:  for each domain, 
calculate respective term weights for terms in each of the plurality of documents (Bettersworth et al. para[0059]), 
calculate average of the term weights for the plurality of documents (Bettersworth et al. para[0073]), 
identify terms with the respective term weights greater than the average of the term weights (Bettersworth para[0066]).  
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined the data extraction method of Yellapragada et al. with the categorization method of Bettersworth et al. in order to identify topic based metadata (Bettersworth para[0006]).

In regards to claim 20, Yellapragada et al. as modified by Bettersworth et al. and Han et al. substantially discloses the non-transitory processor-readable storage medium of claim 19, wherein the instructions for categorizing the plurality of documents into at least one domain Bettersworth et al. para[0024]).
It would have been obvious to one of ordinary skill in the art before the filing date of the invention to have combined the data extraction method of Yellapragada et al. with the categorization method of Bettersworth et al. in order to identify topic based metadata (Bettersworth para[0006]).


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Langseth et al. (US2011/0161333) teaches transforming unstructured data into structured data.
Ghatage et al. (US2017/0372231) teaches using machine learning and natural language processing to categorize documents.
Belgodere et al. (US2016/0080422) teaches filtering documents into domain-specific categories.
Sonobe et al. (US2020/0293553) teaches extracting and classifying data from documents
Smutko et al. (US2020/0219033) teaches system for identifying opportunities to implement robotic process automation.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICHOLAS HASTY whose telephone number is (571)270-7775.  The examiner can normally be reached on Monday-Friday 8:30am-5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Stephen Hong can be reached on (571)272-4124.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/N.H/Examiner, Art Unit 2178                                                                                                                                                                                                        
/STEPHEN S HONG/Supervisory Patent Examiner, Art Unit 2178