Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 6 October 2022 has been entered.
 
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1, 11, and 16 rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claims contain subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
The claim contains the language “building a custom dictionary unique to a user.” Examiner cannot find support for the idea that a custom dictionary is unique to a user in the specification and requests Applicant to provide a citation supporting this claimed feature. 

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The independent claims recite building a dictionary by identifying data structures of a database corresponding to business terms associated with the user, receiving a set of business terms, generating an analysis to identify or recognize attribute values, and analyzing and automatically extracting values of an unstructured document in view of the business terms. The subject matter of the claim is largely directed towards analyzing existing data, building a module, and using the module to analyze additional data. This appears to be a mental process because the idea could be performed by a human being with pen and paper or a generic machine.
This judicial exception is not integrated into a practical application. The claimed subject matter of the independent claims does not appear to improve the processing of a computer. The hardware elements of claims 11 (computer readable storage media) and 16 (computer processors and computer readable storage media) appear to be generic computing elements. Thus, no particular machinery is used to implement the claimed steps. There also does not appear to be any transformation or reduction of a particular article to a different state or thing. The claims simply analyze data, build a module, and use the module to analyze and extract additional data. There is no claimed use for the extracted data. Thus, the claims do not appear to be integrated into a practical application. 
The claims also do not include additional elements that are sufficient to amount to significantly more than the judicial exception because, as noted above, there is no improvement to the functioning of a computer or use of a particular machine. Additionally, there is no transformation or reduction of an article to a different state or thing. 
The dependent claims 2-10, 12-15, and 16-20 also appear to be directed to a mental process for similar reasons. The additional limitations of the dependent claims appear to be merely additional analysis steps or data requirements. These additional limitations do not incorporate the claimed invention into a practical application nor provide elements that are substantially more than the abstract idea. 
As such, claims 1-20 are rejected under 35 USC 101. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5 and 7-20 are rejected under 35 U.S.C. 103 as being unpatentable over Murthy (“Exploiting Evidence from Unstructured Data to Enhance Master Data Management,” cited in Applicant’s IDS provided on 14 October 2020) in view of Agrawal et al. (US Pre-Grant Publication 2013/0312107), in view of Bowen et al. (US Patent 6,094,649). 

As to claim 1, Murthy teaches a computer-implemented method comprising:
extracting structured information from an unstructured document (see page 1863, left column, and discussion of Figures 2 and 3. Specifically, based on existing data information in the structured database, new records can be extracted from unstructured data and linked to the existing records)
wherein extracting structured information comprises:
… identifying [data structures] of a database that correspond to business terms of a business glossary … (see page 1865, section 2.2.2. The EUTC (Extension for Unstructured Text Correlation) extracts the data model for all members of interest from the master data management system. For each attribute, it extracts a dictionary with all values from the attribute. This dictionary is a “business glossary.” The members of interest correspond to the attributes of the business glossary); 
generating an analysis module based on the identified [data structures] that enables to identify attribute values of attributes of the tables and columns (see page 1865 and sections 2.2.2 and 2.2.4. By building and using the existing dictionaries, and other types of analyses, to extract data, an “analysis module” is generated based on the identified information from the data model); and 
automatically extracting values for attributes from the unstructured document based on a relevance of the extracted values for attributes of the unstructured document to business terms of the business glossary listed in the built custom dictionary that comprise the identified [data structures] using the generated analysis module (see page 1865-1867, section 2.2.4 and 2.3. As shown in the example analysis, an unstructured document is received. Terms from the unstructured document are extracted based on fuzzy matching to entries in dictionary. Fuzzy matching is used to establish relevance. Section 3.1 explicitly discusses how relevance of extracted terms from a document is identified). 
Murthy does not explicitly show: 
Building a custom dictionary unique to a user by identifying tables and columns of a database that correspond to business terms of a business glossary associated with the user;
generating an analysis module based on the identified tables and columns that enables to identify attribute values of attributes of the tables and columns;
Agrawal teaches: 
identifying tables and columns of a database that correspond to business terms of a business glossary (see paragraphs [0032]-[0034]. The dictionaries are generated based on identified tables and columns, [0034]. The dictionaries are business glossaries because they contain potentially sensitive words, see paragraph [0007]. It is noted that there is no claimed definition for “business” terms or “business” glossary. Thus, the use of the designation of the terms being “business terms” is non-functional descriptive material and receives no patentable weight. Additionally, even if the terms were claimed as specific variables as described in the arguments of 6 April 2022, (such as “product names,” “custom names,” “product number,” etc.), the claims do not perform differently in response to particular types of words. Without a claimed functional difference, the business terms of the claims are obvious in view of the terms of Agrawal);
generating an analysis module based on the identified tables and columns that enables to identify or recognize attribute values of attributes of the tables and columns (see paragraphs [0032]-[0034]. The document is analyzed based on the identified tables and columns);
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to have modified Murthy by the teachings of Agrawal because both references are directed to identifying and extracting data from unstructured data sources based on known information from a database. Agrawal provides Murthy a benefit of being able to rely on structured data tables, a common data format, to identify and build a business dictionary for data recognition and extraction. 
Bowen teaches: 
Building a custom dictionary unique to a user by identifying tables and columns of a database that correspond to business terms of a business glossary associated with the user (see 8:66-9:14. A user may submit a request to determine the data the user wishes to analyze. From this, a custom dictionary may be built by analyzing an existing schema for a database to identify the database keys associated with the user request. As noted in 9:24-37, location identifiers may include table and column names). 
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to have modified Murthy by the teachings of Bowen because both references are directed to identifying and extracting data from unstructured data sources based on known information from a database. Bowen provides Murthy a benefit of allowing users to submit specific data definitions, which will allow a user to customize any search and extraction. 

As to claim 2, Murthy as modified by Agrawal teaches the computer-implemented method of claim 1, wherein identifying of the tables and columns comprises: 
for each term of a plurality of business terms, determining an identification logic based on a format and content of a respective business term (see Murthy 2.2.2. The system extracts a data model, then, for each atomic attribute, extracts a dictionary with all distinct values for that attribute); and 
running the identification logics on the database for identifying the tables and columns (see Agrawal paragraphs [0032]-[0034]).  

As to claim 3, Murthy as modified by Agrawal teaches the computer-implemented method of claim 1, wherein generating the analysis module comprises: 
building a dictionary of the plurality of business terms using attribute values of the identified tables and columns, wherein using the analysis module to extract the structured information comprises comparing content of the unstructured document with the dictionary (see Agrawal paragraphs [0032]-[0034]).  

As to claim 4, Murthy as modified teaches the computer-implemented method of claim 1, wherein generating the analysis module comprises: 
P201909238US01Page 20 of 26building a logic based on the content and format of the attribute values of the identified tables and columns such that the logic can recognize values similar to the attribute values (see page 1865 and sections 2.2.2 and 2.2.4. The system of Murthy is able to recognize values similar to the dictionary entries using fuzzy matching).  

As to claim 5, Murthy as modified teaches the computer-implemented method of claim 1, further comprising: 
updating the analysis module based on one or more changes in the database and the business glossary (see Murthy page 1865 and section 2.2.2. After the system is set up, dictionaries may be automatically updated whenever there is new content introduced into the MDM), and 
continually updating the analysis module for extraction of structured information from the unstructured document and/or from another unstructured document (see Murthy page 1865 and section 2.2.2. By updating the dictionary, the analysis module that uses the dictionary will have updated information to use, and thus be updated).  

As to claim 7, Murthy as modified teaches the computer-implemented method of claim 1, wherein the extraction of structured information comprises: 
identifying the values of the attributes in the unstructured documents that correspond to attribute values of the identified tables and columns (see Murthy page 1865 and section 2.2.4. Also see pages 1866-1867, section 2.3, which show an example of identifying values in unstructured documents); and 
forming from the attribute values, records associated with respective entities in accordance with the entities of identified records (see pages 1866-1867 and section 2.3. The analysis module inserts two new records into MDS that have a relationship with other records).  

As to claim 8, Murthy as modified teaches the computer-implemented method of claim 7, further comprising: 
repeating the computer-implemented method for a further unstructured document wherein the identification of the tables and columns is performed in the database and in the formed records (see Murthy page 1864, right column. Multiple documents may be analyzed, including news reports, emails, or other reports).  

As to claim 9, Murthy as modified teaches the computer-implemented method of claim 1, the analysis module is a plugin (see Murthy page 1866 and section 2.2.4. Specialized annotators may be plugged into the analysis configuration. Alternatively, the EUTC analysis itself is built on top of an existing framework).  

As to claim 10, Murthy as modified teaches the computer-implemented method of claim 1, wherein the database is a master data management (MDM) database (see Murthy Abstract and Introduction. The database of Murthy is explicitly listed as a master data management system. Also see page 1864, section 2.2.1).

As to claims 11 and 16, see the rejection of claim 1. 
As to claims 12 and 17, see the rejection of claim 2.
As to claims 13 and 18, see the rejection of claim 3. 
As to claims 14 and 19, see the rejection of claim 4. 
As to claims 15 and 20, see the rejection of claim 5. 

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Murthy (“Exploiting Evidence from Unstructured Data to Enhance Master Data Management”) in view of Agrawal et al. (US Pre-Grant Publication 2013/0312107), in view of Bowen et al. (US Patent 6,094,649), further in view of Haskell et al. (US Pre-Grant Publication 2003/0233251). 

As to claim 6, Murthy as modified teaches the computer-implemented method of claim 5.
Murthy as modified does not clearly teach wherein the update is performed if a number of changes is higher than a threshold.
Haskell as modified teaches wherein the update is performed if a number of changes is higher than a threshold (see paragraph [0035]. The dictionary is updated with terms based on a number of changes since a pervious update passing a threshold). 
It would have been obvious to one of ordinary skill in the art at the time the invention was filed to have modified Murthy by the teachings of Haskell because Haskell provides Murthy the benefit of ensuring that a dictionary remains updated when new changes occur. This will ensure that Murthy continues to extract accurate and relevant data. 

Response to Arguments
Applicant's arguments filed 7 September 2022 have been fully considered but they are not persuasive. 

Applicant argues that “in the present instance, it is urged that independent claim 1 when viewed overall, clearly recites an improvement in the technology of a user’s computer system.” Applicant then cites paragraph [0007] from the specification, and states that “embodiments of the present invention extracts structured information from unstructured documents which reduces resources used to evaluated unstructured documents by building a unique dictionary, stored in databases, that is automatically updated to be kept current. In other words, Applicant’s claimed invention provides an unconventional improvement in the technology of computer.” 
In response to this argument, Examiner can see no mention in the specification of a unique dictionary. Additionally, there is no claimed idea of automatically updating the dictionary to be kept current, nor details of such automating would work or a definition for “current.” Because Applicant argues that these features are where the improvement can be found and because these features are either unclaimed or not present in the specification, Applicant’s argument is unpersuasive. Applicant is reminded that unclaimed subject matter from the specification, such as updating a dictionary, receives no patentable weight until claimed.  
Additionally, in view of paragraph [0007], it is noted all benefits are described as conditional and only “may” occur. Any benefits that might potentially improve the processing of a computer are not claimed. Particularly, there is no subsequent claimed step of using the extracted data in a meaningful way that benefits from the data being preprocessed. The data is simply extracted, at which point the method ends. Additionally, there is no claimed step describing how the “number of unstructured documents to be analyzed is constantly increasing.” Thus, any benefits from using the claimed invention are not realized within the scope of the claim language. 

	Applicant argues that “”the human mind is not equipped” to crawl online sources for electronic documents, to crawl databases, or maintain records at the speed and efficiency of a computing system purpose built for managing records. The human mind cannot, by itself, access the Internet (to access, a web page, a document embedded in a web page and that may be rendered in the web page, spreadsheet, email, book, picture, and presentation that have an associated user agent such as a document reader, editor or media player); therefore, the claims which extract unstructured information cannot possibly be performed by a human with pen and paper.”
In response to this argument, as noted in MPEP 2106.04(a)(2) III C, “claims can recite a mental process even if they are claimed as being performed on a computer. 

The Supreme Court recognized this in Benson, determining that a mathematical algorithm for converting binary coded decimal to pure binary within a computer’s shift register was an abstract idea. The Court concluded that the algorithm could be performed purely mentally even though the claimed procedures "can be carried out in existing computers long in use, no new machinery being necessary." 409 U.S at 67, 175 USPQ at 675. See also Mortgage Grader, 811 F.3d at 1324, 117 USPQ2d at 1699 (concluding that concept of "anonymous loan shopping" recited in a computer system claim is an abstract idea because it could be "performed by humans without a computer").”

MPEP 2106.04(a)(2) III C 1-3 further elaborate on the idea that a claim may still be directed towards an abstract idea despite the use of a generic machine. Thus, though the claims may not be performed in wholly in a human mind, the building, generating, and extracting steps may be performed by a human with a generic computer. 
It is noted that the claimed limitations are directed towards building a dictionary by identifying tables and columns of a database. A human, equipped with a generic computer, is capable of this data analysis and building. 
The claims then generate an analysis module based on the identified tables and columns. A human, equipped with a generic computer, is capable of performing this data generation step. 
The claims finish by extracting values for attributes from unstructured documents. A human, equipped with a generic computer and following the steps of the analysis module, is capable of extracting data from unstructured documents. 
Because all of the claimed steps are directed towards data analysis and generation and may be performed by a human being equipped with a generic computer, the claims are directed towards a mental process. 
Examiner additionally notes that the details Applicant recited – such as crawling online sources, crawling databases, a computing system purpose built for managing records, accessing the Internet (including a webpage, document embedded in a web page that may be rendered in the web page, spreadsheet, email, book, picture,  and presentation using an associated user agent such as a document reader, editor, or media player) - all remain unclaimed. Examiner reminds Applicant that unclaimed details and features of the specification have no patentable weight until claimed. 

	Applicant argues that the claims “clearly [include] additional elements that amount to significantly more than the judicial exception. The claims considered as a whole integrate any recited judicial into a practical application of that exception that is directed to an improvement of existing technology, transforming an article of a particular state to a different state or thing, and applying, relying on, or using any said judicial exception in a manner that imposes a meaningful limit on the judicial exception, such that the claims are more than a drafting effort designed to monopolize said judicial exception.” 
	In response to this assertion, Examiner notes that all of the claimed features appear to be directed towards data generation and analysis steps followed by data extraction steps. Examiner can see no additional elements (and Applicant has identified no additional elements) that, individually or as a whole, amount to significantly more than the abstract idea. As noted above, there does not appear to be an improvement to existing technology at least because Applicant’s argued improvement relies upon unclaimed features from the specification. Therefore, the claims remain rejected under 35 USC 101 for being directed towards a mental process of data analysis, generation, and extraction. 


	Examiner finds Applicants arguments directed to Murthy not teaching “building a custom dictionary unique to a user” persuasive and has added the reference Bowen to teach this subject matter. 
However, Examiner finds Applicants arguments directed towards the “automatically extracting…” step of the independent claims unpersuasive. 

Applicant argues that “Applicant asserts that Murphy's extraction differs in terms of its methodology and function. Specifically, Murphy extracts missing evidence to merge two records from unstructured data by comparing corresponding attribute values and if the matching score is above a certain threshold, MDM automatically merges the two records into a single entity. If the evidence that a record is same is not high enough a manual merge is required. In contrast, Applicant's claimed invention focuses not on merging two records (to prevent duplication of entries as disclosed in Murphy) but extracting attribute information (e.g., entities like product names, custom names, product number, etc. to identify entities relevant for GDPR or other scenarios) from a second record (e.g., an unstructured document) that is relevant to attributes of the first record (e.g., a built custom dictionary that includes business terms of a business glossary).”
In response to this argument, while Murthy might contain additional teachings that the inventive concept of claimed invention does not, the validity of the rejection is based on whether Murthy teaches the claimed subject matter as written. Examiner notes that the features applicant discusses (“extracting attribute information (e.g., entities like product names, custom names, product number, etc. to identify entities relevant for GDPR or other scenarios)”) simply are not claimed in such detail. Applicant is reminded that unclaimed features of the invention have no patentable weight until claimed. 
Thus, Murthy, in view of Agrawal and Bowen, teaches the claims as written for the reasons provided in the rejection above. 

Applicant argues that “Combining Agrawal with Murphy would yield different results. As argued above, Murphy discloses attribute identification for document merging, that is, merging two records to make a complete record for a single entity. Combining Agrawal would provide the feature to then control (i.e., limit) access to portions of the merged document/record (e.g., provided by Agrawal) based on classifications of text matching "sensitive words" (see e.g., Agrawal paragraphs [0003-0009]). In contrast, Applicant's claimed invention requires "building a custom dictionary unique to a user by identifying tables and columns of a database that correspond to business terms of a business glossary associated with the user"; and "automatically extracting values for attributes from the unstructured document based on a relevance of the extracted values for attributes of the unstructured document to business terms of the business glossary in the built custom dictionary using the generated analysis module".”
In response to this argument, it is noted that while the combination of Murthy and Agrawal may teach additional subject beyond the currently claimed invention, the combination still does teach the claimed invention as written for the reasons provided in the rejection above. 

Applicant argues that “Furthermore, the extraction reference in Agrawal differs. In paragraphs [0032-0034], Agrawal details "identifying" specific terms that match to structured data represented in attribute columns of a data entity table (e.g., whether a term matches a term contained in a database, paragraph [0033]. In contrast, Applicant requires automatically extracting values for attributes from the unstructured document based on a relevance of the extracted values for attributes of the unstructured document to business terms. While Applicant can acknowledge that Agrawal can account for typographical errors, Applicant contends that is not functionally equivalent to determining relevance of an attribute identified from an unstructured document to terms of a built custom dictionary.” Applicant then cites paragraph [0039] of Applicant’s specification for context. 
In response to this argument, Applicant’s specification indicates that the claimed “relevancy” claimed is nothing more than identifying documents that match the dictionary (see paragraphs [0007] and [0038]). Examiner can find no detailed discussions regarding how relevancy is measured or determined, let alone any additional claimed details. 
Paragraph [0039] of Applicant’s specification cites several unclaimed examples, such as where an identified attribute “employing company” is used to determine whether a value is a value of the attribute “employing company.” These examples consider whether the attribute “employing company” is a column and consider the values exist that in that column. None of these details are claimed. Applicant is reminded that unclaimed limitations from the specification – such as those described in paragraph [0039] – receive no patentable weight until claimed. 
Regardless, Murthy is relied upon to teach the claimed extraction and does so for the reasons provided in the rejection above. 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHARLES D ADAMS whose telephone number is (571)272-3938. The examiner can normally be reached M-F, 9-5:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Neveen Abel-Jalil can be reached on 571-270-0474. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CHARLES D ADAMS/           Primary Examiner, Art Unit 2152