EXAMINER'S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.

The application has been amended as follows: 

1. (Currently Amended) A method comprising:
storing, by one or more input data sources, one or more reference data sets;
extracting, by an ingest engine of a cloud computing infrastructure system, a first data set from the one or more reference data sets stored in the one or more input data sources;
receiving, by an input unit, a second data set from a user;
calculating, by a similarity metric module, a similarity metric value between a second subset of data of the second data set and a first subset of data of the first data set, wherein calculating the similarity metric value comprises:
determining a ratio of a size of an intersection between the second data set and one or more categories of the first data set and a size of a union of data in the second data set and the one or more categories of the first data set;
in response to the similarity metric value being a predetermined ratio, determining a category name that identifies a category type of the one or more categories of the first data set extracted from the one or more reference data sets stored in the one or more input data sources, wherein the category type comprises a knowledge domain for the first subset of data extracted from the one or more reference data sets;

recommending, by the recommendation engine, the determined category name to the user on a user interface for the second subset of data of the second data set.

Claims 2 – 7 (as submitted on December 1, 2021)

8. (Currently Amended) A data enrichment system comprising:
one or more input data sources configured to store one or more reference data sets; and
a cloud computing infrastructure system comprising:
an ingest engine that extracts a first data set from the one or more reference data sets stored in the one or more input data sources;
an input unit configured to receive a second data set from a user;
a similarity metric module, comprising a processor and a memory, configured to:
calculate a similarity metric value between a second subset of data of the second data set and a first subset of data of the first data set, wherein calculating the similarity metric value comprises:
determining a ratio of a size of an intersection between the second data set and one or more categories of the first data set and a size of a union of data in the second data set and the one or more categories of the first data set; and
in response to the similarity metric value being a predetermined ratio, determining a category name that identifies a category type of the first subset of data of the first data set extracted from the one or more reference data sets stored in the one or more input data sources, wherein the category type comprises a knowledge 
a recommendation engine configured to obtain, from the similarity metric module, the determined category name of the first subset of data of the first data set and recommending the determined category name to the user on a user interface for the second subset of data of the second data set.

Claims 9 – 14 (As submitted on December 1, 2021)

15. (Currently Amended) A non-transitory computer-readable medium comprising instructions which, when executed by one or more processors, causes the one or more processors to:
store, by one or more input data sources, one or more reference data sets;
extract, by an ingest engine of a cloud computing infrastructure system, a first data set from the one or more reference data sets stored in the one or more input data sources;
receive, by an input unit, a second data set from a user;
calculate, by a similarity metric module, a similarity metric value between a second subset of data of the second data set and a first subset of data of the first data set, wherein calculating the similarity metric value comprises:
determining a ratio of a size of an intersection between the second data set and one or more categories of the first data set and a size of a union of data in the second data set and the one or more categories of the first data set;
in response to the similarity metric value being a predetermined ratio, determine a category name that identifies a category type of the first subset of data of the first data set extracted from the one or more reference data sets stored in the one or more input 
obtain, by a recommendation engine from the similarity metric module, the determined category name of the first subset of data of the first data set; and
recommend, by the recommendation engine, the determined category name to the user on a user interface for the second subset of data of the second data set.

Claims 16 – 21 (As submitted on December 1, 2021)


Reasons for Allowance
The following is an examiner’s statement of reasons for allowance:

Examiner acknowledges applicants’ reply dated December 1, 2021, including arguments and amendments. The amendments include the cancellation of claim 20 and addition of new claim 21.

Claims 1 – 20 were previously rejected under 35 USC 103 over the combination of Hudis, Liu, and Neal. The most recent amendments add limitations to the independent claims that successfully overcome the prior art of record. Specifically, the combination of references do not adequately anticipate or render obvious the features of determining a ratio of a size of an intersection between the second data set and the one or more categories of the first data set and a size of a union of data in the second data set and the one or more categories of1 the first data set; or wherein the category type comprises a knowledge domain for the first subset of data extracted from the one or more reference data sets. The 103 rejection is hereby withdrawn.



The most relevant reference, Charikar (U.S. Pat. No. 7,158,961) at col. 1, lines 35 – 47, discloses the feature of determining the similarity of two sets by calculating the ratio between the intersect and the union of the sets; however, Charikar is silent regarding the use of knowledge domains in informing category types.

For these reasons, claims 1 – 19 and 21 are allowed.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NIRAV K KHAKHAR whose telephone number is (571)270-1004. The examiner can normally be reached Monday through Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Robert W Beausoliel, Jr. can be reached on 571-272-3645. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.






/NIRAV K KHAKHAR/Examiner, Art Unit 2167     

/ROBERT W BEAUSOLIEL JR/Supervisory Patent Examiner, Art Unit 2167                                                                                                                                                                                                                                                                                                                                                                                                           


    
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 See Examiner’s amendment