DETAILED ACTION
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
2.	This action is in response to amendment filed on 7/5/2022, in which claims 1, 10, and 19 was amended, and claims 1 – 20 was presented for further examination.
3.	Claims 1 – 20 are now pending in the application.

Continued Examination
4.	A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 7/5/2022 has been entered.
 
Response to Arguments
5.	Applicant’s arguments with respect to claims 1 - 20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.




Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

6.	Claims 1 - 20 are rejected under 35 U.S.C. 103 as being unpatentable over Takayama et al (US 2010/0030801 A1), in view of Liensberger et al (US 2013/0091138 A1) in view of Zhang et al (US 20200311077 A1), and further in view of Roy et al (US 2022/0019741 A1).
As per claim 1, Takayama et al (US 2010/0030801 A1) discloses,
A method comprising: retrieving data from a data set (para.[0052]; “list generating unit 10 imports database catalog which will be inputted. The list generating unit 10 inputs the database catalog” where importing database catalog is “retrieving data from a data set” as claimed).
wherein the data organized in a plurality of columns and for each column in the plurality of columns (para.[0052]; “generating unit 10 inputs the database catalog and a column correspondence table”).
generating one or more candidate semantic categories for that column (para.[0053]; “semantic classifying unit 20 classifies tables by the semantic classification” and para.[0055]; “semantic classifying unit 20 generates the table semantic classification table 21 in which the tables are classified into a plurality of groups based on whether the columns located near the top are similar or not”).
	Takayama does not specifically disclose wherein each of the one or more candidate semantic categories has a corresponding probability.
	However, Liensberger et al (US 2013/0091138 A1) in an analogous art discloses,
wherein each of the one or more candidate semantic categories has a corresponding probability (para.[0079]; “assigns at least one semantic categorization 210 in the form of a likelihood 240; "likelihood" and "semantic categorization" ……… a likelihood 240 is a set of one or more semantic categorizations 210 associated with a set of one or more data values, including both absolute categorizations and probability categorizations”).
	Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate semantic categorization of data through assignment of probability value to the data of the system of Liensberger into classification of database table of the system of Takayama for proper integration and finding relationship among data values which improve the productive use of dataset.  
	Neither Takayama nor Liensberger specifically disclose creating a feature vector for that column from the one or more candidate semantic categories and the corresponding probabilities and selecting a column semantic category from the one or more candidate semantic categories using the feature vector that is input to a trained machine learning model.
	However, Zhang et al (US 20200311077 A1) in an analogous art discloses,
creating a feature vector for that column from the one or more candidate semantic categories and the corresponding probabilities (para.[0012]; “semantic content may be represented in the form of semantic vectors in a semantic vector space” and para.[0014]; “generate semantic vectors representing images. …… the neural network generates semantic vectors by supplying input images to the neural network (e.g., in the form of vectors representing rows or columns of images”)
and selecting a column semantic category from the one or more candidate semantic categories using the feature vector that is input to a trained machine learning model and a threshold for the column (para.[0003]; “cluster is selected from a plurality of different candidate clusters” and para.[0014]; “semantic vectors may be generated by any suitable artificial intelligence (AI) and/or machine learning (ML) mode …….. semantic vectors may be generated by operating one or more trained models with regard to input semantic values”).
	Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate identification of relationship between data in a tables of the system of Zhang into the combine teaching of Takayama and Liensberger for proper identification of table context data thereby improve the quality of semantic category.
	Neither Takayama nor Liensberger nor Zhang specifically disclose selecting a column semantic category from the one or more candidate semantic categories using the feature vector that is an input to a trained machine learning model and a threshold for the column, wherein the selection includes selecting the sematic category with a sematic category that has an associated probability above a corresponding threshold, the trained machine learning model outputs a label representing the column semantic category for this column, and the thresholds for the plurality of columns are encoded in the machine learning model.
	However, Roy et al (US 2022/0019741 A1) in an analogous art discloses,
and selecting a column semantic category from the one or more candidate semantic categories using the feature vector that is an input to a trained machine learning model and a threshold for the column (para.[0003]; “processing the input vector-based representation using a trained supervised machine learning model to generate the categorization based at least in part on the input vector-based
representation”).
wherein the selection includes selecting the sematic category with a sematic category that has an associated probability above a corresponding threshold (para.[0088]; “select d candidate text categorizations having the highest normalized probability values as the text categorization for the input document”).
the trained machine learning model outputs a label representing the column semantic category for this column (para.[0082]; “semantic-categorization-determining supervised machine learning models may have other architectures and semantic-categorization-determining unsupervised machine learning models” and para.[0087]; “machine learning model used to generate text categorizations of input documents”).
and the thresholds for the plurality of columns are encoded in the machine learning model (para.[0089]; “categorization recommendation user interface 1100 displays a list of top recommended text categorizations for each sentence of a document, along with a probability score for each recommendation”).
Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate assignment of semantic label to cluster of the system of Roy into the combine teaching of Takayama, Liensberger, and Zhang for performing  text categorization based on semantic data  to represent semantic data in form that accurately  provide relevant information for identifying the data.

As per claim 2, the rejection of claim 1 is incorporated and further Takayama et al (US 2010/0030801 A1) discloses,
wherein there are a plurality of candidate semantic categories for at least one column (para.[0089]; “semantic classifying unit 20 classifies tables including the top some (a prescribed number of) columns of the same column type, size, and scale into the same group”).  

As per claim 3, the rejection of claim 1 is incorporated and further Takayama et al (US 2010/0030801 A1) discloses,
further comprising: determining a semantic category type for each of the plurality of columns based on at least the column semantic category for that column (para.[0089]; “semantic classifying unit 20 classifies tables including the top some (a prescribed number of) columns of the same column type, size, and scale into the same group”).  
  
As per claim 4, the rejection of claim 3 is incorporated and further Takayama et al (US 2010/0030801 A1) discloses,
wherein a semantic category type is selected from the group consisting of an identifier, quasi-identifier, and sensitive (para.[0069]; “column node 112 includes the node identification number 113, the node name 114, the node type 115, a column type  ……..The node type 115 stores an identifier to identify a column”).  

As per claim 5, the rejection of claim 3 is incorporated and further Zhang et al (US 20200311077 A1) discloses,
further comprising: anonymizing the data using the semantic category types for each of the plurality of columns (fig.2 #204; “FIND SELECTED CLUSTER FROM PLURALITY OF DIFFERENT CANDIDATE CLUSTERS, EACH CANDIDATE CLUSTER INCLUDING PLURALITY OF COMPRESSED ANSWER VECTORS”). 
	Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate identification of relationship between data in a tables of the system of Zhang into the combine teaching of Takayama, Liensberger, and Oberbreckling for protecting the column data that contains sensitive information while providing access to the rest of data.
 
As per claim 6, the rejection of claim 1 is incorporated and further Liensberger et al (US 2013/0091138 A1) discloses,
wherein generation of one or more candidate semantic categories comprises: generating the probability for each of the plurality of one or more candidate semantic categories (para.[0079]; “assigns at least one semantic categorization 210 in the form of a likelihood 240; "likelihood" and "semantic categorization" ……… a likelihood 240 is a set of one or more semantic categorizations 210 associated with a set of one or more data values, including both absolute categorizations and probability categorizations”).
	Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate semantic categorization of data through assignment of probability value to the data of the system of Liensberger into Takayama, Zhang, and Roy for proper integration and finding relationship among data values which improve the productive use of dataset.  

As per claim 7, the rejection of claim 6 is incorporated and further Liensberger et al (US 2013/0091138 A1) discloses,
wherein the generation of the probabilities comprises: selecting a column from the plurality of columns; applying a bloom filter with a potential semantic category to the data of that column; and computing a probability based on a set of results from the application of the bloom filter to the column data (para.[0079]; “a column of data from a spreadsheet may be assigned 314 an 80% probability of being individual-name categorization data and 20% probability of being business-name categorization data, ….. Zero probability may be assigned as a mechanism for ruling out a particular categorization 210, and "unknown" or "unassigned" may be used as a placeholder categorization”).
Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate semantic categorization of data through assignment of probability value to the data of the system of Liensberger into Takayama, Zhang, and Roy for proper integration and finding relationship among data values which improve the productive use of dataset.  

As per claim 8, the rejection of claim 6 is incorporated and further Zhang et al (US 20200311077 A1) discloses,
wherein the bloom filter is one of a whitelist bloom filter and a blacklist bloom filter (para.[0070]; “Machines may be implemented using any suitable combination of state-of-the-art and/or future machine learning (ML), artificial intelligence (AI) ……. associative memories (e.g., lookup tables, hash tables, Bloom Filters, Neural Turing Machine and/or Neural Random Access Memory) …… Markov random fields, (hidden) conditional random fields”).  
	Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate identification of relationship between data in a tables of the system of Zhang into the combine teaching of Takayama, Liensberger, and Roy for reducing large computational burden of searching plurality of vectors. 

As per claim 9, the rejection of claim 1 is incorporated and further Zhang et al (US 20200311077 A1) discloses,
wherein the trained machine learning model is a random forest trained machine learning model (para.[0070]; “Machines may be implemented using any suitable combination of state-of-the-art and/or future machine learning (ML), artificial intelligence (AI) ……. associative memories (e.g., lookup tables, hash tables, Bloom Filters, Neural Turing Machine and/or Neural Random Access Memory) …… Markov random fields, (hidden) conditional random fields”).
	Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to incorporate identification of relationship between data in a tables of the system of Zhang into the combine teaching of Takayama, Liensberger, and Roy for reducing large computational burden of searching plurality of vectors.

Claims 10 – 18 are system claim corresponding to method claims 1 – 9 respectively, and rejected under the same reason set forth in connection to the rejection of claims 1 – 9 respectively above.

Claims 19 – 20 are non-transitory machine-readable medium claim corresponding to method claims 1 and 3 respectively, and rejected under the same reason set forth in connection to the rejection of claims 1 and 3 respectively above.

 Conclusion 
7.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
TITLE: Methods  and apparatus for classifying text and for building a text classifier, 
EP 1090365 B1  authors: Dumais et al.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AUGUSTINE K. OBISESAN whose telephone number is (571)272-2020. The examiner can normally be reached Monday - Friday 8:30am - 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached on 571-272-4241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/AUGUSTINE K. OBISESAN/
Primary Examiner
Art Unit 2156



7/29/2022