DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 2-5, 9-12, and 16-19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. Claims 2, 9, and 16 recite “wherein the user dataset comprises a first user term and a second user term, the method further comprising: creating a first one of the plurality of user dataset entries that assigns the first term as a parent to the second term; creating a second one of the plurality of user dataset entries that assigns the second term as a parent to the first term; and applying a set of lexical relations feature learning algorithms to the first user dataset entry and second user data set entry to generate a first set of user dataset feature learning results.” These limitations require that the first term be both parent and child to the second term, and vice versa. 
As these claims are dependent on Claims 1, 8, and 15, these claims also include all of the limitations of the claims upon which they depend, therefore Claims 2, 9, and 16 also require “generating a second set of unidirectional associations between a plurality of user dataset entries included in a user dataset” and “building a hierarchical relationship of the user dataset based on the second set of unidirectional associations.” Imposing a unidirectional restriction on the term associations, as recited in the claims, requires that a child tag cannot also be a parent of the same tag, otherwise a circular association is formed and the location of each term within the required hierarchical relationship cannot be determined. It is unclear how the required unidirectional associations and hierarchical relationship can be determined when the first term is simultaneously parent and child to the second term, therefore the claims are indefinite.
Claims 2-5, 10-12, and 17-19 are rejected for the same reasons by virtue of their dependency on Claims 2, 9, and 16.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-2, 4, 7-9, 11, 14-16, and 18 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Chavez et al (US 2017/0228438).
Regarding Claim 1, Chavez teaches a method (Fig. 5) implemented by an information handling system that includes a memory and a processor ([0015-0016]), the method comprising: 
training a machine learning model using a reference dataset comprising a plurality of reference entries, wherein the machine learning model learns a first set of unidirectional associations between the plurality of reference entries ([0042-0043], Fig. 5, at 510 custom taxonomy classifier 300 is built to describe a way in which terms are indicative of each other using word embeddings, knowledge graphs, parent/child relationships, large scale word co-occurrence, etc., to arrive at an indicative measurement of terms, such as terse definitions of where “a” indicates “b”, custom taxonomy classifier 300 is able to take two arbitrary terms and relate them to each other because it learned the definition between terms by learning the rules whereby they are related (~parent-child relationship is unidirectional)); 
generating a second set of unidirectional associations between a plurality of user dataset entries included in a user dataset in response to inputting the user dataset into the trained machine learning model ([0043], Fig. 5, at 520 the process receives a user's custom taxonomy (user-defined categories) and stores the custom taxonomy); 
building a hierarchical relationship ([0002], taxonomy may include relationship schemes (e.g., parent-child relationships), an organization of items into groups, hierarchically organized terms, etc) of the user dataset based on the second set of unidirectional associations ([0043], Fig. 5, at step 530, the process receives the user's mappings that maps the user's custom categories to pre-learned terms, steps 520 and 530 may be combined into a single step whereby the user provides a complete taxonomy definition file that includes the custom taxonomy and mappings); and 
managing the user dataset based on the hierarchical relationship ([0044], Fig. 6, showing steps taken to categorize input data based upon a user's custom taxonomy).
Regarding Claim 2, Chavez teaches all aspects of the claimed invention as disclosed in Claim 1 above. Chavez further teaches wherein the user dataset comprises a first user term and a second user term, the method further comprising: creating a first one of the plurality of user dataset entries that assigns the first term as a parent to the second term; creating a second one of the plurality of user dataset entries that assigns the second term as a parent to the first term ([0043], Fig. 5, at 520 the process receives a user's custom taxonomy (user-defined categories) and stores the custom taxonomy, at step 530, the process receives the user's mappings that maps the user's custom categories to pre-learned terms [0002], taxonomy may include relationship schemes (e.g., parent-child relationships), an organization of items into groups, hierarchically organized terms, etc (~user may define any number of mappings indicating relationships between terms)); and applying a set of lexical relations feature learning algorithms to the first user dataset entry and second user data set entry to generate a first set of user dataset feature learning results ([0045-0046], process uses pre-defined relationships between the pre-learned terms to compute measurements of how well the text matches to each taxonomy definition, the primary metrics to determine if terms are related are how often terms co-occur in a large web corpus, how similar terms are based on a pre-existing neural network trained to learn word similarity, and whether the terms are related in a pre-existing knowledge graph that stores millions of “parent⇄child” relationships, at 640 the process categorizes the text according to the user-defined categories).
Regarding Claim 4, Chavez teaches all aspects of the claimed invention as disclosed in Claim 2 above. Chavez further teaches wherein at least one of the set of lexical relations feature learning algorithms is selected from a group consisting of a hypernym feature learning algorithm, a hyponym feature learning algorithm, a holonym feature learning algorithm, and a meronym feature learning algorithm ([0002], [0042-0043], [0045-0046], parent-child relationships disclose hypernym/hyponym relationship applied to learning algorithms).
Regarding Claim 7, Chavez teaches all aspects of the claimed invention as disclosed in Claim 1 above. Chavez further teaches wherein the user dataset is devoid of classification information and is also devoid of data association information prior to the generating of the second set of unidirectional associations ([0043], Fig. 5, at 520 the process receives a user's custom taxonomy (user-defined categories) and stores the custom taxonomy, at step 530, the process receives the user's mappings that maps the user's custom categories to pre-learned terms (~not until the mappings are received in step 530 is the association information created, step 520 is simply user-defined categories)).
Regarding Claim 8, Chavez teaches an information handling system comprising: one or more processors; a memory coupled to at least one of the processors; a set of computer program instructions stored in the memory and executed by at least one of the processors ([0015-0016]) in order to perform actions (Fig. 5) of: 
training a machine learning model using a reference dataset comprising a plurality of reference entries, wherein the machine learning model learns a first set of unidirectional associations between the plurality of reference entries ([0042-0043], Fig. 5, at 510 custom taxonomy classifier 300 is built to describe a way in which terms are indicative of each other using word embeddings, knowledge graphs, parent/child relationships, large scale word co-occurrence, etc., to arrive at an indicative measurement of terms, such as terse definitions of where “a” indicates “b”, custom taxonomy classifier 300 is able to take two arbitrary terms and relate them to each other because it learned the definition between terms by learning the rules whereby they are related (~parent-child relationship is unidirectional)); 
generating a second set of unidirectional associations between a plurality of user dataset entries included in a user dataset in response to inputting the user dataset into the trained machine learning model ([0043], Fig. 5, at 520 the process receives a user's custom taxonomy (user-defined categories) and stores the custom taxonomy); 
building a hierarchical relationship ([0002], taxonomy may include relationship schemes (e.g., parent-child relationships), an organization of items into groups, hierarchically organized terms, etc) of the user dataset based on the second set of unidirectional associations ([0043], Fig. 5, at step 530, the process receives the user's mappings that maps the user's custom categories to pre-learned terms, steps 520 and 530 may be combined into a single step whereby the user provides a complete taxonomy definition file that includes the custom taxonomy and mappings); and 
managing the user dataset based on the hierarchical relationship ([0044], Fig. 6, showing steps taken to categorize input data based upon a user's custom taxonomy).
Regarding Claim 9, Chavez teaches all aspects of the claimed invention as disclosed in Claim 8 above. Chavez further teaches wherein the user dataset comprises a first user term and a second user term, and wherein the processors perform additional actions comprising: creating a first one of the plurality of user dataset entries that assigns the first term as a parent to the second term; creating a second one of the plurality of user dataset entries that assigns the second term as a parent to the first term ([0043], Fig. 5, at 520 the process receives a user's custom taxonomy (user-defined categories) and stores the custom taxonomy, at step 530, the process receives the user's mappings that maps the user's custom categories to pre-learned terms [0002], taxonomy may include relationship schemes (e.g., parent-child relationships), an organization of items into groups, hierarchically organized terms, etc (~user may define any number of mappings indicating relationships between terms)); and applying a set of lexical relations feature learning algorithms to the first user dataset entry and the second user dataset entry to generate a first set of user dataset feature learning results ([0045-0046], process uses pre-defined relationships between the pre-learned terms to compute measurements of how well the text matches to each taxonomy definition, the primary metrics to determine if terms are related are how often terms co-occur in a large web corpus, how similar terms are based on a pre-existing neural network trained to learn word similarity, and whether the terms are related in a pre-existing knowledge graph that stores millions of “parent⇄child” relationships, at 640 the process categorizes the text according to the user-defined categories).
Regarding Claim 11, Chavez teaches all aspects of the claimed invention as disclosed in Claim 9 above. Chavez further teaches wherein at least one of the set of lexical relations feature learning algorithms is selected from a group consisting of a hypernym feature learning algorithm, a hyponym feature learning algorithm, a holonym feature learning algorithm, and a meronym feature learning algorithm ([0002], [0042-0043], [0045-0046], parent-child relationships disclose hypernym/hyponym relationship applied to learning algorithms).
Regarding Claim 14, Chavez teaches all aspects of the claimed invention as disclosed in Claim 8 above. Chavez further teaches wherein the processors perform additional actions comprising wherein the user dataset is devoid of classification information and is also devoid of data association information prior to the generating of the second set of unidirectional associations ([0043], Fig. 5, at 520 the process receives a user's custom taxonomy (user-defined categories) and stores the custom taxonomy, at step 530, the process receives the user's mappings that maps the user's custom categories to pre-learned terms (~not until the mappings are received in step 530 is the association information created, step 520 is simply user-defined categories)).
Regarding Claim 15, Chavez teaches a computer program product stored in a computer readable storage medium, comprising computer program code that, when executed by an information handling system ([0015-0016]), causes the information handling system to perform actions (Fig. 5) comprising: 
training a machine learning model using a reference dataset comprising a plurality of reference entries, wherein the machine learning model learns a first set of unidirectional associations between the plurality of reference entries ([0042-0043], Fig. 5, at 510 custom taxonomy classifier 300 is built to describe a way in which terms are indicative of each other using word embeddings, knowledge graphs, parent/child relationships, large scale word co-occurrence, etc., to arrive at an indicative measurement of terms, such as terse definitions of where “a” indicates “b”, custom taxonomy classifier 300 is able to take two arbitrary terms and relate them to each other because it learned the definition between terms by learning the rules whereby they are related (~parent-child relationship is unidirectional)); 
generating a second set of unidirectional associations between a plurality of user dataset entries included in a user dataset in response to inputting the user dataset into the trained machine learning model ([0043], Fig. 5, at 520 the process receives a user's custom taxonomy (user-defined categories) and stores the custom taxonomy); 
building a hierarchical relationship ([0002], taxonomy may include relationship schemes (e.g., parent-child relationships), an organization of items into groups, hierarchically organized terms, etc) of the user dataset based on the second set of unidirectional associations ([0043], Fig. 5, at step 530, the process receives the user's mappings that maps the user's custom categories to pre-learned terms, steps 520 and 530 may be combined into a single step whereby the user provides a complete taxonomy definition file that includes the custom taxonomy and mappings); and 
managing the user dataset based on the hierarchical relationship ([0044], Fig. 6, showing steps taken to categorize input data based upon a user's custom taxonomy).
Regarding Claim 16, Chavez teaches all aspects of the claimed invention as disclosed in Claim 15 above. Chavez further teaches wherein the user dataset comprises a first user term and a second user term, and wherein the information handling system performs further actions comprising: creating a first one of the plurality of user dataset entries that assigns the first term as a parent to the second term; creating a second one of the plurality of user dataset entries that assigns the second term as a parent to the first term ([0043], Fig. 5, at 520 the process receives a user's custom taxonomy (user-defined categories) and stores the custom taxonomy, at step 530, the process receives the user's mappings that maps the user's custom categories to pre-learned terms [0002], taxonomy may include relationship schemes (e.g., parent-child relationships), an organization of items into groups, hierarchically organized terms, etc (~user may define any number of mappings indicating relationships between terms)); and applying a set of lexical relations feature learning algorithms to the first user dataset entry and the second user dataset entry to generate a first set of user dataset feature learning results ([0045-0046], process uses pre-defined relationships between the pre-learned terms to compute measurements of how well the text matches to each taxonomy definition, the primary metrics to determine if terms are related are how often terms co-occur in a large web corpus, how similar terms are based on a pre-existing neural network trained to learn word similarity, and whether the terms are related in a pre-existing knowledge graph that stores millions of “parent⇄child” relationships, at 640 the process categorizes the text according to the user-defined categories).
Regarding Claim 18, Chavez teaches all aspects of the claimed invention as disclosed in Claim 16 above. Chavez further teaches wherein at least one of the set of lexical relations feature learning algorithms is selected from a group consisting of a hypernym feature learning algorithm, a hyponym feature learning algorithm, a holonym feature learning algorithm, and a meronym feature learning algorithm ([0002], [0042-0043], [0045-0046], parent-child relationships disclose hypernym/hyponym relationship applied to learning algorithms).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 3, 10, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Chavez et al (US 2017/0228438), in view of Kajinaga et al (US 2017/0255603).
Regarding Claims 3, 10, and 17, Chavez teaches all aspects of the claimed invention as disclosed in Claims 2, 9, and 16 above. Chavez fails to teach inputting the first set of user dataset feature learning results into the trained machine learning model to generate a portion of the second set of unidirectional associations.
In the same field of endeavor, Kajinaga teaches inputting the first set of user dataset feature learning results into the trained machine learning model to generate a portion of the second set of unidirectional associations ([0041], output of the apparatus 100 can take the form of, for example, the facet tree, results of calculating a degree of similarity or a degree of correlation, a custom annotator generated based on the facet tree, an annotated input document, and/or statistical information about one or more input documents, [0051-0052], calculating section 120 calculates a degree of correlation between a co-occurrence of two or more existing facet tree entries in a document and an occurrence of the candidate word in the document, the facet tree updating section 130 may update the facet tree based on the one or more degrees of correlation calculated by the calculating section 120, facet tree storage 140 stores the facet tree that is updated by the facet tree updating section 130, facet tree stored in the facet tree storage 140 may be created from scratch by the facet tree updating section 130 or may be initially provided by a user of the apparatus 100 to thereafter be updated by the facet tree updating section 130). 
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify training and application of a machine learning model using reference data entries and user customizations to determine unidirectional and hierarchical associations between terms to be later used in categorization of input data, as taught in Chavez, to further include generating further associations between data sets based on results produced by the machine learning models, as taught in Kajinaga, in order to provide automated and more efficient updating of hierarchical term associations and ease the demand on the user. (See Kajinaga {0014-0015])

Claims 6, 13, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Chavez et al (US 2017/0228438), in view of Williamson et al (US 2020/0410116).
Regarding Claims 6, 13, and 20, Chavez teaches all aspects of the claimed invention as disclosed in Claims 1, 8, and 15 above. Chavez fails to teach wherein the reference dataset comprises a subset of related entries and a subset of unrelated entries, the method further comprising: removing the subset of unrelated entries from the reference dataset to create a prepared reference dataset; applying a set of lexical relations feature learning algorithms to the prepared reference dataset to generate a set of reference dataset feature learning results; and performing the training of the machine learning model using the set of reference dataset feature learning results.
In the same field of endeavor, Williamson teaches wherein the reference dataset comprises a subset of related entries and a subset of unrelated entries, the method further comprising: removing the subset of unrelated entries from the reference dataset to create a prepared reference dataset; applying a set of lexical relations feature learning algorithms to the prepared reference dataset to generate a set of reference dataset feature learning results; and performing the training of the machine learning model using the set of reference dataset feature learning results ([0080], machine learning trainer 306 trains the deep learning classifier 212. The machine learning trainer 306 may receive a set of training data, the training data includes multiple data portions which are labeled as sensitive data or not sensitive data, the sensitive data type may also be labeled, this training data may be received from the user and may be custom to the user's organization or may be a generic set of training data, a set of features may first be extracted from the training data, multiple machine learning models may be trained on the training data and the one that produces the best accuracy for the selected training data may be selected, the trained machine learning model is tested against a validation data set, and if the accuracy exceeds a certain percentage, then the machine learning model is selected by the machine learning trainer 306 for use with the deep learning classifier 212).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify training and application of a machine learning model using reference data entries and user customizations to determine unidirectional and hierarchical associations between terms to be later used in categorization of input data, as taught in Chavez, to further include refining the training of the machine learning models based on performance tested against a subset of the reference dataset, as taught in Williamson, in order to validate and enhance the accuracy of the training process providing more confidence in the performance of the machine learning models. (See Williamson [0080])

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: Jin et al (US 2019/0384895) discloses hierarchical tag-based systems creating parent and child relationships, where a directed acyclic graph means that the edges are one directional, and that no cycles can be formed between nodes, i.e., by following a path of logically consistent directional edges the same node cannot be revisited, meaning that a given tag may have multiple parents, rather than being restricted to a single parent as in hierarchical tree structures, resulting in the ability to create multiple virtual paths to the same directory, the result of imposing the restriction of direction and acyclic is that for a given tag, one of its child tags cannot also be a parent of the tag, which is essential in order to filter documents ([0096]).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARGARET G MASTRODONATO whose telephone number is (571)270-7803. The examiner can normally be reached M-F 9:00-6:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Charles Appiah can be reached on (571) 272-7904. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MARGARET G MASTRODONATO/Primary Examiner, Art Unit 2641