DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference characters not mentioned in the description: 200 in Figure 2 and 300 in Figure 3.
Corrected drawing sheets in compliance with 37 CFR 1.121(d), or amendment to the specification to add the reference character(s) in the description in compliance with 37 CFR 1.121(b) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Specification
The disclosure is objected to because of the following informalities:
In paragraph 0029, line 9, “a word with many meaning” should read “a word with many meanings”.
In paragraph 0048, line 3, the acronym “ASCI” is used without being defined.  Also, if the intended meaning is “American Standard Code for Information Interchange”, then “ASCI” should read “ASCII”.
Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 5 – 6, 9, 15 – 16 and 19 – 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 5 recites the limitation "the one or more corresponding definitions" in line 4.  There is insufficient antecedent basis for this limitation in the claim.  Claim 5 depends from claim 1, and claim 1 recites “definitions associated with the specific domain and corresponding to the plurality of terms”, but it is not clear how "the one or more corresponding definitions" relates to the definitions of claim 1.  Modifying "the one or more corresponding definitions" to "one or more of the corresponding definitions" would resolve the indefiniteness.  For examination purposes, "the one or more corresponding definitions" will be interpreted as one or more of the definitions corresponding to the plurality of terms.
Claim 6 recites the limitation “wherein the machine learning model includes only a single stack of encoders”.  This limitation is indefinite because it is not clear if an encoder of the machine learning model includes only a single stack of encoding elements, or if the machine learning model includes no other elements except for a single stack of encoders. The specification recites, in paragraph 0041, lines 11-13, “In some embodiments, one or more of the encoder or decoder each includes only a single stack of RNN units or its variants.”.  For examination purposes, the limitation “wherein the machine learning model includes only a single stack of encoders” will be interpreted to mean that an encoder of the machine learning model includes only a single stack of encoding elements.  Also, the limitation “wherein the machine learning model includes only a single stack of encoders” is indefinite because it is not clear if “a single stack of encoders” means one layer of encoding elements or one stack of encoding layers, with each layer made up of multiple encoding elements.  For examination purposes, the limitation “a single stack of encoders” will be interpreted to mean one layer of encoding elements.
Claim 9 recites the limitation “wherein each pair of term and corresponding definition in the training data and ground truth, respectively, is independent of each other”.  This limitation is indefinite because it is not clear if each term/definition pair is independent of all the other term/definition pairs, or if the terms are independent of the definitions.  The specification recites, in paragraph 0036, lines 1-4, “In some embodiments, each pair of term and corresponding definition in the training data and ground truth, respectively, is independent of each other. In other words, in contrast to training data generally used for natural language translation, the meaning of each term is not dependent on the meaning of any other term.”.  For examination purposes, the limitation “wherein each pair of term and corresponding definition in the training data and ground truth, respectively, is independent of each other” will be interpreted to mean that each term/definition pair is independent of all the other term/definition pairs.
Claim 15 recites the limitation "the one or more corresponding definitions" in line 4.  There is insufficient antecedent basis for this limitation in the claim.  Claim 15 depends from claim 12, and claim 12 recites “definitions associated with the specific domain and corresponding to the plurality of terms”, but it is not clear how "the one or more corresponding definitions" relates to the definitions of claim 12.  Modifying "the one or more corresponding definitions" to "one or more of the corresponding definitions" would resolve the indefiniteness.  For examination purposes, "the one or more corresponding definitions" will be interpreted as one or more of the definitions corresponding to the plurality of terms.
Claim 16 recites the limitation “wherein the machine learning model includes only a single stack of encoders”.  This limitation is indefinite because it is not clear if an encoder of the machine learning model includes only a single stack of encoding elements, or if the machine learning model includes no other elements except for a single stack of encoders. The specification recites, in paragraph 0041, lines 11-13, “In some embodiments, one or more of the encoder or decoder each includes only a single stack of RNN units or its variants.”.  For examination purposes, the limitation “wherein the machine learning model includes only a single stack of encoders” will be interpreted to mean that an encoder of the machine learning model includes only a single stack of encoding elements.  Also, the limitation “wherein the machine learning model includes only a single stack of encoders” is indefinite because it is not clear if “a single stack of encoders” means one layer of encoding elements or one stack of encoding layers, with each layer made up of multiple encoding elements.  For examination purposes, the limitation “a single stack of encoders” will be interpreted to mean one layer of encoding elements.
Claim 19 recites the limitation “wherein each pair of term and corresponding definition in the training data and ground truth, respectively, is independent of each other”.  This limitation is indefinite because it is not clear if each term/definition pair is independent of all the other term/definition pairs, or if the terms are independent of the definitions.  The specification recites, in paragraph 0036, lines 1-4, “In some embodiments, each pair of term and corresponding definition in the training data and ground truth, respectively, is independent of each other. In other words, in contrast to training data generally used for natural language translation, the meaning of each term is not dependent on the meaning of any other term.”.  For examination purposes, the limitation “wherein each pair of term and corresponding definition in the training data and ground truth, respectively, is independent of each other” will be interpreted to mean that each term/definition pair is independent of all the other term/definition pairs.
Claim 20 recites the limitation "the one or more corresponding definitions" in lines 9-10.  There is insufficient antecedent basis for this limitation in the claim.  Claim 20 previously recites “definitions associated with the specific domain and corresponding to the plurality of terms”, but it is not clear how "the one or more corresponding definitions" relates to “definitions associated with the specific domain and corresponding to the plurality of terms”.  Modifying "the one or more corresponding definitions" to "one or more of the corresponding definitions" would resolve the indefiniteness.  For examination purposes, "the one or more corresponding definitions" will be interpreted as one or more of the definitions corresponding to the plurality of terms.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1 and 4 – 5 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Veyseh et al. (US Patent Application Publication No. 2022/0050967), hereinafter Veyseh.
Regarding claim 1, Veyseh discloses a method of determining a definition for a term associated with a specific domain, the method comprising:
receiving, via a processor, an electronic document that is associated with a specific domain, the electronic document including at least one term (Paragraph 0127, lines 1-7, "The components of the definition extraction system 102 can include software, hardware, or both. For example, the components of the definition extraction system 102 can include one or more instructions stored on a computer-readable storage medium and executable by processors of one or more computing devices (e.g., the computing device(s) 500)."; Paragraph 0014, lines 1-4, "In one or more embodiments, the definition extraction system analyzes a source document to determine whether the source document includes a definition of a term included in the source document."; Paragraph 0093, lines 13-17, "Alternatively, the documents can include text related to a specific domain for training the neural network 301 to extract definitions from sources related to the specific domain (e.g., legal contracts, scientific papers).");
determining a definition of the at least one term via a machine learning model that is trained, based on (i) a plurality of terms associated with the specific domain as training data and (ii) definitions associated with the specific domain and corresponding to the plurality of terms as ground truth, to generate an output definition associated with the specific domain in response to an input term (Paragraph 0047, lines 3-7, "For instance, the definition extraction system 102 can utilize data associated with extracted term definitions from a set of documents (e.g., a set of training documents) to learn parameters of one or more layers of the machine-learning model 114."; Paragraph 0052, lines 1-9, "As mentioned above, the definition extraction system 102 can accurately and flexibly extract term definitions from documents utilizing machine learning. FIGS. 2A-2B illustrate examples of the machine-learning model 114 of the definition extraction system 102 analyzing a document 200a to extract a term definition 202a. In particular, FIGS. 2A-2B illustrate that the machine-learning model 114 receives the document 200a as an input and outputs the term definition 202a.”; Paragraph 0093, lines 1-17, "FIG. 4 illustrates that the neural network 301 receives documents from a document repository 400 as an input. In one or more embodiments, the documents are part of a corpus of documents used for training, verifying, and testing the neural network 301. To illustrate, the documents can include a training dataset, a verification dataset, and a testing dataset. The documents include text related to one or more domains of knowledge. The documents can also include labeled data indicating ground-truth information for use in training the neural network 301. According to some embodiments, the documents can include text related to a plurality of domains for training the neural network 301 to extract definitions from a variety of sources. Alternatively, the documents can include text related to a specific domain for training the neural network 301 to extract definitions from sources related to the specific domain (e.g., legal contracts, scientific papers)."; The text related to a specific domain for training the neural network reads on a plurality of terms associated with the specific domain as training data, and the documents including labeled data indicating ground-truth information for use in training the neural network reads on definitions associated with the specific domain and corresponding to the plurality of terms as ground truth.); 
and transmitting a response to receiving the electronic document that includes the determined definition of the at least one term (Paragraph 0048, lines 15-24, "In one or more embodiments, the definition extraction system 102 can analyze documents obtained from the client device 106 or associated with digital content items from the client device 106 to extract term definitions. The definition extraction system 102 can provide extracted term definitions to the client device 106 for assisting the user of the client device 106 for interacting with digital content items (e.g., in instructions for performing operations associated with interacting with digital content items)."; Providing extracted term definitions to the client device reads on transmitting a response that includes the determined definition of the at least one term.).
Regarding claim 4, Veyseh discloses the method as claimed in claim 1, wherein the at least one term is a single-word term (Paragraph 0030, lines 1-3, "Furthermore, a term includes a word or phrase that describes a thing or concept. For example, a term can include one word.").
Regarding claim 5, as best understood based on the 35 U.S.C. 112(b) issues identified above, Veyseh discloses the method as claimed in claim 1, wherein the training of the machine learning model is configured to cause the machine learning model to learn associations between (iii) at least a portion of one or more of the plurality of terms in the training data and (iv) at least a portion of the one or more corresponding definitions (Paragraph 102, lines 1-14, "As further shown in FIG. 4, after determining the predicted-sequence-latent labels 422 and the predicted-term-definition-latent labels 424, the definition extraction system 102 can utilize a semantic consistency loss function 426 to determine a semantic consistency loss. Specifically, the definition extraction system 102 can determine the semantic consistency loss by determining differences between the predicted-sequence-latent labels 422 and the predicted-term-definition-latent labels 424. For example, if the information encoded in a sequence representation vector is semantically consistent with information encoded in the corresponding term-definition vector, the corresponding predicted-sequence-latent label should be the same as the corresponding predicted-term-definition-latent label."; Determining the differences between the predicted-sequence-latent labels and the predicted-term-definition-latent labels reads on learning associations between at least a portion of one or more of the plurality of terms in the training data and at least a portion of the one or more corresponding definitions.).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 2 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Veyseh in view of Joyce et al. (US Patent Application Publication No. 2021/0263900), hereinafter Joyce.
Regarding claim 2, Veyseh discloses the method as claimed in claim 1, but does not specifically disclose: wherein transmitting the response includes adding an annotation to the electronic document that includes the determined definition.
Joyce teaches:
wherein transmitting the response includes adding an annotation to the electronic document that includes the determined definition (Paragraph 0185, lines 3-6, "For example, the generated labels of datasets can be used for data quality enforcement, personal data anonymization, data masking, (PII) reports, test data management, dataset annotation, and so forth."; Paragraph 0198, lines 13-17, "In addition, a computer can interact with a user by sending documents to and receiving documents from a device that is used by the user; for example, by sending web pages to a web browser on a user's client device in response to requests received from the web browser."; Paragraph 0039, lines 5-10, "The semantic discovery system 602 is configured to determine a meaning (e.g., a semantic meaning) of values for one or more fields of the data records. The semantic discovery system 602 can label each of the fields with a semantic label 118 that is selected from a data dictionary database 614."; Dataset annotation reads on adding an annotation to the electronic document, and determining a meaning of values for one or more fields of the data records reads on the determined definition.).
Joyce teaches determining the meaning of values of data records and annotating the dataset in order to determine the data processing operations to apply to the data values to accomplish a specified goal of an application (Paragraph 0031, lines 1-13, "Aspects can include one or more advantages. For instance, the techniques described herein enable a data processing system to automatically generate one or more rules for processing the data of data fields of a dataset. Once the semantic meaning of the data are known, the data processing system determines which data processing operations to apply to the data values to accomplish a specified goal of an application. The data processing system can thus automatically determine how to process different data fields of the dataset to accomplish a goal for the entire dataset, such as masking data values, enforcing data quality rules, identifying a schema of the dataset, and/or selecting test data for testing another application.").
Veyseh and Joyce are considered to be analogous to the claimed invention because they are in the same field of document processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Veyseh to incorporate the teachings of Joyce to determine the meaning of values of data records and annotate the dataset.  Doing so would allow for determining the data processing operations to apply to the data values to accomplish a specified goal of an application.
Regarding claim 10, Veyseh discloses the method as claimed in claim 1, but does not specifically disclose: wherein the machine learning model is further trained to determine the definition of the at least one term from the electronic document independently of a remainder of the electronic document.
Joyce teaches:
wherein the machine learning model is further trained to determine the definition of the at least one term from the electronic document independently of a remainder of the electronic document (Paragraph 0158, lines 1-9, "The pattern matching analysis of the testing module 606 uses the data content of the fields (in addition to or instead of the field names). The types of pattern matching that are used for the pattern matching can be determined by the testing module 606 based on the results of the classification data. For example, the classification data may identify a data type of a field, such as that the data are numerical. In this example, the profile data also indicates that each entry in the data field is 13-18 characters long. This may indicate to the testing module 606 that the data field may be a credit card number data field."; Paragraph 0166, lines 1-9, "The testing module 606 is configured to execute machine learning logic in which classifications of prior datasets (e.g., from a particular source) or of prior iterations of the same dataset are remembered and influence which tests are selected for subsequent iterations and how the probability values of those subsequent iterations are determined. The machine learning logic is trained on the dataset and can apply the weights that are developed using the training data to classify new data of the dataset."; Using the data content of the fields to classify the data, as in the example of the data content being numerical and 13-18 characters long used to determine that the data field may be a credit card number, reads on determining the definition of the at least one term from the electronic document independently of a remainder of the electronic document.).
Joyce teaches using the data content of the fields to classify the data in order to determine the data processing operations to apply to the data values to accomplish a specified goal of an application (Paragraph 0031, lines 1-13, "Aspects can include one or more advantages. For instance, the techniques described herein enable a data processing system to automatically generate one or more rules for processing the data of data fields of a dataset. Once the semantic meaning of the data are known, the data processing system determines which data processing operations to apply to the data values to accomplish a specified goal of an application. The data processing system can thus automatically determine how to process different data fields of the dataset to accomplish a goal for the entire dataset, such as masking data values, enforcing data quality rules, identifying a schema of the dataset, and/or selecting test data for testing another application.").
Veyseh and Joyce are considered to be analogous to the claimed invention because they are in the same field of document processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Veyseh to incorporate the teachings of Joyce to use the data content of the fields to classify the data.  Doing so would allow for determining the data processing operations to apply to the data values to accomplish a specified goal of an application.
Claims 3, 12 – 15 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Veyseh in view of Mascio et al. (“Comparative Analysis of Text Classification Approaches in Electronic Health Records”), hereinafter Mascio.
Regarding claim 3, Veyseh discloses the method as claimed in claim 1, but does not specifically disclose further comprising: prior to determining the definition of the at least one term, performing a pre-processing on the at least one term, wherein the pre-processing is predetermined based on the specific domain.
Mascio teaches:
prior to determining the definition of the at least one term, performing a pre-processing on the at least one term, wherein the pre-processing is predetermined based on the specific domain (Section 2.4, lines 9-14, "For the word level tokenizer we chose SciSpaCy as it is specifically aimed at biomedical and scientific text processing. We further tested additional text pre-processing: lowercasing, punctuation removal, stopwords removal, stemming and lemmatization."; Using a tokenizer specific to biomedical and scientific text processing reads on pre-processing predetermined based on the specific domain.).
Mascio teaches using a tokenizer specific to biomedical and scientific text processing in order to improve the performance of text classification tasks (Abstract, lines 15-24, "In this work, we analyse the impact of various word representations, text pre-processing and classification algorithms on the performance of four different text classification tasks. The results show that traditional approaches, when tailored to the specific language and structure of the text inherent to the classification task, can achieve or exceed the performance of more recent ones based on contextual embeddings such as BERT.").
Veyseh and Mascio are considered to be analogous to the claimed invention because they are in the same field of document processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Veyseh to incorporate the teachings of Mascio to use a tokenizer specific to biomedical and scientific text processing.  Doing so would allow for improving the performance of text classification tasks.
Regarding claim 12, Veyseh discloses a method of training a machine learning model to output a definition associated with a specific domain in response to an input term, the method comprising:
receiving a plurality of terms and definitions associated with a specific domain and corresponding to the plurality of terms (Paragraph 0014, lines 1-4, "In one or more embodiments, the definition extraction system analyzes a source document to determine whether the source document includes a definition of a term included in the source document."; Paragraph 0093, lines 13-17, "Alternatively, the documents can include text related to a specific domain for training the neural network 301 to extract definitions from sources related to the specific domain (e.g., legal contracts, scientific papers).");
and training a machine learning model, based on the pre-processed plurality of terms as training data and the corresponding pre-processed definitions as ground truth, to generate an output definition associated with the specific domain in response to an input term (Paragraph 0047, lines 3-7, "For instance, the definition extraction system 102 can utilize data associated with extracted term definitions from a set of documents (e.g., a set of training documents) to learn parameters of one or more layers of the machine-learning model 114."; Paragraph 0052, lines 1-9, "As mentioned above, the definition extraction system 102 can accurately and flexibly extract term definitions from documents utilizing machine learning. FIGS. 2A-2B illustrate examples of the machine-learning model 114 of the definition extraction system 102 analyzing a document 200a to extract a term definition 202a. In particular, FIGS. 2A-2B illustrate that the machine-learning model 114 receives the document 200a as an input and outputs the term definition 202a.”; Paragraph 0093, lines 1-17, "FIG. 4 illustrates that the neural network 301 receives documents from a document repository 400 as an input. In one or more embodiments, the documents are part of a corpus of documents used for training, verifying, and testing the neural network 301. To illustrate, the documents can include a training dataset, a verification dataset, and a testing dataset. The documents include text related to one or more domains of knowledge. The documents can also include labeled data indicating ground-truth information for use in training the neural network 301. According to some embodiments, the documents can include text related to a plurality of domains for training the neural network 301 to extract definitions from a variety of sources. Alternatively, the documents can include text related to a specific domain for training the neural network 301 to extract definitions from sources related to the specific domain (e.g., legal contracts, scientific papers)."; The text related to a specific domain for training the neural network reads on a plurality of terms as training data, and the documents including labeled data indicating ground-truth information for use in training the neural network reads on the corresponding definitions as ground truth.).
Veyseh does not specifically disclose: performing a pre-processing on each of the plurality of terms and on each of the corresponding definitions, wherein the pre-processing is predetermined based on the specific domain.
Mascio teaches:
performing a pre-processing on each of the plurality of terms and on each of the corresponding definitions, wherein the pre-processing is predetermined based on the specific domain (Section 2.4, lines 9-14, "For the word level tokenizer we chose SciSpaCy as it is specifically aimed at biomedical and scientific text processing. We further tested additional text pre-processing: lowercasing, punctuation removal, stopwords removal, stemming and lemmatization."; Using a tokenizer specific to biomedical and scientific text processing reads on pre-processing predetermined based on the specific domain.).
Mascio teaches using a tokenizer specific to biomedical and scientific text processing in order to improve the performance of text classification tasks (Abstract, lines 15-24, "In this work, we analyse the impact of various word representations, text pre-processing and classification algorithms on the performance of four different text classification tasks. The results show that traditional approaches, when tailored to the specific language and structure of the text inherent to the classification task, can achieve or exceed the performance of more recent ones based on contextual embeddings such as BERT.").
Veyseh and Mascio are considered to be analogous to the claimed invention because they are in the same field of document processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Veyseh to incorporate the teachings of Mascio to use a tokenizer specific to biomedical and scientific text processing.  Doing so would allow for improving the performance of text classification tasks.
Regarding claim 13, Veyseh in view of Mascio discloses the method as claimed in claim 12.
Mascio further teaches:
wherein the machine learning model is configured to perform the pre-processing on the input term prior to generating the output definition (Section 2.4, lines 9-14, "For the word level tokenizer we chose SciSpaCy as it is specifically aimed at biomedical and scientific text processing. We further tested additional text pre-processing: lowercasing, punctuation removal, stopwords removal, stemming and lemmatization."; Section 3.2, lines 1-3, "In addition to exploring various embeddings, we tested the impact of text pre-processing on classification task performance."; Performing text pre-processing prior to performing classification reads on performing the pre-processing on the input term prior to generating the output definition.).
Mascio teaches performing text pre-processing prior to performing classification in order to improve the performance of text classification tasks (Abstract, lines 15-24, "In this work, we analyse the impact of various word representations, text pre-processing and classification algorithms on the performance of four different text classification tasks. The results show that traditional approaches, when tailored to the specific language and structure of the text inherent to the classification task, can achieve or exceed the performance of more recent ones based on contextual embeddings such as BERT.").
Veyseh and Mascio are considered to be analogous to the claimed invention because they are in the same field of document processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Veyseh in view of Mascio to further incorporate the teachings of Mascio to perform text pre-processing prior to performing classification.  Doing so would allow for improving the performance of text classification tasks.
Regarding claim 14, Veyseh in view of Mascio discloses the method as claimed in claim 12. 
Veyseh further discloses:
wherein each of the plurality of terms is a single-word term (Paragraph 0030, lines 1-3, "Furthermore, a term includes a word or phrase that describes a thing or concept. For example, a term can include one word.").
Regarding claim 15, as best understood based on the 35 U.S.C. 112(b) issues identified above, Veyseh in view of Mascio discloses the method as claimed in claim 12.
Veyseh further discloses:
wherein the training of the machine learning model is configured to cause the machine learning model to learn associations between (iii) at least a portion of one or more of the plurality of terms in the training data and (iv) at least a portion of the one or more corresponding definitions (Paragraph 102, lines 1-14, "As further shown in FIG. 4, after determining the predicted-sequence-latent labels 422 and the predicted-term-definition-latent labels 424, the definition extraction system 102 can utilize a semantic consistency loss function 426 to determine a semantic consistency loss. Specifically, the definition extraction system 102 can determine the semantic consistency loss by determining differences between the predicted-sequence-latent labels 422 and the predicted-term-definition-latent labels 424. For example, if the information encoded in a sequence representation vector is semantically consistent with information encoded in the corresponding term-definition vector, the corresponding predicted-sequence-latent label should be the same as the corresponding predicted-term-definition-latent label."; Determining the differences between the predicted-sequence-latent labels and the predicted-term-definition-latent labels reads on learning associations between at least a portion of one or more of the plurality of terms in the training data and at least a portion of the one or more corresponding definitions.).
Regarding claim 20, as best understood based on the 35 U.S.C. 112(b) issues identified above, Veyseh discloses a system for determining a definition association with a specific domain of a term in an electronic document, the system comprising:
a processor; and a memory that is operatively connected to the processor (Paragraph 0127, lines 1-7, "The components of the definition extraction system 102 can include software, hardware, or both. For example, the components of the definition extraction system 102 can include one or more instructions stored on a computer-readable storage medium and executable by processors of one or more computing devices (e.g., the computing device(s) 500)."),
and that stores: a machine learning model that is trained, based on (i) a plurality of terms associated with a specific domain as training data and (ii) definitions associated with the specific domain and corresponding to the plurality of terms as ground truth (Paragraph 0047, lines 3-7, "For instance, the definition extraction system 102 can utilize data associated with extracted term definitions from a set of documents (e.g., a set of training documents) to learn parameters of one or more layers of the machine-learning model 114."; Paragraph 0052, lines 1-9, "As mentioned above, the definition extraction system 102 can accurately and flexibly extract term definitions from documents utilizing machine learning. FIGS. 2A-2B illustrate examples of the machine-learning model 114 of the definition extraction system 102 analyzing a document 200a to extract a term definition 202a. In particular, FIGS. 2A-2B illustrate that the machine-learning model 114 receives the document 200a as an input and outputs the term definition 202a.”; Paragraph 0093, lines 1-17, "FIG. 4 illustrates that the neural network 301 receives documents from a document repository 400 as an input. In one or more embodiments, the documents are part of a corpus of documents used for training, verifying, and testing the neural network 301. To illustrate, the documents can include a training dataset, a verification dataset, and a testing dataset. The documents include text related to one or more domains of knowledge. The documents can also include labeled data indicating ground-truth information for use in training the neural network 301. According to some embodiments, the documents can include text related to a plurality of domains for training the neural network 301 to extract definitions from a variety of sources. Alternatively, the documents can include text related to a specific domain for training the neural network 301 to extract definitions from sources related to the specific domain (e.g., legal contracts, scientific papers)."; The text related to a specific domain for training the neural network reads on a plurality of terms associated with the specific domain as training data, and the documents including labeled data indicating ground-truth information for use in training the neural network reads on definitions associated with the specific domain and corresponding to the plurality of terms as ground truth.),
to: learn associations between (iii) at least a portion of one or more of the plurality of terms in the training data and (iv) at least a portion of the one or more corresponding definitions (Paragraph 102, lines 1-14, "As further shown in FIG. 4, after determining the predicted-sequence-latent labels 422 and the predicted-term-definition-latent labels 424, the definition extraction system 102 can utilize a semantic consistency loss function 426 to determine a semantic consistency loss. Specifically, the definition extraction system 102 can determine the semantic consistency loss by determining differences between the predicted-sequence-latent labels 422 and the predicted-term-definition-latent labels 424. For example, if the information encoded in a sequence representation vector is semantically consistent with information encoded in the corresponding term-definition vector, the corresponding predicted-sequence-latent label should be the same as the corresponding predicted-term-definition-latent label."; Determining the differences between the predicted-sequence-latent labels and the predicted-term-definition-latent labels reads on learning associations between at least a portion of one or more of the plurality of terms in the training data and at least a portion of the one or more corresponding definitions.);
and generate an output definition associated with the specific domain in response to an input term (Paragraph 0052, lines 1-9, "As mentioned above, the definition extraction system 102 can accurately and flexibly extract term definitions from documents utilizing machine learning. FIGS. 2A-2B illustrate examples of the machine-learning model 114 of the definition extraction system 102 analyzing a document 200a to extract a term definition 202a. In particular, FIGS. 2A-2B illustrate that the machine-learning model 114 receives the document 200a as an input and outputs the term definition 202a.”);
and instructions that are executable by the processor to cause the processor to perform operations, including: receiving an electronic document that is associated with the specific domain, the electronic document including at least one term (Paragraph 0127, lines 1-7, "The components of the definition extraction system 102 can include software, hardware, or both. For example, the components of the definition extraction system 102 can include one or more instructions stored on a computer-readable storage medium and executable by processors of one or more computing devices (e.g., the computing device(s) 500)."; Paragraph 0014, lines 1-4, "In one or more embodiments, the definition extraction system analyzes a source document to determine whether the source document includes a definition of a term included in the source document."; Paragraph 0093, lines 13-17, "Alternatively, the documents can include text related to a specific domain for training the neural network 301 to extract definitions from sources related to the specific domain (e.g., legal contracts, scientific papers).");
determining a definition of the at least one term via the machine learning model (Paragraph 0052, lines 1-9, "As mentioned above, the definition extraction system 102 can accurately and flexibly extract term definitions from documents utilizing machine learning. FIGS. 2A-2B illustrate examples of the machine-learning model 114 of the definition extraction system 102 analyzing a document 200a to extract a term definition 202a. In particular, FIGS. 2A-2B illustrate that the machine-learning model 114 receives the document 200a as an input and outputs the term definition 202a.”);
and transmitting a response to receiving the electronic document that includes the determined definition of the at least one term (Paragraph 0048, lines 15-24, "In one or more embodiments, the definition extraction system 102 can analyze documents obtained from the client device 106 or associated with digital content items from the client device 106 to extract term definitions. The definition extraction system 102 can provide extracted term definitions to the client device 106 for assisting the user of the client device 106 for interacting with digital content items (e.g., in instructions for performing operations associated with interacting with digital content items)."; Providing extracted term definitions to the client device reads on transmitting a response that includes the determined definition of the at least one term.).
Veyseh does not specifically disclose: performing a pre-processing on the at least one term, wherein the pre-processing is predetermined based on the specific domain.
Mascio teaches:
performing a pre-processing on the at least one term, wherein the pre-processing is predetermined based on the specific domain (Section 2.4, lines 9-14, "For the word level tokenizer we chose SciSpaCy as it is specifically aimed at biomedical and scientific text processing. We further tested additional text pre-processing: lowercasing, punctuation removal, stopwords removal, stemming and lemmatization."; Using a tokenizer specific to biomedical and scientific text processing reads on pre-processing predetermined based on the specific domain.).
Mascio teaches using a tokenizer specific to biomedical and scientific text processing in order to improve the performance of text classification tasks (Abstract, lines 15-24, "In this work, we analyse the impact of various word representations, text pre-processing and classification algorithms on the performance of four different text classification tasks. The results show that traditional approaches, when tailored to the specific language and structure of the text inherent to the classification task, can achieve or exceed the performance of more recent ones based on contextual embeddings such as BERT.").
Veyseh and Mascio are considered to be analogous to the claimed invention because they are in the same field of document processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Veyseh to incorporate the teachings of Mascio to use a tokenizer specific to biomedical and scientific text processing.  Doing so would allow for improving the performance of text classification tasks.
Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Veyseh in view of Meng et al. (US Patent Application Publication No. 2022/0284028), hereinafter Meng.
Regarding claim 6, as best understood based on the 35 U.S.C. 112(b) issues identified above, Veyseh discloses the method as claimed in claim 1, but does not specifically disclose: wherein the machine learning model includes only a single stack of encoders.
Meng teaches:
wherein the machine learning model includes only a single stack of encoders (Paragraph 0021, lines 6-10, "For example, with some embodiments, each Transformer used for encoding each text input type is configured to operate with a single layer having a fixed number of attention heads (e.g., ten) and feed forward mechanisms."; A Transformer used for encoding with a single layer having a fixed number of attention heads reads on a single stack of encoders.).
Meng teaches using a Transformer for encoding with a single layer having a fixed number of attention heads in order to reduce computational complexity and latency (Paragraph 0021, lines 1-6, "In addition to reducing the computational complexity, and thus latency, of the Transformers by establishing a maximum text input length for each type of text input to be encoded by a Transformer, various hyperparameters of the Transformer are selected to ensure optimal performance of the overall ranking system.").
Veyseh and Meng are considered to be analogous to the claimed invention because they are in the same field of document processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Veyseh to incorporate the teachings of Meng to use a Transformer for encoding with a single layer having a fixed number of attention heads.  Doing so would allow for reducing computational complexity and latency.
Claim 7 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Veyseh in view of Lin et al. (US Patent No. 11,176,330), hereinafter Lin.
Regarding claim 7, Veyseh discloses the method as claimed in claim 1, but does not specifically disclose: wherein the machine learning model is an attention-based sequence-to-sequence model.
Lin teaches:
wherein the machine learning model is an attention-based sequence-to-sequence model (Column 8, lines 60-64, "Then, the generation model can be trained based on the determined training sample. In the implementation of the present specification, a sequence-to-sequence (Seq2seq) model can be selected as the generation model, namely, an encoder-decoder network (Encoder-Decoder)."; Column 10, lines 12-17, "In the model training process, the output of the decoder neural network at each moment can be determined by the input. That is, for each input, a word can be selected based on probability distribution of words in a word list. The word probability distribution can be implemented, for example, by using an attention mechanism.").
Lin teaches using a sequence-to-sequence model with attention in order to improve the efficiency and effectiveness of generating information (Column 6, lines 5-15, "The generation model can be a neural network based on natural language processing, for example, an encoder-decoder network (Encoder-Decoder). The encoder-decoder network can be used to handle sequence-to-sequence problems. The recommendation information can be considered as a sequence of a plurality of words. A predetermined quantity of pieces of recommendation information can be generated based on input keywords by using the generation model, so that diversified recommendation information is generated, and therefore efficiency and effectiveness of generating the recommendation information is improved.").
Veyseh and Lin are considered to be analogous to the claimed invention because they are in the same field of document processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Veyseh to incorporate the teachings of Lin to use a sequence-to-sequence model with attention.  Doing so would allow for improving the efficiency and effectiveness of generating information.
Regarding claim 8, Veyseh in view of Lin discloses the method as claimed in claim 7.
Lin further teaches:
wherein the machine learning model includes a gated recurrent unit based encoder-decoder recurrent neural network (Column 3, lines 22-31, "In an implementation, the encoder-decoder network includes an encoder neural network and a decoder neural network, the encoder neural network is configured to convert an input word sequence into a semantic vector, the decoder neural network is configured to predict a predetermined quantity of character sequences based on the semantic vector, and the encoder neural network or the decoder neural network is one of a recurrent neural network, a bidirectional recurrent neural network, a gated recurrent unit, and a long short-term memory model."; The encoder neural network or the decoder neural network being a gated recurrent unit reads on a gated recurrent unit based encoder-decoder recurrent neural network, as a gated recurrent unit neural network is a type of recurrent neural network.).
Lin teaches using a gated recurrent unit encoder-decoder neural network in order to improve the efficiency and effectiveness of generating information (Column 6, lines 5-15, "The generation model can be a neural network based on natural language processing, for example, an encoder-decoder network (Encoder-Decoder). The encoder-decoder network can be used to handle sequence-to-sequence problems. The recommendation information can be considered as a sequence of a plurality of words. A predetermined quantity of pieces of recommendation information can be generated based on input keywords by using the generation model, so that diversified recommendation information is generated, and therefore efficiency and effectiveness of generating the recommendation information is improved.").
Veyseh and Lin are considered to be analogous to the claimed invention because they are in the same field of document processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Veyseh in view of Lin to further incorporate the teachings of Lin to use a gated recurrent unit encoder-decoder neural network.  Doing so would allow for improving the efficiency and effectiveness of generating information.
Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Veyseh in view of Sherstinsky ("Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network”).
Regarding claim 9, as best understood based on the 35 U.S.C. 112(b) issues identified above, Veyseh discloses the method as claimed in claim 1, but does not specifically disclose: wherein each pair of term and corresponding definition in the training data and ground truth, respectively, is independent of each other.
Sherstinsky teaches:
wherein each pair of term and corresponding definition in the training data and ground truth, respectively, is independent of each other (Page 13, footnote 17, lines 8-12, "Moreover, if the input data is prepared ahead of time and is made available to the system in its entirety, then the causality restriction can be relaxed altogether. This can be feasible in applications, where the entire training data set or a collection of independent training data segments is gathered before processing commences."; Independent training data segments read on each pair of term and corresponding definition in the training data and ground truth being independent of each other.).
Sherstinsky teaches using independent training data segments in order to detect the presence of context among data samples for analyzing audio, speech, and text (Page 13, footnote 17, lines 12-17, "Non-causal processing (i.e., a technique characterized by taking advantage of the input data “from the future”) can be advantageous in detecting the presence of “context” among data samples. Utilizing the information at the “future” steps as part of context for making decisions at the “current” step is often beneficial for analyzing audio, speech, and text.").
Veyseh and Sherstinsky are considered to be analogous to the claimed invention because they are in the same field of machine learning systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Veyseh to incorporate the teachings of Sherstinsky to use independent training data segments.  Doing so would allow for to detecting the presence of context among data samples for analyzing audio, speech, and text.
Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Veyseh in view of Soeder et al. (US Patent No. 11,218,500), hereinafter Soeder.
Regarding claim 11, Veyseh discloses the method as claimed in claim 1, but does not specifically disclose: wherein the electronic document includes one or more of event or system log data.
Soeder teaches:
wherein the electronic document includes one or more of event or system log data (Column 4, lines 5-11, " FIG. 1 illustrates a schematic diagram of a system 10 for automated parsing and identification of textual data according to the present disclosure. The system 10 generally employs a multi-stage approach, with a parsing system 15, including a parser 20 and a classifier 30 for receiving and parsing raw incoming textural data, and a labeling system 40."; Column 5, lines 37-43, "The system 10 generally is designed to process system-generated textual outputs, such as that commonly found in log files (e.g. web server logs, intrusion detection system logs, operating system logs, etc.), though the system 10 can process other textual data or other information, without departing from the scope of the present disclosure.").
Soeder teaches performing identification of textual data from system logs in order to increase the efficiency and effectiveness of detecting potential data security threats (Column 1, lines 15-24, "Increasing the efficiency and effectiveness of system log data analysis has been the focus of substantial research and development, particularly among security analysts attempting to inspect and analyze security log messages for evidence of security incidents, threats, and/or other fault conditions/issues, as well as to diagnose system performance problems and other types of analyses. The faster and more accurately potential data security threats can be detected, the faster remedial actions can be enacted to stop, remediate and/or prevent such threats.").
Veyseh and Soeder are considered to be analogous to the claimed invention because they are in the same field of document processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Veyseh to incorporate the teachings of Soeder to perform identification of textual data from system logs.  Doing so would allow for increasing the efficiency and effectiveness of detecting potential data security threats.
Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Veyseh in view of Mascio, and further in view of Meng.
Regarding claim 16, as best understood based on the 35 U.S.C. 112(b) issues identified above, Veyseh in view of Mascio discloses the method as claimed in claim 12, but does not specifically disclose: wherein the machine learning model includes only a single stack of encoders.
Meng teaches:
wherein the machine learning model includes only a single stack of encoders (Paragraph 0021, lines 6-10, "For example, with some embodiments, each Transformer used for encoding each text input type is configured to operate with a single layer having a fixed number of attention heads (e.g., ten) and feed forward mechanisms."; A Transformer used for encoding with a single layer having a fixed number of attention heads reads on a single stack of encoders.).
Meng teaches using a Transformer for encoding with a single layer having a fixed number of attention heads in order to reduce computational complexity and latency (Paragraph 0021, lines 1-6, "In addition to reducing the computational complexity, and thus latency, of the Transformers by establishing a maximum text input length for each type of text input to be encoded by a Transformer, various hyperparameters of the Transformer are selected to ensure optimal performance of the overall ranking system.").
Veyseh, Mascio, and Meng are considered to be analogous to the claimed invention because they are in the same field of document processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Veyseh in view of Mascio to incorporate the teachings of Meng to use a Transformer for encoding with a single layer having a fixed number of attention heads.  Doing so would allow for reducing computational complexity and latency.
Claim 17 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Veyseh in view of Mascio, and further in view of Lin.
Regarding claim 17, Veyseh in view of Mascio discloses the method as claimed in claim 12, but does not specifically disclose: wherein the machine learning model is an attention-based sequence-to-sequence model.
Lin teaches:
wherein the machine learning model is an attention-based sequence-to-sequence model (Column 8, lines 60-64, "Then, the generation model can be trained based on the determined training sample. In the implementation of the present specification, a sequence-to-sequence (Seq2seq) model can be selected as the generation model, namely, an encoder-decoder network (Encoder-Decoder)."; Column 10, lines 12-17, "In the model training process, the output of the decoder neural network at each moment can be determined by the input. That is, for each input, a word can be selected based on probability distribution of words in a word list. The word probability distribution can be implemented, for example, by using an attention mechanism.").
Lin teaches using a sequence-to-sequence model with attention in order to improve the efficiency and effectiveness of generating information (Column 6, lines 5-15, "The generation model can be a neural network based on natural language processing, for example, an encoder-decoder network (Encoder-Decoder). The encoder-decoder network can be used to handle sequence-to-sequence problems. The recommendation information can be considered as a sequence of a plurality of words. A predetermined quantity of pieces of recommendation information can be generated based on input keywords by using the generation model, so that diversified recommendation information is generated, and therefore efficiency and effectiveness of generating the recommendation information is improved.").
Veyseh, Mascio, and Lin are considered to be analogous to the claimed invention because they are in the same field of document processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Veyseh in view of Mascio to incorporate the teachings of Lin to use a sequence-to-sequence model with attention.  Doing so would allow for improving the efficiency and effectiveness of generating information.
Regarding claim 18, Veyseh in view of Mascio and further in view of Lin discloses the method as claimed in claim 17.
Lin further teaches:
wherein the machine learning model includes a gated recurrent unit based encoder-decoder recurrent neural network (Column 3, lines 22-31, "In an implementation, the encoder-decoder network includes an encoder neural network and a decoder neural network, the encoder neural network is configured to convert an input word sequence into a semantic vector, the decoder neural network is configured to predict a predetermined quantity of character sequences based on the semantic vector, and the encoder neural network or the decoder neural network is one of a recurrent neural network, a bidirectional recurrent neural network, a gated recurrent unit, and a long short-term memory model."; The encoder neural network or the decoder neural network being a gated recurrent unit reads on a gated recurrent unit based encoder-decoder recurrent neural network, as a gated recurrent unit neural network is a type of recurrent neural network.).
Lin teaches using a gated recurrent unit encoder-decoder neural network in order to improve the efficiency and effectiveness of generating information (Column 6, lines 5-15, "The generation model can be a neural network based on natural language processing, for example, an encoder-decoder network (Encoder-Decoder). The encoder-decoder network can be used to handle sequence-to-sequence problems. The recommendation information can be considered as a sequence of a plurality of words. A predetermined quantity of pieces of recommendation information can be generated based on input keywords by using the generation model, so that diversified recommendation information is generated, and therefore efficiency and effectiveness of generating the recommendation information is improved.").
Veyseh, Mascio, and Lin are considered to be analogous to the claimed invention because they are in the same field of document processing systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Veyseh in view of Mascio and further in view of Lin to further incorporate the teachings of Lin to use a gated recurrent unit encoder-decoder neural network.  Doing so would allow for improving the efficiency and effectiveness of generating information.
Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Veyseh in view of Mascio, and further in view of Sherstinsky and Joyce.
Regarding claim 19, as best understood based on the 35 U.S.C. 112(b) issues identified above, Veyseh in view of Mascio discloses the method as claimed in claim 12, but does not specifically disclose wherein: each pair of term and corresponding definition in the training data and ground truth, respectively, is independent of each other.
Sherstinsky teaches:
each pair of term and corresponding definition in the training data and ground truth, respectively, is independent of each other (Page 13, footnote 17, lines 8-12, "Moreover, if the input data is prepared ahead of time and is made available to the system in its entirety, then the causality restriction can be relaxed altogether. This can be feasible in applications, where the entire training data set or a collection of independent training data segments is gathered before processing commences."; Independent training data segments read on each pair of term and corresponding definition in the training data and ground truth being independent of each other.).
Sherstinsky teaches using independent training data segments in order to detect the presence of context among data samples for analyzing audio, speech, and text (Page 13, footnote 17, lines 12-17, "Non-causal processing (i.e., a technique characterized by taking advantage of the input data “from the future”) can be advantageous in detecting the presence of “context” among data samples. Utilizing the information at the “future” steps as part of context for making decisions at the “current” step is often beneficial for analyzing audio, speech, and text.").
Veyseh, Mascio, and Sherstinsky are considered to be analogous to the claimed invention because they are in the same field of machine learning systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Veyseh in view of Mascio to incorporate the teachings of Sherstinsky to use independent training data segments.  Doing so would allow for to detecting the presence of context among data samples for analyzing audio, speech, and text.
Veyseh in view of Mascio and further in view of Sherstinsky does not specifically disclose: the machine learning model is further trained to determine the definition of the input term independently of other data associated with the input term.
Joyce teaches:
the machine learning model is further trained to determine the definition of the input term independently of other data associated with the input term (Paragraph 0158, lines 1-9, "The pattern matching analysis of the testing module 606 uses the data content of the fields (in addition to or instead of the field names). The types of pattern matching that are used for the pattern matching can be determined by the testing module 606 based on the results of the classification data. For example, the classification data may identify a data type of a field, such as that the data are numerical. In this example, the profile data also indicates that each entry in the data field is 13-18 characters long. This may indicate to the testing module 606 that the data field may be a credit card number data field."; Paragraph 0166, lines 1-9, "The testing module 606 is configured to execute machine learning logic in which classifications of prior datasets (e.g., from a particular source) or of prior iterations of the same dataset are remembered and influence which tests are selected for subsequent iterations and how the probability values of those subsequent iterations are determined. The machine learning logic is trained on the dataset and can apply the weights that are developed using the training data to classify new data of the dataset."; Using the data content of the fields to classify the data, as in the example of the data content being numerical and 13-18 characters long used to determine that the data field may be a credit card number, reads on determining the definition of the input term independently of other data associated with the input term.).
Joyce teaches using the data content of the fields to classify the data in order to determine the data processing operations to apply to the data values to accomplish a specified goal of an application (Paragraph 0031, lines 1-13, "Aspects can include one or more advantages. For instance, the techniques described herein enable a data processing system to automatically generate one or more rules for processing the data of data fields of a dataset. Once the semantic meaning of the data are known, the data processing system determines which data processing operations to apply to the data values to accomplish a specified goal of an application. The data processing system can thus automatically determine how to process different data fields of the dataset to accomplish a goal for the entire dataset, such as masking data values, enforcing data quality rules, identifying a schema of the dataset, and/or selecting test data for testing another application.").
Veyseh, Mascio, Sherstinsky, and Joyce are considered to be analogous to the claimed invention because they are in the same field of machine learning systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Veyseh in view of Mascio and further in view of Sherstinsky to incorporate the teachings of Joyce to use the data content of the fields to classify the data.  Doing so would allow for determining the data processing operations to apply to the data values to accomplish a specified goal of an application.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to James Boggs whose telephone number is (571)272-2968. The examiner can normally be reached M-F 8:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571)272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JAMES BOGGS/Examiner, Art Unit 2657                                                                                                                                                                                                        

/DANIEL C WASHBURN/Supervisory Patent Examiner, Art Unit 2657