DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
Claims 1-2, 4-5, 7-10, 12-13, and 15-19 were amended. Claims 3, 6, 11, 14, and 20 were previously canceled. Claims 21-23 were added. Claims 1-2, 4-5, 7-10, 12-13, 15-19 and 21-23 are pending and are examined herein.
Claims 1-2, 4-5, 7-10, 12-13, 15-19, and 21-23 are objected to for minor informalities.
Claims 1-2, 4-5, 7-10, 12-13, 15-19, and 21-23 are rejected under 35 USC 112(b). 
Claims 17-19 and 21-23 are rejected under 35 USC 101 because the claimed invention is directed to non-statutory subject matter.  
Claims 1-2, 4-5, 7-10, 12-13, 15-19, and 21-23 are rejected under 35 USC 101 as being directed to an abstract idea without significantly more. See response to arguments.
Applicant’s amendment overcomes the previous grounds of rejection of claims 1-2, 4-5, 7-10, 12-13, and 15-19 under 35 USC 103. New grounds of rejection under 35 USC 103 of claims 1-2, 4-5, 7-10, 12-13, 15-19, and 21-23 necessitated by amendment are presented herein.

Response to Arguments
	Applicant’s arguments filed 08/29/2022 regarding the rejection under 35 USC 103 have been fully considered, and are persuasive. Note newly cited Ittycheriah in the current grounds of rejection with respect to the client-specific text-formatting issue.

	Applicant’s arguments filed 08/29/2022 regarding the rejection under 35 USC 101 have been fully considered, but are not persuasive. Applicant argues on pages 9-12 that the claims represent an improvement to technology and more specifically an improvement to performing natural language understanding by using machine learning models. Examiner respectfully disagrees. Performing natural language understanding is fundamentally a mental process. As was made clear in SAP America Inc. v. InvestPic LLC, an improvement in the realm of an abstract idea does not make a claim eligible. The “natural language understanding (NLU) algorithm” and the training of the “natural language understanding (NLU) machine learning model” are recited at a very high level of generality with few technical implementation features. Consequently, the recitation of these elements in the claim amounts to a mere instruction to apply the abstract idea using a computer. Providing further details of the machine learning model training would likely help to overcome the rejection under 35 USC 101 although further consideration of any claim language would be required to reach a determination.

Specification
The specification is objected to as failing to provide proper antecedent basis for the claimed subject matter.  See 37 CFR 1.75(d)(1) and MPEP § 608.01(o).  Correction of the following is required: The specification does not provide antecedent basis for “computer-readable tangible storage media” as recited in claim 9 and claims dependent thereon. Note also rejection under 35 USC 112(b).

Claim Objections
Claims 1-2, 4-5, 7-10, 12-13, 15-19, and 21-23 are objected to because of the following informalities:  
Independent claims 1, 9, and 17 recite “grouping the quality heuristics in the set of quality heuristics”; however, “the quality heuristics” lacks proper antecedent basis. 37 CFR 1.71(a) requires the use of “full, clear, concise, and exact terms”. This requirement would be better met by amending to “
Independent claims 1, 9, and 17 recite “applying the quality heuristics in a respective cluster to the data”; however, “the quality heuristics in a respective cluster” lacks proper antecedent basis. 37 CFR 1.71(a) requires the use of “full, clear, concise, and exact terms”. This requirement would be better met by amending to “ quality heuristics in a respective cluster”. Claims dependent on claims 1, 9, and 17 are objected to with the same rationale.
Claims 1, 9, and 17 recite “the data corresponding to a particular cluster”; however, this limitation lacks proper antecedent basis. 37 CFR 1.71(a) requires the use of “full, clear, concise, and exact terms”. This requirement would be better met by amending to “[[the]] data corresponding to a particular cluster”. Claims dependent on claims 1, 9, and 17 are objected to with the same rationale.
Claims 1, 9, and 17 recite “the quality heuristics in the particular cluster”; however, this limitation lacks proper antecedent basis. 37 CFR 1.71(a) requires the use of “full, clear, concise, and exact terms”. This requirement would be better met by amending to “[[the]] quality heuristics in the particular cluster”. Claims dependent on claims 1, 9, and 17 are objected to with the same rationale.
Claims 1, 9, and 17 recite “the quality score for the particular cluster”; however, this limitation lacks proper antecedent basis. 37 CFR 1.71(a) requires the use of “full, clear, concise, and exact terms”. This requirement would be better met by amending to “[[the]] a quality score for the particular cluster”. Claims dependent on claims 1, 9, and 17 are objected to with the same rationale.
Claims 5, 13, and 19 recite “the data corresponding to each cluster”; however, this limitation lacks proper antecedent basis. 37 CFR 1.71(a) requires the use of “full, clear, concise, and exact terms”. This requirement would be better met by amending to “[[the]] data corresponding to each cluster”.
Claims 7, 15, and 22 recite “the set”; however, this limitation lacks proper antecedent basis. For the purposes of examination, this limitation is being interpreted as “the set of quality heuristics”.
Claims 8, 16, and 23 recite “the patterns”; however, this limitation lacks proper antecedent basis the first time that it appears. 37 CFR 1.71(a) requires the use of “full, clear, concise, and exact terms”. This requirement would be better met by amending to “[[the]] patterns” the first time this limitation appears.
Claim 9 recites “A computer system enhancing a natural language understanding (NLU) algorithm”. The phrasing makes it appear that this is a positively recited step, but the claim is directed to a system. As the limitations are directed to capabilities of the system and not specific functions or actions performed by a user, this particular limitation is not being treated as indefinite (see MPEP 2173.05(p), section II, especially discussion of Mastermine Software, Inc. v. Microsoft Corp). However, 37 CFR 1.71(a) requires the use of “full, clear, concise, and exact terms”. This requirement would be better met by amending to “A computer system configured to enhance .
Appropriate correction is required.

Claim Interpretation – Contingent Limitations
	 The claims variously recite limitations of the form “do A when B”. These are being interpreted as contingent limitations. In particular, the form “do A when B” is not being interpreted as requiring that condition B occur. For example, claim 1 recites “identifying a pattern in the data corresponding to a particular cluster based on the set of quality heuristics in the particular cluster when the quality score for the particular cluster is below a threshold”; however, the claim does not appear to require that the quality score for the particular cluster is below a threshold. The use of “when” in claims 5, 13, and 19 has the same issue. The interpretation of contingent limitations may be found at MPEP 2111.04, section II. In particular, “The broadest reasonable interpretation of a method (or process) claim having contingent limitations requires only those steps that must be performed and does not include steps that are not required to be performed because the condition(s) precedent are not met” and “The broadest reasonable interpretation of a system (or apparatus or product) claim having structure that performs a function, which only needs to occur if a condition precedent is met, requires structure for performing the function should the condition occur.” 

	Alternate language that Applicant may consider which would resolve the issue of having a contingent limitation whose precedent condition is not necessarily met: “identifying a pattern in the data corresponding to a particular cluster based on the set of quality heuristics in the particular cluster 

	Applicant may consider similar amendments to claims 5, 13, and 19. 

Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-2, 4-5, 7-10, 12-13, 15-19 and 21-23 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

	Independent claims 1, 9, and 17 recite “the container level” and at several points “each container level”. However, it is unclear what relationship exists between these limitations and the preceding “one or more container level”.  For the purposes of examination, this limitation is being interpreted as  “the one or more container level” and “each of the one or more container level”. Note in particular that the claim is not being interpreted as requiring more than a single container level.
	Dependent claims 2, 4-5, 7-8, 10, 12-13, 15-16, 18-19 and 21-23 do not resolve the issue and are objected to with the same rationale.

	Claims 1, 9 and 17 recite “wherein the quality score for each cluster is calculated”; however, “the quality score for each cluster” lacks proper antecedent basis. The nearest prior limitation is “a quality score”. The use of “the” in “the quality score for each cluster” could imply that “the quality score for each cluster” is the same as the “quality score” used in the clustering. However, it is unclear what relationship (if any) exists between “a quality score” and “the quality score for each cluster”. For the purposes of examination, this limitation is being interpreted as “wherein [[the]] a quality score for each cluster is calculated”. 
	Dependent claims 2, 4-5, 7-8, 10, 12-13, 15-16, 18-19 and 21-23 do not resolve the issue and are rejected with the same rationale.

	Claims 1, 9, and 17 recite the claim term “quality heuristics”. As per MPEP 2173.05(a), the meaning of every term should be apparent. In this case, it is unclear what the scope of the claim term “quality heuristics” is. While Applicant does provide examples of quality heuristics starting at published [0057-0082], the specification makes it clear that quality heuristics are not limited to these examples. For the purposes of examination, a quality heuristic is being interpreted as any feature computed or determined based on the document. Using the specific examples of “quality heuristics” at [0057-0082] would not be rejected as indefinite.
	Dependent claims 2, 4-5, 7-8, 10, 12-13, 15-16, 18-19 and 21-23 do not resolve the issue and are rejected with the same rationale.

	Claim 9 recites “computer-readable tangible storage media”. This term is not defined by the specification. The specification includes a disclaimer of signals per se for the term “computer readable storage medium”. It is unclear whether the disclaimer in the specification applies to the claimed terminology. For the purposes of examination, this limitation is being interpreted as not necessarily invoking the disclaimer (i.e., computer-readable tangible storage media may encompass signals per se). 
	Dependent claims 10, 12-13, and 15-16 do not resolve the issue and are rejected with the same rationale.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.



Claims 17-19 and 21-23 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because the broadest reasonable interpretation of “computer-readable storage device” encompasses both statutory and non-statutory subject matter such as signals per se. Note that the previous claim language “computer readable storage medium” was not given this rejection in view of the disclaimer at [0159] of the published specification, which does not clearly and unmistakably apply to the limitation “computer-readable storage device”, but would clearly and unmistakably apply to the limitation “computer readable storage medium”. See MPEP 2111.01(IV).



Claims 1-2, 4-5, 7-10, 12-13, 15-19 and 21-23 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

	When considering subject matter eligibility under 35 U.S.C. 101, it must be determined whether the claim is directed to one of the four statutory categories of invention, i.e., process, machine, manufacture, or composition of matter (Step 1). If the claim does fall within one of the statutory categories, the second step in the analysis is to determine whether the claim is directed to a judicial exception (Step 2A). The Step 2A analysis is broken into two prongs. In the first prong (Step 2A, Prong 1), it is determined whether or not the claims recite a judicial exception (e.g., mathematical concepts, mental processes, certain methods of organizing human activity). If it is determined in Step 2A, Prong 1 that the claims recite a judicial exception, the analysis proceeds to the second prong (Step 2A, Prong 2), where it is determined whether or not the claims integrate the judicial exception into a practical application. If it is determined at step 2A, Prong 2 that the claims do not integrate the judicial exception into a practical application, the analysis proceeds to determining whether the claim is a patent-eligible application of the exception (Step 2B). If an abstract idea is present in the claim, any element or combination of elements in the claim must be sufficient to ensure that the claim integrates the judicial exception into a practical application, or else amounts to significantly more than the abstract idea itself. Applicant is advised to consult the 2019 PEG for more details of the analysis.
	
	Step 1 Analysis (All Claims)
According to the first part of the analysis, in the instant case claims 1, 2, 4-5, and 7-8 are directed to a method; claims 9-10, 12-13, and 15-16 are directed to a system comprising at least one or more computer-readable memories, and are directed to one of the four statutory categories of invention. However, claims 17-19 and 21-23 are not directed to one of the four statutory categories as indicated above.

	Combined Step 2A Prong 1, Step 2A Prong 2, and Step 2B Analysis	
	
	Claim 1 includes the following recitation of an abstract idea:
	A computer-implemented method for enhancing a natural language understanding (NLU) algorithm, the method comprising: (The intended use “for enhancing a natural language understanding (NLU) algorithm” is practical to perform in the human mind. For example, a computer programmer would typically develop and “enhance” NLU algorithms.) 
	generating a set of quality heuristics for the natural language understanding (NLU) algorithm, at each container level; (Generating a set of quality heuristics is practical to perform in the human mind. Note the examples of the quality heuristics starting at published [0057]. This is a recitation of a mental process.)
	grouping the quality heuristics in the set of quality heuristics at each container level into one or more clusters based on a quality score, (Grouping quality heuristics into one or more clusters based on a quality score is practical to perform in the human mind. For example, the heuristics could be grouped by simply taking examples where the quality score is less than 0.5 to be a first group and the examples with quality score greater than 0.5 to be a second group.)
	wherein the quality score for each cluster is calculated by applying the quality heuristics in a respective cluster to the data; (Calculating a quality score for each cluster is a recitation of a mathematical concept (i.e., a mathematical calculation). At the level of generality that the quality score is recited, it is also practical to implement in the human mind. For example, calculating a number of sentences which cannot be parsed would be practical to perform in the human mind.)
	identifying a pattern in the data corresponding to a particular cluster based on the quality heuristics in the particular cluster when the quality score for the particular cluster is below a threshold, wherein the pattern includes a source; (Identifying a pattern in data based on the set of quality heuristics when the quality score is below a threshold is practical to perform in the human mind. This is a recitation of a mental process.)
	determining that the source of the pattern is a client-specific text-formatting issue; and (Determining that the source of a pattern is a client-specific text-formatting issue is practical to perform in the human mind. This is a recitation of a mental process.)
	Claim 1 recites the following additional elements which, considered individually and as an ordered combination, do not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea:
	A computer-implemented method (The recitation of a computer is a high level recitation of generic computer equipment configure to perform the abstract idea. This does not integrate the abstract idea into a practical application. See MPEP 2106.05(f).)
	collecting data, wherein the data is textual and includes one or more container level, wherein the container level is selected from a group consisting of: document, section, paragraph and sentence; (Collecting or gathering data of a particular type or source (i.e., textual data which includes one or more container levels that comprise document, section, paragraph or sentence) is an attempt to limit the abstract idea to a particular field of use or technological environment. This does not integrate the abstract idea into a practical application. See MPEP 2106.05(h).)
	…training a natural language understanding (NLU) machine learning model by applying the quality heuristics in the particular cluster to the data. (The recitation of training a natural language understanding model based on a result of the abstract idea is a mere instruction to apply the judicial exception. The claim does not recite any details as to how the model is to be trained and merely uses the computer as a tool for carrying out an existing process. This does not integrate the abstract idea into a practical application. See MPEP 2106.05(f).)
Claim 1 does not reflect an improvement to computer technology or any other technology.

Claim 2 recites at least the abstract idea identified above in the claim upon which it depends. 
Claim 2 recites the following additional elements which, considered individually and as an ordered combination with the additional elements from the claim upon which it depends, do not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea:
wherein the data is comprised of documents or text. (Collecting or gathering data of a particular type or source (i.e., documents or text) is an attempt to limit the abstract idea to a particular field of use or technological environment. This does not integrate the abstract idea into a practical application. See MPEP 2106.05(h).)
Claim 2 does not reflect an improvement to computer technology or any other technology.

Claim 4 recites at least the abstract idea identified above in the claim upon which it depends. 
Claim 4 recites the following additional elements which, considered individually and as an ordered combination with the additional elements from the claim upon which it depends, do not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea:
wherein the grouping the set of quality heuristics into the one or more clusters comprises using unsupervised machine learning models. (The recitation of using unsupervised machine learning models does not provide any details as to how the grouping is achieved and uses the computer as a tool to perform an existing process. This is a mere instruction to apply the judicial exception, which does not integrate the abstract idea into a practical application. See MPEP 2106.05(f).)
Claim 4 does not reflect an improvement to computer technology or any other technology.

Claim 5 recites at least the abstract idea identified above in the claim upon which it depends. Claim 5 further recites
	determining a second quality score for the data; (As discussed above regarding claim 1, determining the quality score is both a recitation of a mental process and a mathematical concept.)
confirming the quality score for the particular cluster when the second quality score is below the threshold. (A person could practically confirm a quality score when a score is below a threshold. This is a recitation of a mental process.)	
Claim 5 recites the following additional elements which, considered individually and as an ordered combination with the additional elements from the claim upon which it depends, do not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea:
retrieving the data corresponding to each cluster; (Collecting data of a particular type or source (i.e., collecting data corresponding to the clusters) is an attempt to limit the abstract idea to a particular field of use or technological environment. This does not integrate the abstract idea into a practical application. See MPEP 2106.05(h).)
Claim 5 does not reflect an improvement to computer technology or any other technology.

Claim 7 recites at least the abstract idea identified above in the claim upon which it depends. Claim 7 further recites:
wherein quality heuristics in the set are used to analyze additional data. (A person could practically analyze data using the quality heuristics. For example, the number of sentences which cannot be parsed in a new document could be compared to the numbers in the set.)
Claim 7 does not recite further additional elements which might integrate the abstract idea into a practical application or amount to significantly more than the abstract idea.
Claim 7 does not reflect an improvement to computer technology or any other technology.

Claim 8 recites at least the abstract idea identified above in the claim upon which it depends. Claim 8 further recites:
further comprising generating a report describing: the quality heuristics in the set of quality heuristics, the patterns in the data, wherein the patterns are determined to be new, unexpected or a problem; the one or more clusters and the quality heuristics in the particular cluster; the source of the pattern; and a trained natural language understanding (NLU) machine learning model.  (A person could practically generate a report including the claimed elements in the human mind, perhaps assisted by pen and paper. In particular, the data need only be described. This could be accomplished for the machine learning model, for example, by saying “a decision tree” or “a neural network with three hidden layers”, etc. Furthermore, a person could practically determine whether a pattern is new, unexpected or a problem in the human mind.)
Claim 8 does not recite further additional elements which might integrate the abstract idea into a practical application or amount to significantly more than the abstract idea.
Claim 8 does not reflect an improvement to computer technology or any other technology.

Claim 9 recites substantially similar subject matter to claim 1 including substantially the same abstract idea.  
enhancing a natural language understanding (NLU) algorithm (The intended use “for enhancing a natural language understanding (NLU) algorithm” is practical to perform in the human mind. For example, a computer programmer would typically develop and “enhance” NLU algorithms.)
Claim 9 recites the following additional elements which, considered individually and as an ordered combination with the additional elements addressed above with respect to claim 1, do not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea:
A computer system…, comprising: one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage media, and program instructions stored on at least one of the one or more tangible storage media for execution by at least one of the one or more processors via at least one of the one or more memories, wherein the computer system is capable of performing a method comprising: (This is a high level recitation of generic computer equipment programmed to perform the abstract idea. This does not integrate the abstract idea into a practical application. See MPEP 2106.05(f).)
Claim 9 does not reflect an improvement to computer technology or any other technology.

Claims 10, 12-13, and 15-16 recite substantially similar subject matter to claims 2, 4-5 and 7-8, respectively, and are rejected with the same rationale in view of the rejection of independent claim 9.

Claim 17 recites substantially similar subject matter to claim 1 including substantially the same abstract idea.  
for enhancing a natural language (NLU) algorithm (The intended use “for enhancing a natural language understanding (NLU) algorithm” is practical to perform in the human mind. For example, a computer programmer would typically develop and “enhance” NLU algorithms.)
Claim 17 recites the following additional elements which, considered individually and as an ordered combination with the additional elements addressed above with respect to claim 1, do not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea:
A computer program product…, the computer program product comprising a computer-readable storage device having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform a method comprising: (This is a high level recitation of generic computer equipment programmed to perform the abstract idea. This does not integrate the abstract idea into a practical application. See MPEP 2106.05(f).)
Claim 17 does not reflect an improvement to computer technology or any other technology.

Claims 18-19 and 21-23 recite substantially similar subject matter to claims 4-5, 2, and 7-8, respectively, and are rejected with the same rationale in view of the rejection of claim 17.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

	Claims 1-2, 4, 7-10, 12, 15-18 and 21-23 are rejected under 35 U.S.C. 103 as being unpatentable over “Ramachandran” (US 10,884,842 B1) in view of “Hsu” (US 2014/0207716 A1) and further in view of “Ittycheriah” (US 2021/0342517 A1).

	Regarding claim 1, Ramachandran teaches
	A computer-implemented method for enhancing a natural language understanding (NLU) algorithm, the method comprising: (Abstract describes classifying problems. Column 9, lines 1-20 describe producing a model for each specific problem. Column 7, lines 24-46 and column 8, lines 45-62 indicate that the algorithm performs natural language understanding. Training the models used in this algorithm (see below) is consequently an enhancement of the natural language understanding algorithm. Figure 1 and column 1 line 60 through column 2, line 9 indicates that this may be computer implemented.)
	collecting data, wherein the data is textual includes one or more container level, wherein the container level is selected from a group consisting of: document, section, paragraph and sentence; (Column 7, lines 58-64 describe receiving logs (i.e., documents). A log is being interpreted as a document. Note that the claim requires only a single container level (e.g., document). Column 6, lines 4-27 indicate that the logs may comprise text.))
	generating a set of quality heuristics for the natural language understanding (NLU) algorithm, at each container level; (Column 7, line 65 through column 8 line 26 describes vectorizing the logs/documents using term frequency inverse document frequency. The TFIDF values are being interpreted as quality heuristics since the terms used by a document/log may be indicative of quality. Note that the specification does not define the term “quality heuristics”. Note also that the claim does not require a plurality of container levels.)
	grouping the quality heuristics in the set of quality heuristics at each container level into one or more clusters (Column 8, lines 27-44 describes clustering the vectors.)
	…identifying a pattern in the data corresponding to a particular cluster based on the quality heuristics in the particular cluster (Column 8, lines 27-44 describe generating classified buckets (i.e., determining patterns in the data corresponding to the clusters). This is based on the vector representation (i.e., the quality heuristics as described above).)
	…determining that the source of the pattern…(Column 8, lines 45-62 describes labeling/classifying the buckets. Column 9, lines 1-19 clarify that the label may be a label indicating a specific problem. Column 5, lines 9-24 indicate that the problem may be an issue arising from a client operating an application. That is, the problem with which a cluster is labeled may be a client-specific problem/issue. See also column 9, lines 21-32.)
	training a natural language understanding (NLU) machine learning model by applying the quality heuristics in the particular cluster to the data. (Column 9, lines 1-20 describe producing a model for each specific problem. As described above, the clusters correspond to problems. Since this may occur for each problem, it also occurs in response to the client-specific issues/problems. Figure 3B, steps 312-316 indicate that the vectorized data (i.e., the quality heuristics) are used for the model training steps.)
	Ramachandran does not appear to explicitly teach 
	grouping the quality heuristics in the set of quality heuristics at each container level into one or more clusters based on a quality score, wherein the quality score for each cluster is calculated by applying the quality heuristics in a respective cluster to the data;
	identifying a pattern in the data corresponding to a particular cluster based on the quality heuristics in the particular cluster when the quality score for the particular cluster is below a threshold
	determining that the source of the pattern is a client-specific text formatting issue
	However, Hsu—directed to analogous art—teaches
	grouping the quality heuristics in the set of quality heuristics at each container level into one or more clusters based on a quality score, wherein the quality score for each cluster is calculated by applying the quality heuristics in a respective cluster to the data; (Abstract describes a statistical classification system which clusters queries. Figure 5 provides an overview of the clustering. In particular, [0055-0056] indicates that clusters are evaluated based on a confidence measure (i.e., a “quality score”) for each of the clusters.)
	identifying a pattern in the data corresponding to a particular cluster based on the quality heuristics in the particular cluster when the quality score for the particular cluster is below a threshold (Note claim interpretation. In particular, the precedent condition does not appear to actually be required by the claim. It is indicated how the prior art would teach this limitation for the purposes of compact prosecution. [0055-0056] indicates that when a cluster confidence (i.e., “quality score”) falls below a threshold, it may be sent to a specialist to determine the pattern. See also [0060-0061]. While Hsu indicates that it may be sent for manual review, in the combination with Ramachandran, the models taught by Ramachandran may be used in place of manual review.)
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify Ramachandran to use a quality score for the clustering as taught by Hsu because this allows for a determination as to when clusters are ambiguous, allowing further analysis as described by Hsu at [0056].
	The combination of Ramachandran and Hsu does not appear to explicitly teach 
	determining that the source of the pattern is a client-specific text formatting issue
	However, Ittycheriah—directed to analogous art—teaches
	determining that the source of the pattern is a client-specific text formatting issue (Ittycheriah, Title and Abstract describe a user-specific format suggestion. This is described in more detail at [0090]: “In some implementation , stored text records 121 are user - specific and associated only with a particular user . For example , the stored text records 121 may be populated with only text that the particular user contributed to the one or more electronic documents. [0090-0093] describe providing user-specific text-formatting suggestions. A determination that a suggestion is to be prevented is being interpreted as an indication that there is a text formatting issue.)
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify Ramachandran and Hsu because this allows for a reduction in the time it takes for users to format documents as described by Ittycheriah at [0028].

	Regarding claim 2, the rejection of claim 1 is incorporated herein. Furthermore, Ramachandran teaches
	wherein the data is comprised of documents or text. (Column 7, lines 58-64 describe receiving logs (i.e., documents). A log is being interpreted as a document. Note that the claim requires only a single container level (e.g., documents). Column 6, lines 4-27 indicate that the logs may comprise text.)

	Regarding claim 4, the rejection of claim 1 is incorporated herein. Furthermore, Ramachandran teaches
	wherein grouping the set of quality heuristics into the one or more clusters comprises using unsupervised machine learning models. (Column 6, lines 44-56 describe the algorithms which may be used including K-means clustering, means-shift clustering, density-based clustering, EM clustering, and and/or agglomerative hierarchical clustering. These are all unsupervised models.)

	Regarding claim 7, the rejection of claim 1 is incorporated herein. Furthermore, Ramachandran teaches	
	wherein quality heuristics in the set are used to analyze additional data. (Figure 4B shows an external service requesting analysis of a new failure data log. This is described at column 9, lines 1-20. Column 9, lines 33-55 indicate that the same vectorization process (i.e., the same set of quality heuristics) may be used for new data.)

	Regarding claim 8, the rejection of claim 1 is incorporated herein. Furthermore, Ramachandran teaches
	further comprising generating a report describing: the quality heuristics in the set of quality heuristics, including the patterns in the data, wherein the patterns are determined to be new, unexpected or a problem; the one or more clusters and the quality heuristics in the particular cluster; the source of the pattern; and (Column 9, lines 33-55 describes presenting to a user an output label for the problem represented by the failure log data along with the failure log data. The label “describes” the quality heuristics in the set (since it is a label for these). Moreover, it “describes” the patterns since it outputs both the log itself and the label. The label “describes” the cluster since it applies to the full cluster. It also describes the source (i.e., the particular problem) associated with the patterns. Moreover, Column 10, lines 63-67 describe outputting the cluster data to the client.)
	a trained natural language understanding (NLU) machine learning model. (Column 11, lines 37-56 further describe delivering the trained model data to the client. All of the data delivered to the client, considered together, is taken to correspond to the “report”. The claim does not recite any particular form that that the report needs to take or any way in which the report is generated.) 

Regarding claim 9, Ramachandran teaches
A computer system enhancing a natural language understanding (NLU) algorithm, comprising: one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage media, and program instructions stored on at least one of the one or more tangible storage media for execution by at least one of the one or more processors via at least one of the one or more memories, wherein the computer system is capable of performing a method comprising: (Abstract describes classifying problems. Column 9, lines 1-20 describe producing a model for each specific problem. Column 7, lines 24-46 and column 8, lines 45-62 indicate that the algorithm performs natural language understanding. Training the models used in this algorithm (see rejection of claim 1) is consequently an enhancement of the natural language understanding algorithm. Figure 1 and column 1 line 60 through column 2, line 9 indicates that this may be computer implemented. Column 3, line 65 through column 4, line 11 indicates that the system may include one or more processors and one or more memories/media storing instructions for implementing the methods of Ramachandran.)
The remainder of claim 9 is substantially similar to claim 1; claim 9 is rejected with the same rationale, mutatis mutandis.

Claims 10, 12, and 15-16 are substantially similar to claims 2, 4, and 7-8, respectively, and are rejected with the same rationale as claims 2, 4, 7 and 8, respectively, in view of the rejection of claim 9, mutatis mutandis.

Regarding claim 17, Ramachandran teaches
	A computer program product for enhancing a natural language (NLU) algorithm, the computer program product comprising a computer-readable storage device having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform a method comprising: (Column 3 line 65 through column 4 line 34 describe an embodiment as a computer program stored on a computer readable storage medium which may cause a processor to perform operations)
	The remainder of claim 17 is substantially similar to claim 1; claim 9 is rejected with the same rationale, mutatis mutandis.

	Claims 18 and 21-23 are substantially similar to claims 4, 2, 7, and 8, respectively and are rejected with the same rationale as claims 4, 2, 7, and 8, respectively, in view of the rejection of claim 17, mutatis mutandis.

	Claims 5, 13, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over “Ramachandran” (US 10,884,842 B1) in view of “Hsu” (US 2014/0207716 A1), further in view of Ittycheriah (US 2021/0342517 A1), and further in view of “He” (US 2017/0082555 A1).
	
	Regarding claim 5, the rejection of claim 1 is incorporated herein. Furthermore, Ramachandran teaches
	retrieving the data corresponding to each cluster; (Column 9, lines 1-20 describe producing a model for each specific problem. As described above, the clusters correspond to problems. In particular, the cluster-specific model is trained on the cluster data, so this data is retrieved.)
	Ramachandran and Hsu does not appear to explicitly teach 
	determining a second quality score for the data;
	confirming the quality score for the particular cluster when the second quality score is below the threshold.
	However, He—directed to analogous art—teaches
	determining a second quality score for the data; ([0092] describes detecting clusters of novel defects/issues. [0100] indicates that a first confidence is computed using a first model. [0101-0102] indicates that this data is then used to train an additional model.  [0103] describes computing a second confidence value based on the second classifier.)
	 confirming the quality score for the particular cluster when the second quality score is below the threshold. (Note claim interpretation. In particular, the antecedent condition does not appear to actually be required by the claim. It is indicated how the prior art would teach this limitation for the purposes of compact prosecution. [0103] describes placing the objects in either a novel or non-novel bin based on whether or not the second confidence exceeds a threshold. The decision to place the data in a bin based on the subsequent failures to exceed a confidence threshold is being interpreted as a confirmation of the confidence/quality score of the cluster since it reflects a decision made when the models are unanimous in their confidence assessment.)
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify the combination of Ramachandran and Hsu to determine a second quality score and confirm the quality score of the cluster based on the second quality score as taught by He because this allows for the identification of new issues and subtypes as described by He at [0100].

	Claims 13 and 19 recite substantially similar subject matter to claim 5 and are rejected with the same rationale as claim 5 in view of the rejections of claims 9 and 17, mutatis mutandis.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Markus A Vasquez whose telephone number is (303)297-4432. The examiner can normally be reached Monday to Friday 9AM to 4PM PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li Zhen can be reached on (571) 272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MARKUS A. VASQUEZ/Examiner, Art Unit 2121