DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 05/04/2021 has been entered.

Status of Claims
The present application is being examined under the claims filed 05/04/2021.
Claims 1-31 are amended.
Claims 1-31 are rejected.
Claims 1-31 are pending.

Drawings
The Drawings filed on 02/05/2016 are acceptable for examination purposes.

Specification
The Specification filed on 10/29/2019 is acceptable for examination purposes.

Response to Arguments
In reference to Rejections under 35 USC § 112(a)
Applicant’s arguments, filed 05/04/2021, with respect to Rejections under 35 USC § 112(a) have been fully considered and are persuasive. The Rejections under 35 USC § 112(a) has been withdrawn in view of amendments.

In reference to Rejections under 35 USC § 112(b)
Applicant’s arguments, filed 05/04/2021, with respect to Rejections under 35 USC § 112(b) have been fully considered and are persuasive. The Rejections under 35 USC § 112(b) has been withdrawn in view of amendments.

In reference to Claim Interpretation under 35 USC § 112(f)
Applicant’s arguments, filed 05/04/2021, with respect to Claim Interpretation under 35 USC § 112(f) have been fully considered and are persuasive. The Claim Interpretation under 35 USC § 112(f) has been withdrawn in view of amendments.

In reference to Rejections under 35 USC § 101
Applicant's arguments have been fully considered but they are not persuasive.
Applicant asserts (see pg. 12 to pg. 13) that the Revised Claim Overcomes the Basis of Rejection.
Examiner respectfully disagrees. The claim are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims recite step which describe a mental process, see below for full analysis. This judicial exception is not integrated into a practical application because the claim is directed to an abstract idea with additional generic computer elements, the generically recited computer elements do not add a meaningful limitation to the abstract 

Applicant asserts (see pgs. 13-16) that the claims are not abstract per the January 2019 USPTO Guidelines.
Examiner respectfully disagrees. The claim are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims recite step which describe a mental process, see below for full analysis. This judicial exception is not integrated into a practical application because the claim is directed to an abstract idea with additional generic computer elements, the generically recited computer elements do not add a meaningful limitation to the abstract idea because they amount to simply implementing the abstract idea on a computer. The claims does not include additional elements that are sufficient to amount to significantly more than the judicial exception because identify the additional elements and explain why, when considered separately and in combination, they do not add significantly more (also known as an “inventive concept”) to the exception.

Applicant assets (see pgs. 16-18) that the claims are not abstract per DDR Holdings.
Examiner respectfully disagrees. In DDR, the claims address a business challenge particular to the internet. In DDR, the claimed solution is necessarily rooted in computer technology in order to overcome a problem specifically arising in the realm of computer networks. Examiner notes that the claims are not necessarily rooted in computer technology to solve a problem specifically arising in the 

Applicant assets (see pg. 18 to pg. 19) that the claims contain an "inventive concept" and thus is not rendered ineligible.
Examiner respectfully agrees. The claims do contain an inventive concept, however, the inventive concept is directed to an abstract idea. The claims recite step which describe a mental process, see below for full analysis. This judicial exception is not integrated into a practical application because the claim is directed to an abstract idea with additional generic computer elements, the generically recited computer elements do not add a meaningful limitation to the abstract idea because they amount to simply implementing the abstract idea on a computer. The claims does not include additional elements that are sufficient to amount to significantly more than the judicial exception because identify the additional elements and explain why, when considered separately and in combination, they do not add significantly more (also known as an “inventive concept”) to the exception. 

In reference to Rejections under 35 USC § 103
Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.
Applicant's arguments filed 05/04/2021 have been fully considered but they are not persuasive.

Claim Rejections - 35 USC § 112(a)
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 7, 9, 10, 17, 19, 20, 27, 29, and 30 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
In reference to claims 7, 17, and 27. The claim recites “tokenization of corresponding documents using a token to ID map”. Examiner notes that the Instant Specification does not disclose “using a token to ID map”.

In reference to claims 9, 19, and 29. The claim recites “a recommendation is not to label when there is no majority of the one or more neighbor documents having a certain label value”. Examiner notes that the Instant Specification does not disclose “not to label”.

In reference to claims 10, 20, and 30. The claim recites “the recommended label is a label that was previously accepted by a user”. Examiner notes that the Instant Specification does not disclose “the recommended label is a label that was previously accepted by a user”.

Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-31 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
In reference to claims 1, 12, and 22. The term “periodically” is a relative term which renders the claim indefinite. The term “periodically” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. For examination purposes the limitation of “periodically triggering” will be interpreted as “triggering”.
In reference to dependent claims 2-11, 13-21, and 23-31. Claims 2-11, 13-21, and 23-31 do not cure the deficiencies noted in the rejection of independent claims 1, 12, and 22. Therefore, these claims are rejected under the same rationale as claims 1, 12, and 22.

In reference to claims 7, 17, and 27. The claim recites “tokenization of corresponding documents using a token to ID map”. Examiner notes that the Instant Specification does not disclose “using a token to ID map”. Because there is no disclosure, the limitation is confusing and the examiner will interpret the claim under the broadest reasonable interpretation. The broadest reasonable interpretation for the limitation is that the “feature vectors are generated based on a tokenization of corresponding documents”.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-31 are rejected under 35 U.S.C. 101.
In reference to Claim 1
Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process.

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Limitation “generating an event for each document of the plurality of documents, the event for each document of the plurality of documents being stored in a queue” is directed to the abstract 
Limitation “processing a corresponding document for the respective event to generate a respective feature vector” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “periodically triggering the processing of the log file having a plurality of feature vectors corresponding to each document of the plurality of documents by performing at least one of (a) grouping data in the log file; (b) separating metadata and vector data; (c) appending periodic data to a dataset; or (d) maintaining aggregated data representations in a datastore directory” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “classifying respective documents of the plurality of documents by at least analyzing similarities between documents to perform label propagation” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “identifying one or more neighbor documents based at least in part on previously generated feature vector for the respective document and feature vectors for previously classified documents” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “identifying one or more label values for the one or more neighbor documents” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “providing a recommendation of a label value for the respective document” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “assigning the label value to the respective document” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
The claim recites the additional limitation of “cloud-based storage system” is recited at a high level of generality and merely used computers as a tool to perform the processes.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, the claim does not recite additional elements that integrate the judicial exception into a practical application. MPEP 2106.04(d).

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
Limitation “receiving a plurality of documents for storage in a cloud-based storage system” is directed to receiving or transmitting data over a network, or storing and retrieving information in memory. Receiving or transmitting data over a network, or storing and retrieving information in memory is well-understood, routine, conventional activity as per MPEP 2106.05(d).
Limitation “dispatching respective events from the queue for processing at a feature extraction engine” is directed to transmitting data over a network. Transmitting data over a network is well-understood, routine, conventional activity as per MPEP 2106.05(d).
Limitation “storing the respective feature vector in a log file” is directed to storing and retrieving information in memory. Storing and retrieving information in memory is well-understood, routine, conventional activity as per MPEP 2106.05(d).

In reference to Claim 2. The claim recites the additional limitation of “wherein identifying the one or more neighbor documents is performed by at least determining a distance between the previously generated feature vector for the respective document and feature vectors for previously classified documents”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a).

In reference to Claim 3. The claim recites the additional limitation of “wherein feature vectors comprise term frequency vectors”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a). 

In reference to Claim 4. The claim recites the additional limitation of “the log preprocessing aggregates log entries of the metadata and vector data that were written into log files by one or more feature extraction worker processes […]”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a). The claim recites the additional limitation of “cloud-based storage system” is recited at a high level of generality and merely used computers as a tool to perform the processes.

In reference to Claim 5. The claim recites the additional limitation of “clustering is employed to perform the classification of the documents”, the limitation is directed to the abstract idea of a mental 

In reference to Claim 6. The claim recites the additional limitation of “a similarity matrix is used to classify the documents”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a).

In reference to Claim 7. The claim recites the additional limitation of “feature vectors are generated based on a tokenization of corresponding documents using a token to ID map”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a). 

In reference to Claim 8. The claim recites the additional limitation of “the label value recommended is determined based at least in part on a majority of the one or more neighbor documents having the label value”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a).

In reference to Claim 9. The claim recites the additional limitation of “a recommendation is not to label when there is no majority of the one or more neighbor documents having a certain label value”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a). 

In reference to Claim 10. The claim recites the additional limitation of “the recommended label is a label that was previously accepted by a user”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a).

In reference to Claim 11. The claim recites the additional limitation of “a conflict resolution process selects a label value associated with a neighbor document that has a closest distance to the document”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a). 

In reference to Claim 12
Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a machine.

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Limitation “generating an event for each document of the plurality of documents, the event for each document of the plurality of documents being stored in a queue” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “processing a corresponding document for the respective event to generate a respective feature vector” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “periodically triggering the processing of the log file having a plurality of feature vectors corresponding to each document of the plurality of documents by performing at least one of (a) grouping data in the log file; (b) separating metadata and vector data; (c) appending periodic data to a dataset; or (d) maintaining aggregated data representations in a datastore directory” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “classifying respective documents of the plurality of documents by at least analyzing similarities between documents to perform label propagation” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “identifying one or more neighbor documents based at least in part on previously generated feature vector for the respective document and feature vectors for previously classified documents” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “identifying one or more label values for the one or more neighbor documents” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “providing a recommendation of a label value for the respective document” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “assigning the label value to the respective document” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
The claim recites the additional limitation of “a memory storing a set of instructions” is recited at a high level of generality and merely used computers as a tool to perform the processes.
The claim recites the additional limitation of “a processor that executes the set of instructions to cause a set of acts” is recited at a high level of generality and merely used computers as a tool to perform the processes.
The claim recites the additional limitation of “cloud-based storage system” is recited at a high level of generality and merely used computers as a tool to perform the processes.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, the claim does not recite additional elements that integrate the judicial exception into a practical application. MPEP 2106.04(d).

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
Limitation “receiving a plurality of documents for storage in a cloud-based storage system” is directed to receiving or transmitting data over a network, or storing and retrieving information in memory. Receiving or transmitting data over a network, or storing and retrieving information in memory is well-understood, routine, conventional activity as per MPEP 2106.05(d).
Limitation “dispatching respective events from the queue for processing at a feature extraction engine” is directed to transmitting data over a network. Transmitting data over a network is well-understood, routine, conventional activity as per MPEP 2106.05(d).
Limitation “storing the respective feature vector in a log file” is directed to storing and retrieving information in memory. Storing and retrieving information in memory is well-understood, routine, conventional activity as per MPEP 2106.05(d).

In reference to Claim 13. The claim recites the additional limitation of “wherein feature vectors comprise term frequency vectors”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a). 

In reference to Claim 14. The claim recites the additional limitation of “the log preprocessing aggregates log entries of the metadata and vector data that were written into log files by one or more feature extraction worker processes […]”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a). The claim recites the additional limitation of “cloud-based storage system” is recited at a high level of generality and merely used computers as a tool to perform the processes.

In reference to Claim 15. The claim recites the additional limitation of “clustering is employed to perform the classification of the documents”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a).

In reference to Claim 16. The claim recites the additional limitation of “a similarity matrix is used to classify the documents”, the limitation is directed to the abstract idea of a mental process 

In reference to Claim 17. The claim recites the additional limitation of “feature vectors are generated based on a tokenization of corresponding documents using a token to ID map”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a). 

In reference to Claim 18. The claim recites the additional limitation of “the label value recommended is determined based at least in part on a majority of the one or more neighbor documents having the label value”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a).

In reference to Claim 19. The claim recites the additional limitation of “a recommendation is not to label when there is no majority of the one or more neighbor documents having a certain label value”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a). 

In reference to Claim 10. The claim recites the additional limitation of “the recommended label is a label that was previously accepted by a user”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a).

In reference to Claim 21. The claim recites the additional limitation of “a conflict resolution process selects a label value associated with a neighbor document that has a closest distance to the document”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a). 

In reference to Claim 22
Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a machine.

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Limitation “generating an event for each document of the plurality of documents, the event for each document of the plurality of documents being stored in a queue” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “processing a corresponding document for the respective event to generate a respective feature vector” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “periodically triggering the processing of the log file having a plurality of feature vectors corresponding to each document of the plurality of documents by performing at least one of (a) grouping data in the log file; (b) separating metadata and vector data; (c) appending periodic data to a dataset; or (d) maintaining aggregated data representations in a datastore directory” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “classifying respective documents of the plurality of documents by at least analyzing similarities between documents to perform label propagation” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “identifying one or more neighbor documents based at least in part on previously generated feature vector for the respective document and feature vectors for previously classified documents” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “identifying one or more label values for the one or more neighbor documents” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “providing a recommendation of a label value for the respective document” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
Limitation “assigning the label value to the respective document” is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper. MPEP 2106.04(a).
The claim recites the additional limitation of “a computer program product embodied on a non-transitory computer usable medium, having stored thereon a sequence of instructions which, when executed by a processor as set of acts” is recited at a high level of generality and merely used computers as a tool to perform the processes.
The claim recites the additional limitation of “cloud-based storage system” is recited at a high level of generality and merely used computers as a tool to perform the processes.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, the claim does not recite additional elements that integrate the judicial exception into a practical application. MPEP 2106.04(d).

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
Limitation “receiving a plurality of documents for storage in a cloud-based storage system” is directed to receiving or transmitting data over a network, or storing and retrieving information in memory. Receiving or transmitting data over a network, or storing and retrieving information in memory is well-understood, routine, conventional activity as per MPEP 2106.05(d).
Limitation “dispatching respective events from the queue for processing at a feature extraction engine” is directed to transmitting data over a network. Transmitting data over a network is well-understood, routine, conventional activity as per MPEP 2106.05(d).
Limitation “storing the respective feature vector in a log file” is directed to storing and retrieving information in memory. Storing and retrieving information in memory is well-understood, routine, conventional activity as per MPEP 2106.05(d).

In reference to Claim 23. The claim recites the additional limitation of “wherein feature vectors comprise term frequency vectors”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a). 

In reference to Claim 24. The claim recites the additional limitation of “the log preprocessing aggregates log entries of the metadata and vector data that were written into log files by one or more feature extraction worker processes […]”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a). The claim recites the additional limitation of “cloud-based storage system” is recited at a high level of generality and merely used computers as a tool to perform the processes.

In reference to Claim 25. The claim recites the additional limitation of “clustering is employed to perform the classification of the documents”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a).

In reference to Claim 26. The claim recites the additional limitation of “a similarity matrix is used to classify the documents”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a).

In reference to Claim 27. The claim recites the additional limitation of “feature vectors are generated based on a tokenization of corresponding documents using a token to ID map”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a). 

In reference to Claim 28. The claim recites the additional limitation of “the label value recommended is determined based at least in part on a majority of the one or more neighbor 

In reference to Claim 29. The claim recites the additional limitation of “a recommendation is not to label when there is no majority of the one or more neighbor documents having a certain label value”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a). 

In reference to Claim 30. The claim recites the additional limitation of “the recommended label is a label that was previously accepted by a user”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a).

In reference to Claim 31. The claim recites the additional limitation of “a conflict resolution process selects a label value associated with a neighbor document that has a closest distance to the document”, the limitation is directed to the abstract idea of a mental process (thinking) that can be performed in the human mind, or by a human using a pen and paper MPEP 2106.04(a). 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-7, 10, 12-17, 20, 22-27, and 30 are rejected under 35 U.S.C. 103 as being unpatentable over Kasturi et al. (hereinafter Kasturi) US 10157347 B1 in view of Chhichhia et al. (hereinafter Chhichhia) US 20160147891 A1.
In reference to claim 1. Kasturi teaches a computer implemented method for analyzing data, comprising:
“receiving a plurality of documents for storage in a cloud-based storage system” (Kasturi in at least Fig. 3, Fig. 4, Col. 6 lines 8-23, Col. 10 lines 32-57, and Col. 18 lines 51-58 “Data access services 166-2 provides an API or gateway to a from the System 100 to client web services 161, e.g., for transmitting queries from an enterprise connected device (e.g., 294-1, 294-2 of FIG. 2) and returning metadata or visualization data to the connected device(s). Support may include: Text file such as csv exports [and] Databases (MySQL, Microsoft Sql Server, Mango, etc.)”, “Data extraction/consumption module (DEC) 148 may receive data from a variety of data sources […] Each of these data sources 302, 304, 306, and 308 may communicate with DEC 148 via a corresponding input module 310,312,314, and 316, respectively. Domain databases 304 may provide enterprise-specific information, records and other data regarding a particular enterprise (also referred to herein as experienced-based information). Such enterprise-specific information may relate to, for example, customers, transactions, support requests, products, parts, trouble codes, diagnostic codes, repair or service information, and financial data. In an embodiment for processing 
“generating an event for each document of the plurality of documents, the event for each document of the plurality of documents being stored in a queue” (Kasturi in at least Fig. 1, Fig. 6, Col. 5 line 50 to Col. 6 line 33, Col. 7 lines 6-40, and Col. 15 lines 16-26 “Based on the type of data stream the platform 100 will scale accordingly to produce messages for DATA_QUEUE” and “Core 154 is consumer of the DATA_QUEUE. Messages taken from this queue may describe the source domain, entity, attributes of the entity and a command indicating the machine learning logic to be performed. This may include feature extraction, indexing, analyzing, classification, clustering, and supervised/unsupervised learning algorithms built into platform 30”);
“dispatching respective events from the queue for processing at a feature extraction engine” (Kasturi in at least Fig. 3, Fig. 4, Fig. 18, Fig. 19, Col. 4 line 55 to Col. 5 line 5, and Col. 19 line 44 to Col. 21 line 10 “The platform may include a data extraction and consumption (DEC) module to translate domain specific data into defined abstractions, breaking it down for consumption by a feature extraction engine”) by:
“processing a corresponding document for the respective event to generate a respective feature vector” (Kasturi in at least Fig. 4, Fig. 13, Fig. 18, Fig. 19, Col. 15 lines 30-44, and Col. 23 lines 46-59 “Data then may be processed 406 in the core 154, e.g., by feature extractor 164 to extract features from the data and create feature vectors 408”), and 
“storing the respective feature vector in a log file” (Kasturi in at least Fig. 4, Fig. 12, Fig. 13, Fig. 18, Fig. 19, Col. 15 lines 30-44, Col. 20 lines 28-38, and Col. 23 lines 46-59 “Data then may be processed 406 in the core 154, e.g., by feature extractor 164 to extract features from the data and create feature vectors 408” and “Features are then extracted 1218 and the model is annotated with features 1220 and stored as feature vectors 1222”);
“periodically triggering the processing of the log file having a plurality of feature vectors corresponding to each document of the plurality of documents by performing at least one of (a) grouping data in the log file; (b) separating metadata and vector data; (c) appending periodic data to a dataset; or (d) maintaining aggregated data representations in a datastore directory” (Kasturi in at least Fig. 4, Fig. 12, Fig. 13, Fig. 18, Fig. 19, Col. 15 lines 30-44, Col. 20 lines 28-38, and Col. 23 lines 46-59 “Data then may be processed 406 in the core 154, e.g., by feature extractor 164 to extract features from the data and create feature vectors 408” and “Features are then extracted 1218 and the model is annotated with features 1220 and stored as feature vectors 1222”. Examiner notes that in at least Col. 23 lines 46-67 Kasturi discloses that the metadata and vector data are separate from each other);

Kasturi does not explicitly disclose:
“classifying respective documents of the plurality of documents by at least analyzing similarities between documents to perform label propagation”,
wherein label propagation comprises:
“identifying one or more neighbor documents based at least in part on previously generated feature vector for the respective document and feature vectors for previously classified documents”,
“identifying one or more label values for the one or more neighbor documents”,
“providing a recommendation of a label value for the respective document”,
“assigning the label value to the respective document”.
However, Chhichhia discloses:
“classifying respective documents of the plurality of documents by at least analyzing similarities between documents to perform label propagation” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] “the model trainer 540 generates the model using an ensemble method, such as linear support vector classification, logistic regression, k-nearest neighbor, naïve Bayes, or stochastic gradient descent. The model trainer 540 outputs the learned model 545 to the classification module 550 for classifying documents of the other content entities” and “if the learner is a k-nearest neighbor classifier, the learner assigns each set aside sample an entity label based on the similarity between features of the sample and features of the other content entities” Examiner notes that Kasturi disclose k-nearest neighbor classifier but did not explicitly disclose that it analyzes similarities),
wherein label propagation comprises:
“identifying one or more neighbor documents based at least in part on previously generated feature vector for the respective document and feature vectors for 
“identifying one or more label values for the one or more neighbor documents” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] “a k-nearest neighbor classifier, the learner assigns each set aside sample an entity label based on the similarity between features of the sample and features of the other content entities” and “The content classification system 410 assigns taxonomic labels to documents in the content catalog database 402 to classify the documents into a hierarchical taxonomy. In particular, the content classification system 410 trains a model for assigning taxonomic labels to a representative content entity, which is a content entity determined to have a high degree of similarity to the other content entities of the catalog database 402. One or more taxonomic labels are assigned to documents of other content entities using the 
“providing a recommendation of a label value for the respective document” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] “a k-nearest neighbor classifier, the learner assigns each set aside sample an entity label based on the similarity between features of the sample and features of the other content entities” and “The content classification system 410 assigns taxonomic labels to documents in the content catalog database 402 to classify the documents into a hierarchical taxonomy. In particular, the content classification system 410 trains a model for assigning taxonomic labels to a representative content entity, which is a content entity determined to have a high degree of similarity to the other content entities of the catalog database 402. One or more taxonomic labels are assigned to documents of other content entities using the model trained for the representative content entity. Using the assigned labels, the content classification system 410 classifies the documents”),
“assigning the label value to the respective document” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] “a k-nearest neighbor classifier, the learner assigns each set aside sample an entity label based on the similarity between features of the sample and features of the other content entities” and “The content classification system 410 assigns taxonomic labels to documents in the content catalog database 402 to classify the documents into a hierarchical taxonomy. In particular, the content classification system 410 trains a model for assigning taxonomic labels to a representative content entity, which is a content entity determined to have a high degree of similarity to the other content 
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Kasturi and Chhichhia. Kasturi teaches improved utility of enterprise data by modeling data into features applicable to enterprise context and using the model to drive classification and clustering of data. Chhichhia teaches a content management system storing electronic documents generates topic progressions to recommend topics to users of the content management system. One of ordinary skill would have motivation to combine Kasturi and Chhichhia because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to claim 2. Kasturi and Chhichhia teach the method of claim 1 (as mentioned above), wherein identifying the one or more neighbor documents is performed by:
Chhichhia further discloses:
“at least determining a distance between the previously generated feature vector for the respective document and feature vectors for previously classified documents” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] “a k-nearest neighbor classifier, the learner assigns each set aside sample an entity label based on the 

In reference to claim 3. Kasturi and Chhichhia teach the method of claim 1 (as mentioned above), wherein feature vectors comprise:
Chhichhia further discloses:
“term frequency vectors” (Chhichhia in at least ¶ [0091], ¶ [0099], ¶ [0106], and ¶ [0128] disclose the term frequency vector).

In reference to claim 4. Kasturi and Chhichhia teach the method of claim 1 (as mentioned above), wherein:
Kasturi further discloses:
“the log preprocessing aggregates log entries of the metadata and vector data that were written into log files by one or more feature extraction worker processes in the cloud-based storage system” (Kasturi in at least Fig. 3, Fig. 4, Col. 6 lines 8-23, Col. 10 lines 32-57, and Col. 18 lines 51-58 “Data access services 166-2 provides an API or gateway to a from the 

In reference to claim 5. Kasturi and Chhichhia teach the method of claim 1 (as mentioned above), wherein:
Kasturi further discloses:
“clustering is employed to perform the classification of the documents” (Kasturi in at least Figs. 1-4, Fig. 7, Fig. 9, Fig. 10, Fig. 14, Fig. 16, Fig. 18, and Fig. 19 discloses the classification/ clustering of the documents. See also corresponding sections).

In reference to claim 6. Kasturi and Chhichhia teach the method of claim 1 (as mentioned above), wherein:
Chhichhia further discloses:
“a similarity matrix is used to classify the documents” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] discloses classifying the documents by analyzing similarity. The classification module receives a set of taxonomic labels, which collectively define a hierarchical taxonomy. Fig. 9 illustrates an example confusion matrix used to determine feature overlap between content entities, the overlap is the similarity. ¶ [0100], ¶ [0101], and ¶ [0150] further disclose the confusion matrix represent a similarity matrix, which is used to classify data).

In reference to claim 7. Kasturi and Chhichhia teach the method of claim 1 (as mentioned above), wherein:
Chhichhia further discloses:
“feature vectors are generated based on a tokenization of corresponding documents using a token to ID map” (Chhichhia in at least Fig. 17, Col. 10 ¶ [0120], ¶ [0126], and ¶ [0139]-

In reference to claim 10. Kasturi and Chhichhia teach the method of claim 1 (as mentioned above), wherein:
Kasturi further discloses:
“the recommended label is a label that was previously accepted by a user” (Kasturi in at least Col. 6 lines 48-65, and Col. 10 lines 14-20 “A dashboard 286 or other administration system may provide a user interface and administration tools for configuring the system 100, 230, modeling a domain, presenting visualization data, collecting and/or applying feedback from domain experts, configuring and/or monitoring classification and clustering processes, configuring other processes, and setting desired parameters and monitoring system function” and “Feedback obtained from a domain expert and/or an auto-control set is consumed back by feedback module, which reassigns the weights and relearns the classification model. This provided a constantly self-evolving ecosystem for the model to be consistent with the changing environment”).

In reference to claim 12. Kasturi teach a system, comprising:
“a memory storing a set of instructions” (Kasturi in at least Fig. 2 and Fig. 3);
“a processor that executes the set of instructions to cause a set of acts” (Kasturi in at least Fig. 2 and Fig. 3) comprising:
“receiving a plurality of documents for storage in a cloud-based storage system” (Kasturi in at least Fig. 3, Fig. 4, Col. 6 lines 8-23, Col. 10 lines 32-57, and Col. 18 lines 51-58 “Data access services 166-2 provides an API or gateway to a from the System 100 to client web 
“generating an event for each document of the plurality of documents, the event for each document of the plurality of documents being stored in a queue” (Kasturi in at least Fig. 1, Fig. 6, Col. 5 line 50 to Col. 6 line 33, Col. 7 lines 6-40, and Col. 15 lines 16-26 “Based on the type of data stream the platform 100 will scale accordingly to produce messages for 
“dispatching respective events from the queue for processing at a feature extraction engine” (Kasturi in at least Fig. 3, Fig. 4, Fig. 18, Fig. 19, Col. 4 line 55 to Col. 5 line 5, and Col. 19 line 44 to Col. 21 line 10 “The platform may include a data extraction and consumption (DEC) module to translate domain specific data into defined abstractions, breaking it down for consumption by a feature extraction engine”) by:
“processing a corresponding document for the respective event to generate a respective feature vector” (Kasturi in at least Fig. 4, Fig. 13, Fig. 18, Fig. 19, Col. 15 lines 30-44, and Col. 23 lines 46-59 “Data then may be processed 406 in the core 154, e.g., by feature extractor 164 to extract features from the data and create feature vectors 408”), and
“storing the respective feature vector in a log file” (Kasturi in at least Fig. 4, Fig. 12, Fig. 13, Fig. 18, Fig. 19, Col. 15 lines 30-44, Col. 20 lines 28-38, and Col. 23 lines 46-59 “Data then may be processed 406 in the core 154, e.g., by feature extractor 164 to extract features from the data and create feature vectors 408” and “Features are then extracted 1218 and the model is annotated with features 1220 and stored as feature vectors 1222”);
“periodically triggering the processing of the log file having a plurality of feature vectors corresponding to each document of the plurality of documents by performing at least one of (a) grouping data in the log file; (b) separating metadata and vector data; (c) appending 
Kasturi does not explicitly disclose:
“classifying respective documents of the plurality of documents by at least analyzing similarities between documents to perform label propagation”,
wherein label propagation comprises:
“identifying one or more neighbor documents based at least in part on previously generated feature vector for the respective document and feature vectors for previously classified documents”,
“identifying one or more label values for the one or more neighbor documents”,
“providing a recommendation of a label value for the respective document”,
“assigning the label value to the respective document”.
However, Chhichhia discloses:
“classifying respective documents of the plurality of documents by at least analyzing similarities between documents to perform label propagation” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] “the model trainer 540 generates the model using an ensemble method, such as linear support vector classification, logistic regression, k-nearest neighbor, naïve Bayes, or stochastic gradient descent. The model trainer 540 outputs the learned model 545 to the classification module 550 for 
wherein label propagation comprises:
“identifying one or more neighbor documents based at least in part on previously generated feature vector for the respective document and feature vectors for previously classified documents” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] “a k-nearest neighbor classifier, the learner assigns each set aside sample an entity label based on the similarity between features of the sample and features of the other content entities” and “The content classification system 410 assigns taxonomic labels to documents in the content catalog database 402 to classify the documents into a hierarchical taxonomy. In particular, the content classification system 410 trains a model for assigning taxonomic labels to a representative content entity, which is a content entity determined to have a high degree of similarity to the other content entities of the catalog database 402. One or more taxonomic labels are assigned to documents of other content entities using the model trained for the representative content entity. Using the assigned labels, the content classification system 410 classifies the documents”),
“identifying one or more label values for the one or more neighbor documents” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] “a k-nearest neighbor classifier, the learner assigns each set aside sample 
“providing a recommendation of a label value for the respective document” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] “a k-nearest neighbor classifier, the learner assigns each set aside sample an entity label based on the similarity between features of the sample and features of the other content entities” and “The content classification system 410 assigns taxonomic labels to documents in the content catalog database 402 to classify the documents into a hierarchical taxonomy. In particular, the content classification system 410 trains a model for assigning taxonomic labels to a representative content entity, which is a content entity determined to have a high degree of similarity to the other content entities of the catalog database 402. One or more taxonomic labels are assigned to documents of other content entities using the model trained for the representative content entity. Using the assigned labels, the content classification system 410 classifies the documents”),
“assigning the label value to the respective document” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] “a k-nearest neighbor classifier, the learner assigns each set aside sample an entity label based on the similarity between features of the sample and features of the other content entities” and “The content classification system 410 assigns taxonomic labels to documents in the content catalog database 402 to classify the documents into a hierarchical taxonomy. In particular, the content classification system 410 trains a model for assigning taxonomic labels to a representative content entity, which is a content entity determined to have a high degree of similarity to the other content entities of the catalog database 402. One or more taxonomic labels are assigned to documents of other content entities using the model trained for the representative content entity. Using the assigned labels, the content classification system 410 classifies the documents”).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Kasturi and Chhichhia. Kasturi teaches improved utility of enterprise data by modeling data into features applicable to enterprise context and using the model to drive classification and clustering of data. Chhichhia teaches a content management system storing electronic documents generates topic progressions to recommend topics to users of the content management system. One of ordinary skill would have motivation to combine Kasturi and Chhichhia because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the 

In reference to claim 13. Kasturi and Chhichhia teach the system of claim 12 (as mentioned above), wherein feature vectors comprise:
Chhichhia further discloses:
“term frequency vectors” (Chhichhia in at least ¶ [0091], ¶ [0099], ¶ [0106], and ¶ [0128] disclose the term frequency vector).

In reference to claim 14. Kasturi and Chhichhia teach the system of claim 12 (as mentioned above), wherein:
Kasturi further discloses:
“the log preprocessing aggregates log entries of the metadata and vector data that were written into log files by one or more feature extraction worker processes in the cloud-based storage system” (Kasturi in at least Fig. 3, Fig. 4, Col. 6 lines 8-23, Col. 10 lines 32-57, and Col. 18 lines 51-58 “Data access services 166-2 provides an API or gateway to a from the System 100 to client web services 161, e.g., for transmitting queries from an enterprise connected device (e.g., 294-1, 294-2 of FIG. 2) and returning metadata or visualization data to the connected device(s). Support may include: Text file such as csv exports [and] Databases (MySQL, Microsoft Sql Server, Mango, etc.)”, “Data extraction/consumption module (DEC) 148 may receive data from a variety of data sources […] Each of these data sources 302, 304, 306, and 308 may communicate with DEC 148 via a corresponding input module 310,312,314, and 316, respectively. Domain databases 304 may provide enterprise-specific information, records and other data regarding a particular enterprise (also referred 

In reference to claim 15. Kasturi and Chhichhia teach the system of claim 12 (as mentioned above), wherein:
Kasturi further discloses:
“clustering is employed to perform the classification of the documents” (Kasturi in at least Figs. 1-4, Fig. 7, Fig. 9, Fig. 10, Fig. 14, Fig. 16, Fig. 18, and Fig. 19 discloses the classification/ clustering of the documents. See also corresponding sections).

In reference to claim 16. Kasturi and Chhichhia teach the system of claim 12 (as mentioned above), wherein:
Chhichhia further discloses:
“a similarity matrix is used to classify the documents” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] discloses classifying the documents by analyzing similarity. The classification module receives a set of taxonomic labels, which collectively define a hierarchical taxonomy. Fig. 9 illustrates an example confusion matrix used to determine feature overlap between content entities, the overlap is the similarity. ¶ [0100], ¶ [0101], and ¶ [0150] further disclose the confusion matrix represent a similarity matrix, which is used to classify data).

In reference to claim 17. Kasturi and Chhichhia teach the system of claim 12 (as mentioned above), wherein the:
Chhichhia further discloses:
“feature vectors are generated based on a tokenization of corresponding documents using a token to ID map” (Chhichhia in at least Fig. 17, Col. 10 ¶ [0120], ¶ [0126], and ¶ [0139]-[0143] “the topic extraction module 1505 tokenizes text of the document into n-gram tokens and identifies tokens likely to be topics of the document”).

In reference to claim 20. Kasturi and Chhichhia teach the system of claim 12 (as mentioned above), wherein:
Kasturi further discloses:
“the recommended label is a label that was previously accepted by a user” (Kasturi in at least Col. 6 lines 48-65, and Col. 10 lines 14-20 “A dashboard 286 or other administration 

In reference to claim 22. Kasturi teach a computer program product embodied on a non-transitory computer usable medium, having stored thereon a sequence of instructions which, when executed by a processor as set of acts (Kasturi in at least Fig. 2 and Fig. 3), comprising:
“receiving a plurality of documents for storage in a cloud-based storage system” (Kasturi in at least Fig. 3, Fig. 4, Col. 6 lines 8-23, Col. 10 lines 32-57, and Col. 18 lines 51-58 “Data access services 166-2 provides an API or gateway to a from the System 100 to client web services 161, e.g., for transmitting queries from an enterprise connected device (e.g., 294-1, 294-2 of FIG. 2) and returning metadata or visualization data to the connected device(s). Support may include: Text file such as csv exports [and] Databases (MySQL, Microsoft Sql Server, Mango, etc.)”, “Data extraction/consumption module (DEC) 148 may receive data from a variety of data sources […] Each of these data sources 302, 304, 306, and 308 may communicate with DEC 148 via a corresponding input module 310,312,314, and 316, respectively. Domain databases 304 may provide enterprise-specific information, records and other data regarding a particular enterprise (also referred to herein as experienced-based information). Such enterprise-specific information may relate to, for example, 
“generating an event for each document of the plurality of documents, the event for each document of the plurality of documents being stored in a queue” (Kasturi in at least Fig. 1, Fig. 6, Col. 5 line 50 to Col. 6 line 33, Col. 7 lines 6-40, and Col. 15 lines 16-26 “Based on the type of data stream the platform 100 will scale accordingly to produce messages for DATA_QUEUE” and “Core 154 is consumer of the DATA_QUEUE. Messages taken from this queue may describe the source domain, entity, attributes of the entity and a command indicating the machine learning logic to be performed. This may include feature extraction, indexing, analyzing, classification, clustering, and supervised/unsupervised learning algorithms built into platform 30”);
“dispatching respective events from the queue for processing at a feature extraction engine” (Kasturi in at least Fig. 3, Fig. 4, Fig. 18, Fig. 19, Col. 4 line 55 to Col. 5 line 5, and Col. 19 line 44 to Col. 21 line 10 “The platform may include a data extraction and consumption 
“processing a corresponding document for the respective event to generate a respective feature vector” (Kasturi in at least Fig. 4, Fig. 13, Fig. 18, Fig. 19, Col. 15 lines 30-44, and Col. 23 lines 46-59 “Data then may be processed 406 in the core 154, e.g., by feature extractor 164 to extract features from the data and create feature vectors 408”), and 
“storing the respective feature vector in a log file” (Kasturi in at least Fig. 4, Fig. 12, Fig. 13, Fig. 18, Fig. 19, Col. 15 lines 30-44, Col. 20 lines 28-38, and Col. 23 lines 46-59 “Data then may be processed 406 in the core 154, e.g., by feature extractor 164 to extract features from the data and create feature vectors 408” and “Features are then extracted 1218 and the model is annotated with features 1220 and stored as feature vectors 1222”);
“periodically triggering the processing of the log file having a plurality of feature vectors corresponding to each document of the plurality of documents by performing at least one of (a) grouping data in the log file; (b) separating metadata and vector data; (c) appending periodic data to a dataset; or (d) maintaining aggregated data representations in a datastore directory” (Kasturi in at least Fig. 4, Fig. 12, Fig. 13, Fig. 18, Fig. 19, Col. 15 lines 30-44, Col. 20 lines 28-38, and Col. 23 lines 46-59 “Data then may be processed 406 in the core 154, e.g., by feature extractor 164 to extract features from the data and create feature vectors 408” and “Features are then extracted 1218 and the model is annotated with features 1220 and stored as feature vectors 1222”. Examiner notes that in at least Col. 23 lines 46-67 Kasturi discloses that the metadata and vector data are separate from each other);


“classifying respective documents of the plurality of documents by at least analyzing similarities between documents to perform label propagation”,
wherein label propagation comprises:
“identifying one or more neighbor documents based at least in part on previously generated feature vector for the respective document and feature vectors for previously classified documents”,
“identifying one or more label values for the one or more neighbor documents”,
“providing a recommendation of a label value for the respective document”,
“assigning the label value to the respective document”.
However, Chhichhia discloses:
“classifying respective documents of the plurality of documents by at least analyzing similarities between documents to perform label propagation” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] “the model trainer 540 generates the model using an ensemble method, such as linear support vector classification, logistic regression, k-nearest neighbor, naïve Bayes, or stochastic gradient descent. The model trainer 540 outputs the learned model 545 to the classification module 550 for classifying documents of the other content entities” and “if the learner is a k-nearest neighbor classifier, the learner assigns each set aside sample an entity label based on the similarity between features of the sample and features of the other content entities” Examiner notes that Kasturi disclose k-nearest neighbor classifier but did not explicitly disclose that it analyzes similarities),
wherein label propagation comprises:
“identifying one or more neighbor documents based at least in part on previously generated feature vector for the respective document and feature vectors for previously classified documents” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] “a k-nearest neighbor classifier, the learner assigns each set aside sample an entity label based on the similarity between features of the sample and features of the other content entities” and “The content classification system 410 assigns taxonomic labels to documents in the content catalog database 402 to classify the documents into a hierarchical taxonomy. In particular, the content classification system 410 trains a model for assigning taxonomic labels to a representative content entity, which is a content entity determined to have a high degree of similarity to the other content entities of the catalog database 402. One or more taxonomic labels are assigned to documents of other content entities using the model trained for the representative content entity. Using the assigned labels, the content classification system 410 classifies the documents”),
“identifying one or more label values for the one or more neighbor documents” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] “a k-nearest neighbor classifier, the learner assigns each set aside sample an entity label based on the similarity between features of the sample and features of the other content entities” and “The content classification system 410 assigns taxonomic labels to documents in the content catalog database 402 to classify the documents into a hierarchical taxonomy. In particular, the content classification system 410 trains a model for assigning taxonomic labels to a representative content entity, which is a content entity determined to have a high degree of 
“providing a recommendation of a label value for the respective document” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] “a k-nearest neighbor classifier, the learner assigns each set aside sample an entity label based on the similarity between features of the sample and features of the other content entities” and “The content classification system 410 assigns taxonomic labels to documents in the content catalog database 402 to classify the documents into a hierarchical taxonomy. In particular, the content classification system 410 trains a model for assigning taxonomic labels to a representative content entity, which is a content entity determined to have a high degree of similarity to the other content entities of the catalog database 402. One or more taxonomic labels are assigned to documents of other content entities using the model trained for the representative content entity. Using the assigned labels, the content classification system 410 classifies the documents”),
“assigning the label value to the respective document” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] “a k-nearest neighbor classifier, the learner assigns each set aside sample an entity label based on the similarity between features of the sample and features of the other content entities” and “The content classification system 410 assigns taxonomic labels to documents in the content catalog database 402 to classify the documents into a hierarchical taxonomy. In particular, the content classification system 410 trains a 
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Kasturi and Chhichhia. Kasturi teaches improved utility of enterprise data by modeling data into features applicable to enterprise context and using the model to drive classification and clustering of data. Chhichhia teaches a content management system storing electronic documents generates topic progressions to recommend topics to users of the content management system. One of ordinary skill would have motivation to combine Kasturi and Chhichhia because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to claim 23. Kasturi and Chhichhia teach the computer program product of claim 22 (as mentioned above), wherein feature vectors comprise:
Chhichhia further discloses:
“term frequency vectors” (Chhichhia in at least ¶ [0091], ¶ [0099], ¶ [0106], and ¶ [0128] disclose the term frequency vector).

In reference to claim 24. Kasturi and Chhichhia teach the computer program product of claim 22 (as mentioned above), wherein:
Kasturi further discloses:
“the log preprocessing aggregates log entries of the metadata and vector data that were written into log files by one or more feature extraction worker processes in the cloud-based storage system” (Kasturi in at least Fig. 3, Fig. 4, Col. 6 lines 8-23, Col. 10 lines 32-57, and Col. 18 lines 51-58 “Data access services 166-2 provides an API or gateway to a from the System 100 to client web services 161, e.g., for transmitting queries from an enterprise connected device (e.g., 294-1, 294-2 of FIG. 2) and returning metadata or visualization data to the connected device(s). Support may include: Text file such as csv exports [and] Databases (MySQL, Microsoft Sql Server, Mango, etc.)”, “Data extraction/consumption module (DEC) 148 may receive data from a variety of data sources […] Each of these data sources 302, 304, 306, and 308 may communicate with DEC 148 via a corresponding input module 310,312,314, and 316, respectively. Domain databases 304 may provide enterprise-specific information, records and other data regarding a particular enterprise (also referred to herein as experienced-based information). Such enterprise-specific information may relate to, for example, customers, transactions, support requests, products, parts, trouble codes, diagnostic codes, repair or service information, and financial data. In an embodiment for processing automotive service information, for example, domain DB 304 may comprise automotive service records, service orders, vehicle diagnostic information, or other historical or experience-based service information”, and “Input modules 930 may be used by the DEC 148 runtime component to extract data from external data sources and pre-process them into entity representations. Thus, input module 930 may act as connector to 

In reference to claim 25. Kasturi and Chhichhia teach the computer program product of claim 22 (as mentioned above), wherein:
Kasturi further discloses:
“clustering is employed to perform the classification of the documents” (Kasturi in at least Figs. 1-4, Fig. 7, Fig. 9, Fig. 10, Fig. 14, Fig. 16, Fig. 18, and Fig. 19 discloses the classification/ clustering of the documents. See also corresponding sections).

In reference to claim 26. Kasturi and Chhichhia teach the computer program product of claim 22 (as mentioned above), wherein:
Chhichhia further discloses:
“a similarity matrix is used to classify the documents” (Chhichhia in at least ¶ [0079], ¶ [0084], ¶ [0085], ¶ [0095], ¶ [0096], ¶ [0101], and ¶ [0107] discloses classifying the documents by analyzing similarity. The classification module receives a set of taxonomic labels, which collectively define a hierarchical taxonomy. Fig. 9 illustrates an example confusion matrix used to determine feature overlap between content entities, the overlap is 

In reference to claim 27. Kasturi and Chhichhia teach the computer program product of claim 22 (as mentioned above), wherein the:
Chhichhia further discloses:
“feature vectors are generated based on a tokenization of corresponding documents using a token to ID map” (Chhichhia in at least Fig. 17, Col. 10 ¶ [0120], ¶ [0126], and ¶ [0139]-[0143] “the topic extraction module 1505 tokenizes text of the document into n-gram tokens and identifies tokens likely to be topics of the document”).

In reference to claim 30. Kasturi and Chhichhia teach the computer program product of claim 22 (as mentioned above), wherein:
Kasturi further discloses:
“the recommended label is a label that was previously accepted by a user” (Kasturi in at least Col. 6 lines 48-65, and Col. 10 lines 14-20 “A dashboard 286 or other administration system may provide a user interface and administration tools for configuring the system 100, 230, modeling a domain, presenting visualization data, collecting and/or applying feedback from domain experts, configuring and/or monitoring classification and clustering processes, configuring other processes, and setting desired parameters and monitoring system function” and “Feedback obtained from a domain expert and/or an auto-control set is consumed back by feedback module, which reassigns the weights and relearns the classification model. This provided a constantly self-evolving ecosystem for the model to be consistent with the changing environment”).

Claims 8, 9, 11, 18, 19, 21, 28, 29, and 31 are rejected under 35 U.S.C. 103 as being unpatentable over Kasturi et al. (hereinafter Kasturi) US 10157347 B1 in view of Chhichhia et al. (hereinafter Chhichhia) US 20160147891 A1 in view of Krishna et al. (hereinafter Krishna) “A New Approach to Mining Fuzzy Databases Using Nearest Neighbor”.


In reference to claim 8. Kasturi and Chhichhia teach the method of claim 1 (as mentioned above), wherein:
Kasturi and Chhichhia do not explicitly disclose:
“the label value recommended is determined based at least in part on a majority of the one or more neighbor documents having the label value”.
Krishna further discloses:
“the label value recommended is determined based at least in part on a majority of the one or more neighbor documents having the label value” (Krishna in at least § 2 “the algorithm scans all the data to find a subset of cases that is most similar to it and predicts the outcome using them. That is, in the k-NN classification, the k closest patterns are found and a voting scheme is used to determine the outcome. Thus, the k-NN technique assumes that locality in the featured space may often imply strong relationships among the class labels”. The algorithms recommends a label based on the k closest pattern, the most similar label is accepted and propagated).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Kasturi, Chhichhia, and Krishna. Kasturi teaches improved utility of enterprise data by modeling data into features applicable to enterprise context and using the model to 

In reference to claim 9. Kasturi and Chhichhia teach the method of claim 1 (as mentioned above), wherein:
Kasturi and Chhichhia do not explicitly disclose:
“a recommendation is not to label when there is no majority of the one or more neighbor documents having a certain label value”.
Krishna further discloses:
“a recommendation is not to label when there is no majority of the one or more neighbor documents having a certain label value” (Krishna in at least § 2 “The k-NN technique does not classify a case if there is no majority outcome out of the k neighbors”. The algorithm does not recommend a label if there is no majority outcome, the label is not accepted and not propagated).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Kasturi, Chhichhia, and Krishna. Kasturi teaches improved utility of 

In reference to claim 11. Kasturi and Chhichhia teach the method of claim 1 (as mentioned above), wherein:
Kasturi and Chhichhia do not explicitly disclose:
“a conflict resolution process selects a label value associated with a neighbor document that has a closest distance to the document”.
Krishna further discloses:
“a conflict resolution process selects a label value associated with a neighbor document that has a closest distance to the document” (Krishna in at least § 2 “the algorithm scans all the data to find a subset of cases that is most similar to it and predicts the outcome using them. That is, in the k-NN classification, the k closest patterns are found and a voting scheme is used to determine the outcome. Thus, the k-NN technique assumes that locality in the featured space may often imply strong relationships among the class labels”. The algorithms 
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Kasturi, Chhichhia, and Krishna. Kasturi teaches improved utility of enterprise data by modeling data into features applicable to enterprise context and using the model to drive classification and clustering of data. Chhichhia teaches a content management system storing electronic documents generates topic progressions to recommend topics to users of the content management system. Krishna teaches an application of the k-nearest neighbor classification technique. One of ordinary skill would have motivation to combine Kasturi, Chhichhia, and Krishna because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to claim 18. Kasturi and Chhichhia teach the system of claim 12 (as mentioned above), wherein:
Kasturi and Chhichhia do not explicitly disclose:
“the label value recommended is determined based at least in part on a majority of the one or more neighbor documents having the label value”.
Krishna further discloses:
“the label value recommended is determined based at least in part on a majority of the one or more neighbor documents having the label value” (Krishna in at least § 2 “the algorithm 
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Kasturi, Chhichhia, and Krishna. Kasturi teaches improved utility of enterprise data by modeling data into features applicable to enterprise context and using the model to drive classification and clustering of data. Chhichhia teaches a content management system storing electronic documents generates topic progressions to recommend topics to users of the content management system. Krishna teaches an application of the k-nearest neighbor classification technique. One of ordinary skill would have motivation to combine Kasturi, Chhichhia, and Krishna because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to claim 19. Kasturi and Chhichhia teach the system of claim 12 (as mentioned above), wherein:
Kasturi and Chhichhia do not explicitly disclose:
“a recommendation is not to label when there is no majority of the one or more neighbor documents having a certain label value”.
Krishna further discloses:
“a recommendation is not to label when there is no majority of the one or more neighbor documents having a certain label value” (Krishna in at least § 2 “The k-NN technique does not classify a case if there is no majority outcome out of the k neighbors”. The algorithm does not recommend a label if there is no majority outcome, the label is not accepted and not propagated).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Kasturi, Chhichhia, and Krishna. Kasturi teaches improved utility of enterprise data by modeling data into features applicable to enterprise context and using the model to drive classification and clustering of data. Chhichhia teaches a content management system storing electronic documents generates topic progressions to recommend topics to users of the content management system. Krishna teaches an application of the k-nearest neighbor classification technique. One of ordinary skill would have motivation to combine Kasturi, Chhichhia, and Krishna because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art.

In reference to claim 21. Kasturi and Chhichhia teach the system of claim 12 (as mentioned above), wherein:

“a conflict resolution process selects a label value associated with a neighbor document that has a closest distance to the document”.
Krishna further discloses:
“a conflict resolution process selects a label value associated with a neighbor document that has a closest distance to the document” (Krishna in at least § 2 “the algorithm scans all the data to find a subset of cases that is most similar to it and predicts the outcome using them. That is, in the k-NN classification, the k closest patterns are found and a voting scheme is used to determine the outcome. Thus, the k-NN technique assumes that locality in the featured space may often imply strong relationships among the class labels”. The algorithms recommends a label based on the k closest pattern, the most similar label is accepted and propagated).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Kasturi, Chhichhia, and Krishna. Kasturi teaches improved utility of enterprise data by modeling data into features applicable to enterprise context and using the model to drive classification and clustering of data. Chhichhia teaches a content management system storing electronic documents generates topic progressions to recommend topics to users of the content management system. Krishna teaches an application of the k-nearest neighbor classification technique. One of ordinary skill would have motivation to combine Kasturi, Chhichhia, and Krishna because MPEP 2143 sets forth the Supreme Court rationales for obviousness including: (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results; (E) "Obvious to try" choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success; (F) Known work in one field of endeavor may prompt variations of it for use in either the 

In reference to claim 28. Kasturi and Chhichhia teach the computer program product of claim 22 (as mentioned above), wherein:
Kasturi and Chhichhia do not explicitly disclose:
“the label value recommended is determined based at least in part on a majority of the one or more neighbor documents having the label value”.
Krishna further discloses:
“the label value recommended is determined based at least in part on a majority of the one or more neighbor documents having the label value” (Krishna in at least § 2 “the algorithm scans all the data to find a subset of cases that is most similar to it and predicts the outcome using them. That is, in the k-NN classification, the k closest patterns are found and a voting scheme is used to determine the outcome. Thus, the k-NN technique assumes that locality in the featured space may often imply strong relationships among the class labels”. The algorithms recommends a label based on the k closest pattern, the most similar label is accepted and propagated).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Kasturi, Chhichhia, and Krishna. Kasturi teaches improved utility of enterprise data by modeling data into features applicable to enterprise context and using the model to drive classification and clustering of data. Chhichhia teaches a content management system storing electronic documents generates topic progressions to recommend topics to users of the content management system. Krishna teaches an application of the k-nearest neighbor classification technique. One of ordinary skill would have motivation to combine Kasturi, Chhichhia, and Krishna because MPEP 

In reference to claim 29. Kasturi and Chhichhia teach the computer program product of claim 22 (as mentioned above), wherein:
Kasturi and Chhichhia do not explicitly disclose:
“a recommendation is not to label when there is no majority of the one or more neighbor documents having a certain label value”.
Krishna further discloses:
“a recommendation is not to label when there is no majority of the one or more neighbor documents having a certain label value” (Krishna in at least § 2 “The k-NN technique does not classify a case if there is no majority outcome out of the k neighbors”. The algorithm does not recommend a label if there is no majority outcome, the label is not accepted and not propagated).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Kasturi, Chhichhia, and Krishna. Kasturi teaches improved utility of enterprise data by modeling data into features applicable to enterprise context and using the model to drive classification and clustering of data. Chhichhia teaches a content management system storing electronic documents generates topic progressions to recommend topics to users of the content management system. Krishna teaches an application of the k-nearest neighbor classification technique. 

In reference to claim 31. Kasturi and Chhichhia teach the computer program product of claim 22 (as mentioned above), wherein:
Kasturi and Chhichhia do not explicitly disclose:
“a conflict resolution process selects a label value associated with a neighbor document that has a closest distance to the document”.
Krishna further discloses:
“a conflict resolution process selects a label value associated with a neighbor document that has a closest distance to the document” (Krishna in at least § 2 “the algorithm scans all the data to find a subset of cases that is most similar to it and predicts the outcome using them. That is, in the k-NN classification, the k closest patterns are found and a voting scheme is used to determine the outcome. Thus, the k-NN technique assumes that locality in the featured space may often imply strong relationships among the class labels”. The algorithms recommends a label based on the k closest pattern, the most similar label is accepted and propagated).
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Kasturi, Chhichhia, and Krishna. Kasturi teaches improved utility of 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Viker A. Lamardo whose telephone number is (571)270-5871. The examiner can normally be reached Mon. - Fri. 9 AM - 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann J. Lo can be reached on (571)272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To 





/VIKER A LAMARDO/Primary Examiner, Art Unit 2126