DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 


Claim Rejections – 35 U.S.C. § 101
35 U.S.C. § 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-10 are rejected under 35 U.S.C. § 101 because the claimed invention is directed to non-statutory subject matter.

These claims are rejected under 35 USC §101 because the claimed invention is directed to an abstract idea without significantly more.  Claim 1 recites at a very high level categorizing documents into particular classification bins, and re-categorizing if any of those classification bins contain more than 1 document.  

Regarding independent claim 1: 
Statutory Category:  Yes, recites a series of steps executed on a generic device (therefore a process or a generic product).
 
Step 2A, Prong 1 (Judicial Exception Recited?):  Yes.  The claim recites a series of steps including performing processes/steps to classify documents into groups, recognizing whether more than one document has been classified into any group, and performing another classification process in any such group.  These concepts, under a broadest reasonable interpretation, encompass the performance of the limitations in the mind (and possibly using a pen/paper), or alternatively the solving of a math problem (i.e., using vague / unstated mathematical concepts for categorizing/classifying documents).  Use of a mathematical concept integrated into a practical application may represent patent eligible subject matter, but the mere solving of a math problem is considered an abstract idea.
It is further noted that generic hardware (i.e., a “device”) is also claimed.  
For example, the claim limitation directed to “… sequentially apply a plurality of classification processes in a prescribed sequence to a plurality of documents to classify the plurality of documents into a plurality of groups” merely encompasses categorizing and re-categorizing data (in this case documents) into certain categories/classes/types.  Further, “every time one of the plurality of classification processes is applied, determine whether or not each group contains two or more documents”, merely involves recognizing that multiple documents are of the same category/class/type. And, “after applying a preceding one of the plurality of classification processes, applies a succeeding one of the plurality of classification processes to the two or more documents in each group determined as containing the two or more documents”, merely involves reperforming the first step using a different method for categorizing the data/document.  And the last limitation “the prescribed sequence is an ascending order of an amount of calculation involved in the plurality of classification processes” appears to mean that one performs the first categorizing step first, then the second step.  
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic components (or the use of pen/paper), then it falls within the “Mental Processes” grouping of abstract ideas.  
Alternatively, other than reciting additional generic elements, such as processors and storage, nothing in the claim precludes characterization as a mathematical concept.  For example, the claim encompasses the performance of an unclaimed / vague mathematical operation to establish categories for data/documents.  These limitations are therefore may also reasonably be characterized as encompassing mathematical concepts (i.e., an abstract idea).  
Accordingly, the claim recites an abstract idea.  I.e., these limitations encompass mental processes, or in the alternative a mathematical concept (an abstract idea).  

Step 2A, Prong 2 (Integrated into a Practical Application?):  No.  The claim recites a series of steps directed to use of a plurality of classification techniques in which a first classification technique categorizes data/documents in bins/categories, recognizing whether more than one such document has been categorized into the same category/bin, then performing another classification technique thereby forming a series of classification techniques performed in sequential order.  These concepts, under a broadest reasonable interpretation and other than reciting additional generic elements, such as a device, cover performance of the limitations in the mind (or with the aid of pencil and paper).  Alternatively, other than reciting additional generic elements, such as a device, nothing in the claim precludes characterization as mental processes or a mathematical concept.  
The computing elements are recited at a high-level of generality such that the claim amounts to no more than mere instructions to apply the exception using generic computer components.  Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose meaningful limits on practicing the abstract idea.  Therefore, the claim is directed to an abstract idea. 

Step 2B (Inventive Concept Provided?):  No.  As discussed with respect to Step 2A, the elements (i.e., steps directed to use of a plurality of classification techniques in which a first classification technique categorizes data/documents in bins/categories, recognizing whether more than one such document has been categorized into the same category/bin, then performing another classification technique thereby forming a series of classification techniques performed in sequential order) in the claim amount to no more than mere instructions to apply the exception.  Mere instructions to apply an exception using generic computer components cannot integrate a judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B.  
Therefore, the claim is not patent eligible, and is reasonably rejected under 35 USC §101.  


Independent claims 9 and 10 are each substantially similar to claim 1.  Therefore, these claims are likewise rejected.  

Claims 2-8 depend upon claim 1 and do not correct the issues set forth above.   Therefore, these claims are likewise rejected.  



35 USC § 112(f) claim interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

Use of the word “means” (or “step for”) in a claim with functional language creates a rebuttable presumption that the claim element is to be treated in accordance with 35 U.S.C. 112(f) (pre-AIA  35 U.S.C. 112, sixth paragraph).  The presumption that 35 U.S.C. 112(f) (pre-AIA  35 U.S.C. 112, sixth paragraph) is invoked is rebutted when the function is recited with sufficient structure, material, or acts within the claim itself to entirely perform the recited function.  

Absence of the word “means” (or “step for”) in a claim creates a rebuttable presumption that the claim element is not to be treated in accordance with 35 U.S.C. 112(f) (pre-AIA  35 U.S.C. 112, sixth paragraph).  The presumption that 35 U.S.C. 112(f) (pre-AIA  35 U.S.C. 112, sixth paragraph) is not invoked is rebutted when the claim element recites function but fails to recite sufficiently definite structure, material or acts to perform that function. 


Claim limitations directed to a “section” (as recited in claims 1-8 and 10) have been interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because they use a generic placeholder “section” coupled with functional language “configured to” [apply, determine, detect, extract, classifying, counting, obtaining, etc.] without reciting sufficient structure to achieve the function.  Furthermore, the generic placeholder is not preceded by a structural modifier.  
Since the claim limitation(s) invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, claim(s) 1-8 and 10 have also been interpreted to cover the corresponding structure described in the specification that achieves the claimed function, and equivalents thereof.  

If applicant wishes to provide further explanation or dispute the examiner’s interpretation of the corresponding structure, applicant must identify the corresponding structure with reference to the specification by page and line number, and to the drawing, if any, by reference characters in response to this Office action. 
If applicant does not intend to have the claim limitation(s) treated under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112 , sixth paragraph, applicant may amend the claim(s) so that it/they will clearly not invoke 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, or present a sufficient showing that the claim recites/recite sufficient structure, material, or acts for performing the claimed function to preclude application of 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
For more information, see MPEP § 2173 et seq. and Supplementary Examination Guidelines for Determining Compliance With 35 U.S.C. 112 and for Treatment of Related Issues in Patent Applications, 76 FR 7162, 7167 (Feb. 9, 2011).




Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-10 are rejected under 35 U.S.C. § 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.



Regarding independent claim 1:  
First, it appears that Applicant’s claim encompasses an infinite loop, as a plurality of “classification processes” includes the same process being applied multiple times.  In such a case, a group that contains two or more documents will always contain two or more documents. 
Next, how is the “succeeding one” chosen?  Are classification processes reused?  (In which case, the implementation can ping-pong between two processes and never make use of a third, fourth, etc., process?) Must all such processes be used first?
Additionally, it is unclear what the claim limitation “the prescribed sequence is an ascending order of an amount of calculation involved in the plurality of classification processes” means.  What is “an ascending order” in the context of this claim?  What is an “amount of calculation”?  How is this “amount” determined? What does “involved” mean, and how does it fit into the meaning of this limitation (involved vice each process, vice the total number of processes? 
Therefore, the scope of each claim is ambiguous.

Claims 9 and 10 are substantially similar to claim 1.  Therefore, these claims are likewise rejected.  

Claims 2-8 depend upon claim 1 and do not correct the issues set forth above.  Therefore, these claims are likewise rejected.  




Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  Relevance is provided in at least the Abstract of each cited document.

Non-Patent Literature
Qiu, Junping, et al., “Detection and Optimized Disposal of Near-Duplicate Pages”, ICFCC 2010, Wuhan, China, May 21-24, 2010, pp. V2-604 - V2-607.
Survey of near-detection algorithms such a page-structure similarity, super-link similarity, clustering, excluding by URLS, and feature codes (page V2-604, last paragraph); Two types of current text detection algorithms:  syntax [based on Shingles] and semantic [ based on terms] (page V2-605, 1st full paragraph); No creation of a classification processing hierarchy, and subsequent application of a different classification process to detect the next level of near duplicate documents.

Liao, Fei, et al., “Combined Self-Attention Mechanism for Chinese Named Entity Recognition in Military”, Future Internet 2019, Vol. 11, August 2019, pp. 1-11.
Input sentences are converted into a sequence of character vectors (page 3, Figure 1); In the Chinese language there is no natural separator between words, resulting in ambiguous word boundaries (page 3, section 3.1 Embedding Layer); Mapping of input sentences to a vector sequence so that a neural network model can process the raw textual data, use of Word2vec to pre-train large scale unlabeled Chinese corpus and create a character vector dictionary (page 4, 1st paragraph); No creation of a classification processing hierarchy, and subsequent application of a different classification process to detect the next level of near duplicate documents. 




US Patent Application Publications
Cooke 	 				2015/0033120
 Detection of common content in documents, for example but not limited to, the detection of near-duplication and/or plagiarism or leaked information, managing documents by grouping into collections containing overlapping content and e-mail management (para 0001); In one aspect the present invention produces a binary stream representation of a text which enables minimal encoding of a text using one binary digit per meaningful character string, typically being a word. In one minimal binary representation, one digit is used for main content words and one digit is used for auxiliary content words, and binary pattern matching approaches then become readily applicable to entire documents or segments of documents of specified sizes (para 0009); In addition, the use of binary encoding has been found to be sufficiently lossy as to render the content of the original document practically unidentifiable from the encoded document owing to the very many variants of auxiliary character strings and main character strings that can be substituted for the binary digits in the encoded document (para 0015);  This 21-bit pattern would be readily reversible with prior knowledge of the key words and their order, but might also indicate other phrases. By applying or sliding this window against each document to be matched at the defined step size, an index is created that uses the contents of the window as an address to a linked list which contains metadata identifying the document source and the location within the document, and also metadata identifying windows from other, already processed documents against which the document may be compared (para 0121);  Creates a sliding window pattern to compare against “windows” in other documents in order to detect a duplicate document.  No creation of a classification processing hierarchy, and subsequent application of a different classification process to detect the next level of near duplicate documents.

Chitiveli 	 				2010/0306204
Systems, methods and articles of manufacture are disclosed for detecting a duplicate document. A plurality of documents may be assigned to categories, each category corresponding to a collection of duplicates, or near duplicate documents. A new document may be received. The new document may be evaluated against each category to determine a similarity score between the new document and each category. The new document may be identified as a duplicate based on the similarity scores and thresholds for each category. An action may then be performed on the duplicate based on duplication rules  (Abstract); Iterative processing of category / duplicate document (Fig. 6); If the duplicate detector 150 determines that a score 164 exceeds a threshold 158 for a near duplicate document, the duplicate detector 150 marks the new document 162 as a near duplicate document and transfers control to the duplication rules engine. Otherwise, the duplicate detector 150 proceeds to step 618, where the duplicate detector 150 determines whether any of the scores 164 are above a threshold 158 for similarity. If so, the duplicate detector 150 merely marks the new document 162 as a similar document (para 0050); No creation of a classification processing hierarchy, and subsequent application of a different classification process to detect the next level of near duplicate documents.

Yih 	 				2011/0219012
A technology for measuring the similarity between two objects (e.g., documents), via a framework that learns the term-weighting function from training data, e.g., labeled pairs of objects, to develop a learned model. A learning procedure tunes the model parameters by minimizing a defined loss function of the similarity score. Also described is using the learning procedure and learned model to detect near duplicate documents (Abstract); No creation of a classification processing hierarchy, and subsequent application of a different classification process to detect the next level of near duplicate documents.


US Patents
Acharya 					8,549,014
According to a further implementation, search engine 125 may generate a similarity hash (which may be used to detect near-duplication of a document) for the document and monitor it for changes. A change in a similarity hash may be considered to indicate a relatively large change in its associated document. In other implementations, yet other techniques may be used to monitor documents for changes. In situations where adequate data storage resources exist, the full documents may be stored and used to determine changes rather than some representation of the documents (col. 8 lines 34-44); No creation of a classification processing hierarchy, and subsequent application of a different classification process to detect the next level of near duplicate documents. 


Barsony 					10,467,252
System for characterizing and defining groups within large corpuses of documents using a combination of one or more of human judgment, tiered similarity analysis techniques, and language/concept analysis (Abstract); “Classification” od a document as a near-duplicate (col. 8 lines 37-40); No creation of a classification processing hierarchy, and subsequent application of a different classification process to detect the next level of near duplicate documents. 



Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Robert Stevens, whose telephone number is (571) 272-4102.  The examiner can normally be reached on M-F 6:00 – 2:30.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ashish Thomas can be reached on (571) 272-0631.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/ROBERT STEVENS/Primary Examiner, Art Unit 2164                                                                                                                                                                                                        




May 18, 2022