Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continuity/Reexam Information for 16/805534 
    
        
            
                                
            
        
    

Parent Data16805534, filed 02/28/2020 Claims Priority from Provisional Application 62811723, filed 02/28/2019 Claims Priority from Provisional Application 62828316, filed 04/02/2019

1.	Claims presented for examination: 1-20

Claim Objections
2.	Claim 14 is objected to because of the following informalities:  In lines 15-16, the claim limitation is being duplicated of claim language in lines 13-14.  Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


3.	Claim 1  rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being incomplete for omitting essential steps, such omission amounting to a gap between the steps.  See MPEP § 2172.01.  The omitted steps are:  the step for generating the overlapping score has not claimed.
1 recites the limitation "determine a document linkage for the collection of document based on the score…" in line 26.  There is insufficient antecedent basis for this limitation in the claim.

Allowable Subject Matter
5.	Claims 1-14 will be allowed when applicants overcome the claims rejections as set forth in the Office Action
	Claims 14-20 will be allowed when applicant(s) overcome the claim objection as set forth in the Office Action.
The following is a statement of reasons for the indication of allowable subject matter:
As to claim 1, cited reference(s) Conrad, Short and/or Patinkin alone or in combination fails to disclose or suggest “receive electronic data representing a collection of documents; identify multiple documents in the collection that have a group document property, the group document property including content of the document; group the identified documents based each having the group document property that includes content of the document; generate a signature for each of the documents in the group, each signature representative of content in each respective document; compare the signature for each of the documents in the group to determine a percentage of overlap of content between each of the documents in the group; generate a score for each comparison that correlates with a relative value for the percentage of overlap of content between each of the documents; identify comparisons for which the score is above a predetermined threshold value; cluster the documents associated with the scores that are above the predetermined threshold value into a first cluster; generate a first document lineage for the first cluster, the first document lineage representing a relationship between each of the documents in the first cluster based on relative scores; for the documents in the collection of documents that do not have the group document property, compare a difference in filenames between the documents to determine a percentage of overlap score between each of the 

	As to claim 14, cited reference(s) Conrad, Short and/or Patinkin alone or in combination fails to disclose or suggest “group multiple text-based document files based on a group document property of each of the documents; using a MinHashing technique, generate a file signature unique to each of the text-based document files, the file signature representing a user-readable content of the document; using a Jaccard similarity technique, compare the file signatures for each of the document files to identify a percentage of overlap of content between multiples of the document files; generate a score for each comparison based on the percentage of overlap of content between multiples of the text-based document files; determine the score for two or more of the text-based documents is above a predetermined threshold; determine that the score for two or more of the text-based documents is above a predetermined threshold; generate a document lineage for the multiple text-based document files based on the score for the multiple text-based documents that are above the predetermined threshold; group non-text-based documents; generate a score for each of the non-text-based documents based on the Levenshtein distance between filenames of each of the non-text-based documents; determine that the score for the non-text-based documents is above a threshold for non-text-based documents; and generate a document lineage for the multiple non-text-based document files based on the score for the multiple non-text-based documents that are above the predetermined .

Conclusion
5.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to BAOQUOC N TO whose telephone number is (571)272-4041.  The examiner can normally be reached on Mon-Sat 9 - 10:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hosain S Alams can be reached on 571-272-3978.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


BAOQUOC N. TO
Examiner
Art Unit 2154