DETAILED ACTION
	This final rejection is responsive to communication filed November 23, 2021.  Claims 1, 3-14, 17, and 18 are currently amended.  Claims 2, 16, and 19 are canceled.  Claims 1, 3-15, 17, 18, and 20 are pending in this application. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3-7, 10-13, 14, 17, 18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Ngo (US 2020/0358621 A1) in view of Chisa et al. (US 2014/0025712 A1) (‘Chisa’), and further in view of Henzinger (US 2008/0044016 A1).

With respect to claims 1, 13 and 18, Ngo teaches a system to process a set of files to determine whether a content of a target file is related to content of one or more source files, the system comprising: 

a memory (paragraph 60) including instructions that, when executed by the at least one processor, cause the at least one processor to perform operations to:
detect an operation by the first computing system to create or modify the target file (paragraph 70);
obtain a set of file references (file name or pathway) that are configured to access files stored on a data storage system associated with a first computing system, the set of file references comprising the reference to the target file and references to one or more source files (paragraphs 71 and 337); 
retrieve the target file and the one or more source files using the set of file references (paragraphs 72 and 337); 
partition the target file into a first set of tokens (partition target file into blocks and determine signature for blocks) (paragraph 292); 
partition the one or more source files into a second set of tokens (other files are partitioned into data blocks/signatures) (paragraph 294); 
identify, based on the first set of tokens and the second set of tokens, at least one source file of the one or more source files that contain a threshold quantity of tokens of the target file (detecting files having threshold number of signatures in common with target file) (paragraphs 294-296 and 338-339); and 
provide the at least one source file to a second computing system (logging relationship between files or emailing/texting due to similarity of files) (paragraphs 298 and 376).

Chisa teaches detect an operation by the first computing system to create or modify the target file (Fig. 3; paragraphs 41 and 63);
 identify the one or more source files based on files that were at least partially loaded in a memory of the first computing system within a window of time that is determined based on the detected operation to modify the target file (identifying recently used files) (paragraph 41); and 
generate a data structure that associates a reference to the target file with the references to the one or more source files (adding file to recently used file list) (paragraphs 32 and 41).
It would have been obvious to a person having ordinary skill in the art prior to the filing date of the invention to have modified Ngo to identify the sources files as recent files and associate them with a target file as taught by Chisa because files used around the same time are likely to be related and Chisa enables determination of local and remote recently used files (Chisa, abstract).  Further, Ngo tracks creation time and last access and modification times for files.


Henzinger teaches remove tokens identified as boilerplate tokens from the first set of tokens and the second set of tokens (paragraph 95) and identifying similar source files based on tokens remaining after removing boilerplate tokens (similarity is determined after removing boilerplate text) (paragraph 95).
It would have been obvious to a person having ordinary skill in the art prior to the filing date of the invention to have further modified Ngo to remove boilerplate text before determining related documents because removing content (such as boilerplate text) from files to reduce the amount of noise would help document similarity algorithms to perform better and more accurately access similarity of data (Henzinger, paragraph 95).

With respect to claims 3 and 17, Ngo in view of Chisa and Henzinger teaches wherein instructions to detect the operation to create or modify the target file include instructions to detect an operation to open a file for writing or an operation for create a new file (Chisa, paragraph 41).

With respect to claim 4, Ngo in view of Chisa and Henzinger teaches wherein the instructions to detect the operation to modify the target file include instructions to select the 

With respect to claim 5, Ngo in view of Chisa and Henzinger teaches wherein the window of time comprises at least one of: a time period prior to execution of the operation to modify a target file; or a time period after to execution of the operation to modify a target file (Chisa, paragraphs 54-55 and 66).

With respect to claim 6, Ngo in view of Chisa and Henzinger teaches the memory further comprising instructions that, when executed by the at least one processor, cause the at least one processor to perform operations to store file references of files that are at least partially loaded in memory of the first computing system during the window of time (Ngo, paragraphs 71 and 337; Chisa, paragraphs 32 and 41).

With respect to claim 7, Ngo in view of Chisa and Henzinger teaches wherein the instructions to retrieve the target file and the one or more source files using the set of file references include instructions to retrieve the target file or the one or more source files from a data repository that is configured to store files of the first computing system (Ngo, paragraphs 62 and 71).


partition the target file or the one or more source files into textual tokens (Ngo, paragraph 293); and 
generate hash code tokens using the textual tokens (Ngo, paragraph 293).

With respect to claim 11, Ngo in view of Chisa and Henzinger teaches where the instructions to identify at least one source file of the one or more source files that contain a threshold quantity of tokens of the target file include instructions to determine a quantity of tokens from the first set of tokens that are included in the second set of tokens (Ngo, paragraphs 295-296).

With respect to claim 12, Ngo in view of Chisa and Henzinger teaches wherein the instructions to provide the at least one source file to a second computing system include instructions to generate a data structure that comprises a reference to the at least one source file and a statistic that is indicative of a quantity of data of the target file that this stored in the at least one source file (logging relationship between files) (Ngo, paragraphs 298 and 376).

Claims 1, 3, 5, 6-11, 13-15, 17, 18, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Knight et al. (US 2014/0082006 A1) (‘Knight’) in view of Chisa et al. (US 2014/0025712 A1) (‘Chisa’), and further in view of Henzinger (US 2008/0044016 A1).

With respect to claims 1, 13 and 18, Knight teaches a system to process a set of files to determine whether a content of a target file is related to content of one or more source files, the system comprising: 
at least one processor (paragraph 27); and
a memory (paragraph 27) including instructions that, when executed by the at least one processor, cause the at least one processor to perform operations to:
obtain a set of file references (file or document name) that are configured to access files stored on a data storage system associated with a first computing system, the set of file references comprising a reference to the target file and references to one or more source files (paragraphs 28 and 31); 
retrieve the target file and the one or more source files using the set of file references (paragraph 28); 
partition the target file into a first set of tokens (tokenize documents and create has tokens) (paragraph 28); 
partition the one or more source files into a second set of tokens (tokenize documents and create has tokens) (paragraph 28); 
remove tokens identified as stop words from the first set of tokens and the second set of tokens (paragraph 44);
identify, based on remaining tokens of the first set of tokens and the second set of tokens, at least one source file of the one or more source files that contain a threshold quantity detecting documents with at least two matching segments) (paragraphs 7, 34, 39, and 43); and 
provide the at least one source file to a second computing system (paragraphs 7 and 28).
Knight does not explicitly teach detect an operation by the first computing system to create or modify the target file; identify the one or more source files based on files that were at least partially loaded in a memory of the first computing system within a window of time that is determined based on the detected operation to modify the target file; or generate a data structure that associates a reference to the target file with references to the one or more source files.
Chisa teaches detect an operation by the first computing system to create or modify the target file (Fig. 3; paragraphs 41 and 63);
 identify the one or more source files based on files that were at least partially loaded in a memory of the first computing system within a window of time that is determined based on the detected operation to modify the target file (identifying recently used files) (paragraph 41); and 
generate a data structure that associates a reference to the target file with the references to the one or more source files (adding file to recently used file list) (paragraphs 32 and 41).
It would have been obvious to a person having ordinary skill in the art prior to the filing date of the invention to have modified Knight to identify the sources files as recent files and associate them with a target file as taught by Chisa because files used around the same time 

Further regarding claims 1, 13 and 18, although Knight teaches removing stop words and identifying similar files based on remaining tokens, Knight in view of Chisa does not explicitly teach remove tokens identified as boilerplate tokens from the first set of tokens and the second set of tokens.
Henzinger teaches remove tokens identified as boilerplate tokens from the first set of tokens and the second set of tokens (paragraph 95) and identifying similar source files based on tokens remaining after removing boilerplate tokens (similarity is determined after removing boilerplate text) (paragraph 95).
It would have been obvious to a person having ordinary skill in the art prior to the filing date of the invention to have further modified Knight to remove boilerplate text before determining related documents because removing content (such as boilerplate text) from files to reduce the amount of noise would help document similarity algorithms to perform better and more accurately access similarity of data (Henzinger, paragraph 95).

With respect to claims 3 and 17, Knight in view of Chisa and Henzinger teaches wherein instructions to detect the operation to create or modify the target file include instructions to detect an operation to open a file for writing or an operation for create a new file (Chisa, paragraph 41).



With respect to claim 6, Knight in view of Chisa and Henzinger teaches the memory further comprising instructions that, when executed by the at least one processor, cause the at least one processor to perform operations to store file references of files that are at least partially loaded in memory of the first computing system during the window of time (Chisa, paragraphs 32 and 41).

With respect to claim 7, Knight in view of Chisa and Henzinger teaches wherein the instructions to retrieve the target file and the one or more source files using the set of file references include instructions to retrieve the target file or the one or more source files from a data repository that is configured to store files of the first computing system (Knight, paragraphs 28 and 24).

With respect to claim 8, Knight in view of Chisa and Henzinger teaches wherein the instructions to partition the target file into a first set of tokens or to partition the one or more source files into a second set of tokens include instructions to: 
partition at least one of the target file or the one or more source files based on a syntax (sentences, paragraphs) of the content of the target file (Knight, paragraph 28); and 


With respect to claims 9 and 15, Knight in view of Chisa and Henzinger teaches wherein the instructions to partition the target file into a first set of tokens or to partition the one or more source files into a second set of tokens include instructions to generate tokens based on sentences, phrases, paragraphs, or other logical groupings of textual content in the target file or the one or more source files (Knight, paragraph 28).

With respect to claims 10, 14, and 20, Knight in view of Chisa and Henzinger teaches wherein the instructions to partition the target file into a first set of tokens or to partition the one or more source files into a second set of tokens include instructions to: 
partition the target file or the one or more source files into textual tokens (Knight, paragraph 28); and 
generate hash code tokens using the textual tokens (Knight, paragraphs 22 and 28).

With respect to claim 11, Knight in view of Chisa and Henzinger teaches where the instructions to identify at least one source file of the one or more source files that contain a threshold quantity of tokens of the target file include instructions to determine a quantity of tokens from the first set of tokens that are included in the second set of tokens (Knight, paragraphs 32 and 39).

Response to Arguments
Applicant's arguments with respect to the limitation “generate a data structure that associates a reference to the target file with references to the one or more source files” filed November 23, 2021 have been fully considered but they are not persuasive.  Applicant argues that Chisa does not associate a reference to the target file with references to the one or more source files.  The examiner disagrees.  Chisa teaches generating a most recent used list that associates a reference to the target file (i.e. name, url, or location of opened file) with references to the one or more source files (i.e. names, urls, or locations of files already stored in most recently used list) (adding file to recently used file list) (paragraphs 32 and 41).  The claims do not specify what the association is.  Further, a reference to a file may be a name of a file, a URL, or even the location of the file.  Chisa teaches that a name of a file, a URL, and/or the location of the file may be listed in most recently used list (paragraphs 7, 29 and 61), and thus adding the newly opened file (target file) to the list of files already store (source files) on most recently used list constitutes generating a data structure that associates a reference to the target file with references to the one or more source files.
Applicant’s other arguments with respect to claims 1, 3-15, 17, 18, and 20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALICIA M WILLOUGHBY whose telephone number is (571)272-5599.  The examiner can normally be reached on 9-5:30, EST, M-F.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Robert Beausoliel can be reached on 571-272-3645.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/ALICIA M WILLOUGHBY/             Primary Examiner, Art Unit 2167                                                                                                                                                                                           	February 2, 2022