DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 7, 8, 14, 15, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Alspector et al. (US 2010/0191819), hereafter “Alspector,” in view of Costea et al. (US 2021/0136089), hereafter “Costea.”
Regarding claim 1, Alspector teaches a computer-implemented near-duplicate document detection method (Alspector: par 0026), the method comprising: 	receiving a message having message content (Alspector: 620 of FIG. 6; par 0049); 	determining a message fingerprint based on at least part of the message content (Alspector: par 0049 […the e-mail grouper may determine the fingerprint of the received e-mail and determine whether the fingerprint matches any of the spam fingerprints saved as spam signatures.]);	determining whether the message is a near duplicate of another message by matching the message fingerprint to at least one fingerprint in a cluster of other messages (Alspector: par 0049 […e-mail grouper 320 analyzes the e-mail using the spam signatures to determine if the e-mail belongs to any of the spam clusters (i.e., is a substantially similar to the e-mails in a cluster determined to be spam) (630).]); and 	if the message fingerprint matches at least one message in the cluster of other messages, adding [the message] to the cluster of other messages (Alspector: 740 of FIG. 7; par 0052).	Alspector does not explicitly teach: 	adding an identifier for the message and the message fingerprint to the cluster of other messages. 	Costea teaches: 	adding an identifier for the message and the message fingerprint to a cluster of other messages (Costea: par 0047, 0048, 0057 [If the one or more fingerprints are identified inside of an existing centroid, then the fingerprint will be grouped into that centroids cluster.]). 	It would have been obvious to one of ordinary skill in the art to employ the clustering techniques of Costea within the Alspector system with predictable results. One would be motivated to make the combination to provide the benefit of Costea’s risk score evaluation and advanced classification techniques to improve the effectiveness of the Alspector system. One would further be motivated to make the combination because both Costea and Alspector disclose systems for grouping/clustering related email messages. Accordingly, implementing Costea’s methods of clustering and classification within the Alspector system would have amounted to simple substitution of one known element for another with predictable results. Further, in view of this substantial similarity it would have been readily apparent to one of ordinary skill that various beneficial features of Costea could have been implemented within the Alspector system with predictable results and a beneficial effect. 

Regarding claim 7, the method of Claim 1, wherein: 	the step of determining a message fingerprint based on at least part of the message content comprises mathematically generating a fingerprint corresponding to the part of the message content using a fingerprinting algorithm (Costea: par 0009, 0036); and 	the step of determining whether the message is near duplicate of another message by matching the message fingerprint to at least one fingerprint in a cluster of other messages comprises determining whether the message fingerprint is within a predetermined distance metric of at least one fingerprint in a cluster of other messages (Costea: par 0036). 

Regarding claim 8, a system for near-duplicate detection, the system comprising:	one or more processors (Alspector: par 0094, 0095); and 	one or more memory devices in communication with the one or more processors, the memory devices having computer-readable instructions stored thereupon that, when executed by the processors, cause the processors to perform a method for near-duplicate detection (Alspector: par 0094, 0095), the method comprising: 	receiving a message having message content (Alspector: 620 of FIG. 6; par 0049); 	determining a message fingerprint based on at least part of the message content (Alspector: par 0049 […the e-mail grouper may determine the fingerprint of the received e-mail and determine whether the fingerprint matches any of the spam fingerprints saved as spam signatures.]);	determining whether the message is a near duplicate of another message by matching the message fingerprint to at least one fingerprint in a cluster of other messages (Alspector: par 0049 […e-mail grouper 320 analyzes the e-mail using the spam signatures to determine if the e-mail belongs to any of the spam clusters (i.e., is a substantially similar to the e-mails in a cluster determined to be spam) (630).]); and 	if the message fingerprint matches at least one message in the cluster of other messages, adding an identifier for the message and the message fingerprint to the cluster of other messages (Alspector: 740 of FIG. 7; par 0052; Costea: par 0047, 0048, 0057 [If the one or more fingerprints are identified inside of an existing centroid, then the fingerprint will be grouped into that centroids cluster.]). 

Regarding claim 14, the near-duplicate detection system of Claim 8, wherein:	the step of determining a message fingerprint based on at least part of the message content comprises mathematically generating a fingerprint corresponding to the part of the message content using a fingerprinting algorithm (Costea: par 0009, 0036); and 	the step of determining whether the message is near duplicate of another message by matching the message fingerprint to at least one fingerprint in a cluster of other messages comprises determining whether the message fingerprint is within a predetermined distance metric of at least one fingerprint in a cluster of other messages (Costea: par 0036). 

Regarding claim 15, one or more computer storage media having computer executable instructions stored thereon which, when executed by one or more processors, cause the processors to execute a near-duplicate detection method (Alspector: par 0094, 0095), the method comprising: 	receiving a message having message content (Alspector: 620 of FIG. 6; par 0049);	determining a message fingerprint based on at least part of the message content (Alspector: par 0049 […the e-mail grouper may determine the fingerprint of the received e-mail and determine whether the fingerprint matches any of the spam fingerprints saved as spam signatures.]);	determining whether the message is a near duplicate of another message by matching the message fingerprint to at least one fingerprint in a cluster of other messages (Alspector: par 0049 […e-mail grouper 320 analyzes the e-mail using the spam signatures to determine if the e-mail belongs to any of the spam clusters (i.e., is a substantially similar to the e-mails in a cluster determined to be spam) (630).]); and 	if the message fingerprint matches at least one message in the cluster of other messages, adding an identifier for the message and the message fingerprint to the cluster of other messages (Alspector: 740 of FIG. 7; par 0052; Costea: par 0047, 0048, 0057 [If the one or more fingerprints are identified inside of an existing centroid, then the fingerprint will be grouped into that centroids cluster.]). 

Regarding claim 20, the computer storage media of Claim 15, wherein:	the step of determining a message fingerprint based on at least part of the message content comprises mathematically generating a fingerprint corresponding to the part of the message content using a fingerprinting algorithm (Costea: par 0009, 0036); and	the step of determining whether the message is near duplicate of another message by matching the message fingerprint to at least one fingerprint in a cluster of other messages comprises determining whether the message fingerprint is within a predetermined distance metric of at least one fingerprint in a cluster of other messages (Costea: par 0036). 

Claims 2-6, 9-13, and 16-19 are rejected as being unpatentable over Alspector et al. (US 2010/0191819), in view of Costea et al. (US 2021/0136089), and further in view of Benea et al. (US 2008/0289047), hereafter “Benea.”
Regarding claim 2, Alspector-Costea teaches the near-duplicate detection method of Claim 1, the method including: 	determining a risk level for the cluster of other messages (Costea: par 0042); and 	if the risk level for the cluster is greater than a risk threshold, [taking an action] (Costea: par 0042). 	Alspector-Costea does not explicitly teach: 	if the risk level for a cluster is greater than a risk threshold, adding the fingerprints of the cluster of other messages to a risk list. 	Benea teaches: 	if a risk level is greater than a risk threshold, adding fingerprints to a risk list (Benea: par 0046, 0049).	It would have been obvious to one of ordinary skill to store the high risk score fingerprint clusters of Alspector-Costea in a list according to the technique of Benea with predictable results. One would be motivated to make the combination in order to provide the benefit of expedited and convenient visualization of risky clusters to users of the system. A high likelihood of success is anticipated given that the clusters of Alspector-Costea are already stored in a list, and as such it would have been apparent that clusters with a high risk score could likewise be readily stored in a list (Costea: par 0057). Further, in view of the substantial similarity of Benea to Alspector-Costea it would have been readily apparent to one of ordinary skill that various beneficial features of Benea could have been implemented within the Alspector-Costea system with predictable results and a beneficial effect. 

Regarding claim 3, the method of Claim 2, the method including: 	receiving an inquiry message with inquiry message content (Benea: par 0023, 0024); 	determining an inquiry message fingerprint based on at least part of the inquiry message content (Benea: par 0023, 0024);	searching the risk list for a fingerprint matching the inquiry message fingerprint (Benea: par 0023, 0024); and 	if the fingerprint matching the inquiry message is found on the risk list, generating at least one of an alert, a notification, and a blocking message (Benea: par 0023, 0024). 

Regarding claim 4, the method of Claim 1, the method including:	receiving an inquiry message with inquiry message content (Benea: par 0023, 0024); 	determining an inquiry message fingerprint based on at least part of the inquiry message content (Benea: par 0023, 0024); 	searching one or more clusters of other messages for a fingerprint matching the inquiry message fingerprint (Costea: par 0048); 	if the fingerprint matching the inquiry message fingerprint is found on a matching cluster of other messages, then determining a risk level for the matching cluster (Costea: par 0048); and	if the risk level for the matching cluster is greater than a risk threshold, generating at least one of an alert, a notification, and a blocking message if the message is a near-duplicate of any fingerprint on the risk list (Costea: par 0047, 0048, 0057).

Regarding claim 5, the method of Claim 4, the method including:	training a risk detection model using machine learning applied to data for one or more clusters of other messages and one or more attributes to determine a risk level (Costea: par 0067); and 	the step of determining a risk level for the matching cluster comprises predicting a risk level associated with the matching cluster of other messages using the risk detection model (Costea: par 0067).

Regarding claim 6, the method of Claim 5, where the one or more attributes includes one or more of a sender identifier (Costea: par 0083), a number of messages sent by the sender, a number of accounts associated with the sender, or a description, price, or age of an item listing.

Regarding claim 9, the near-duplicate detection system of Claim 8, the method including: 	determining a risk level for the cluster of other messages (Costea: par 0042); and 	if the risk level for the cluster is greater than a risk threshold, adding the fingerprints of the cluster of other messages to a risk list (Costea: par 0042; Benea: par 0046, 0049).

Regarding claim 10, the near-duplicate detection system of Claim 9, the method including: 	receiving an inquiry message with inquiry message content (Benea: par 0023, 0024); 	determining an inquiry message fingerprint based on at least part of the inquiry message content (Benea: par 0023, 0024); 	searching the risk list for a fingerprint matching the inquiry message fingerprint (Benea: par 0023, 0024); and 	if the fingerprint matching the inquiry message is found on the risk list, generating at least one of an alert, a notification, and a blocking message (Benea: par 0023, 0024).

Regarding claim 11, the near-duplicate detection system of Claim 8, the method including: 	receiving an inquiry message with inquiry message content (Benea: par 0023, 0024); 	determining an inquiry message fingerprint based on at least part of the inquiry message content (Benea: par 0023, 0024); 	searching one or more clusters of other messages for a fingerprint matching the inquiry message fingerprint (Costea: par 0048); 	if the fingerprint matching the inquiry message fingerprint is found on a matching cluster of other messages, then determining a risk level for the matching cluster (Costea: par 0048); and	if the risk level for the matching cluster is greater than a risk threshold, generating at least one of an alert, a notification, and a blocking message if the message is a near-duplicate of any fingerprint on the risk list (Costea: par 0047, 0048, 0057).

Regarding claim 12, the near-duplicate detection system of Claim 11, the method including: 	training a risk detection model using machine learning applied to data for one or more clusters of other messages and one or more attributes to determine a risk level (Costea: par 0067); and 	the step of determining a risk level for the matching cluster comprises predicting a risk level associated with the matching cluster of other messages using the risk detection model (Costea: par 0067).

Regarding claim 13, the near-duplicate detection system of Claim 12, where the one or more attributes includes one or more of a sender identifier (Costea: par 0083), a number of messages sent by the sender, a number of accounts associated with the sender, or a description, price, or age of an item listing.

Regarding claim 16, the computer storage media of Claim 15, where the near-duplicate detection method includes: 	determining a risk level for the cluster of other messages (Costea: par 0042);	if the risk level for the cluster is greater than a risk threshold, adding the fingerprints of the cluster of other messages to a risk list (Costea: par 0042; Benea: par 0046, 0049); 	receiving an inquiry message with inquiry message content (Benea: par 0023, 0024);	determining an inquiry message fingerprint based on at least part of the inquiry message content (Benea: par 0023, 0024); 	searching the risk list for a fingerprint matching the inquiry message fingerprint (Benea: par 0023, 0024); and 	if the fingerprint matching the inquiry message is found on the risk list, generating at least one of an alert, a notification, and a blocking message (Benea: par 0023, 0024).

Regarding claim 17, the computer storage media of Claim 16, where the near-duplicate detection method includes: 	receiving an inquiry message with inquiry message content (Benea: par 0023, 0024);	determining an inquiry message fingerprint based on at least part of the inquiry message content (Benea: par 0023, 0024);	searching one or more clusters of other messages for a fingerprint matching the inquiry message fingerprint (Costea: par 0048); 	if the fingerprint matching the inquiry message fingerprint is found on a matching cluster of other messages, then determining a risk level for the matching cluster (Costea: par 0048); and	if the risk level for the matching cluster is greater than a risk threshold, generating at least one of an alert, a notification, and a blocking message if the message is a near-duplicate of any fingerprint on the risk list (Costea: par 0047, 0048, 0057).

Regarding claim 18, the computer storage media of Claim 17, where the near-duplicate detection method includes:	training a risk detection model using machine learning applied to data for one or more clusters of other messages and one or more attributes to determine a risk level (Costea: par 0067); and 	the step of determining a risk level for the matching cluster comprises predicting a risk level associated with the matching cluster of other messages using the risk detection model (Costea: par 0067).

Regarding claim 19, the computer storage media of Claim 18, where the one or more attributes includes one or more of a sender identifier (Costea: par 0083), a number of messages sent by the sender, a number of accounts associated with the sender, or a description, price, or age of an item listing. 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAMES E SPRINGER whose telephone number is (571)270-5640. The examiner can normally be reached 9am - 5:30pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, GLENTON BURGESS can be reached on 571-272-3949. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

JAMES E. SPRINGER
Primary Examiner
Art Unit 2454



/JAMES E SPRINGER/           Primary Examiner, Art Unit 2454