DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

The claims 1-20 are pending in this application.  This is a non-final office action in response to Application Number 16/871,258 filed on 11 May 2020.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claims 1, 8, and 15 recite the limitation “match exclusive of a top-level domain” in the second limitation (claim 1: second limitation | claim 8: fifth limitation | claim 15: second limitation), however the term ‘exclusive’ has multiple common interpretations. It is unclear if the claimed limitation refers to not allowing a match with a top-level domain or if the match is only with a top-level domain.  For purposes of examination, examiner interprets this limitation such that a match is found from within a specific top-level domain.
The dependent claims 2-7, 9-14, and 16-20 do not clarify this issue and are also rejected under 35 USC 112 for the same reasons.

Claims 2, 9, and 16 recite the limitation “removing the matches from the set of matches” in the last limitation (claim 2: last limitation | claim 9: last limitation | claim 16: last limitation), however it is unclear which matches are removed since the dependent claims 2, 9, and 16 and their parent claims 1, 8, and 15 describe matches, irrelevant matches, clustered matches, and false positive matches. For purpose of examination, examiner interprets this limitation such that the not compromised FQDNs are not included in the suspicious list.

Claims 1, 8, and 15 recites the limitation "the seed domain string" three times in the fourth and fifth limitations (claim 1: fourth and fifth limitations | claim 8: seventh and eighth limitations | claim 15: fourth and fifth limitations).  There is insufficient antecedent basis for this limitation in the claim. 
The dependent claims 2-7, 9-14, and 16-20 do not clarify this issue and are also rejected under 35 USC 112 for the same reasons.

Claims 2, 7, 9, 12, 14, 16, and 19-20 recite the limitation “for each of the set of matches” in the first limitation (claim 2: first limitation | claim 9: first limitation | claim 16: first limitation | claim 12: first limitation | claim 7: first limitation | claim 14: first limitation | claim 20: first limitation | claim 19: first limitation). There is insufficient antecedent basis for this limitation in the claim.  The use of “each” in these limitations would imply that there could be multiple sets of matches, however the parent claims 1, 8, and 15 recite “a set of matches” in the last limitation (claim 1: last limitation | claim 8: last limitation | claim 15: last limitation), i.e. a singular set of matches. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (U.S. Patent 10,880,330) in view of Omoigui (U.S. Patent Publication 2011/0047148).


Regarding claim 1, Wang disclosed a method for automatically reducing false positives in a domain discovery process (see Wang 18:19-36: automatically discovering compromised FQDN with low (4.4%) false positive rate | Wang 29:49-61: competitor’s false detection rate is 90%), the method comprising:
analyzing, by a rules engine (see Wang 2:35-38: inconsistency searcher searches a set of TLD and detects at least one FQDN carrying at least one irrelevant term | 2:39-44: context analyzer, i.e. a rules engine, determines whether the frequently-used or irrelevant terms are unrelated to the FQDN’s content), a match produced by a domain discovery system (see Wang 2:35-38: inconsistency searcher, i.e. a part of a domain discovery system, searches a set of TLD and detects at least one FQDN carrying at least one irrelevant term, i.e. discovering a FQDN that matches – search results identify suspicious sites | 1:56-2:10: SEISE (Semantic Inconsistency Search) uses NLP to identify infected sites, i.e. domain discovery system | 10:42-44: SEISE includes Semantics Finder, Inconsistency Searcher, Context Analyzer, and IBT Collector), the match comprising a domain name determined by the domain discovery system as matching a seed domain (see Wang 2:35-38: inconsistency searcher searches a set of TLD, i.e. a seed domain, and detects at least one FQDN carrying at least one irrelevant term);
extracting, by the rules engine utilizing a natural language processing (NLP) library (see Wang 1:56-2:10: SEISE (Semantic Inconsistency Search) uses NLP to identify infected sites | 10:42-44: SEISE includes Semantics Finder, Inconsistency Searcher, Context Analyzer, and IBT Collector), a sequence of segments from the match (see Wang 9:31-10:12: NLP techniques include extracting words, keywords, parts-of-speech, and phrases from web content and analyzing the extracted content) exclusive of a top-level domain (TLD) (see Wang 11:1-20  identifying FQDN from within the specific sTLD);
assigning, by the rules engine utilizing the NLP library, a lexical category to each segment of the sequence of segments (see Wang 9:31-10:12: NLP techniques include extracting words, keywords, parts-of-speech, and phrases from web content and analyzing the extracted content. Each word within a text, phrase, or sentence is associated with a particular part of speech, i.e. lexical category);
determining, by the rules engine through fuzzy string matching (see Omoigui combination below), a segment of the sequence of segments that is closest to the seed domain string (see Wang 12:41-62: identifying terms within the identified FQDN that match the IBT (irrelevant bad terms), i.e. seed domain string, or have a large semantic gap with the sTLD/FQDN);
comparing, by the rules engine, the lexical category of the segment of the sequence of segments that is closest to the seed domain string with a lexical category of the seed domain string (see Wang 13:1-9: skip-gram model maps IBT or sTLD keyword with various semantics | 13:10-16: calculating the semantic distance between IBT and sTLD keyword | 13:32-46: comparing (performing a differential analysis – 14:20-52 provides details) the words extracted from the FQDN with the IBT to determine if they have a similar context, i.e. determining if the compared terms have related themes/categories | 15:26-52: identifying other terms within the same category as the FQDN’s word and IBT);
determining, by the rules engine based on the comparing, whether the match is relevant to the seed domain (see Wang 13:32-14:19: determining if the suspicious FQDN is truly compromised or if the presence of terms irrelevant to the sTLD is for a legitimate reason, i.e. determining if the matched FQDN is relevant or not – relevance is interpreted as the FQDN including IBT with a malicious context),
in response to a determination by the rules engine that the match is not relevant to the seed domain (see Wang 13:33-44: determining if the suspicious FQDN is truly compromised or if the presence of terms irrelevant to the sTLD is for a legitimate reason, i.e. relevance is interpreted as the FQDN including IBT without a legitimate reason – a compromised FQDN – and not relevant is interpreted as the FQDN including IBT for a legitimate reason – not compromised FQDN), identifying, by the rules engine, the match produced by the domain discovery system as a false positive (see Wang 2:35-38: inconsistency searcher searches a set of TLD, i.e. a seed domain, and detects at least one FQDN carrying at least one irrelevant term, i.e. identifying suspicious/compromised FQDNs. | 14:20-52: determining that the identified suspicious/compromised FQDN is not compromised, i.e. not relevant and false positive); and
automatically removing the false positive from a set of matches produced by the domain discovery system for the seed domain (see Wang 18:38-42: not compromised FQDN are not flagged, i.e. removing from suspicious list).

Wang did not explicitly disclose that the identification of closest segments is performed “through fuzzy string matching”, however in a related art of semantic queries (see Omoigui 0033) and identifying false positives (see Omoigui 0043), Omoigui disclosed analyzing the query search results via semantic indexing and fuzzy mapping (see Omoigui 0035). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Wang and Omoigui to further disclose how fuzzy matching is used when performing semantic analysis and thereby increase precision (see Omoigui 0044).

Regarding claim 8, the claim contains the limitations, substantially as claimed, as described in claim 1 above and is rejected under Wang-Omoigui according to the rationale provided above. Wang-Omoigui further disclosed a system for automatically reducing false positives in a domain discovery process, the system comprising: a processor (see Wang 5:1-10: processor); a non-transitory computer-readable medium (see Wang 5:1-10: memory); and stored instructions translatable by the processor (see Wang 5:1-10: instructions stored in memory and executed by processor) for performing the method of claim 1 above.

Regarding claim 15, the claim contains the limitations, substantially as claimed, as described in claim 1 above and is rejected under Wang-Omoigui according to the rationale provided above. Wang-Omoigui further disclosed a computer program product comprising a non-transitory computer-readable medium storing instructions translatable by a processor (see Wang 5:1-10: instructions stored in memory and executed by processor) to perform the method of claim 1 above.

Regarding claim 2, Wang-Omoigui disclosed the method according to claim 1, further comprising:
performing the analyzing, the extracting, the assigning, the determining, and the comparing for each of the set of matches (see rejection of claim 1 above | Wang 2:35-38: inconsistency searcher, i.e. a part of a domain discovery system, searches a set of TLD and detects at least one FQDN carrying at least one irrelevant term, i.e. discovering a FQDN that matches | 2:39-44: context analyzer, i.e. a rules engine, determines whether the frequently-used or irrelevant terms are unrelated to the FQDN’s content | 9:31-10:12: NLP techniques include extracting words, keywords, parts-of-speech, and phrases from web content and analyzing the extracted content | 13:32-14:19: determining if the suspicious FQDN is truly compromised or if the presence of terms irrelevant to the sTLD is for a legitimate reason);
clustering matches in the set of matches (a set of matches can include just one item) determined by the rules engine as not being relevant to the seed domain (see Wang 2:35-38: inconsistency searcher searches a set of TLD, i.e. a seed domain, and detects at least one FQDN carrying at least one irrelevant term, i.e. identifying suspicious/compromised FQDNs. | 14:20-52: determining that the identified suspicious/compromised FQDN is not compromised, i.e. not relevant); and
removing the matches from the set of matches (see Wang 18:38-42: not compromised FQDN are not flagged, i.e. removing from suspicious list).

Regarding claim 9, the claim contains the limitations, substantially as claimed, as described in claim 2 above and is rejected under Wang-Omoigui according to the rationale provided above.

Regarding claim 16, the claim contains the limitations, substantially as claimed, as described in claim 2 above and is rejected under Wang-Omoigui according to the rationale provided above.

Regarding claim 3, Wang-Omoigui disclosed the method according to claim 1, further comprising: excluding the TLD from the match (see Wang 19:10-20:9: identifying compromised FQDN in other sTLDs and other categories).

Regarding claim 10, the claim contains the limitations, substantially as claimed, as described in claim 3 above and is rejected under Wang-Omoigui according to the rationale provided above.

Regarding claim 17, the claim contains the limitations, substantially as claimed, as described in claim 3 above and is rejected under Wang-Omoigui according to the rationale provided above.



Regarding claim 4, Wang-Omoigui disclosed the method according to claim 1, wherein the automatically removing the false positive from the set of matches comprises dissociating the false positive and the seed domain (see Wang 18:38-42: not compromised FQDN are not flagged, i.e. dissociated from suspicious list).

Regarding claim 11, the claim contains the limitations, substantially as claimed, as described in claim 4 above and is rejected under Wang-Omoigui according to the rationale provided above.

Regarding claim 18, the claim contains the limitations, substantially as claimed, as described in claim 4 above and is rejected under Wang-Omoigui according to the rationale provided above.

Regarding claim 5, Wang-Omoigui disclosed the method according to claim 4, further comprising: dissociating false positives identified by the rules engine from being associated with the seed domain, the dissociating producing a final result set of relevant matches for the seed domain (see Wang 18:38-42: compromised FQDN are flagged, i.e. producing a list of compromised sites).

Regarding claim 12, the claim contains the limitations, substantially as claimed, as described in claim 5 above and is rejected under Wang-Omoigui according to the rationale provided above.

Regarding claim 6, Wang-Omoigui disclosed the method according to claim 5, further comprising: providing the final result set of relevant matches for the seed domain to a user interface or as input to a computing facility downstream from the rules engine (see Wang 28:66-29:6: reporting the discovered compromised FQDNs).

Regarding claim 13, the claim contains the limitations, substantially as claimed, as described in claim 6 above and is rejected under Wang-Omoigui according to the rationale provided above.

Regarding claim 7, Wang-Omoigui disclosed the method according to claim 1, wherein each of the set of matches produced by the domain discovery system for the seed domain contains a segment that is identical to the seed domain string (see Wang 2:35-38: inconsistency searcher searches a set of TLD and detects at least one FQDN carrying at least one irrelevant term, i.e. discovering a FQDN that includes a term matching the IBT, i.e. matching the seed domain string).

Regarding claim 14, the claim contains the limitations, substantially as claimed, as described in claim 7 above and is rejected under Wang-Omoigui according to the rationale provided above.

Regarding claim 20, the claim contains the limitations, substantially as claimed, as described in claim 7 above and is rejected under Wang-Omoigui according to the rationale provided above.
Regarding claim 19, Wang-Omoigui disclosed the computer program product of claim 15, wherein the instructions translatable by the processor for:
performing the dissociating for each of the set of matches determined as not being relevant to the seed domain to thereby produce a final result set of relevant matches for the seed domain (see Wang 18:38-42: compromised FQDN are flagged, i.e. producing a list of compromised sites); and
providing the final result set of relevant matches for the seed domain to a user interface or as input to a computing facility (see Wang 28:66-29:6: reporting the discovered compromised FQDNs).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Angela Widhalm de Rodriguez whose telephone number is (571)272-1035. The examiner can normally be reached M-F: 6am-2:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Thu Nguyen can be reached on (571) 272-6967. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/A.W.R./Examiner, Art Unit 2452                                                                                                                                                                                                        15 July 2022


/Patrice L Winder/Primary Examiner, Art Unit 2452