DETAILED ACTION
This Office Action is in response to the amendment and communication filed on 02/18/2021.
The present application is being examined under the pre-AIA  first to invent provisions.
Claims 1-19 have been examined and are pending in this application. Claims 1, 8, and 15 are independent.
Response to Arguments/Remarks
As to the objections to claims 1 and 8, the objections are withdrawn as the claims have been amended.
As to the rejections to claims 8-14, rejected under 35 U.S.C. § 101, the rejections are withdrawn as the claims have been amended to recite “non-transitory”.
As to the rejections to claims 6, 13, and 18, rejected under 35 U.S.C. § 112(b) Second paragraph, the rejections are withdrawn as the claims have been amended.
Applicants’ arguments in the instant Amendment, filed on 02/18/2021, with respect to the prior-art rejections to claims 1-19, and limitations listed below, have been fully considered but they are not persuasive. Applied prior art continue to teach the limitations, including the amended limitations.
Applicant’s Remarks: As to the independent claim 1, the Applicant submits that disclosure in Hubbard does not disclose or suggest the aforementioned claim 
Further, applicant respectfully submits that removing URL links as cited by the Examiner from Hubbard teaches against the claim limitation because the URL links removed by Hubbard contain valuable information for identifying a phishing attempt. In contrast, the claim limitation teaches removing stop words and repeated tags which are devoid of such valuable information. Applicant respectfully submits that Baudin also teaches against this limitation. Baudin states in paragraph 0041 that stop words are “not ignored” and, in fact, teaches that a parsing script that marks stop words (paragraph 0015) (Applicant Arguments/Remarks, 02/18/2021, page 9).
The Examiner disagrees with the Applicants. The Examiner respectfully submits that Hubbard teaches transforming said HTML source strings [ ] to reduce; and filtering sequences. Hubbard discloses that the system generates an indicator of active content associated with the URL. The indicator is based on data associated with at least one component of the URL. Remove URL links and other content from a list of URLs. Parse the returned page and identify URL links in the returned search results and embedded items of content (Hubbard: pars 0010-0011, 0042, 0062). In Hubbard teaching, URL links and other content [i.e. selective content] from a list of URLs [i.e. just a selective content], does not render the deviation of valuable information as the system. System maintains the ability to maintain valuable information, and maintain the ability to parse the returned page and identify URL links in the returned search results and embedded items of content. Therefore mapping of applied prior art 
Baudin teaches of input text editing/normalizing process where performing mark stop words and common words based on a list of words in the parsing script, and applies editor to add relations and similarity information usable; creating normalized vales (Baudin: pars 0014-0015, 0031-0033, 0085-0086). While the primary art Hubbard teaches of removing certain contain of a text-content, and secondary art Baudin teaches of identifying the stop words and common words, it is irrelevant to have Baudin to show any evidence of removing the identified stop words and common words, to combine with Hubbard to teach the addressed limitations. The Examiner agrees with the Applicant that, in paragraph 0041, Baudid discloses that prepositions, etc.) and words that are too general and must be underscored during comparison matches; these words are not ignored but they add less weight than other words to a final matching weight. However, none where in Baudin discloses that the stop words cannot be ignored, Baudin discloses that they are given less weight. Therefore, in combination with Hubbard if the less weight words are not required they can be removed following the teaching of Hubbard.  Therefore, broadly interpreted the combination of Hubbard and Baudin teaches the claim limitations.
Applicant’s Remarks: As to the independent claim 1, the Applicant further submits that Hubbard and Baudin does not disclose the teaching of limitations, “performing string alignment on said set of transformed source strings, thereby obtaining at least a scoring matrix” and “obtaining a second set of longest common sequences responsive to said performing said string alignment.” Applicant  (Applicant Arguments/Remarks, 02/18/2021, page 10).
The Examiner disagrees with the Applicants. The Examiner respectfully submits that the combination of Hubbard and Baudin teaches the addressed limitations. performing string alignment on said set of transformed source strings, thereby obtaining at least a scoring matrix; obtaining a second set of longest common sequences responsive to said performing said string alignment Baudin teaches of performing mark stop words and common words based on a list of words in the parsing script, and applies editor to add relations and similarity information usable; creating normalized vales. Applying a second matching method to the text to identify a value in the text; (Baudin: pars 0014-0015, 0031-0033, 0085-0086). The limitation does not refined as to what specific algorithm or mechanism is used to obtain the scoring Matrix, and what algorithm or mechanism is used for performing the string alignment. Therefore, broadly interpreted, Baudin teaches the limitations, as it teaches of applying 
Additionally, as to the dependent claims 3-7, the Applicant submits that the claims are allowable at least based on their dependency from the allowable base claim1 (Applicant Arguments/Remarks, 02/18/2021, page 10).
The Examiner respectfully submits that the dependent claims 2-7 are rejected at least based on the rationale and response presented to the argument for their respective base claim 1, and the reference applied to the claims 2-7.
Applicant’s Remarks: As to independent claims 8-19, the Applicant submits  that for similar reasons, as tdiscussed for claim 1-8, the claims are patenably distingusihed over the combination of Huberd and Baudin (Applicant Arguments/Remarks, 02/18/2021, page 10).
The Examiner disagrees with the Applicants. The Examiner respectfully submits that the claims 8-19 are rejected at least based on the reference applied to the claims and the rationale and response presented to the argument above for claims 1-7.
Claim Rejections - 35 USC § 103
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.


This application currently names joint inventors. In considering patentability of the claims under pre-AIA  35 U.S.C. 103(a), the examiner presumes that the subject matter of the various claims was commonly owned at the time any inventions covered therein were made absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and invention dates of each claim that was not commonly owned at the time a later invention was made in order for the examiner to consider the applicability of pre-AIA  35 U.S.C. 103(c) and potential pre-AIA  35 U.S.C. 102(e), (f) or (g) prior art under pre-AIA  35 U.S.C. 103(a).
Claims 1-19 are rejected under 35 U.S.C. 103(a) as being unpatentable over Hubbard et al (“Hubbard” US 2008/0133540, filed on 02/28/2007), in view of Baudin et al (“Baudin” US 2004/0163043, published on 08/19/2004).
As to claim 1, Hubbard teaches a computer-implemented method for generating a first set of longest common sequences from a plurality of known malicious webpages, said first set of longest common sequences representing input data which is used to generate a set of regular expressions for detecting phishing webpages (Hubbard: pars 0010-0011, 0042, 0068, a system and method for method of identifying and categorizing web content, including potentially executable web content and malicious content, where web content including, HTML webpages. Analyzed URL content and process reputation score for allowing a user access to the website associated with the URLs applying security policy), comprising:
(Hubbard: pars 0010-0011, 0042, 0049, receive a request for at least one uniform resource locator, where web content including, HTML webpages. upload module  is used to receive URL data from the network. Performs identifying and categorizing web content, including malicious content);
transforming said HTML source strings (Hubbard: pars 0010-0011, 0042, generate an indicator of active content associated with the URL. The indicator is based on data associated with at least one component of the URL), to reduce (Hubbard: pars 0010-0011, remove URL links and other content [i.e. selective content] from a list of URLs);
filtering sequences (Hubbard: pars 0010-0011, 0062, filtering module determines whether to allow the request based at least partly on the at least indicator and the policy. Parse the returned page and identify URL links in the returned search results and embedded items of content); and
using said first set of longest common sequences to generate the set of regular expressions: and using the set of regular expressions to detect a phishing attack (Hubbard: pars 0010-0011, 0042, 0055, 0062, filtering module determines whether to allow the request based at least partly on the at least indicator and the  security policy. Identifying and categorizing web content, including potentially executable web content and malicious content, that is found at locations identified by Uniform Resource Locators (URLs), where malicious web content may refers to interactive content such as "phishing" schemes. Generated active content, disallow URL requests that are categorized as being "Malicious" or "Spyware.").

performing string alignment on said set of transformed source strings, thereby obtaining at least a scoring matrix;
obtaining a second set of longest common sequences responsive to said performing said string alignment; and
[processing] said second set of longest common sequences, thereby obtaining said first set of longest common sequences.
However, in an analogous art, Baudin teaches transforming [ ] source strings to reduce the number of at least one of stop words and repeated tags, thereby obtaining a set of transformed source strings (Baudin: pars 0014-0015, 0033, 0085, obtaining structured data from input text files such as html file and webpages, and transforming the structured data applying a first matching method to the text to identify a value in the text. Performing mark stop words and common words based on a list of words in the parsing script);
performing string alignment on said set of transformed source strings, thereby obtaining at least a scoring matrix (Baudin: pars 0014-0015, 0031-0033, 0085-0086, performing mark stop words and common words based on a list of words in the parsing script, and applies editor to add relations and similarity information usable; creating normalized vales);
obtaining a second set of longest common sequences responsive to said performing said string alignment (Baudin: pars 0014-0015, 0031-0033, applying a second matching method to the text to identify a value in the text); and
(Baudin: pars 0014-0015, 0031-0033, 0085-0086, text transformation process creates text with new values after the editing process, enabling relationships and similarities between the elements).
Therefore, it would have been obvious to one of ordinary skill in the art the time the invention was made to combine the teachings of Baudin with the method of Hubbard for the benefit of providing a user with a means for receiving text strings of HTML source and processing the text using test transformation and filtering to analyze potential malicious webpage (Baudin: pars 0014-0015, 0033, 0085-0086). 
As to claim 2, the combination of Hubbard and Baudin teaches the method of claim 1, Baudin further teaches wherein said transforming said HTML source strings is configured to retain visual key tags of said HTML source strings (Baudin: pars 0014-0015, 0031-0033, transforming unstructured/semi-structured text into structured text/criterion-value pairs, insofar as the transformed text/associated criterion input text/WEB pages, results in the editing html tags, inclusive of the tags associated URL(s) and media (e.g., visual image links embedded) data content/files references).
As to claim 3, the combination of Hubbard and Baudin teaches the method of claim 1, Baudin further teaches wherein said filtering said second set of longest common sequences includes removing similarities among said second set of longest common sequences (Baudin: pars  0014-0015, 0031-0033, transforming unstructured/semi-structured text into structured text/criterion-value pairs, insofar as the transformed text/associated criterion input text/WEB pages, results in the editing and subsequent ).
As to claim 4, the combination of Hubbard and Baudin teaches the method of claim 1, Hubbard and Baudin further teaches wherein said performing string alignment includes performing dynamic programming calculations on said at least one scoring matrix (Hubbard pars 0077, generates a score or other data representative of a reputation of the URL. transforming unstructured/semi-structured text into structured text/criterion-value pairs, insofar as the transformed text/associated criterion input text/WEB pages, results in the editing and subsequent comparison/matching strategies to produce weighing text/text criteria, said structured text/criterion having been applied a second matching method (i.e., real time and dynamic weighing/scoring of the structured content so processed) insofar as criteria/values established are a function of the extraction rules, ontological matching criteria, etc.).
As to claim 5, the combination of Hubbard and Baudin teaches the method of claim 4, Baudin further teaches wherein said performing dynamic programming includes assigning different weights to different keyword combinations (Baudin: pars 0014-0015, 0031-0033, 0077,  transforming unstructured/semi-structured text into structured text/criterion-value pairs, insofar as the transformed text/associated criterion input text/WEB pages, results in the editing and subsequent comparison/matching strategies to produce weighing text/text criteria, said structured text/criterion having been applied a second matching method (i.e., real time and dynamic weighing/scoring of the structured content so processed). Categorized and URL field is indexed so that it may be more quickly searched in real time).
As to claim 6, the combination of Hubbard and Baudin teaches the method of claim 1, Baudin further teaches wherein said transforming said HTML source strings includes removing both stop words and said repeat tags (Baudin: pars 0014-0015, 0033, 0085, obtaining structured data from input text files such as html file and webpages, and transforming the structured data applying a first matching method to the text to identify a value in the text. Performing mark stop words and common words based on a list of words in the parsing script).
As to claim 7, the combination of Hubbard and Baudin teaches the method of claim 1, Baudin further teaches further teaches wherein said transforming said HTML source strings includes removing said stop words, said stop words representing at least one of a commonly occurring verb, a commonly occurring article, a commonly occurring preposition, and a commonly occurring pronoun (Baudin: pars 0014-0015, 0031-0033, 0038, transforming unstructured/semi-structured text into structured text/criterion-value pairs, insofar as the transformed text/associated criterion input text/WEB pages, results in the editing html tags/downplaying stop words/ extraction rules editing/ontology-based analysis. Llist of common words used to downplay stop words or the words that are too general for the domain).
As to claim 8, the claim is directed to an article of manufacture having thereon a computer readable medium, and the claim limitations are similar to the method claim 1, and therefore, rejected for the same reason set forth for claim 1.
As to claims 9-14, the claim limitations are similar to the method claims 2-7, respectively, and therefore, rejected for the same reason set forth for claims 2-7.
As to claim 15, Hubbard teaches a computer-implemented method for generating a first set of longest common sequences from a plurality of known malicious webpages, said first set of longest common sequences representing input data for generating regular expressions for detecting phishing webpages (Hubbard: pars 0010-0011, 0042, 0068, a system and method for method of identifying and categorizing web content, including potentially executable web content and malicious content, where web content including, HTML webpages. Analyzed URL content and process reputation score for allowing a user access to the website associated with the URLs applying security policy), comprising:
obtaining HTML source strings from said plurality of known malicious webpages (Hubbard: pars 0010-0011, 0042, 0049, receive a request for at least one uniform resource locator, where web content including, HTML webpages. upload module  is used to receive URL data from the network. Performs identifying and categorizing web content, including malicious content);
transforming said HTML source strings (Hubbard: pars 0010-0011, 0042, generate an indicator of active content associated with the URL. The indicator is based on data associated with at least one component of the URL), to reduce Hubbard: pars 0010-0011, remove URL links and other content [i.e. selective content] from a list of URLs); and
filtering sequences (Hubbard: pars 0010-0011, 0062, filtering module determines whether to allow the request based at least partly on the at least indicator and the policy. Parse the returned page and identify URL links in the returned search results and embedded items of content).

performing string alignment on said set of transformed source strings, thereby obtaining at least a scoring matrix;
obtaining a second set of longest common sequences responsive to said performing said string alignment;
[processing] said second set of longest common sequences to remove similarities among said second set of longest common sequences, thereby obtaining said first set of longest common sequences; and
using said first set of longest common sequences to generate the set of regular 
expressions for use  in detecting a phishing attack (Hubbard: pars 0010-0011, 0042, 0055, 0062, filtering module determines whether to allow the request based at least partly on the at least indicator and the  security policy. Identifying and categorizing web content, including potentially executable web content and malicious content, that is found at locations identified by Uniform Resource Locators (URLs), where malicious web content may refers to interactive content such as "phishing" schemes. Generated active content, disallow URL requests that are categorized as being "Malicious" or "Spyware.").
However, in an analogous art, Baudin teaches transforming said [ ] strings to retain the number of visual key tags of [ ] strings and to reduce the number of at least one of stop words and repeated tags, thereby obtaining a set of transformed source strings (Baudin: pars 0014-0015, 0033, 0085, obtaining structured data from input text files such as html file and webpages, and transforming the structured data applying a first matching method to the text to identify a value in the text. Performing mark stop words and common words based on a list of words in the parsing script. Transforming unstructured/semi-structured text into structured text/criterion-value pairs, insofar as the transformed text/associated criterion input text/WEB pages, results in the editing html tags, inclusive of the tags associated URL(s) and media (e.g., visual image links embedded) data content/files references);
performing string alignment on said set of transformed source strings, thereby obtaining at least a scoring matrix (Baudin: pars 0014-0015,0031-0033, 0085-0086, performing mark stop words and common words based on a list of words in the parsing script, and applies editor to add relations and similarity information usable; creating normalized vales);
obtaining a second set of longest common sequences responsive to said performing said string alignment (Baudin: pars 0014-0015, 0031-0033, applying a second matching method to the text to identify a value in the text); and
[processing] said second set of longest common sequences to remove similarities among said second set of longest common sequences, thereby obtaining said first set of longest common sequences (Baudin: pars 0014-0015, 0031-0033, 0085-0086, text transformation process creates text with new values after the editing process, enabling relationships and similarities between the elements. Transforming unstructured/semi-structured text into structured text/criterion-value pairs, insofar as the transformed text/associated criterion input text/WEB pages, results in the editing and subsequent comparison/matching strategies to produce weighing text/text criteria, said structured text/criterion having been applied a second matching method insofar as criteria/values established are a function of the extraction rules, ontological matching criteria, etc).
Therefore, it would have been obvious to one of ordinary skill in the art the time the invention was made to combine the teachings of Baudin with the method of Hubbard for the benefit of providing a user with a means for receiving text strings of HTML source and processing the text using test transformation and filtering to analyze potential malicious webpage (Baudin: pars 0014-0015, 0033, 0085-0086). 
As to claim 16, the combination of Hubbard and Baudin teaches method of claim 15, Hubbard and Baudin further teaches wherein said performing string alignment includes performing dynamic programming calculations on said at least one scoring matrix (Hubbard pars 0077, generates a score or other data representative of a reputation of the URL. transforming unstructured/semi-structured text into structured text/criterion-value pairs, insofar as the transformed text/associated criterion input text/WEB pages, results in the editing and subsequent comparison/matching strategies to produce weighing text/text criteria, said structured text/criterion having been applied a second matching method (i.e., real time and dynamic weighing/scoring of the structured content so processed) insofar as criteria/values established are a function of the extraction rules, ontological matching criteria, etc.).
As to claim 17, the combination of Hubbard and Baudin teaches method of claim 16, Baudin further teaches wherein said performing dynamic programming includes assigning different weights to different keyword combinations (Baudin: pars 0014-0015, 0031-0033, 0077,  transforming unstructured/semi-structured text into structured text/criterion-value pairs, insofar as the transformed text/associated criterion input text/WEB pages, results in the editing and subsequent comparison/matching strategies to produce weighing text/text criteria, said structured text/criterion having been applied a second matching method (i.e., real time and dynamic weighing/scoring of the structured content so processed). Categorized and URL field is indexed so that it may be more quickly searched in real time).
As to claim 18, the combination of Hubbard and Baudin teaches method of claim 15, Baudin further teaches wherein said transforming said HTML, source strings includes removing both said stop words and said repeat tags (Baudin: pars 0014-0015, 0033, 0085, obtaining structured data from input text files such as html file and webpages, and transforming the structured data applying a first matching method to the text to identify a value in the text. Performing mark stop words and common words based on a list of words in the parsing script).
As to claim 19, the combination of Hubbard and Baudin teaches method of claim 15, Baudin further teaches wherein said transforming said HTML source strings includes removing said stop words, said stop words representing at least one of a commonly occurring verb, a commonly occurring article, a commonly occurring preposition, and a commonly occurring pronoun (Baudin: pars 0014-0015, 0031-0033, 0038, transforming unstructured/semi-structured text into structured text/criterion-value pairs, insofar as the transformed text/associated criterion input text/WEB pages, results in the editing html tags/downplaying stop words/ extraction rules editing/ontology-based analysis. List of common words used to downplay stop words or the words that are too general for the domain).

Conclusion

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  

A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jahangir Kabir whose telephone number is (571) 270-3355.  The examiner can normally be reached on 9:00- 5:00 Mon-Thu.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Luu Pham can be reached on (571) 270-5002.  The fax number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published 

/JAHANGIR KABIR/             Primary Examiner, Art Unit 2439