Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claims 1-20 are pending in this application.


Continued Examination Under 37 CFR 1.114

A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 1/28/2021 has been entered.



Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have 


Claims 1-3, 5-18, 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Glass et al., US 2005/0060643 (hereinafter Glass) in view of Lecerf et al., US 2010/0150448 (hereinafter Lecerf).

For claim 1, Glass teaches a computer system, comprising: 
a processor; and 
a non-transitory computer-readable medium having stored thereon instructions that are executable by the processor to cause the computer system to perform operations comprising: 
accessing a plurality of character strings (see [0195] – [0196], [0215] – [0218], accessing “sample documents (sample messages)” representing strings for classification); 
tokenizing each of the character strings into one or more respective sub-string tokens (see [0240] “Each finger represents a partial document content feature that has been extracted according to one or more document parsing rules”, [0247], fingers are “strings of characters representing portions” of associated document/message, and where fingers represents sub-string tokens); 
storing the sub-string tokens for each of the character strings in an initial data store (see Fig. 5, [0240], [0307], “fingerprints for each message body finger are then stored, along with a message ID code, as part of a database record representing a profile of the message”);
processing the initial data store to produce a token probability lookup table (see [0275] – [0277], [0320], “A database table including a list of common fingers and their hash codes is maintained by the system administrator”, [0391], “The handprints may be stored in a database table” where database table represents token probability lookup table), wherein the processing includes storing, in the token probability lookup table, sub-string tokens and a count associated with each sub-string token (see [0275] – [0277], “Certain document metadata is extracted from each message during the handprinting process” including “A finger count is derived and is useful for comparing the number of fingers in one message to the number of fingers in another message” where finger count represents count for each sub-string associated with handprint that is stored in database table, [0285] “Creating handprints, or profiles”, [0320], [0391]);
receiving a request to classify an unprocessed character string (see [0195] – [0197], “preferred embodiment the invention may be used to classify email messages in support of a message filtering or classification objective” for an “unclassified document”); 
tokenizing the unprocessed character string (see [0196] - [0198], “unclassified documents are automatically processed by first removing insignificant content, according to a content significance rule set. Documents then are partitioned into a set of content chunks according to a content chunk rule set. Chunks then may have additional content removed according to additional content significance rules that are dependent on chunk types”); and 
producing a cleaned string from the unprocessed character string by discarding any tokens from the unprocessed character string that are not present in the token probability lookup table (see [0195] – [0198], “unclassified documents are automatically processed by first removing insignificant content, according to a content significance rule set”, [0269] “‘Noise fingers’ represent content chunks within messages containing insignificant character sequences or subsequences” and where “insignificant or obfuscating content may be removed by an automated document noise stripping process”, [0295] – [0296] “Noise stripping”, [0320], [0391], fingerprints identifying noise stored in database table for processing messages, examiner fingerprints database table to remove insignificant content represents discarding tokens not present in lookup table).

Lecerf teaches a method of producing a probability lookup table (see Lecerf, [0066], “input dataset is mined in the unsupervised mode, in order to extract frequent patterns from the dataset of text sequences” includes determining “patterns based on their frequency of occurrence”, [0082], “basic patterns which are factual fragments occurring k times (the minimum support) in the dataset” and “A factual (basic) pattern represents a single, contiguous fragment, i.e., part or all of a text sequence, which is not generalized so as to read on a plurality of different fragments. Each occurrence of a factual pattern may be stored as a tuple (i.e., a set of ordered elements) of the form <featureID, (b,e)>, where ‘featureID’ corresponds to a unique identifier that denotes a feature” representing pattern p is considered frequent, with probability...and is infrequent with probability...”  where frequently occurring sub-string tokens collected via a probability calculation represent a “token probability lookup table”) where “each sub-string stored in the token probability lookup table meeting a threshold probability level and discarding sub-string tokens that do not meet the threshold probability level” (see Lecerf, [0027], feature extraction includes mining the dataset for the frequent patterns and selecting those most relevant to the classification task” and where selection includes “minimum support threshold”, [0117] - [0121], “filter out the less relevant features” based on calculation that includes “prior probabilities for X values” and where “features which yield values of SU(Y,X) which are below a threshold value SU.sub.t are pruned, leaving a reduced set of features with relatively higher values” where filtering of sub-string tokens based on probability values that are “below a threshold value” represents the storing tokens over a threshold and discarding tokens that do not meet the threshold, [0127], “selects relevant features, by keeping features with a threshold value SU.sub.t which may be a user-selected threshold value”).  It would have been obvious to one skilled in the art at the time of the invention to modify the teachings of Glass with the teachings of Lecerf to build recognition of certain patterns, verified by algorithms, for subsequent comparison with incoming data (see Lecerf, [0003] – [0004], [0018], “frequent pattern extractor configured for extracting frequent patterns from an input dataset of extracted text sequences”, [0106], [0117] - [0127], “reduced set of features with relatively higher values” and “selects relevant features, by user-selected threshold value”).

For claims 8 and 16, Glass teaches a method, comprising: 
receiving, at a computer system, an unprocessed character string (see [0195] – [0197], “preferred embodiment the invention may be used to classify email messages in support of a message filtering or classification objective” for an “unclassified document”); 
the computer system tokenizing the unprocessed character string, including dividing the unprocessed character string into a plurality of sub-string tokens (see [0196] - [0198], “unclassified documents are automatically processed by first removing insignificant content, according to a content significance rule set. Documents then are partitioned into a set of content chunks according to a content chunk rule set. Chunks then may have additional content removed according to additional content significance rules that are dependent on chunk types”); and 
producing a cleaned string from the unprocessed character string by discarding any tokens from the unprocessed character string that are not present in a token probability lookup table, (see [0195] – [0198], “unclassified documents are automatically processed by first removing insignificant content, according to a content significance rule set”, [0269] “‘Noise fingers’ represent content chunks within messages containing insignificant character sequences or subsequences” and where “insignificant or obfuscating content may be removed wherein the token probability lookup table was created using operations comprising: 
accessing a plurality of character strings (see [0195] – [0196], [0215] – [0218], accessing “sample documents (sample messages)” for classification); 
tokenizing each of the character strings into one or more respective sub-string tokens (see [0240] “Each finger represents a partial document content feature that has been extracted according to one or more document parsing rules”, [0247], where fingers are “strings of characters representing portions” of associated document/message, where fingers represents sub-string tokens); 
storing the sub-string tokens for each of the character strings in an initial data store (see Fig. 5, [0240], [0307], “fingerprints for each message body finger are then stored, along with a message ID code, as part of a database record representing a profile of the message”); and 
processing the initial data store to produce a token probability lookup table (see [0320], “A database table including a list of common fingers and their hash codes is maintained by the system administrator”, [0391], “The handprints may be stored in a database table”), wherein the processing includes storing, in the token probability lookup table, sub-string tokens and a count associated with each sub-string token (see [0275] – [0277], “Certain document metadata is extracted from each message during the handprinting process” including “A finger count is derived and is useful for comparing the number of fingers in one message to the number of fingers in another message” where finger count represents count for each sub-string associated with handprint that is stored in database table, [0285] “Creating handprints, or profiles”, [0320], [0391]).

Lecerf teaches a method of producing a probability lookup table (see Lecerf, [0066], “input dataset is mined in the unsupervised mode, in order to extract frequent patterns from the dataset of text sequences” includes determining “patterns based on their frequency of occurrence”, [0082], “basic patterns which are factual fragments occurring k times (the minimum support) in the dataset” and “A factual (basic) pattern represents a single, contiguous fragment, i.e., part or all of a text sequence, which is not generalized so as to read on a plurality of different fragments. Each occurrence of a factual pattern may be stored as a tuple (i.e., a set of ordered elements) of the form <featureID, (b,e)>, where ‘featureID’ corresponds to a unique identifier that denotes a feature” representing a probability lookup table, [0106], “a pattern p is considered frequent, with probability...and is infrequent with probability...”  where frequently occurring sub-string tokens collected via a probability calculation represent a “token probability lookup table”) where for “each sub-string stored in the token probability lookup table meeting a threshold probability level” (see Lecerf, [0027], threshold”, [0117] - [0121], “filter out the less relevant features” based on calculation that includes “prior probabilities for X values” and where “features which yield values of SU(Y,X) which are below a threshold value SU.sub.t are pruned, leaving a reduced set of features with relatively higher values” where filtering of sub-string tokens based on probability values that are “below a threshold value” represents the storing tokens over a threshold and discarding tokens that do not meet the threshold, [0127], “selects relevant features, by keeping features with a threshold value SU.sub.t which may be a user-selected threshold value”).  It would have been obvious to one skilled in the art at the time of the invention to modify the teachings of Glass with the teachings of Lecerf to build recognition of certain patterns, verified by algorithms, for subsequent comparison with incoming data (see Lecerf, [0003] – [0004], [0018], “frequent pattern extractor configured for extracting frequent patterns from an input dataset of extracted text sequences”, [0106], [0117] - [0127], “reduced set of features with relatively higher values” and “selects relevant features, by keeping features with a threshold value SU.sub.t which may be a user-selected threshold value”).

For claims 2 and 17, the combination teaches wherein the operations further comprise: processing the plurality of character strings into a cleaned strings table, including eliminating tokens from the plurality of character strings that are 

For claims 3 and 18, the combination teaches wherein each of the plurality of character strings is labeled as belonging to a category in a plurality of categories (see Glass, [0325], “Non-noise fingers contained in sample messages from the database are identifiable by subjective classification labels associated with each finger”; Lecerf, [0037], “automatically assign an appropriate label from a set of labels to extracted segments, particularly text sequences”); and 
wherein processing the plurality of character strings into the cleaned strings table comprises: 
avoiding any duplicate strings being stored in the cleaned strings table (see Glass, [0231], “each sample message is checked to identify and discard new sample messages that are duplicates of or substantially similar to previously received sample messages”; Lecerf, [0017], [0122], “An efficient feature selection method should cope with both irrelevant and redundant features. The set of features may be further or alternatively reduced by removing redundant features”); and 
labeling each of the strings in the cleaned strings table, according to a model, as belonging to one of the plurality of categories (see Glass, [0325], “Non-noise fingers contained in sample messages from the database are identifiable classification labels associated with each finger”; Lecerf, [0037] – [0038], “set of features can be used by a classifier to learn a model which can then be used to label text sequences with appropriate class labels”). 

For claims 5, 12 and 20, the combination teaches receiving a user request associated with the unprocessed character string (see Glass, [0195] – [0197], “preferred embodiment the invention may be used to classify email messages in support of a message filtering or classification objective” for an “unclassified document”); categorizing the cleaned string into one of a plurality of categories based on a model (see Glass, [0325], “Non-noise fingers contained in sample messages from the database are identifiable by subjective classification labels associated with each finger”; Lecerf, [0037] – [0038], “set of features can be used by a classifier to learn a model which can then be used to label text sequences with appropriate class labels”); and causing an action to be taken in response to the user request based on the category for the cleaned string (see Glass, [0348], “Various message characteristics, such as characteristics of known non-spam messages, may be used to determine whether a new sample message should be subjected to more than one review. In this embodiment unanimous agreement on message reviews would be required in order for message reviews to be considered complete. Lack of unanimous agreement would trigger an alert, requiring administrator intervention to resolve a disputed review”).

automatically accept or reject messages based on a profile of the user that the user has permitted to reside within the filter”, [0200], “If the threshold value is not exceeded then a null classification or other non-specific classification is assigned to the unclassified document” and where null classification may also represents denied user request to classify message).

For claim 7, the combination teaches the computer system of claim 1, wherein the plurality of character strings comprises uniform resource locators (URLs) accessed by users of an electronically provided service (see Glass, [0252] – [0262], “Subdividing link fingers into subfingers provides greater granularity to the similarity detection process, which sometimes is needed to expose recurring content contained in links that is partially obscured by variable content within links”, [0271], “finger containing...URL”, [0299], “Link fingers, including URLs”). 

For claim 9, the combination teaches the method of claim 8, wherein the unprocessed character string is a web uniform resource locator (URL) corresponding to a web application (see Glass, [0252] – [0262], “Subdividing link fingers into subfingers provides greater granularity to the similarity detection process, which sometimes is needed to expose recurring content contained in links that is partially obscured by variable content within links”, [0271], “finger containing...URL”, [0299], “Link fingers, including URLs”). 

For claim 10, the combination teaches the method of claim 8, further comprising: labeling the cleaned string as belonging to one of a plurality of categories (see Glass, [0325], “Non-noise fingers contained in sample messages from the database are identifiable by subjective classification labels associated with each finger”; Lecerf, [0037] – [0038], “set of features can be used by a classifier to learn a model which can then be used to label text sequences with appropriate class labels”). 

For claim 11, the combination teaches the method of claim 10, wherein the labeling is performed based on a cleaned strings table created from a learning model, wherein strings in the cleaned strings table are labeled with various ones of the plurality of categories (see Glass, [0240] – [0244], “finger model” to classify and label messages, [0325]; Lecerf, [0037] – [0038], “set of features can be used by a classifier to learn a model which can then be used to label text sequences with appropriate class labels”). 

For claim 13, the combination teaches the method of claim 12, wherein causing the action to be taken includes transmitting the category for the cleaned string via an electronic communications network (see Glass, [0325], “subjective classification labels associated with each finger”, [0334], “In the subsequent manual review process the handprint may be altered by further interpretation of the content and by adding subjective classification labels to the handprint 

For claim 14, the combination teaches the method of claim 12, wherein the action comprises an escalation of a risk level for a transaction requested via the user request (see Glass, [0348], “Various message characteristics, such as characteristics of known non-spam messages, may be used to determine whether a new sample message should be subjected to more than one review. In this embodiment unanimous agreement on message reviews would be required in order for message reviews to be considered complete. Lack of unanimous agreement would trigger an alert, requiring administrator intervention to resolve a disputed review”).

Claims 4 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Glass et al., US 2005/0060643 (hereinafter Glass) and Lecerf et al., US 2010/0150448 (hereinafter Lecerf) and further in view of Zheng et al., US 2010/0145900 (hereinafter Zheng).

For claims 4 and 19, Zheng teaches wherein the model used to label the strings in the cleaned table is a Bayesian network (see [0020], “directed towards classifying messages as spam or non-spam using a two phased approach. The first phase employs a statistical classifier, such as a modified Bayesian classifier, to classify messages based on message content”).  It would have been obvious to one skilled in the art at the time of the invention to modify the teachings of Glass and Lecerf with the teachings of Zheng to provide an alternative classifier of messages that may or may not contain noise/spam (see Zheng, [0001], [0020]).


Response to Arguments

Applicant’s arguments with respect to claim(s) rejected under 35 U.S.C. 103 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.


Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Lloyd, US 8,095,530

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JENSEN HU whose telephone number is (571)270-3803.  The examiner can normally be reached on Monday - Friday 9-5 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on 571-272-4046.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.