Notice of Pre-AIA  or AIA  Status
    	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1-3 and 8-12 are rejected under the 35 U.S.C 103 as being unpatentable over Jaiswal et al. (US 8688601 B1) hereinafter Jaiswal in further view of Finley et al. (“The Pyramid Match Kernel: Efficient Learning with Sets of Features”) hereinafter Finley.
Regarding claim 1, Jaiswal teaches a method comprising: 
obtaining training data comprising (i) a training list of a plurality of training items and (ii) a plurality of training input documents, wherein each training input document of the plurality of training input documents is a match with a different corresponding training item of the plurality of training items: (Section: Abstract; “obtaining a training data set for each specific category of sensitive information that includes a plurality of positive and a plurality of negative examples of the specific category of sensitive information, (3) using machine learning to train, based on an analysis of the training data sets, at least one machine learning-based classifier that is capable of detecting items of data that contain one or more of the plurality of specific categories of sensitive information, and then (4) deploying the machine learning-based classifier within the DLP system to enable the DLP system to detect and protect items of data that contain one or more of the plurality of specific categories of sensitive information in accordance with at least one DLP policy of the DLP system.” Thus, the reference’s ‘categories of sensitive information’, which are the list of the information of data, whether it is positive or negative example, it corresponds to the limitation’s list of the training items of a document. Moreover, in the reference, where it states the training data set which also corresponds to the limitation’s training input document since the reference’s data set are the inputs to train using the machine learning.)
identifying features of the plurality of training items of the training list; (Col. 2, Para. 2; “ In one embodiment, using machine learning to train the machine learning-based classifier may include, for each training data set, (1) extracting a feature set from the training data set that includes statistically significant features of the positive examples within the training data set and statistically significant features of the negative examples within the training data set and then (2) building a machine learning-based classification model from the feature set that is capable of indicating whether or not items of data contain the specific category of sensitive information associated with the training data set. In some embodiments, the negative examples within a particular training data set may represent the positive examples from all other training data sets” where the reference’s extracting a feature set, which is the characteristic of an item to determine the specific category of sensitive information which corresponds to the limitation’s identifying the features of the training item.)
for each of the plurality of training input documents, identifying values of the features; (Col. 9, Para. 6; “The term "feature," as used herein, may refer to any characteristic of an item of data that may be used to determine whether the item of data falls within one or more specific categories of sensitive information. Examples of such features include, without limitation, a word (e.g., "proprietary"), a pair of words (e.g., "stock market"), a phrase (e.g., "please do not distribute"), etc." The reference states that the ‘item of the data that is used to compare with other item of data, to determine the categories of sensitive information, corresponds to the limitation’s input document that is compare to other document for matching. The specification of the invention states that the feature is merely a variable for a data such as total, date, company name and etc. [Para. 0063, Line 8- 13], where the features in the reference, such as proprietary, stock market or please do not distribute are classified into different value such as word, pair of words or a phrase.)
storing the trained machine learning model for use in matching additional input documents with one of a plurality of items in a prediction list. (Col.9 para. 5, “The systems described herein may perform step 306 in a variety of ways and contexts. In one example, the systems described herein may train the machine learning-based classifier by, for each training data set obtained in step 304, (1) extracting a feature set from the training data set that includes statistically significant features of the positive examples within the training data set and statistically significant features of the negative examples within the training data set and then (2) using the feature set to build a machine learning-based classification model that is capable of indicating whether or not new items of data contain information that falls within the specific category of sensitive information associated with the training data set,” where the reference has indicated building a machine learning-based classification that is capable of determining the new items of data set, where the new item of data set corresponds to the limitation’s additional input document, machine-learning-based classification corresponds to the limitation’s trained machine learning model and the category of the sensitive information associated library where the item of data is being match with corresponds to the limitation’s prediction list.) 
Regarding the further limitation in claim 1, Jaiswal does teach, training a machine learning model for matching each training input document with the corresponding training item with respect to claim 1. Jaiswal does not explicitly teach training machine learning model for matching training input document by learning a parameterized similarity measure, wherein the parametrized similarity measure represents a degree of match between the values of the features of a given training input document and the corresponding training item.
However, Finley teaches, training a machine learning model for matching each training input document with the corresponding training item by learning a parameterized similarity measure, wherein the parametrized similarity measure represents a degree of match between the values of the features of a given training input document and the corresponding training item; (Pg. 219, Section: Model, “In our supervised clustering method, we hold the clustering algorithm constant and modify the similarity measure so that the clustering algorithm produces desirable clusterings. Our similarity measure Simw, parameterized by w, maps pairs of items to a real number indicating how similar the pair is; positive values indicate the pair is alike, negative values, unalike. Each pair of different items xa, xb ∈ x has a feature vector φ(xa, xb) ≡ φa,b to describe the pair. The similarity measure is Simw (xa, xb) = wT φa,b.” in which the reference’s ‘modify the similarity measure’ so that the clustering algorithm produces desirable clustering. The references teach the limitation also on the method of parameterizing the weight to map different pair of items to a real umber indicating how similar the pair is [Pg. 219, Section: Model, Line 4 - 10].)
Therefore, it would have been obvious for the one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of training a machine learning model for matching document disclosed by Jaiswal, to include the parameterized similarity measure taught by Finley, to optimize to a custom loss function and exploit transitive dependencies in data does improve performance compared to a naïve classification approach [Page 8, Section: Conclusion].
Regarding claim 2 Jaiswal-Finley teaches all the features with respect to claim 1, as outlined above.
Finley teaches the machine learning model comprising one of a structural-support vector machine, neural network, and random forest. (Pg. 219, Section: Learning Algorithm, “The structural SVM algorithm provides a general framework for learning with complex structured output spaces (Tsochantaridis et al., 2004). The method we present in the sequel is implemented in the software SVMstruct available from http://svmlight.joachims.org/svm struct.html. We describe how to apply this method to supervised clustering. We refer to our method as SVMcluster (SVM supervised clustering). The structural SVM algorithm solves this quadratic program:

    PNG
    media_image1.png
    127
    534
    media_image1.png
    Greyscale
 ” where the reference state that the use of structural support vector machine algorithm provides a general framework for learning complex structured output spaces. The structural support vector machine in the reference corresponds to the limitation because the reference’s structural SVM finds the correct clustering value greater than the incorrect clustering [Pg. 219, Section: Learning Algorithm, Para. 4] and the limitation’s structural SVM aims to find the highest score which corresponds to closest match.)
The reasons of obviousness have been noted in the rejection of Claim 1, above and are applicable herein.
Regarding claim 3 Jaiswal-Finley teaches all the features with respect to claim 1, as outlined above.
Finley teaches learning the parametrized similarity measure comprises optimizing the parameterized similarity measure such that a correct matching between the training input document with the corresponding training item has a highest score out of all matches between the training input document and different ones of the plurality of training items. 
(Pg. 217 Section: Introduction, “Clustering algorithms accept a set of items and produce a partitioning of that set. Two items in the same partition should be more similar than two items not in the same partition. However, sometimes what “similar” means is unclear. When clustering news articles, a user could want articles clustered either by topic, or by author, or by language, etc. A clustering algorithm may not produce desirable clusterings without additional information from the user. One way to provide this information is through manual adjustment of the clustering algorithm or similarity measure. However, manual adjustment of similarity can be difficult if articles are described by many attributes of indeterminate relevance. While users often cannot easily specify the similarity measure, they can often provide examples of what constitutes the “correct” clustering of a set. We take the approach of learning to cluster from user provided example clusterings. In this paper, we present an SVM algorithm for supervised clustering. This algorithm learns an item-pair similarity measure to optimize performance of correlation clustering (Bansal et al., 2002) on a variety of performance measures. Since clustering is NP-hard, we present and empirically evaluate approximation methods to use when learning,” in which the ‘manual adjustment’ of the similarity stated to be difficult due to many attributes of indeterminate relevance in the reference. While the statement in the reference states that the classification of different clustering identification is through different user provided clustering examples which can result in optimized performance of correlation clustering using the support vector machine. In this way the use of different example of clustering to modify the parameters accordingly corresponds to the limitation’s optimization of parametrized similarity measure to match the input document to different items.)
The reasons of obviousness have been noted in the rejection of Claim 1, above and are applicable herein.
Regarding claim 9 Jaiswal-Finley teaches all the features with respect to claim 1, as outlined above.
Jaiswal further teaches the features comprise one or more of total vendor and date. (Col. 15 Para. 2, “Any of a variety of portions of an item of data or document may be redacted, including a document's file name, the title of a document, the body or content of a document, a timestamp associated with a document, the subject, body, recipient list, and/or attachment of an email, or the like” and [Col. 8, line 1- 16] “Similarly, the term "category of sensitive information," as used herein, may refer to a specific administrator or machine-defined division or classification of sensitive information. Examples of categories of sensitive information include, without limitation, sensitive legal information (e.g., license agreements, sales agreements, partnership agreements, etc.), sensitive financial information (sales reports, loan applications, etc.), sensitive marketing information (e.g., marketing plans, product launch timelines, etc.), sensitive technical information (e.g., proprietary source code, product documentation, product formulas, training models, actuary algorithms, etc.), sensitive human-resource information (e.g., insurance claims, billing codes and/or procedures, patient health information, and personal information, such as Social Security numbers, credit card numbers, personal addresses, and resumes), or the like”thus the reference’s ‘recipient list’ as well as ‘timestamp associated with a document’ corresponds to claim’s limitation.)
The reasons of obviousness have been noted in the rejection of Claim 1, above and are applicable herein.
Regarding claim 10, Jaiswal-Finley teaches all the features with respect to claim 1, as outlined above.
Finley teaches the method of learning the parameterized similarity measure comprises learning a separate parameterized similarity measure for each of the features. (Pg. 219, Section: Model, “In our supervised clustering method, we hold the clustering algorithm constant and modify the similarity measure so that the clustering algorithm produces desirable clusterings. Our similarity measure Simw, parameterized by w, maps pairs of items to a real number indicating how similar the pair is; positive values indicate the pair is alike, negative values, unalike. Each pair of different items xa, xb ∈ x has a feature vector φ(xa, xb) ≡ φa,b to describe the pair. The similarity measure is Simw (xa, xb) = wT φa,b.” in which the reference’s ‘modify the similarity measure’ so that the clustering algorithm produces desirable clustering which refers to the parameters of the similarity measure with desired value for separate desired clustering which corresponds to the limitation. The references teach the limitation also on the method of parameterizing the weight to map different pair of items to a real umber indicating how similar the pair is [Pg. 219, Section: Model, Line 4 - 10].)
The reasons of obviousness have been noted in the rejection of Claim 1, above and are applicable herein.
Regarding claim 11, Jaiswal-Finley teaches all the features with respect to claim 1, as outlined above.
Jaiswal teaches a computer system programmed to perform the process of claim 1. (Col. 2 Para. 6, “In some embodiments, deploying the machine learning-based classifier within the DLP system may include providing the machine learning-based classifier as part of the DLP policy to a DLP agent installed on at least one client device and/or a DLP engine installed on at least one server configured to monitor a plurality of client devices. The computer- implemented method may also include, upon deploying the machine learning-based classifier within the DLP system” in which the computer implemented method in the references corresponds to the limitation’s computer system programmed to perform method.)
The reasons of obviousness have been noted in the rejection of Claim 1, above and are applicable herein.
Regarding claim 12, Jaiswal-Finley teaches all the features with respect to claim 1, as outlined above.
Jaiswal teaches a non-transitory computer storage comprising executable code that directs a computing system to perform the process of claim 1. (Col. 24 Claim 20 , “A non-transitory computer-readable-storage medium comprising one or more computer-executable instructions that, when executed by at least one processor of a computing device, cause the computing device to: identify a plurality of specific categories of sensitive information to be protected by a data loss prevention (DLP) system” where the reference states non-transitory storage device is implementing instruction such as identifying a sensitive information, which corresponds to the limitation non-transitory computer storage.) 
The reasons of obviousness have been noted in the rejection of Claim 1, above and are applicable herein.
Claim 4-8 are rejected under the 35 U.S.C 103 as being unpatentable over Jaiswal et al. (US 8688601 B1) hereinafter Jaiswal in view of Finley et al. (“Supervised Clustering with support vector machine”) hereinafter Finley in further view Elfeky et al. (“TAILOR: A Record Linkage Toolbox”) hereinafter Elfeky.
Regarding claim 4, Jaiswal-Finley teaches all the features with respect to claim 1, as outlined above. Jaiswal-Finley does not explicitly teach the method of accessing the prediction list: accessing an additional input document: and using the trained machine learning model to match the additional input document with one of the plurality of items in the prediction list. 
However, Elfeky teaches the method to access the prediction list and additional input document; the method of using the trained machine learning model to match the additional input document with one of the plurality of items in the prediction list. (Pg. 5, Section: 3.3, “The third model proposed in this paper is the hybrid record linkage model. Such a model combines the advantages of both the induction and the clustering record linkage models. Supervised learning gives more accurate results for pattern classification than unsupervised learning. However, supervised learning relies on the presence of a training set, which is not available in practice for many applications. Unsupervised learning can be used to overcome this limitation by applying the unsupervised learning on a small set of patterns in order to predict the class of each unclassified pattern,i.e.,a training set is generated. The proposed hybrid record linkage model proceeds in two steps. In the first step, clustering is applied to predict the matching status of a small set of record pairs. A training set is formed as {< c, f (c) >} where c is a comparison vector and f (c) is the predicted matching status of its corresponding record pair, i .e., f (c)∈{M,U,P} where P denotes a possible matched record pair, and M and U are as before. In the second step, a classifier is employed to build a classification model just like the induction record linkage model” thus where the reference teaches the limitation about the machine learning using the statement ‘supervised learning.’ Moreover, the reference’s ‘matching status of a small set of record pairs’ and ‘training set’ corresponds to the limitation’s prediction list and additional input document.)
The disclosure of Jaiswal-Finley and Elfeky , hereinafter JFE are analogous art to the claimed invention because they are in the training the machine learning model to match the input to corresponding training item.
Therefore, it would have been obvious for the one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of training a machine learning model for matching document disclosed by Jaiswal-Finley, to include the user’s access to the prediction list and the additional input document, taught by Elfeky. One of the ordinary skills in the art would be motivated to combine the method of training machine learning model for matching each training input document with the user’s accessibility to the prediction list and additional input document disclosed by Elfeky, to give the flexibility to the user to match and compare the input using classification algorithm to enhance the performance of retrieval system. Elfeky approach strengthen the classification accuracy in matching through machine learning    with clustering and hybrid record linkage model [Page 11 Section: Conclusion].
Regarding claim 5, JFE teaches all the features with respect to claim 4, as outlined above. 
Elfeky teaches the method of generating a user interface including: an indication of the match determined between the additional input document and the one of the plurality of items in the prediction list:  28XTRCT.004APATENTa user selectable element to confirm the match: and a user selectable element to deny the match. (Pg. 8, Section 4.2, “TAILOR provides its users with two different ways for interacting with the system. The users can use either a definition language or a graphical user interface. In either way, the user is able to select a searching method, a comparison function, and a decision model, as well as to tune all the required parameters. The values of the parameters determine the functionality of the various components
described in Section 4.1. For example, in order for the users to make use of the sorted neighborhood searching method, they should specify values for the two parameters: the sorting key and the window size. Because of space limitations, we are not providing a full description of the interface. However, we refer the interested reader to [8]” in which the reference has stated the system provides the user to interact with the system with definition language or a graphical user interface. The reference option to select and interact with the system to compare different and find out the different matching or unmatching pair.)
The reasons of obviousness have been noted in the rejection of Claim 4, above and are applicable herein.
Regarding claim 6, JFE teaches all the features with respect to claim 5, as outlined above. 
Elfeky teaches the method of receiving indication of a user selection of the user selectable element to confirm the match, removing the one of the plurality of items from the prediction list. (Pg. 6 Section 4.1.2.2, “The Hamming distance function cannot be used for
variable length fields since it does not take into account the possibility of a missing letter, e.g., “John” and “Jon”, or an extra letter, e.g., “John” and “Johhn”. The edit distance between two strings is the minimum cost to convert one of them to the other by a sequence of character insertions, deletions, and replacements. Each one of these modifications is assigned a cost value” thus the reference’s stated the system capability in the system to convert one of them to the other by insertion, deletions and replacement which explicitly corresponds to the limitation of the claim on removing the items from the prediction list.) 
The reasons of obviousness have been noted in the rejection of Claim 5, above and are applicable herein.
Regarding claim 7, JFE teaches all the features with respect to claim 5, as outlined above. 
Elfeky teaches the method of to receiving indication of a user selection of the user selectable element to deny the match. (Pg. 6. Para. 1, “They should be intelligent enough to exclude any record pair whose two records completely disagree, i.e., to exclude any record pair that cannot be a potentially matched pair. The selected record pairs are provided to the comparison functions to perform component-wise comparison of each record pair, and hence generate the comparison vectors. Then, the decision model is applied to predict the matching status of each comparison vector. Last, an evaluation step, to estimate the performance of the decision model, is performed” thus the reference states that the ‘exclude any record pair that cannot potentially match’ which correspond to the limitation’s user selectable of element to deny the match.)
In additional limitation regarding claim 7, Elfeky teaches the method of retrieving a next potential match between the additional input document and a different one of the plurality of items in the prediction list. (Pg. 4 Section 3.2 Para. 2) “The clustering record linkage model considers each comparison vector as a point in n-dimensional space, where n is the number of components in each record. A clustering algorithm, such as k-means clustering, is used to cluster those points into three clusters, one for each possible matching status, matched, unmatched, and possibly matched. After applying the clustering algorithm to the set of comparison vectors, the issue is to determine which cluster represents which matching status.” where the reference states the process of clustering algorithm is to point into one of the cluster which is matched status between the components in each record which corresponds to the limitation of the claim.) 
In further view of the limitation of claim 7, Elfeky teaches the method of generating an updated version of the user interface including an indication of the next potential match and the user selectable elements to confirm or deny the next potential match. (Pg. 2 Section:2.1, “Having in mind that it is always better to classify a record pair as a possible match than to falsely decide on its matching status with insufficient information, a third set P, called possible matched, is introduced. In the case that a record pair is assigned to P, a domain expert should manually examine this pair. We assume that a domain expert can always identify the correct matching status (M or U) of a record pair” thus the reference has stated that the record pair as a possible match which corresponds to the limitation’s next potential match. Moreover, the reference stated that the domain’s expert can always identify the correct matching status corresponds to the limitation’s user selectable element to confirm or deny the match.)
The reasons of obviousness have been noted in the rejection of Claim 5, above and are applicable herein.
Regarding Claim 8 JFE teaches all the features with respect to claim 4, as outlined above.
Elfeky teaches the prediction list comprises a bank statement, and wherein the additional input document comprises a receipt. (Pg. 8 Col. 2 Para. 2, “In our experiments, we exploit synthetic data as well as real data. As mentioned before, a tool called DBGen [15] is used for generating synthetic data. DBGen generates records of people that include the following information for each person: SSN, Name (Last, First, Middle Initial), and Address (Street, City, Zip Code, State). DBGen associates each record with a group number in such a way that records with the same group number represent the same person, i.e., matched records. A Wal-Mart database of 70 Gigabytes, which resides on an NCR Teradata Server running the NCR Teradata Database System, is used for the real data experimental study. The results of this experimental study are reported in Section 5.4.” where the reference’s data of certain group number which corresponds to the limitation prediction list whereas each person details such as SSN, Name corresponds to the limitation’s additional input document.)
The reasons of obviousness have been noted in the rejection of Claim 4, above and are applicable herein.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SUBASH LIMBU whose telephone number is (571)272-0633. The examiner can normally be reached Monday - Friday 0730 - 530.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, OMAR FERNANDEZ RIVAS can be reached on (571)272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/S.L./Examiner, Art Unit 2128                                                                                                                                                                                                        
/OMAR F FERNANDEZ RIVAS/Supervisory Patent Examiner, Art Unit 2128