Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This Office action is in response to the amendment filed on Sept 15, 2022.
Claims 1-20 are pending.
Claims 1, 3, 4, 8, 10, 11, 14, and 16 have been amended.
Allowable Subject Matter
Claims 3, 10, and 16 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of their base claims respectively.
Claims 4, 11, and 17 are considered allowable by virtue of their dependence on claims 3, 10, and 16 rewritten in independent form including all of the limitations of their base claims respectively.
Response to Arguments
Applicant's arguments have been considered but are moot in view of new ground(s) of rejection. As stated in the Applicant’s Remarks on page 2, Applicant has amended independent claims 1, 8, and 14 with a portion of the allowable subject matter indicated by the Examiner to reside in dependent claims 3, 10, and 16, respectively. However, the amended portion from dependent claims 3, 10, and 16 is taught by US 2021/0174258 at ¶¶ 18-19. See the claim rejection section below for more details.
Therefore, the rejections of the independent claims 1, 8, and 14 are maintained.
With respect to the remaining dependent claims, Applicant merely reiterates the argument made regarding claim 1 and asserts that any additional references cited by Examiner fail to resolve the alleged deficiencies in the rejections of the independent claims (see Remarks at pp. 2-3).  Applicant’s arguments are unpersuasive for the same reasons articulated above with respect to claims 1, 8, and 14.  
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1-2, 6, 8-9, 14-15, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over US 10,445,170 (hereinafter "Subramanian”), in view of US 2021/0118024 (hereinafter “Sollami”), and further in view of US 2021/0174258 (hereinafter “Wenchel”).
In the following claim analysis, Applicant’s claim limitations are shown boldfaced and Examiner’s explanations/notes/remarks are in square brackets and emphases are underlined.

As to claim 1, Subramanian discloses a computer system (Subramanian, Abstract, Methods and systems are described for data lineage identification and change impact prediction) comprising:
a processing device and a memory device operably coupled to the processing device; and a data storage system communicatively coupled to the processing device and the memory device (Subramanian, Fig. 1A-1B, col. 5, ln. 23-32, The system 100 includes a plurality of data sources 102a-102n, a client computing device 103, a communications network 104, a server computing device 106 comprising a metadata capture module 108, a data lineage identification module 110, a classification model training module 112, and a data object change impact module 114, and a database 116), the data storage system configured to maintain predictive model training data therein (Subramanian, Fig. 1A-1B,  col. 9, ln. 19-30, the repository 104 has many types of information related to data incidents, errors, changes, and the like. Trained data 150 is built by the system of FIG. 1B using the incident data 148; ln. 55-64, Insert/update (and thereby build) the trained data 150 repository [predictive model training data repository]), the processing device is configured to implement a modeler, the modeler configured to build a predictive model from at least a portion of the predictive model training data (Subramanian, Fig. 1A-1B,  col. 9, ln. 19-30, the Trained data 150 is built by the system of FIG. 1B using the incident data 148. The trained data 150 enables the system to predict impact; col. 14, ln. 62-col. 15, ln. 3, the input and output from prior executions of the model are added into a subsequent training set that is used to further train the model, thereby producing a refined classification model that is self-learning), wherein the processing device is further configured to:
capture lineage metadata for the predictive model training data (Subramanian, col. 13, ln. ln. 23-31, the data lineage identification module 112 may not be able to directly use the metadata to determine the existence of the indirect relationship; 62-66, Using the data lineage, the server computing device 106 can train a classification model to predict the impact of changes to the underlying data objects on the computing environment); 
identify, subject to the capture, a plurality of features collected from the predictive model training data (Subramanian, col. 13, ln. 62-65, Using the data lineage, the server computing device 106 can train a classification model to predict the impact of changes to the underlying data objects on the computing environment; col. 14, ln. 20-25, The classification model training module 112 then generates (412) a multidimensional vector for one or more of the data objects based upon the data lineage associated with the data objects and the unstructured text from one or more incident tickets associated with the data objects. For example, the multidimensional vector can define a feature set of the data object based upon the data lineage and the unstructured text. The multidimensional vector is in a form that is usable as input to the classification model …  the module 112 can further refine the feature set for specific data objects);
populate a feature catalog with the identified plurality of features (Subramanian, col. 14, ln,20-45, The classification model training module 112 then generates (412) a multidimensional vector for one or more of the data objects based upon the data lineage associated with the data objects … the multidimensional vector can define a feature set of the data object based upon the data lineage … the module 112 can further refine [populate] the feature set [a feature catalog] for specific data objects … to identify the most relevant features in the feature set for change impact prediction).
Subramanian does not appear to explicitly disclose execute one or more analyses on the predictive model to capture one or more predictive model runtime measurements.
However, in an analogous art to the claimed invention in the field of machine learning, Sollami teaches execute one or more analyses on the predictive model to capture one or more predictive model runtime measurements (Sollami, ¶ 52, . The machine learning models 120 may also be trained when, for example, the list of categories for the catalog is updated and updated selected categories, reflecting the changes to the list of categories, are received for catalog entries with feature vectors that are in the training data set 150; ¶ 75, The labeler 510 may only use the category probabilities output by the machine learning model that had the highest performance [a predictive model runtime measurement], for example, least error, as determined by the evaluator 410. The labeler 510 may determine which categories have a category probability above a threshold among a plurality of the different sets of category probabilities).
Therefore, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention having the teaching of Subramanian and Sollami before him/her to modify Subramanian’s system to include Sollami’s method of multi-label product categorization, with a reasonable expectation of success. The modification would be obvious because one of ordinary skill in the art would be motivated to generate feature vectors in the live data set, where the machine learning models outputs category probabilities, which may be sets of probabilities that the item in the catalog entry used to generate the feature vector belongs to categories from the list of categories for the catalog as determined by the machine learning models 120. Each of the machine learning models 120xxx that receives the feature vector as input may output its own category probabilities (Sollami, paragraph 74)
Subramanian as modified does not appear to explicitly disclose correlate, subject to the one or more analyses, the one or more captured predictive model runtime measurements to the plurality of features. However, in an analogous art to the claimed invention in the field of machine learning, Wenchel teaches correlate, subject to the one or more analyses, the one or more captured predictive model runtime measurements to the plurality of features Wenchel, ¶ 18, both underlying data and the health and/or other statistical metrics run on that data are used to compute the impact of each feature in the underlying data on the health and performance of one or more ML models; ¶ 19, sampling some or all of the data within the ML system (i.e., raw input data, health metrics, model outputs, metrics run on the model outputs and other statistical metrics) over time … compute one or more aggregate health scores for the ML model, the data, and/or the full ML-based system … the aggregate health score(s) can be used to trigger one or more actions including … re-sampling training data to address one or more identified problem areas).
Therefore, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention having the teaching of Subramanian as modified and Wenchel before him/her to modify Subramanian’s modified system to include Wenchel’s method of monitoring performance of a ML system, with a reasonable expectation of success. The modification would be obvious because one of ordinary skill in the art would be motivated to generate an alert based on the aggregated health scores for the ML model, display it to a user, and perform a remedial action in response to the alert.

As to claim 2, the rejection of claim 1 is incorporated. Subramanian as modified further discloses wherein the processing device is further configured to: identify the predictive model training data used to build the predictive model from the captured lineage metadata (Subramanian, col. 13, ln. ln. 23-31, the data lineage identification module 112 may not be able to directly use the metadata to determine the existence of the indirect relationship; 62-66, Using the data lineage, the server computing device 106 can train a classification model to predict the impact of changes to the underlying data objects on the computing environment).

As to claim 6, the rejection of claim 1 is incorporated. Subramanian as modified further discloses wherein the processing device is further configured to: determine an accuracy measurement of the predictive model and correlate the accuracy measurement with each feature of the plurality of features (Subramanian, col. 14, ln. 40-45, Using specific characteristics such as frequency, depth, and penetration, the module 112 can further refine the feature set for specific data objects using, e.g., Random Forest techniques like mean decrease impurity and mean decrease accuracy—to identify the most relevant features in the feature set for change impact prediction).

Claims 8-9, 14-15 are essentially the same as claims 1-2 except they are set forth the claimed invention as a product and method respectively. Therefore, they are rejected with the same reasoning as applied to claims 1-2.

As to claim 19, the rejection of claim 14 is incorporated and the claim is corresponding to the method claim 6. Accordingly, it is rejected under the same rational set forth in the rejection of claim 6.

Claims 5, 12, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over US 10,445,170 (hereinafter "Subramanian”) in view of US 2021/0118024 (hereinafter “Sollami”), in view of US 2021/0174258 (hereinafter “Wenchel”), and further in view of WO 2013/121224 (hereinafter “William”).

As to claim 5, the rejection of claim 1 is incorporated. Subramanian as modified further discloses wherein the processing device is further configured to: identify one or more features of the plurality of features introducing indirect biases in the predictive model (Subramanian, col. 14, ln,20-45, the module 112 can further refine the feature set for specific data objects … to identify the most relevant features in the feature set for change impact prediction) and capture one or more relationships between each feature of the plurality of features (Subramanian, col. 4, ln. 5-20, identify one or more indirect relationships comprises a Bayesian network model. In some embodiments, the change impact feature set is based upon the relationships in the data lineage associated with the data object, a depth of the data object in the data lineage, and the database incident tickets associated with the data object), but does not appear to explicitly disclose determine one or more of overlapping features and gaps between features. However, in an analogous art to the claimed invention in the field of machine learning, William teaches determining one or more of overlapping features and gaps between features (William, pg. 38, ln.3-5, the plural derived feature vectors 12 are compared with each other and the similarity therebetween is determined in overlapping parts of the feature vectors 12; pb. 39, ln. 4-7, The feature vectors 14 stored in the memory 15 may comprise two or more feature vectors having overlapping regions; pg. 40, 11-13, These algorithms output an alignment score that is a function of the two feature vectors, the distance function and the gap penalties. The alignment score can be used to determine similarity). 
Therefore, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention having the teaching of Subramanian as modified and William before him/her to modify Subramanian’s system to include William’s machine-learning classification techniques, with a reasonable expectation of success. The modification would be obvious because one of ordinary skill in the art would be motivated to determine similarity between the derived feature vector and at least one other feature vector to provide information that is useful in many applications (William, pg. 4, ln. 17-20).

Claims 12 and 18 are corresponding to the system claims 5. Therefore, they are rejected under the same rational set forth in the rejection of claim 5.

Claims 7, 13, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over US 10,445,170 (hereinafter "Subramanian”), in view of US 2021/0118024 (hereinafter “Sollami”), in view of US 2021/0174258 (hereinafter “Wenchel”), and further in view of US 2020/0134493 (hereinafter “Bhide”).

As to claim 7, the rejection of claim 1 is incorporated. Subramanian as modified does not appear to explicitly disclose wherein the processing device is further configured to: capture, from the predictive model predictive model design time measurements; and capture predictive model design time metadata to be used to facilitate improvements in efficiency and effectiveness of predictive model building. However, in an analogous art to the claimed invention in the field of machine learning, Bhide teaches wherein the processing device is further configured to: capture, from the predictive model, predictive model design time measurements (Bhide, ¶ 69, In design time bias detection, the method focuses on detecting bias using the training data which is used for building the machine learning model); and capture predictive model design time metadata to be used to facilitate improvements in efficiency and effectiveness of predictive model building (Bhide, ¶ 75, the indirect bias module 410 (or another module of the server 404) performing bias mitigation by selecting a different estimator of the machine learning model and running it against the same design data (e.g., training data) or runtime data, to determine a difference in bias estimation. The selecting of the different estimator and running it against the same design data or runtime data, to determine a difference in bias estimation, may be automated (e.g., based on rules) or based on user input. In embodiments, step 525 comprises the system revising the machine learning model based on the bias prevention and/or bias mitigation).
Therefore, it would have been obvious to one of ordinary skill before the effective filing date of the claimed invention having the teaching of Subramanian as modified and Bhide before him/her to modify Subramanian’s system to include William’s machine-learning classification techniques, with a reasonable expectation of success. The modification would be obvious because one of ordinary skill in the art would be motivated to detect indirect bias in machine learning models, performing bias mitigation by selecting a different estimator of the machine learning model and running it against the same design data, and revise the machine learning model based on the bias prevention and/or bias mitigation (Bhide, Abstract and ¶ 75).

Claims 13 and 20 are corresponding to the system claims 7. Therefore, they are rejected under the same rational set forth in the rejection of claim 7.
Conclusion
Applicant’s amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a)  will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAXIN WU whose telephone number is (571)270-7721.  The examiner can normally be reached on M-F (7 am - 11:30 am; 1:30- 5 pm).
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Wei Zhen can be reached at (571) 272-3708.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/DAXIN WU/
Primary Examiner, Art Unit 2191