Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by US Pat. No. 10,810,491 to Xia et al. (hereinafter Xia). 
Per claim 1, Xia discloses a method (see figures…method using visualization tool for development of a plurality of machine learning models and selection of subset of machine learning models based on performance metrics; col. 20:17-25: ”A visualization tool that collects information from all the execution platforms being used for the different variants, and presents an easy-to-understand representation of metrics such as loss function values, test scores, and internal layer parameter values may help clients verify whether progress is being made towards convergence at desired rates, and debug or tune model variants which require attention in real time”) for presenting inference models (fig. 1:134 and col. 3:58-col. 5:20…visualization manager presents machine learning models based on metadata associated with each model; fig. 5…example presentation of loss function values at specific training iterations for a plurality of models being developed;  fig. 6…example presentation of test scores for test runs performed for a plurality of models being developed;  fig. 7…example presentation of layer-specific information for selected models, e.g., for Layer L1 of ModelID MID21; fig. 8…example presentation of web-based visual interface for visualization tool showing various example models being developed, i.e., MID1 and MID2; fig. 9…presentation of low-dimensional mapping of models’ outputs) based on interrelationships among inference models (fig. 1 and col. 3:25-33 and col. 10:45-52…various models to be presented can be interrelated with each other as variants of a particular model or variants of several different models, i.e., the models of Model IDs MID1 and MID2 are interrelated as having the same model structure MS1 while models of Model IDs MID3 and MID4 have different model structures MS2 and MS3; fig. 1:123A-123T, fig. 3:326A-D and col. 5:51-57…each machine learning model has metadata generated from log data associated with attributes of the model as well as other metadata derived from execution/training of the models, where the models are each run on execution platforms 122A-122T; col. 4: col. 5:28-31: “extracting the requisite log entries or other metadata and processing the collected metadata to provide dynamically updated displays of various characteristics of one or more machine learning models”), the method comprising: 
obtaining information (fig. 3: 326A-D, fig. 11:1101,1104 and col. 19:20-27…a plurality of models are executed to be trained and tested, and log data associated with model obtained, the log data construed as ‘information’ obtained: “…each execution platform (e.g., a GPU-based or CPU-based compute engine) at which a model variant is being trained may generate a growing local collection of log entries which may contain meta data indicative of the current state of the variant, and the visualization tool may obtain such log entries from the execution platforms”;  col. 7:11-24: “At various execution platforms 122, a respective local log 123 may be maintained to track the training and/or testing operations being performed---e.g., log 123A may be generated at execution platform 122A, log 123B may be generated at execution platform 123B, and so on. A given log 123 may include  comprise a plurality of entries, and a given entry may include various elements of data and/or metadata associated with the model(s) for which processing is being performed at the execution platform.  For example, a log entry may contain information about the number and nature of the layers of a neural network model, the parameters associated with subcomponents at various layers, the loss function or objective function value computed for a recent training iteration, the scores obtained from a recent test run, and so on”) related to a group of at least three inference models (fig. 3 and fig. 5…a group of at least three models shown, i.e., Model IDs MID1 thru MID4; fig. 4…a group of at least three models shown, i.e., Model IDs MID1-MID3; fig. 6…MI10-MID12; log data is obtained for all the models being developed/executed);
obtaining a plurality of interrelationship records (col. 12: 24-32…metadata obtained off the log data for each model being developed/executed and visualization is produced from the metadata, the metadata for the models being a plurality of interrelationship records: “The visualization manager 434 may obtain metadata pertaining to the different model variants, e.g., by extracting various log entries generated at the execution nodes where the models are being trained/tested. A number of different types of output may be displayed by the visualization manager using the collected data, e.g., to facilitate tuning and debugging of the models, to provide feedback regarding the progress being made as more iterations of training followed by testing are performed, and so on.”; col. 19:20-27: “A visualization manager or tool, which may be implemented using one or more computing devices, may collect several kinds of metadata pertaining to the training and testing of the model variants in the depicted embodiment (element 1104) while the training process is still ongoing”), wherein each interrelationship record corresponds to one subgroup of the group of at least three inference models (fig. 5…MID1-MID3 is a subgroup of three inference models out of an original group of four models MID1-MID4 of which MID4 was terminated due to abnormal loss function, the graph presentation of MID1-MID3 being derived from metadata from loss functions values per each model; fig. 6…MID10 and MID11 is a subgroup of two models out of an original group of three models MID10-MID12 of which MID12 was terminated due to low test score; col. 7:15-25…All models, including those in a ‘subgroup’, such as MID1-MID3 of fig. 5 and MID10-MID11 for fig. 6 are presented based on metadata obtained from those models: “…a given entry may include various elements of data and/or metadata associated with the model(s) for which processing is being performed at the execution platform.  For example, a log entry may contain information about the number and natures of the layers of a neural network model, the parameters associated with subcomponents at various layers, the floss function or objective function value computed for a recent training iterations, the scores obtained from a recent test run, and so on”), and each subgroup comprises at least two inference models (fig. 5…subgroup user to keep has three models MID1-MID3; fig. 6…subgroup user to keep has two models MIDI10 and MIDI11);
using the plurality of interrelationship records to determine information related to a first inference model of the group of at least three inference models (fig. 5 and col. 13:38-col. 14:4…MID4 is a model determined from meta data to be abnormal relative to the subgroup MID1-MID3, MID4 construed to be the first inference model, and abnormality construed as information pertaining to poor performing model; fig. 6 and col. 14:24-60…MID12 is model determined from meta data to be a low score for test runs relative to subgroup MID10-MID11, MID12 also can be construed as the first inference model, and low score construed as information pertaining to a poor performing model); and
using the determined information to present the first inference model to a user (figs. 5-6 and col. 13:38-col. 14:60…present MID4 and MID12 to user as shown in the graphs from which the user can decide to terminate training of the specific models based on bad performance/metric displayed, i.e., MID4 and/or MID12).
Per claim 2, Xia discloses claim 1, further disclosing analyzing a first subgroup of the group of at least three inference models to generate the interrelationship record corresponding to the first subgroup (figs. 5 and 6…good performance of a subgroup of models, i.e., MID1-MID3 and MID10-MID11, will continue to be executed/trained and have their metadata continue to be used and analyzed; col. 5:25-30… extraction/processing/evaluation, e.g., analysis, of metadata associated with subgroup of models : “visualization tool installed at one or more computing devices unaffiliated with any particular service may be instantiated in some embodiments, capable of extracting the requisite log entries or other metadata and processing the collected metadata to provide dynamically updated displays of various characteristics of one or more machine learning models”); and using the generated interrelationship record corresponding to the first subgroup to determine the information related to the first inference model (figs. 5-6 and col. 13:38-col. 14:60…the metadata for the models used to present the graph of loss function values and test run scores, where poor performing models MID4 and MID12 are determined based on comparison with graphs of the good performing subgroups MID1-MID3 and MID10-MID11).
Per claim 3, Xia discloses claim 1, further disclosing using an inexact graph matching algorithm (fig. 5…cgraph of loss functions values only, with no metrics graphed for models, thus construed as inexact graphs not fully representative of entire model;  fig. 6… algorithm generating graph of test scores, not fully representative of entire model, thus inexact;  fig. 9… algorithm generating ’low-dimensional’ representations of models outputs can be construed as inexact graphs) to compare a structure associated with the first inference model (fig. 3:322D…model such as MID4 has model structure MS3 associated with it) …and a structure associated with a second inference model (fig. 3:322A-C…other models such as MID1-MID3 has model structures MS1-3 associated with them) and determine a matching score related to the first inference model and the second inference model (fig. 6…test scores generated for each model and compared, where matching scores are similar score that indicates models are performing well to keep and continue training); and using the determined matching score to determine the information related to the first inference model (fig. 6…scores that do not match, e.g., very dissimilar, will have the model terminated, e.g., MID12).
Per claim 4, Xia discloses claim 1, further disclosing the determined information related to the first inference model is textual information describing the first inference model (col. 7:25-27…”A variety of data structures and/or objects may be used for logs and their entries in different embodiments—e.g., in one embodiment log entries may be stored in text format”; col. 16:60-65…”data underlying graphical display or visualization may be exportable in text format…for offline viewing of the model information”).
Per claim 5, Xia discloses claim 1, further disclosing using the plurality of interrelationship records to determine an embedding of the at least three inference models in a two dimensional space (figs. 5, 6 and 9…underlying loss function values and test runs scores are a series of metadata embedded in logs); and visually presenting the embedding of the at least three inference models in the two dimensional space (fig. 5, 6 and 9…graphical representations of metadata are in 2 dimensional space).
Per claim 6, Xia discloses claim 1, further disclosing using the plurality of interrelationship records to determine an embedding of the at least three inference models in a three dimensional space; and visually presenting the embedding of the at least three inference models in the three dimensional space (col. 12:60-66…high-dimensional outputs from models can be represented in three dimensions: ”In many cases, at least some of the outputs or predictions produced by a given model may be expressed as a vector or matrix of high dimensionality. Such high-dimensional output from different variants may be mapped to two dimensions or three dimensions and displayed to the client by the visualization manager…”).
Per claim 7, Xia discloses claim 1, further disclosing using the plurality of interrelationship records to determine a hierarchical graph of the at least three inference models (figs. 5-6… loss function values and test runs scores are obtained from metadata to graph, y-axis goes from low to high values, e.g., hierarchical); and visually presenting the hierarchical graph of the at least three inference models (figs. 5-6…graphs presented).
Per claim 8, Xia discloses claim 1, further disclosing using an interrelationship record corresponding to a subgroup of the group of at least three inference models (fig. 6…MID10-MID12 is group of at least three inference models, where MID10-MID11 is the subgroup of good performing models), where the subgroup comprises exactly two inference models (fig. 6… MID10-MID11 is the subgroup of exactly two models), to determine the information related to the first inference model (fig. 6…MID10-MID11 is used to compare/relate to MID12, used to determine that MID12 is poor performing and should terminate training).
Per claim 9, Xia discloses claim 1, further disclosing using an interrelationship record corresponding to a subgroup of the group of at least three inference models (fig. 5…MID1-MID4 is group of at least three inference models, where MI1-MID3 is the subgroup of good performing models), where the subgroup comprises at least three inference models (fig. 5… MID1-MID3 is the subgroup of three models), to determine the information related to the first inference model (fig. 5…MID1-MID3 is used to compare/relate to MID4, used to determine that MID4 is abnormal and should terminate training).
Per claim 10, Xia discloses claim 1, further disclosing using interrelationship information based on commonality among at least two inference models to determine the information related to the first inference model (fig. 1 and col. 3:25-33 and col. 10:45-52…various models to be presented can be interrelated, having commonality, with each other as variants of a particular model or variants of several different models, i.e., the models of Model IDs MID1 and MID2 have common model structure MS1 while model ModelID MID4 has different model structure MS3, which may explain abnormality in fig. 5 for MID4).
Per claim 11, Xia discloses claim 1, further disclosing using interrelationship information related to a measure (fig. 5…loss function value is a measure; fig. 6…test run score is a measure) based on an interrelationship between a second inference model and a third inference model (fig. 5…MID1 and MID2 can be second and third inference models whose loss function values are graphed and compared with other models; fig. 6…MID10 and MID11 can be second and third inference models whose test run scores are graphed and compared with other models) to determine the information related to the first inference model (fig. 5…MID4, the first inference model, performance is compared with that of MID1 and MID2; MID12, the first inference model, performance is compared with that of MID10 and MID12), wherein both the second inference model and the third inference model differ from the first inference model (fig. 5…MID1 and MID2 loss function values consistently drop as more training iterations are completed, differing from MID4 which experiences an increase; fig. 6…MID10 and MID11 test run scores consistently increase with higher test runs, differing from MID12 which experiences a decrease).
Per claim 12, Xia discloses claim 1, further disclosing using interrelationship information based on a number of layers associated with the first inference model and a number of layers associated with a second inference model to determine the information related to the first inference model (fig. 3…good or bad performance of the models may depend on the both the structure of the model and parameters associated with the model, the number of layers being part of the structure of the model; col. 3:25-33…”Because 25 the quality of a model's results may typically depend on the structure of the model (e.g., how many layers are included in the model, the kinds of processing performed at each layer, the interconnections between the layers and so on) and the parameters (e.g., weights, activation biases and the like) selected for the model, a number of model variants with differing initial parameters or structures may often be trained in parallel using a given input data set”; col. 7:2-6…”The model variants may differ from one in another in various characteristics-e.g., in the model structure ( e.g., the number of layers of various types of a convolutional neural network model), the initial parameters, the learning rates, etc.”). 
Per claim 13, Xia discloses claim 1, further disclosing using interrelationship information based on a type of at least one layer of the first inference model and a type of at least one layer of a second inference model to determine the information related to the first inference model (col. 8:44-50…visualization tool can be applied to a wide variety of ML algorithms and models including but not limited to various types of neural network based models which may contain multiple internal or hidden layers, e.g., different types of possible layers between models).
Per claim 14, Xia discloses claim 1, further disclosing using interrelationship information based on a number of artificial neurons associated with the first inference model and a number of artificial neurons associated with a second inference model to determine the information related to the first inference model (col. 8:44-50…visualization tool can be applied to a wide variety of ML algorithms and models including but not limited to various types of neural network based models which may contain multiple internal or hidden layers, where the number of layers is directly associated with the number of neurons per model; col. 9:11-19…”The model comprises a number of layers, such as convolution layers Cl and C2 of model 202, pooling or sub-sampling layers Pl and P2, and fully-connected layers Fl and F2. With respect to the convolution layers and the pooling layers, a given layer comprises a number of units (logically representing respective artificial neurons being trained), with each unit receiving input from a small set of units located in a common neighborhood in the previous layer”).
Per claim 15, Xia discloses claim 1, further disclosing using interrelationship information based on a type of at least one activation function of the first inference model and a type of at least one activation function of a second inference model to determine the information related to the first inference model (fig. 3…good or bad performance of the models may depend on the both the structure of the model and parameters associated with the model, the number of layers being part of the structure of the model; col. 3:25-33…activation function/bias is one of the parameters for each model: ”Because 25 the quality of a model's results may typically depend on the structure of the model (e.g., how many layers are included in the model, the kinds of processing performed at each layer, the interconnections between the layers and so on) and the parameters (e.g., weights, activation biases and the like) selected for the model, a number of model variants with differing initial parameters or structures may often be trained in parallel using a given input data set”).
Per claim 16, Xia discloses claim 1, further disclosing using interrelationship information based on a measure of non-linearity associated with the first inference model and a measure of non-linearity associated with a second inference model to determine the information related to the first inference model (figs. 5 and 6…presented graphs are non-linear measures of the models used to determine which model to terminate training for).
Per claim 17, Xia discloses claim 1, further disclosing using interrelationship information based on a plurality of results produced using the first inference model and a plurality of results produced using a second inference model to determine the information related to the first inference model (fig. 5…loss function values can be results produced by the models used to determine which model to terminate training for).
Per claim 18, Xia discloses claim 1, further disclosing using interrelationship information based on a plurality of confidence levels associated with results produced using the first inference model and a plurality of confidence levels associated with results produced using a second inference model to determine the information related to the first inference model (fig. 6 and col. 14:8-23…test run scores are model quality scores/metrics which is construed as a confidence level in how well a model’s predictive capability is, which is used to determine which model to terminate training for). 
Claim 19 is substantially similar in scope and spirit to claim 1.  Therefore, the rejection of claim 1 is applied accordingly.  Xia further discloses using a system (fig. 12) comprising one or more processors (fig. 12:9010a-n) to implement the process/functionality disclosed.
Claim 20 is substantially similar in scope and spirit to claim 1.  Therefore, the rejection of claim 1 is applied accordingly.  Xia further discloses using non-transitory storage media (col. 21:53-62) to implement the process/functionality disclosed.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  Patents and/or related publications are cited in the Notice of References Cited (Form PTO-892) attached to this action to further show the state of the art with respect to presenting/displaying groups and/or subgroups of machine learning models based on information associated with each machine learning model.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALAN CHEN whose telephone number is (571)272-4143. The examiner can normally be reached M-F 10-7.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ALAN CHEN/Primary Examiner, Art Unit 2125