Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This communication is responsive to Amendment, filed 02/28/2022.
 	Claims 1-21 are pending in this application. This action is made Final.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
Claims 1-4, 6-13, 15-21 are rejected under 35 U.S.C. 103 as being unpatentable over Stetson et al. (US Pat No. 9,348,947), in view of Weng et al. (US Pat No. 8,019,593).
As to claims 1, 10, 19, Stetson teaches computer-implemented method comprising:
receiving a plurality of individual subsets of features of a dataset of features (i.e. to obtain a graph database, wherein the graph database includes a set of nodes and a set of edges, wherein an edge in a set of edges defines a relationship between a first node in the set of nodes and a second node in the set of nodes, and metadata describing a numeric value attributed to an edge or node that can be pre-assigned as a static attribute of the node or edge stored in memory and/or calculated as a function of the connection patterns of nodes and edges to be found within one or more degrees of separation to the node or edge, col. 8, lines 20-62), each subset represented as a graph based on a predefined template and on the subset of features (i.e. Graph databases in accordance with embodiments of the invention are configured to store conceptual data in nodes and the relationships between the nodes in the edges, col. 13, lines 25-43), wherein each received subset of features includes relevance data representing a ranked relevance or ranked relative relevance of features within the subset (i.e. The obtained (310) source data includes concepts and relationships between the concepts, col. 13, line 44 to col. 14, line 8);
for each received subset of features, processing the relevance data associated with the features of the subset to determine (i.e. determining relationship weights, col. 13, line 44 to col. 14, line 8) an edge weight (i.e. edge weight metadata, col. 18, lines 20-53) for each of the edges of the graph based at least in part upon the ranked relevance or ranked relative relevance (i.e. The determined (314) weights can be based on a variety of factors as appropriate to the requirements of specific applications in accordance with embodiments of the invention, such as the obtained concepts, the relationships between the concepts, and the determined (312) associated data, col. 14, lines 9-49);
merging the plurality of graphs (i.e. graph database manipulation systems allow the successors of parent nodes to be combined  ... a node perspective can extend to not only the immediate sub-nodes, but also recursively to their cross-links, col. 5, lines 15-42) by combining nodes representing a same feature of the graphs and combining edge weights representing a same relationship between features to form a merged feature graph (i.e. aggregating nodes and/or edges to generate an approximate graph from the perspective of the obtained (610) source node ... if the generated representation of a set of nodes would be too small for a user to effectively explore, the nodes can be aggregated so that useful information can still be analyzed by the user, col. 20, line 61 to col. 21, line 15);
displaying (i.e. Visualizing Graph Databases, col. 14, lines 58 to col. 15, line 8) the merged feature graph to a user to enable the user to select aspects of the merged feature graph to be included as a training graph (i.e. if the generated representation of a set of nodes would be too small for a user to effectively explore, the nodes can be aggregated so that useful information can still be analyzed by the user, col. 20, line 61 to col. 21, line 15), wherein displaying the merged feature graph comprises displaying how nodes and edges were determined (i.e. the generated (318) node data includes node metadata (including a string representing the concept represented by the node data) and references to one or more pieces of edge data, col. 14, lines 9-49).
Stetson does not seem to specifically teach the following limitations, but Weng teaches:
a ranked relevance or ranked relative relevance of features within the subset (i.e. The gain for a feature needs to be re-computed only when the feature reaches the top of a list sorted in descending order by gain. This generally occurs when the feature is the top candidate for inclusion in the model. If the re-computed gain is smaller than that of the next candidate in the list, the feature is re-ranked according to its newly computed gain, and the feature now at the top of the list goes through the same gain re-computing process, col. 5, lines 5-36);
training a machine learning model based on the training graph (i.e. the learning algorithm contains a feature generation module 103 that generates the features from the training data 102, col. 3, lines 18-36).
It would have been obvious to one of ordinary skill of the art having the teaching of Stetson, Weng before the effective filing date of the claimed invention to modify the system of Stetson to include the limitations as taught by Weng. One of ordinary skill in the art would be motivated to make this combination in order to generates a number of feature subsets in view of Weng (col. 6, lines 5-27), as doing so would give the added benefit of these feature subsets are then merged together to produce a second (subsequent) feature space as taught by Weng (col. 6, lines 5-27).

As to claims 2, 11, 20, Stetson teaches the predefined graph template comprises a simple graph comprising nodes that represent features in the subset and edges between pairs of nodes, wherein merging the plurality of graphs to form a merged feature graph comprises:
combining overlapping nodes from different graphs into a merged node, wherein the overlapping nodes from different graphs represent the same feature in the dataset (i.e. wherein at least one generated representation includes a partially overlapping subset of at least one other generated representation, Claim 21), and
combining overlapping edges from different graphs into a merged edge, wherein the overlapping edges from different graphs extend between a pair of nodes representing the same pair of features in the dataset (i.e. graph database manipulation systems allow the successors of parent nodes to be combined  ... a node perspective can extend to not only the immediate sub-nodes, but also recursively to their cross-links, col. 5, lines 15-42).

As to claims 3, 12, Stetson teaches the predefined graph template comprises a weighted graph, wherein each of the edges has a respective edge weight representing a relationship between the feature pair represented by the pair of nodes connected to the corresponding edge, the method further comprising:
combining the edge weights of overlapping edges from different graphs to determined a merged edge weight for each merged edge of the merged feature graph (i.e. the node weight determines the size (e.g. area) of the node within the generated (418) layout and the edge display metadata determines the position of the nodes within the generated (418) layout, col. 15, line 30 to col. 16, line 13; a node includes modifying edge weight metadata for an edge connected to the node by computing a new complex number based on the previous edge weight metadata and the obtained (510) node update, col. 18, line 54 to col. 19, line 20).

As to claims 4, 13, Stetson teaches:
for each of the received individual subsets of features of a dataset, creating said graph based on a predefined template (i.e. Graph database manipulation systems in accordance with embodiments of the invention are configured to visualize and manipulate graph databases. Graph databases contain a set of nodes defining concepts and a set of edges indicating relationships between pairs of nodes, col. 5, line 43 to col. 6, line 45) comprising nodes and edges by mapping each feature of the subset to a node, and connecting at least some pairs of nodes by an edge (i.e. a one-to-one mapping exists between the edge display metadata and some particular perspective of the associated data (e.g. the visualized representation of that data), col. 5, line 43 to col. 6, line 45).

As to claims 6, 15, Stetson teaches:
using the merged feature graph to identify the single subset of feature by selecting a subset of features from the merged feature graph (i.e. a node perspective can extend to not only the immediate sub-nodes, but also recursively to their cross-links. In this way, the node perspective can operate as a fundamental unit of computation within the framework provided by a graph database manipulation system, col. 5, lines 15-42) based on one or more of:
all of the features represented by nodes in the merged feature graph (i.e. graph database manipulation systems allow the successors of parent nodes to be combined, differenced, or otherwise manipulated in order to capture the set operations done in the analysis of naturalistic data, col. 5, lines 15-42);
a threshold number of the most relevant features represented by nodes in the merge feature graph (i.e. the threshold can be based on readability metric(s) and/or the amount of visualized space the node consumes as displayed using a graph database manipulation device, col. 5, line 43 to col. 6, line 45).
all features represented by nodes in the merged feature graph meeting a threshold relevance value (i.e. The number of related nodes so viewed can be limited by a preset threshold, determined dynamically by the resolution or readability limits of the system, or by processing constraints imposed to maintain the graph database manipulation device simultaneously across a network of portals, col. 5, line 43 to col. 6, line 45);
all features of feature pairs represented by edges in the merged feature graph meeting a threshold merged edge weight (i.e. the threshold value is based on one or more nodes and/or edges selected within the graph database, col. 5, line 43 to col. 6, line 45); and
features represented by nodes or connected by edges in the merged feature graph meeting any other suitable threshold (i.e. The visualization of the generated representation includes a representation of the nodes and the layout and visual appearance of the generated representation can be based on edge weight metadata and edge display metadata contained in the edges connecting the nodes being visualized. In several embodiments, the threshold value can be based on the visualized representation of the nodes, e.g. the threshold can be based on readability metric(s) and/or the amount of visualized space the node consumes as displayed using a graph database manipulation device, col. 5, line 43 to col. 6, line 45).

As to claims 7, 16, Stetson teaches:
presenting a visualized representation of the merged feature graph, to enable user selection of the single subset of features (i.e. Graph database manipulation systems in accordance with embodiments of the invention are configured to visualize and manipulate graph databases, col. 10, lines 29-54).

As to claims 8, 17, Weng teaches the received plurality of subsets of features of the dataset are generated using two or more feature selection methods (i.e. the feature selection process that chooses from a feature space a subset of good features to be included in the model; and the parameter estimation process that estimates the weighting factors for each selected feature in the exponential model, col. 3, lines 51-65).
As to claims 9, 18, Weng teaches:
selecting a single subset of features using the merged feature graph (i.e. the feature selection process that chooses from a feature space a subset of good features to be included in the model; and the parameter estimation process that estimates the weighting factors for each selected feature in the exponential model, col. 3, lines 51-65; This iterative splitting, feature selection, and merging process facilitates parallel processing of the initial feature space, col. 5, lines 37-59);
creating a model using training data based on the single subset of features (i.e. FIG. 1 is a block diagram of a machine learning system for a system that includes feature generation process and a feature selection process, according to an embodiment. System 100 illustrated in generally provides a learning algorithm 101 that learns a model 108 based on training data 102, col. 3, lines 4-17), and
applying the model to live and/or real world data in a system (i.e. the learning algorithm 101 that includes the PFS system incorporates a feature generation module 103 that generates the number of features in the initial feature space, col. 8, lines 40-54).

As per claim 21, Stetson teaches the method of claim 1, wherein the edge weight for each of the edges of the graph represents a normalized distance between the nodes (i.e. Node weights are determined (416) based on the edge weight metadata included in the edges connecting the related nodes to the source node, col. 15, line 30 to col. 16, line 13; The length of an edge can be determined based on a variety of criteria, such as the distance between nodes within a visualized representation of the graph database ... the weight of an edge is determined by computing the squared complex weight based on the edge weight metadata, col. 21, lines 16-49).

Claims 5, 14 are rejected under 35 U.S.C. 103 as being unpatentable over Stetson et al. (US Pat No. 9,348,947), in view of Weng et al. (US Pat No. 8,019,593), as applied to claims above, and further in view of Sturlaugson et al. (US Pub No. 2016/0358099).
As to claims 5, 14, Weng teaches the received relevance data comprises, a rank value of each feature of the subset of features based on the relevance value (i.e. The gain for a feature needs to be re-computed only when the feature reaches the top of a list sorted in descending order by gain. This generally occurs when the feature is the top candidate for inclusion in the model. If the re-computed gain is smaller than that of the next candidate in the list, the feature is re-ranked according to its newly computed gain, and the feature now at the top of the list goes through the same gain re-computing process, col. 5, lines 5-36).
Stetson, Weng do not seem to specifically teach:
wherein weighted edge value is calculated using the following formula:
	Edge Weight = [ 1/(1 + Pos. Diff) ] * [W*e^(-Decay/5) ]
		Wherein Pos. Diff is the difference in position between the feature pair in the feature ranking;
		W is a weight for decay, and
		Decay is top position of any two connected features in the feature ranking, staring from 0.
However, Sturlaugson teaches:
“wherein the received relevance data comprises, a rank value of each feature of the subset of features based on the relevance value” (i.e. determining a statistic of the input feature data. Where the dataset is a time-dependent dataset, the statistic may be related to the time-dependence of the dataset, e.g., the statistic may be a statistic during a time window, i.e., during a period of time and/or at one or more specified times. Additionally or alternatively, the statistic may be related to one or more input feature data values. For example, the statistic may be a time average of a sensor value and/or a difference between two sensor values (e.g., measured at different times and/or different locations). More generally, statistics may include, and/or may be, a minimum, a maximum, an average, a variance, a deviation, a cumulative value, a rate of change, an average rate of change, a sum, a difference, a ratio, a product, and/or a correlation, [0031]);
Pos. Diff is the difference in position between the feature pair in the feature ranking (i.e. the statistic may be a time average of a sensor value and/or a difference between two sensor values (e.g., measured at different times and/or different locations), [0031]);
W is a weight for decay (i.e. an artificial neural network may include parameters specifying the number of nodes, the cost function, the learning rate, the learning rate decay, and the maximum iterations, [0022]);
decay is top position of any two connected features in the feature ranking, staring from 0 (i.e. to identify a group of machine learning models 32 ... an artificial neural network with 10 nodes and a learning rate decay of 0, an artificial neural network with 10 nodes and a learning rate decay of 0.01, an artificial neural network with 20 nodes and a learning rate decay of 0, and an artificial neural network with 20 nodes and a learning rate decay of 0.01, [0034]).
Therefore, even though Stetson, Weng, Sturlaugson do not seem to particularly teach the exact formula, it would have been obvious to one of ordinary skill of the art having the teaching of Stetson, Weng, Sturlaugson before the effective filing date of the claimed invention to modify the system of Stetson, Weng to include the limitations as taught by Sturlaugson. One of ordinary skill in the art would be motivated to make this combination in order to compare the machine learning models with the performance comparison statistics and select one or more of the machine learning models to deploy in view of Sturlaugson ([0061]), as doing so would give the added benefit of building a better deployable machine learning model which includes training the corresponding machine learning model with the entire input feature dataset as taught by Sturlaugson ([0061]).

Response to Arguments
Applicant's arguments with respect to claims 1-21 have been considered but are moot in view of the new ground(s) of rejection. 

Conclusion
	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MIRANDA LE whose telephone number is (571)272-4112.  The examiner can normally be reached on M-F 7AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alford W Kindred can be reached on 571-272-4037.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MIRANDA LE/Primary Examiner, Art Unit 2153