DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 2/11/2021 has been entered. Claims 1, 7, 10, 12, 18, 21-23 stand amended. Claims 1-5 and 7-23 are currently pending.
 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:


Claim 1-5 and 7-23 is/are rejected under 35 U.S.C. 103 as being unpatentable over Arasan et al. in US Patent Application Publication № 2018/0357255, hereinafter called Arasan, in combination with Abuelsaad et al. in US Patent Application Publication № 2015/0032609, hereinafter called Abuelsaad, in combination with Sheu et al. in US Patent Application № 2010/0063968, hereinafter called Sheu.

In regard to claim 1, Arasan teaches a method of processing data comprising:
 at one or more machines comprising one or more processors and memory storing one or more programs executed by the one or more processors to perform the method (paragraph 0056), performing operations comprising:
accessing a data store; 
for each of the group identifiers, a plurality of data sets associated with a corresponding plurality of variable types
at least first  plurality of operation (i.e. transformations) definitions defining a one or more operations on at least one of the variable types (“The data stored in data storage formats 201 may be generated and/or ingested by applying a series of transformations to input data using DFS 100.” Paragraph 0034, wherein “Transformation platform 504 may access metadata describing existing transformations in metadata store 514.” Paragraph 0041);
the first operation definition configured to generate a derived data set of a derived variable type (“Put another way, metadata may describe what the new output variables are and where the output variables came from. For example, metadata may include a data type, original source variables, logic used to generate the variables, timestamp, access restrictions, sensitivity of the data, and/or other descriptive metadata.” Paragraph 0045);
 receiving from a user interface selection of the first operation definition and at least one data set of said at least one variable type operated on by the selected at least one operation definition (“With reference to FIG. 4, a logic map 400 is shown in a graphical form depicting transformations 304 applied to source data 302 at a variable (e.g., column of a table) level, in accordance with various embodiments. A user may request an output variable by using a graphical tool to generate logic maps for the output variable.” Paragraph 0036);
 and processing the automatically determined data sets according to the selected first operation definition to generate a first derived data set (i.e. output variables, “In various embodiments, output variables 408 may originate from multiple source variables 402. For example, as illustrated, source variable 2 is transformed into derived variable 1, source variable 3 is transformed into derived variable 2, derived variable 1 and derived variable 2 are both transformed into derived variable 5, and derived variable 5 is transformed into output 2. Thus, output variables 408 are derived from source variables 402 and derived variables 404 by applying transformations.” Paragraph 0038).
However, Arasan fails to expressly teach that the data store stores  a plurality of group identifers; 
That the data sets are stored for each of the group identifiers, 
receiving from a user interface selection of the first operation definition and a first group identifier;
and processing based on the selected first operation definition, data sets associated with the selected group identifier to generate a first derived data set associated with the first group identifier.
O’Halloran  teaches that the data store stores  a plurality of group identifers (i.e. database name); 
That the data sets are stored for each of the group identifiers (“In general, in another aspect still, the invention features a computer-based method of collecting metadata for a dataset, including prompting a user to provide a database name.” paragraph 0014), 
receiving from a user interface selection of the first operation definition and a first group identifier (“In general, in another aspect still, the invention features a computer-based method of collecting metadata for a dataset, including prompting a user to provide a database name.” paragraph 0014);
and processing based on the selected first operation definition, data sets associated with the selected group identifier to generate a first derived data set associated with the first group identifier (“the method also includes adding a dataset corresponding to the table to a collection of datasets.” Paragraph 0014).
It would have been obvious to one of ordinary skill in the art at the time of filing to modify the database analysis system taught by Arasan to include the named data sets, as taught by O’Halloran. It would have been obvious because it represents the simple 
However, neither Arasan nor O’Halloran explicitly teach automatically determining data sets associated with both the selected first group identifier and at least one variable type defined in the selected first operation definition.
Sheu teaches automatically determining data sets associated with both the selected first group identifier and at least one variable type defined in the selected first operation definition (“When the condition "contains" is selected by the user, the system 100 automatically generates a variable definition phrase to define the two input parameters as range variables of polygon class.” Paragraph 0074; further, “At the block 440, the user is prompted to select a command from a list of defined commands. In one embodiment, the user is prompted to select from a list of defined commands that are applicable to the objects defined in the variable definition phrase. For example, if the variable definition phrase defines a variable "polygon", then the user can select from those commands that are applicable to the "polygon" variable, such as "enlarge", "reduce", "rotate", "measure area", and so forth.” Paragraph 0075).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the data analysis tool taught by Arasan and O’Halloran to identify data sets using both the group identifier and the data type for an operation, as taught by Sheu. It would have been obvious because it represents the application of a known technique (i.e. using the data types needed by an operation to 
In regard to claims 12 and 23, they are substantially similar to claim 1 and accordingly are rejected under similar reasoning.

In regard to claim 2, O’Halloran further teaches associating a subset of the data sets of a plurality variable types as related data sets having a related data set designation (), wherein the receiving of the selection of the first group identifier comprises receiving a selection of a related data set designation (“A data group 34 is a collection of datasets 24 having identical blocking variables.” Paragraph 0049).
In regard to claim 13, it is substantially similar to claim 2 and accordingly is rejected under similar reasoning.

In regard to claim 3, Arasan further teaches associating the derived data set as one of the variable types (“In various embodiments, output data 510 may be stored in a one or more data storage formats of BDMS 200, as disclosed above. Transformation platform 504 may also generate metadata for storage in metadata store 514. The metadata may describe the newly generated output data 510 and/or the transformation used to generate the output data 510. Put another way, metadata may describe what the new output variables are and where the output variables came from. For example, metadata may include a data type, original source variables, logic used to generate 
In regard to claim 14, it is substantially similar to claim 3 and accordingly is rejected under similar reasoning.

In regard to claim 4, Arasan further teaches that the method is repeated to generate a further derived data set by accessing the derived data set as one of the plurality of data sets (“In various embodiments, existing data 516 may also be processed for analysis 518 resulting in metadata generation and updates.” Paragraph 0046), wherein the processing to generate the derived data set includes processing at least one data set including the derived data set in response to the selection according to the at least one operation definition to generate the further derived data set (“Transformation platform 504 may receive a request to derive an output variable (Block 602). The request may include parameters such as proposed logic for application to a source variable and a proposed execution time and/or frequency. The logic may be used to derive output data for an output variable.” Paragraph 0047; alternatively or additionally, “In various embodiments, output variables 408 may originate from multiple source variables 402. For example, as illustrated, source variable 2 is transformed into derived variable 1, source variable 3 is transformed into derived variable 2, derived variable 1 and derived variable 2 are both transformed into derived variable 5, and derived variable 5 is transformed into output 2
In regard to claim 15, it is substantially similar to claim 4, and accordingly is rejected under similar reasoning.

In regard to claim 5, Arasan further teaches storing the derived data set to be available for the accessing (“In various embodiments, output data 510 may be stored in a one or more data storage formats of BDMS 200, as disclosed above.” Paragraph 0045).
In regard to claim 16, it is substantially similar to claim 5, and accordingly is rejected under similar reasoning.


In regard to claim 7, Arasan teaches the method of claim 1, as above. However, Arasan fails to explicitly teach that  least some of associations between variable types and data sets are based on user input.
Abuelsaad  teaches  at least some of associations between variable types and data sets are based on user input (“In an alternate embodiment, the comparison program presents the data types of the compared data sets to a user and receives input from the user identifying matches,” paragraph 0092).
It would have been obvious to one of ordinary skill in the art at the time of filing to modify the data set analysis system taught by Arasan to include the matching of related data types, as taught by Abuelsaad. One would have been motivated to do so in order to determine if a data set could be sold for a higher price, as taught by Abuelsaad (“In an alternate embodiment, storefront program 104 may use the relevancy score in the 
In regard to claim 18, it is substantially similar to claim 8 and accordingly is rejected under similar reasoning.

In regard to claim 8, Arasan further teaches that the data sets have associated meta data, and at least parameters of the meta data are processed as meta data for the at least one variable type (“The variables may be cataloged as they are ingested and stored using data storage formats 201. The catalog may track the location of variables by identifying the storage format, the table, and/or the variable name for each variable available through virtualized database structure 220. The catalog may also include metadata describing what the variables are and where the variables came from such as data type, original source variables, timestamp, access restrictions, sensitivity of the data, and/or other descriptive metadata.” Paragraph 0031).
In regard to claim 19, it is substantially similar to claim 8, and accordingly is rejected under similar reasoning.

In regard to claim 9, Abuelsaad further teaches that the meta data for the at least one variable type is compared with the meta data for the data sets and a data set is 

In regard to claim 10, Arasan teaches the method of claim 1, as above. However, Arasan fails to explicitly at least some of the associations between variable types and data sets are automatically identified based on associating data sets to a variable type dependent upon a comparison of the metadata between the data sets.
Abuelsaad  teaches explicitly at least some of the associations between variable types and data sets are automatically identified based on associating data sets to a variable type dependent upon a comparison of the metadata between the data sets. (“Column ID program 106 parses the column data of the column (step 304) to determine the data type of the column data. Column ID program 106 may use one or more methods, alone or in combination, to parse the column data (step 304). An exemplary embodiment including some such methods is discussed in more detail in connection with FIG. 4.”, paragraph 0061, wherein “Column ID program 106 associates the column with the ADT (step 306) by metadata associating the column with the data type determined in step 304” paragraph 0062)
It would have been obvious to one of ordinary skill in the art at the time of filing to modify the data set analysis system taught by Arasan to include the matching of related data types, as taught by Abuelsaad. One would have been motivated to do so in order to allow for more accurate comparison of data, as taught by Abuelsaad (“Embodiments of the present invention recognize that identifying data by an abstract data type enables more accurate comparisons. For example, comparing two integers comprising digits identical to one another would result in a high degree of similarity. However, identifying the first integer as a dollar amount and the second integer as a phone number enables a more accurate comparison, which results in a low degree of similarity.” Paragraph 0018) Alternatively, it represents the application of a known technique (i.e. automatically generating metadata match candidates from identified metadata, as taught by Abuelsaad) to a known system (i.e. the data analysis system taught by Arasan) ready for improvement to yield predictable results (i.e. data types may be automatically determined)
In regard to claim 21, it is substantially similar to claim 10, and accordingly is rejected under similar reasoning.

In regard to claim 11, Arasan further teaches generating a user interface displaying a schematic diagram of interconnected components of a system (i.e. logic map, “With reference to FIG. 4, a logic map 400 is shown in a graphical form depicting transformations 304 applied to source data 302 at a variable (e.g., colunm of a table) level in accordance with various embodiments. A user may request an output variable by using a graphical tool to generate logic maps for the output variable.” a submission from a graphical tool. The request may contain proposed logic for transforming source data into output data.” Paragraph 0039), and the accessing the data sets comprises accessing data sets for the at least one selected component (“In various embodiments, DFS 100 may process hundreds of thousands of records from a single data source. DFS 100 may also ingest data from hundreds of data sources. The data may be processed through data transformations to generate output variables from input variables. In that regard, input variables may be mapped to output variables by applying data transformations to the input variables and intermediate variables generated from the input values.” Paragraph 0022).
In regards to claim 22, it is substantially similar to claim 11 and accordingly is rejected under similar reasoning.

Response to Arguments
Applicant’s arguments, see pages 7-11, filed 2/11/2021, with respect to the rejection(s) of claim(s) 1-5 and 7-23 under 35 U.S.C. 103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon Arasan, O'Halloran, and Sheu. For more information please refer to the relevant sections above.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ARTHUR GANGER whose telephone number is (571)272-0270.  The examiner can normally be reached on 10:00 AM - 7:30 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Robert Beausoliel can be reached on (571) 272-3645.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 






/ROBERT W BEAUSOLIEL JR/           Supervisory Patent Examiner, Art Unit 2167