DETAILED ACTION
1.	The present application 16/726,223, filed on 12/23/2019, is being examined under the first inventor to file provisions of the AIA .

Drawings
2.	The drawings received on 12/23/2019 are accepted by the Examiner.
Review under 35 USC § 101
3.	Claims 1-20 are directed to a process, an article of manufacture and a machine.  Claims 1-9 are appeared to be in one of the statutory categories [e.g. a process]. The process is a computer implemented method for pre-processing data, the method comprising normalizing raw data included in data set included in a plurality of data set to generate normalized data within the data set; aggregating the normalized data within the data set based on a time duration associated with a first data set to generate aggregated data within the data set; and joining the plurality of data set that include aggregated data to the first data set to generate a joined data set.  Claims 1-9 do not seem to fall in one of the grouping of abstract ideas enumerated in the 2019 PEG.   Claims 10-18 are appeared to be in one of the statutory categories [e.g. an article of manufacture].  The article of manufacture is a non-transitory computer-readable storage medium including instructions to perform steps for pre-processing data comprising normalizing raw data included in data set included in a plurality of data set to generate normalized data within the data set; aggregating the normalized data within the data set based on a time duration associated with a first data set to generate aggregated data within the data set; and joining the plurality of data set that include aggregated data to the first data set to generate a joined data set. Claims 10-18 do not seem to fall in one of the grouping of abstract ideas enumerated in the 2019 PEG.   Claims 19-20 are appeared to be in one of the statutory categories [e.g. an apparatus]. Claims 19-20 recite a system comprising a memory storing instructions; and a processor that is coupled to the memory, when executing the instructions is configured to normalize raw data included in data set included in a plurality of data set to generate normalized data within the data set; aggregate the normalized data within the data set based on a time duration associated with a first data set to generate aggregated data within the data set; and join the plurality of data set that include aggregated data to the first data set to generate a joined data set.  Claims 19-20 do not seem to fall in one of the grouping of abstract ideas enumerated in the 2019 PEG.   Claims 1-20 are qualified as eligible subject Matter under 35 USC 101. 
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
4.	The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1, 2, 6, 7, 10, 13, 14, 16, 18 and 19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Bath et al. (US 2018/0089328 A1), hereinafter Bath.
	Referring to claims 1, 10 and 19. Bath discloses a computer-implemented method (See Figure 4, para. [0544], para. [0545], a computer system includes computer readable for storing program instructions and data include all forms of nonvolatile memory, media and memory devices) for pre-processing data (See Figure 5, para. [0135], the system receives data blocks from a forwarder; parses the data to organize the data into events into buckets, where each bucket stores events associated with a specific time range based on the timestamps associated with each event) the method comprising: 
for each data set included in a plurality of data sets (See para. [0095], para. [0130] and para. [0283], receiving data from input source(s), the received data stream(s) is/are event data set), normalizing raw data included in the data set to generate normalized data within the data set; for each data set included in the plurality of data sets (See para. [0087], para. [0099], para. [0262], the system formats the raw data to be stored as events and/or fields, the raw data comprise various data items of different data types that maybe stored at different locations within the raw data, each event 1 through K includes a field that is nine characters in length beginning after a semicolon on a first line of the raw data, note add-on 224 of the system helps to transform and normalize data from various sources);
aggregating the normalized data within the data set based on a time duration associated with a first data set to generate aggregated data within the data set (See para. [0104], para. [0145], para. [0422] and table 1, the system identifies metrics of the stored data event [e.g.  formatted/normalized data] and aggregates metrics of the stored data event(s) into predefined aggregate time windows to reduce the quantity of data, note the metrics scan be defined as metric time series includes a series of timestamp/value tuples for a specific tuple of a dimension value, for example, CPU statistics for a predetermined number of hosts can be collected every minute such that each host’s stream of pre-minute timestamp/CPU pairs constitutes a single metric time series, also note in table 1, the time duration was 30 days, total samples per day was 8650 and 7 aggregations ); and 
joining the plurality of data sets that include aggregated data to the first data set to generate a joined data set (See para. [0237], para. [0243] , the system receives the raw data obtained from the data source(s), then formats or processes the raw data into events and timestamps it and integrates/joins the result data with the result data from other external data source(s) and /or from data stores, the system provides reports of the integrated/ joined data set concurrently).
As to claims 2 and 13, Bath discloses determining that the raw data included in each data set included in the plurality of data sets has a different frequency than raw data included in the first data set (See para. [0409], para. [0450] and para. [0451], the collection frequency of a metric can be increase or decrease collection/storage rate associated with machine generated data [e.g. raw data], for example, a first metric 356-1 and a second metric 356-2 are selected by a user using the interface 322. The first metric 356-1 has a first plurality of metric values associated therewith, each corresponding to a measurement of the metric 356-1 at an instance of time that the data is collected. The first metric 356-1 has a first collection frequency. The second metric 356-2 has a second plurality of metric values associated therewith, each corresponding to a measurement of the metric 356-2 at an instance in time the data is collected. The second metric 356-2 has a second collection frequency. The first collection frequency and the second collection frequency can be different, when two or more metrics 356-1 and 356-2 are selected for simultaneous display, the time axis and/or default time range of the selected metrics 356-1 and 356-2 is adjusted such that each selected metric 356-1 and 356-2 is displayed in an overlapping time frame).
As to claims 6 and 14, Bath discloses wherein joining the plurality of data sets that include aggregated data to the first data set (See para. [0237], para. [0243] , the system receives the raw data obtained from the data source(s), then formats or processes the raw data into events and timestamps it and integrates/joins the result data with the result data from other external data source(s) and /or from data stores, the system provides reports of the integrated/ joined data set concurrently) comprises assigning one or more primary keys to rows within the plurality of data sets that include aggregated data and the first data set and joining the plurality of data sets that include aggregated data to the first data set based on the one or more primary keys (See para. [0172] and para. [0520], the system performs an aggregation analysis of metrics for the data events that include values in particular keys, the data events are stored as field-value pairs [e.g. in rows], the fields are associated with keys, the system perform aggregation analysis on the specific values in the specific keys [specific fields]). 
As to claims 7 and 16, Bath discloses wherein the plurality of data sets comprises a plurality of database tables (See para. [0082], the data sets are associated with relational data tables). 
As to claim 18, Bath also discloses wherein each data set included in the plurality of data sets includes data from at least one of a Controller Area Network (CAN) bus, an event data recorder (EDR), on-board diagnostic information, a head unit, an infotainment system, an electronic control unit (ECU), or a sensor (See para. [0093], para. [0096] and para. [0100], each machine data set includes data from web servers, routers, sensors, Internet of Things devices, etc., the data generated by such data sources includes server log files, activity log files, configuration files, sensor measurements, performance measurements). 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

5.	Claims 3, 4 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Bath (US 2018/0089328 A1) and in view of Ackerman et al. (US 2016/0246811 A1), hereinafter Ackerman.
As to claims 3 and 15, Bath discloses normalizing the raw data included in the data set (See para. [0087], para. [0099], para. [0262], the system formats the raw data to be stored as events and/or fields, the raw data comprise various data items of different data types that maybe stored at different locations within the raw data, each event 1 through K includes a field that is nine characters in length beginning after a semicolon on a first line of the raw data, note add-on 224 of the system helps to transform and normalize data from various sources) but does not explicitly discloses determining a scaling value for the data set and scaling the raw data included in the data set based on the scaling value and an offset value.
Ackerman discloses determining a scaling value for the data set (See para. [0074] and Figure 5, the system determines shifting [e.g. bit values] or scaling factor [e.g. greatest common divisor of the difference between values] for the incoming data streams or data sets) and scaling the raw data included in the data set based on the scaling value and an offset value (See para. [0055], para. [0074] and para. [0093], para. [0094], applying offset value and scaling values to one or more received input streams or datasets).
Therefore, it would have been obvious to a person of ordinary skill in the computer art before the effective filing date of the claimed invention to modify the Bath ‘s system to determine a scaling factor and/or an offset value for the data set, taught by Ackerman. Skilled artisan would have been motivated to store data more efficiently and allow fewer memory resources to be consumed for storage (See Achkerman, para. [0026]). In addition, both references (Ackerman and Bath) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as processing and preparing data obtained from data streams. This close relation between both references highly suggests an expectation of success.
As to claim 4, Ackkerman first discloses wherein the scaling value for the data set is determined by subtracting a minimum data value included in the data set from a maximum data value included in the data set (See para. [0074] and Figure 5, subtracting the minimum value from each value or shifting by the number of bits determined by the bitwise statistics or scaling by scale factor determined by the greatest common divisor of the differences between values, using only the number of bits necessary for the difference [e.g. maximum value- minimum value]).
Therefore, it would have been obvious to a person of ordinary skill in the computer art before the effective filing date of the claimed invention to modify the Bath ‘s system to determine a scaling factor and/or an offset value for the data set, taught by Ackerman. Skilled artisan would have been motivated to store data more efficiently and allow fewer memory resources to be consumed for storage (See Achkerman, para. [0026]). In addition, both references (Ackerman and Bath) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as processing and preparing data obtained from data streams. This close relation between both references highly suggests an expectation of success.
6.	Claims 5, 11 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Bath (US 2018/0089328 A1) and in view of Wang et al. (US 2019/0155672 A1), hereinafter Wang.
As to claim 5, Bath discloses, for each data set included in the plurality of data sets, sampling the normalized data within the data set by at least one of up-sampling or down-sampling the normalized data (See para. [0387], para.[0422] and table 1, each data set is associated with a gauge metric  includes any metric that has a value that can go up or down across samples, for example, in table 1, total samples per day is 86400). 
In addition, Wang also disclose for each data set included in the plurality of data sets, resampling the normalized data within the data set by at least one of up-sampling or down-sampling the normalized data (See para. [0004] and para. [0006], for each received time-series data sets from external sources, performing time-stamp up-sampling or time stamp down-sampling and removing noise from the received plurality of time-series data).
Therefore, it would have been obvious to a person of ordinary skill in the computer art before the effective filing date of the claimed invention to modify the Bath ‘s system to up-sampling or down-sampling the normalized data, taught by Wang. Skilled artisan would have been motivated to evaluate data stream sizes since large groups of data points  would downgrade network efficiency and operation (See Wang, para. [0002]). In addition, both references (Wang and Bath) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as processing and preparing data obtained from data streams. This close relation between both references highly suggests an expectation of success.
As to claim 11, Bath discloses for each data set included in the plurality of data sets sampling the normalized data within the data set (See para. [0387], para.[0422] and table 1, each data set is associated with a gauge metric  includes any metric that has a value that can go up or down across samples, for example, in table 1, total samples per day is 86400). 
In addition, Wang also disclose for each data set included in the plurality of data sets, resampling the normalized data within the data set (See para. [0004] and para. [0006], for each received time-series data sets from external sources, performing time-stamp up-sampling or time stamp down-sampling and removing noise from the received plurality of time-series data).
Therefore, it would have been obvious to a person of ordinary skill in the computer art before the effective filing date of the claimed invention to modify the Bath ‘s system to up-sampling or down-sampling the normalized data, taught by Wang. Skilled artisan would have been motivated to evaluate data stream sizes since large groups of data points would downgrade network efficiency and operation (See Wang, para. [0002]). In addition, both references (Wang and Bath) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as processing and preparing data obtained from data streams. This close relation between both references highly suggests an expectation of success.
As to claim 12, Wang also discloses wherein the re-sampling comprises at least one of up-sampling or down-sampling the normalized data (See para. [0004] and para. [0006], for each received time-series data sets from external sources, performing time-stamp up-sampling or time stamp down-sampling and removing noise from the received plurality of time-series data).
Therefore, it would have been obvious to a person of ordinary skill in the computer art before the effective filing date of the claimed invention to modify the Bath ‘s system to up-sampling or down-sampling the normalized data, taught by Wang. Skilled artisan would have been motivated to evaluate data stream sizes since large groups of data points would downgrade network efficiency and operation (See Wang, para. [0002]). In addition, both references (Wang and Bath) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as processing and preparing data obtained from data streams. This close relation between both references highly suggests an expectation of success.
7.	Claims 8, 9 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Bath (US 2018/0089328 A1) and in view of Walters et al. (US 2020/00112626 A1), hereinafter Walters.
As to claims 8 and 17, Bath does not explicitly disclose discloses training at least one machine learning model based on the joined data set.
Walters discloses training at least one machine learning model based on the joined data set (See para. [0099], para. [0050], para. [0054], para. [0067], the system is trained using one of the training modules based on data stored in aggregated data profiles).
Therefore, it would have been obvious to a person of ordinary skill in the computer art before the effective filing date of the claimed invention to modify the Bath ‘s system to train at least one machine learning model based on the joined dataset, taught by Walters. Skilled artisan would have been motivated to identify and compare statistical metrics of a data set using a machine learning model trained to use learned features of data to improve search accuracy (See Walters, para. [0059]). In addition, both references (Walters and Bath) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as deriving statistical metrics of a dataset using machine learning models. This close relation between both references highly suggests an expectation of success.
As to claim 9, Bath discloses joining at least one other data set including raw data having a same frequency as raw data included in the first data set to the first data set (See para. [0409], para. [0450] and para. [0451], the collection frequency of a metric can be at default rate associated with machine generated data [e.g. raw data], for example, a first metric 356-1 and a second metric 356-2 are selected by a user using the interface 322. The first metric 356-1 has a first plurality of metric values associated therewith, each corresponding to a measurement of the metric 356-1 at an instance of time that the data is collected. The first metric 356-1 has a first collection frequency. The second metric 356-2 has a second plurality of metric values associated therewith, each corresponding to a measurement of the metric 356-2 at an instance in time the data is collected. The second metric 356-2 has a second collection frequency. The first collection frequency and the second collection frequency can be same or different). 
Walters discloses joining at least one other data set including raw data having a same frequency as raw data included in the first data set to the first data set (See para. [0052], para. [0098] and para. [0099], aggregating a sample and a reference dataset using a similarity metric, the data set are combined or aggregated when the data scheme and/data values match according to a frequency).
Therefore, it would have been obvious to a person of ordinary skill in the computer art before the effective filing date of the claimed invention to modify the Bath ‘s system to join at least one other data set including raw data having a same frequency as raw data included in the first data set, taught by Walters. Skilled artisan would have been motivated to reduce overlapping values with another dataset (See Walters, para. [0099]). In addition, both references (Walters and Bath) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as deriving statistical metrics of a dataset using machine learning models. This close relation between both references highly suggests an expectation of success.
8.	Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Bath (US 2018/0089328 A1) and in view of Ruvio et al. (US 2019/0036946 A1), hereinafter Ruvio.
As to claim 20, Bath does not explicitly disclose each data set included in the plurality of data sets comprises data collected by a respective sensor on a vehicle. 
Ruvio discloses each data set included in the plurality of data sets comprises data collected by a respective sensor on a vehicle (See para. [0107] and para.[0124], the acquired raw data can be collected by sensor(s) of vehicle).
 	Therefore, it would have been obvious to a person of ordinary skill in the computer art before the effective filing date of the claimed invention to modify the Bath system to collect data from different data sources including a vehicle sensor, taught by Ruvio. Skilled artisan would have been motivated to extend the pre-processing machine learning method to be used in different industries including vehicle data communication networks. In addition, both of the references (Ruvio and Bath) teach features that are directed to analogous art and they are directed to the same field of endeavor, such as processing and preparing data obtained from external source(s) using machine learning. This close relation between both of the references highly suggests an expectation of success.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Serle et al. (US Patent 9,891,907 B2) discloses a method to determine and generate visualizations of updates timelines indicating device components associated with a remote connected device at different times. A device component status detection and illustration apparatus may include a memory and processor with instructions to: obtain device selection parameters, determine one or more remote connected devices that satisfy the device selection parameters, and identify a remote connected device selected from one or more remote connected devices by a user. The instructions may further be executed to generate visualizations of updates timelines associated with the remote connected device, including information regarding device components associated with the identified remote connected device as of selected update times. This allows the apparatus to present information regarding the status of a particular device at different points in time.
Miller et al. (US 10,394,946 B2) discloses a method to formulate and refine field extraction rules that are used at query time on raw data with a late-binding schema. The field extraction rules identify portions of the raw data, as well as their data types and hierarchical relationships. These extraction rules are executed against very large data sets not organized into relational structures that have not been processed by standard extraction or transformation methods. By using sample events, a focus on primary and secondary example events help formulate either a single extraction rule spanning multiple data formats, or multiple rules directed to distinct formats. Selection tools mark up the example events to indicate positive examples for the extraction rules, and to identify negative examples to avoid mistaken value selection. The extraction rules can be saved for query-time use, and can be incorporated into a data model for sets and subsets of event data.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YUK TING CHOI whose telephone number is (571)270-1637.  The examiner can normally be reached on Monday-Friday 9am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alford W Kindred can be reached on 5712724037.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/YUK TING CHOI/Primary Examiner, Art Unit 2153