Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-4, 6-8, 12-15, 17-19, 23-26, and 29-30 are presented for examination.
Oath/Declaration
For the record, the Examiner acknowledges that the Oaths/Declarations submitted on 2/18/2020 have been received.
Information Disclosure Statement
The information disclosure statements submitted on 3/21/2018, 7/30/2020, and 3/30/2021 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are considered by the examiner.
Drawings
The drawings filed on 3/21/2018 are acceptable for examination purposes.
Specification
The Specification filed on 3/21/2018 is acceptable for examination purposes.
Claim Objections
Claim 8 recites “kurtossis” in line 10.  This should be changed to “kurtosis.”
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the 

Claims 1-2, 12-13, and 23-24 are rejected under 35 U.S.C. 103 as being unpatentable over Birvinskas et al (EEG Dataset Reduction and Feature Extraction using Discrete Cosine Transform, herein Birvinskas).
Regarding Claim 12, 
Birvinskas teaches a method comprising:
	accessing a data structure having input data for an automated modeling algorithm, (Birvinskas, Page 200, Column 1, Paragraph 8, Line 1 “Artificial neural network (ANN) is a mathematical model that mimics some functional aspects of a biological neuron network [13].  The ANN consists of an interconnected group of artificial neurons.  These neurons are basic computational elements, often called either nodes or units.  The node receives input from some other nodes or from an external source.  Each input has an associated weight w, which can be modified so as to model synaptic learning.  The node computes some function f of the weighted sum of its inputs: 

    PNG
    media_image1.png
    89
    639
    media_image1.png
    Greyscale

In other words, input from an external source is input data from a data structure, and artificial neural network is an automated modeling algorithm.)
	the input data comprising data items with attribute values for multiple entities over a time period (Birvinskas, Page 200, Column 1, Paragraph 2, Line 1 “DCT is a transformation method for converting a time series signal into basic frequency components.  Low frequency 

    PNG
    media_image2.png
    295
    673
    media_image2.png
    Greyscale


And, Page 200, Column 1, Paragraph 4, Line 1 “The input is a set of N data values (EEG samples, audio samples, or other data) and the output is a set of N DCT transform coefficients Y(u).” In other words, N data values (EEG samples, audio samples, or other data) is data items with attribute values for multiple entities over a time period.)
	wherein each data item in the input data indicates (i) a respective attribute value for at least one attribute, (ii) a respective time value within the time period, and (iii) a respective entity associated with the respective attribute value; (Birvinskas, Page 200, Column 2, Paragraph 6, Line 3 “The datasets were recorded using a healthy subject.  The subject was asked to move a cursor up and down on a computer screen, while his cortical potentials were taken.  During the recording, the subject received visual feedback of his slow cortical potentials (SCPs).  Training dataset consists of 268 labeled trials.  First 135 trials belong to class 0 and 133 trials belong to class 1.  Testing dataset consist of 293 unlabeled trials.  Each trial consists of 896 samples from each of 6 channels.  The sampling rate of 256 Hz and the recording length is 3.5s.” In other words, each move of the cursor is input data that indicates (i) a respective attribute value for at least one attribute, (ii) a respective time value with the time period, and (iii) a respective entity associate with the respective attribute value.  A healthy subject is the respective entity associated with the respective attribute value.)
	generating, with a processing device, at least one trend attribute that is a function of a time series of attribute values, the at least one trend attribute indicating a trend in the time series of attribute values, (Birvinskas, Page 200, Column 1, Paragraph 6, Line 1 “Applying this feature for EEG signals allow compressing useful data to the first few coefficients.  Therefore, only these coefficients can be used for classification using machine learning algorithms.  This kind of data compression may dramatically reduce input vector size and decrease time required for training and classification.” In other words, compressing useful data to the first few coefficients is generating at least one trend attribute, and coefficients of an EEG signal is indicating a trend in a time series of attribute values.)
	wherein generating the at least one trend attribute comprises:  7Preliminary Amendmentapplying, for each entity, a frequency transform to a respective subset of attribute values for the at least one attribute based on the respective subset of attribute values being associated with the entity, and (Birvinskas, Page 200, Column 1, Paragraph 7, Line 1 “In this paper, we propose the use of the first few per cents of DCT transform coefficients for EEG signal classification.” In other words, DCT transform is frequency transform,  EEG signal is respective subset of attribute values, coefficient is at least one attribute based on the respective subset of attribute values, and (from Column 2, Paragraph 6, Line 3, above) healthy subject is the entity.)
	selecting, as the at least one trend attribute, at least one coefficient generated by the applied frequency transform; (Birvinskas, Page 201, Column 1, Paragraph 1, Line 1 “In Fig. 1a, In other words, coefficients is at least one coefficient, and discrete cosine transform is frequency transform.)

    PNG
    media_image3.png
    499
    735
    media_image3.png
    Greyscale

	modifying, with the processing device, the data structure to include the at least one trend attribute; (Birvinskas, Page 201, Column 1, Paragraph 2, Line 2 “In this experiment, we chose first 50 DCT coefficients for classification, ignoring the rest of DCT data.  Since the signal has 896 samples, we are using less than 6% of original dataset size.” In other words, adding coefficients is modifying the data structure and using coefficients is at least one trend attribute.)
	updating, with the processing device, the input data to include trend attribute values for the at least one trend attribute, the trend attribute values including values of the at least one coefficient; and (Birvinskas, Page 201, Column 1, Paragraph 2, Line 2 “In this experiment, we chose first 50 DCT coefficients for classification, ignoring the rest of DCT data.  Since the signal has 896 samples, we are using less than 6% of original dataset size.” In other words, choosing 50 DCT coefficients for classification is updating the input data to include trend attribute values for the at least one trend attribute, and coefficients is at least one coefficient.)
	and outputting, with the processing device, the trend attribute values from the data structure to a computing system that executes the automated modeling algorithm. (Birvinskas, Fig. 1b, and Page 201, Column 1, Paragraph 2, Line 1 “DCT data plot in Fig. 1b shows that higher coefficients are less than our predefined value.”  In other words, Fig. 1b shows outputting the trend attribute values from the data structure which are then used as input to the ANN, and ANN is automated modeling algorithm.)
Regarding claim 13,
	Birvinskas teaches the method of claim 12,
	wherein the input data comprises training data for the automated modeling algorithm, wherein the method further comprises training the automated modeling algorithm using the trend attribute values in the training data.  (Birvinskas, Page 201, Column 1, Paragraph 3, Line 1 “Classification of data was performed using the MATLAB Neural Network Toolbox.” And Page 201, Column 2, Paragraph 3, Line 1 “Created ANN networks were trained using Levenberg-Marquardt algorithm and training dataset.” In other words, classification of data was performed using the MATLAB Neural Network Toolbox is input data, training dataset is training data, and ANN is automated modeling algorithm
Claim 1 is a system claim corresponding to the combination of method claims 12 and 13.  Outside of that, claim 1 is the same as the combination of claims 12 and 13.  Birvinskas teaches training a neural network which implicitly requires a system with a processing device and a non-transitory computer-readable medium for storage.  Therefore, claim 1 is rejected for the same reasons as claims 12 and 13.
Regarding claim 2, 
	Birvinskas teaches the server system of claim 1
	wherein the processing device is configured for generating the at least one trend attribute by performing, for each entity, additional operations comprising: identifying a respective subset of attribute values for the at least one attribute based on the selected attribute values being associated with the entity; (Birvinskas, Page 200, Column 1, Paragraph 7, Line 1 “In this paper, we propose the use of the first few per cents of DCT transform coefficients for EEG signal classification.” In other words, EEG signal is respective subset of attribute values, coefficient is at least one attribute based on the respective subset of attribute values, and (from Column 2, Paragraph 6, Line 3) healthy subject is the entity.)
Claim 23 is a non-transitory computer-readable medium claim corresponding to the combination of method claims 12 and 13.  Outside of that, claim 23 is the same as the combination of claims 12 and 13.  Birvinskas teaches training a neural network which implicitly requires a system with a processing device and a non-transitory computer-readable medium for storage.  Therefore, claim 23 is rejected for the same reasons as claims 12 and 13.
Claim 24 is a non-transitory computer readable medium claim corresponding to server system claim 2. Outside of that, they are the same.  Birvinskas teaches training a neural .
Claims 3-4, 6-8, 14-15, 17-19, 25-26, 29-30 are rejected under 35 U.S.C. 103 as being unpatentable over Birvinskas and Ghosh (US 20130132390 A1, herein Ghosh).
Regarding Claim 14,
	Birvinskas teaches the method of claim 13,
	[further comprising generating at least one additional trend attribute by performing operations comprising: identifying intervals in the time period; generating multiple cluster series, wherein each cluster series comprises clusters respectively associated with the intervals,]
	[wherein the processing device is configured for 8Preliminary Amendmentgenerating each cluster series by performing operations comprising: grouping, for each interval, a respective subset of attribute values into a respective cluster based on the grouped attribute values being associated with time values in the interval, and adding the respective cluster for the interval to the cluster series; and]
	[computing, for the multiple cluster series, respective additional trend attribute values, wherein each additional trend attribute value is computed as a function of the respective cluster series;]
	wherein updating the input data comprises performing, for at least some of the entities in the input data, operations comprising: identifying, for each entity, a respective cluster series having a respective behavioral attribute value that is similar to a respective behavior of a respective time series of attributes values for the entity, assigning a cluster membership to the entity based on the respective behavioral attribute value being similar to the respective behavior of the respective time series of attributes values for the entity, and (Birvinskas, Page 200, Column 2, Paragraph 6, Line 3 “The datasets were recorded using a healthy subject.  The subject was asked to move a cursor up and down on a computer screen, while his cortical potentials were taken.  During the recording, the subject received visual feedback of his slow cortical potentials (SCPs).  Training dataset consists of 268 labeled trials.  First 135 trials belong to class 0 and 133 trials belong to class 1.  Testing dataset consist of 293 unlabeled trials.  Each trial consists of 896 samples from each of 6 channels.  The sampling rate of 256 Hz and the recording length is 3.5s.” and Page 201, Column 1, Paragraph 1, Line 1 “In Fig. 1a, an example of raw EEG data in time domain is shown. The signal is non-linear and non-periodic, this means no data can be excluded from classification using ANN. Fig. 1b shows discrete cosine transform of EEG data.  The resulting signal in frequency domain is less complex, energy is compressed into first few coefficients and all others are relatively small. These small coefficients can be omitted from classification.”  In other words, each move of the cursor is input data, a healthy subject is entity, 133 trials belong to class 1 is at least some of the entities, EEG data in time domain is respective behavior of a respective time series of attribute values for the entity, and compressed into first few coefficients is assigning a cluster membership for the entity based on the respective behavioral attribute value.)
	selecting, for the entity, an identifier of the cluster membership as a trend attribute value for the entity.  (Birvinskas, Page 200, Column 1, Paragraph 4, Line 1”The input is a set of N data values (EEG samples, audio samples, or other data) and the output is a set of N DCT In other words, DC coefficient is identifier for cluster membership as a trend attribute for the entity.)
	Thus far, Birvinskas does not explicitly teach further comprising generating at least one additional trend attribute by performing operations comprising: identifying intervals in the time period; generating multiple cluster series, wherein each cluster series comprises clusters respectively associated with the intervals,
	Ghosh teaches further comprising generating at least one additional trend attribute by performing operations comprising: identifying intervals in the time period; generating multiple cluster series, wherein each cluster series comprises clusters respectively associated with the intervals, (Ghosh, See FIG 4. In other words, FIG. 4 shows a method for generating at least one additional trend attribute by performing operations, Step 404 is at least one additional trend, Step 408 is identifying intervals in the time period, Step 410 is generate a cluster, and the entire method is generating multiple clusters, respectively associated with intervals.)

    PNG
    media_image4.png
    733
    561
    media_image4.png
    Greyscale

	Thus far, Birvinskas does not explicitly teach wherein the processing device is configured for 8Preliminary Amendmentgenerating each cluster series by performing operations comprising: grouping, for each interval, a respective subset of attribute values into a respective cluster based on the grouped attribute values being associated with time values in the interval, and adding the respective cluster for the interval to the cluster series; and
	Ghosh teaches wherein the processing device is configured for 8Preliminary Amendmentgenerating each cluster series by performing operations comprising: grouping, for each interval, a respective subset of attribute values into a respective cluster based on the grouped attribute values being associated with time values in the interval, and adding the respective cluster for the interval to the cluster series; and (Ghosh, Paragraph [0007], Line 1, “In another aspect, a system for selectively providing an aggregated trend obtained at least in part by aggregating at least a subset of a plurality of individual trends is provided.  The system comprises at least one processor configured to: compute a measure of disorder in at least the subset of the plurality of individual trends; and decide whether to provide the aggregated trend based on the computed measure of disorder.” In other words, one processor configured is processing device configured for generating, aggregated trend is cluster series, subset of a plurality of individual trends is subset of attribute values, and decide whether to provide the aggregated trend is adding the respective cluster for the interval to the cluster series.)
	Thus far, Birvinskas does not explicitly teach computing, for the multiple cluster series, respective additional trend attribute values, wherein each additional trend attribute value is computed as a function of the respective cluster series;
	Ghosh teaches computing, for the multiple cluster series, respective additional trend attribute values, wherein each additional trend attribute value is computed as a function of the respective cluster series; (Ghosh, See FIG. 4. In other words, Step 406 “Provide Aggregate Trend” is additional trend attribute value computed as a function of the respective cluster series.)
	It would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Ghosh into the teaching of Birvinskas.  This would result in 
	Ghosh and Birvinskas are both directed to processing series of data that changes over time. One of ordinary skill in the art would be motivated to combine the teaching of Ghosh into Rhodes in order to effectively handle data that changes over time by clustering the data and aggregating trends.
Regarding claim 15,
	The combination of Birvinskas and Ghosh teach the method of claim 14,
	wherein the updated input data comprises first data items having a first trend attribute value and second data items having a second trend attribute value; (Birvinskas, Page 200, Column 1, Paragraph 4, Line 1”The input is a set of N data values (EEG samples, audio samples, or other data) and the output is a set of N DCT transform coefficients Y(u).  The first coefficient Y(0) is called the DC coefficient and holds average signal value.  The rest coefficients are referred to as the AC coefficients (these terms have been inherited from electrical engineering) [8]. In other words, DC coefficient is first trend attribute value and AC coefficients is second trend attribute value.)
	wherein training the automated modeling algorithm comprises executing segmentation logic based on the additional trend attribute values, wherein executing the segmentation logic comprises: (Birvinskas, Page 200, Column 2, Paragraph 5, Line 1 “In this paper, the classification error (3) is measured as a ratio of false results and total number of trials:            
    PNG
    media_image5.png
    48
    396
    media_image5.png
    Greyscale

true is the number of trials with correct classification result.”  In other words, classification error is segmentation logic based on the additional trend attributes.)
	applying, based on the first data items having the first trend attribute value, a first modeling function to the first data items, and applying, based on the second data items having the second trend attribute value, a second modeling function to the second data items. (Birvinskas, Page 200, Column 2, Paragraph 6, Line 3 “The datasets were recorded using a healthy subject.  The subject was asked to move a cursor up and down on a computer screen, while his cortical potentials were taken.  During the recording, the subject received visual feedback of his slow cortical potentials (SCPs).  Training dataset consists of 268 labeled trials. First 135 trials belong to class 0 and 133 trials belong to class 1…. The sampling rate of 256 Hz and the recording length is 3.5s.”  In other words, first trial is the first data items, and second trial is the second data items.)
Regarding claim 17,
	The combination of Birvinskas and Ghosh teach the method of claim 14,
	wherein the respective subset of attribute values are grouped into the respective cluster based on the at least one coefficient generated by the frequency transform.  
(Birvinskas, Page 200, Column 1, Paragraph 4, Line 1 “The input is a set of N data values (EEG samples, audio samples, or other data) and the output is a set of N DCT transform coefficients Y(u).  The first coefficient Y(0) is called the DC coefficient and holds average signal value.  The rest coefficients are referred to as the AC coefficients (these terms have been inherited from In other words, DC coefficient is at least one coefficient generated by the frequency transform.)
Regarding claim 18,
	The combination of Birvinskas and Ghosh teach the method of claim 14,
	wherein grouping the respective subset of attribute values into the respective cluster comprises: identifying a first time series of attribute values for a first attribute and a second time series of attribute values for a second attribute; performing a principal component analysis on the first time series and the second time series; outputting a principal component data series from the principal component analysis; and 10Preliminary Amendmentgrouping the principal component data series into the clusters. (Birvinskas, Fig. 1b, and Page 201, Column 1, Paragraph 2, Line 1 “DCT data plot in Fig. 1b shows that higher coefficients are less than our predefined value.”  In other words, Fig. 1a shows identifying a first time series of attribute values for a first attribute and a second time series of attribute values for a second attribute. Fig. 1b shows the results of performing a principal component analysis and outputting a principal component data series from the principal component analysis, which is then used as input data to the ANN.)
Regarding claim 19,
	The combination of Birvinskas and Ghosh teach the method of claim 12, 
	wherein the function of the respective time series uses changes in the respective time series over the time period to compute a trend attribute value, (Birvinskas, Page 200, Column 1, Paragraph 2, Line 1 “DCT is a transformation method for converting a time series signal into basic frequency components.  Low frequency components are concentrated in first coefficients 

    PNG
    media_image6.png
    253
    578
    media_image6.png
    Greyscale

In other words, Y(u) is a function of the respective time series over the time period to compute a trend attribute value.)
	wherein the at least one trend attribute comprises at least one of: a statistical attribute, a duration attribute computed based on peaks and valleys in the respective time series, or a depression/recovery attribute computed based on rates of change between the peaks and the valleys in the respective time series, a skewness of a probability distribution of the respective time series, or a kurtossis of a probability distribution of the respective time series. (Birvinskas, Page 201, Column 1, Paragraph 1, Line 1 “In Fig. 1a, an example of raw EEG data in time domain is shown. The signal is non-linear and non-periodic, this means no data can be excluded from classification using ANN. Fig. 1b show discrete cosine transform of EEG data.  The resulting signal in frequency domain is less complex, energy is compressed into first few coefficients and all others are relatively small. These small coefficients can be omitted from classification.” Figs. 1a and 1b” In other words, Fig. 1b shows a duration attribute computed based on peaks and valleys in the respective time series of Fig. 1a.)

    PNG
    media_image7.png
    392
    578
    media_image7.png
    Greyscale

Claims 3-4, and 6-8 are system claims corresponding to method claims 14-15, and 17-19 respectively.  Outside of that, they are the same. Birvinskas teaches training a neural network which implicitly requires a system with a processing device and a non-transitory computer-readable medium for storage.  Therefore, claims 3-4, and 6-8 are rejected for the same reasons as claims 14-15 and 17-19 respectively.
Claims 26 and 29-30 are non-transitory computer-readable medium claims corresponding to method claims 15 and 18-19 respectively.  Outside of that, they are the same. Birvinskas teaches training a neural network which implicitly requires a system with a processing device and a non-transitory computer-readable medium for storage.  Therefore, claims 26 and 29-30 are rejected for the same reasons as claims 15 and 18-19 respectively.
Conclusion
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to BART RYLANDER whose telephone number is (571)272-8359.  The examiner can normally be reached on Monday - Thursday 8:00 to 5:30.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on 571-270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/B.I.R./Examiner, Art Unit 2124                                                                                                                                                                                                        
/Vincent Gonzales/Primary Examiner, Art Unit 2124