Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office Action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/14/2020 has been entered.

Remarks
This Office Action is in response to applicant’s response filed on 12/14/2020 and RCE filed on 1/1/2021. Claims 1, 3-10, 12-15, and 18-20 are pending and under consideration.

Response to Arguments
Applicant’s amendments have overcome most of the previous rejections under § 112(b) for indefiniteness. Therefore, the previous rejections under § 112(b) have been withdrawn to the extent that they are not maintained in the current Office Action. However, as noted in the rejections below, the issue pertaining to “the time-series data” has not yet been resolved for independent claims 10 and 16.
Applicant’s arguments with respect to the rejection under § 103 over Madge in view of Grzegorowski have been considered but are moot under the new ground of rejection. In the new 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 10, 12-16 and 18-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 10 and 16 recite the limitation “the time series data” in the last sub-paragraph of these claims. There is insufficient antecedent basis for this limitation in the claim. For purposes of examination, the above phrase has been interpreted to be “the time series data record.”
 Claims that depend from one or more claims discussed above are also rejected for the same reasons, because they inherent the indefinite recitation of their parent claims.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


1.	Claims 1, 3-5, 8, 10, 12-14, 16, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Madge, “Predicting Stock Price Direction using Support Vector Machines” (2015) (cited by applicant in IDS of 10/18/2018) in view of Kim, “Financial Time Series Forecasting using Support Vector Machines,” Neurocomputing 55 (2003) 307–319 (2003) (provided with the Office Action of October 2, 2020) and Grzegorowski et al., "Window-based feature extraction framework for multi-sensor data: A posture recognition case study," 2015 Federated Conference on Computer Science and Information Systems (FedCSIS), Lodz, 2015, pp. 397-405, doi: 10.15439/2015F425. (“Grzegorowski”).
As to claim 1, Madge teaches a method comprising:
receiving, by a computing device, a plurality of time-series data records associated with individuals in a population of individuals; [§ 1, paragraph 3: “daily closing prices for each stock from the years 2007 through 2014” (see also abstract: “daily closing prices for 34 technology stocks”), represented as series Ci (see page 3, table 1). With respect to the limitations of “by computing device,” Madge is understood to teach performing its method by a computing device (see, e.g., § 1, paragraph 3, which refers to the “code written” for the disclosed method, and the abstract, which refers to machine learning).]
generating, by the computing device, a feature vector comprising a first plurality of time-based features using attribute data for a first attribute collected over a period of time and contained in a first time-series data record, [§ 2.4: feature vector “x” for a support vector machine (SVM) model. Page 3, table 1, teaches that the feature vector may comprise a plurality of time-based features (e.g., stock price volatility and stock momentum) computing using attribute data for a first attribute (e.g., values for the attribute of closing price C, as described in the caption of the table: “We let Ct be the stock’s closing price at time t…”), which are part of the collected historical time series of closing prices note above.] 
generating, by the computing device, a label by computing a value using a subset of data in the first time-series data record, wherein the feature vector and the label define a training vector; [§ 2.4: label “y” of the support vector machine model. § 3.4 teaches that for a particular feature vector (in the training set Xtrain) the label is calculated as the stock’s directional change m days in the future (see table 1 caption and paragraph 4 in particular). The directional change is calculated from the stock’s price is based on the stock’s movement, and thus computed using a subset of the stock’s price data. The feature vector and the label define a training vector (Xtrain, ytrain), as described in § 3.4, paragraph 5.]
creating, by the computing device, a training set comprising a plurality of training vectors by repeating the foregoing operations for each time-series data record in the plurality of time-series data records; [Abstract: “This study uses daily closing prices for 34 technology stocks to calculate price volatility and momentum for individual stocks and for the overall sector. These are used as parameters to the SVM model.” That is, Madge teaches the same process for all 34 technology stocks.] 
providing, by the computing device, (a part of the training set) to a machine learning model to train the machine learning model; and [§ 3.4, last paragraph: “We supply the feature vectors Xtrain as well as their corresponding output vectors ytrain to the SVM model. This is the training phase.”]
forecasting an attribute represented by the time-series data record for any individual in the population of individuals using the trained machine learning model. [§ 3.4, last paragraph: “We then supply only the testing feature vectors Xtest and have the model predict their corresponding output vectors.”]
 Madge does not specifically teach: 
(1)	The generation of the feature vector including “computing the first plurality of time-based features by aggregating a respective plurality of subsets of the attribute data over different respective periods of time”; and
(2) 	The feature that the training data provided to the machine learning model is the (entire) “training set” generated from all of the plurality of time-series data records. (Instead, Madge generally teaches training the SVM model using training data associated with a particular stock (a particular time-series data record) and an index related to the stock. Therefore, the training data used in Madge does not appear to include labeled vectors from other stocks.)
Kim, in an analogous art, teaches “computing the first plurality of time-based features by aggregating a respective plurality of subsets of the attribute data over different respective periods of time.” Kim relates to financial time series forecasting using support vector machines (see title), and a feature vector that includes variety of features (see Table 1 on page 311). Therefore, Kim is analogous art for at least the reason of being in the same field of endeavor as the claimed invention. The Examiner also notes that Kim’s techniques are similar to those of Madge, as they both use support vector machines (SVMs) to perform time series forecasts relating to stock or financial market prices.
In particular, Kim teaches generating a feature vector by a method that includes generating, by the computing device, a feature vector comprising a first plurality of time-based features [§ 2.1 describing attributes serving as input variables of a support vector machine], including “computing the first plurality of time-based features” [§ 3.1: “This study selects 12 technical indicators to make up the initial attributes, as determined by the review of domain experts and prior research [12]. The descriptions of initially selected attributes are presented in Table 1.” Note that “attribute” refers to an input feature, as described in § 2.1, paragraph 2.] “by aggregating a respective plurality of subsets of the attribute data over different respective periods of time” [Table 1 on page 311 lists the features “Disparity5” and “Disparity10”, which are computed from MA5 and MA10, respectively. As described in the caption for Table 1, MA5 and MA10 are the 5-day and 10-day moving average of the price. The 5-day and 10-day moving averages are based on a 5 day period and 10 day period, which correspond to different respective periods of time. With respect to the limitation of “aggregating,” a moving average is an aggregation of data.]. 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Madge and Kim by modifying the first plurality of time-based features to include the aforementioned features of Kim (Disparity5 and Disparity10) and by modifying the generation of the feature vector to include “computing the first plurality of time-based features by aggregating a respective plurality of subsets of the attribute data over different respective periods of time.” The motivation for doing so would have been to include further technical indicators that are suitable as input variables for stock market prediction, as suggested by Kim (see § 3.1, as quoted above (describing technical indicators) and abstract: “The experimental results show that SVM provides a promising alternative to stock market prediction.”).     
Grzegorowski, in an analogous art, teaches the remaining limitations that the training data provided to the machine learning model is the (entire) “training set” generated from all of the plurality of time-series data records. Grzegorowski generally relates to window-based feature extraction (title) for preparing features that are “suitable for machine learning” (§ III, last paragraph). Therefore, Grzegorowski is analogous art for at least the reason of being in the same field of endeavor as the claimed invention.
In particular, Grzegorowski teaches providing “the training set” to a machine learning model to train the machine learning model [§ II.A, second paragraph, last sentence: “the training and test data sets consist of recordings from disjoint groups of firefighters.” That is, the training data is from multiple individuals in a population. This training data is used to train a single model (e.g., Model 1 or Model 2), as described in § III, that is used for prediction for a particular individual (e.g., the posture of an individual firefighter, as described in the caption for FIG. 1. In other examples, as shown in FIG. 6, training data from a plurality of individuals (mine 1, mine 2) is used to train a general model for prediction).] Grzegorowski teaches that data from multiple sources (see abstract: “multiple streams of readings generated by sensors”) can be beneficially used to train a model. In particular, § IV, paragraph 2, last sentence, teaches that “a variety of data from multiple systems enables performing a wide-ranging analysis.” See also § V, teaching that feature extraction was performed on “readings from multiple sensors” to obtain features “suitable for machine learning algorithms.” 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Madge and Kim with the teachings of Grzegorowski by modifying the method of Madge such that the entire “training set” is provided to a machine learning model to train the machine learning model, in order to utilize a variety of data (i.e., multiple time series data from different individuals of a population) to perform a wide-ranging analysis, as suggested by Grzegorowski (§ IV, paragraph 2, last sentence, as quoted above). 

As to claim 3, the combination of Madge, Kim and Grzegorowski teaches the method of claim 1, wherein the respective plurality of subsets of the attribute data and the label time period are referenced relative to a reference time tref. [Madge, § 3.4: Feature periods are for the past “n” days (Table 1) and extends to time “t,” while the label is for “m” days in the future (i.e., t + m). Therefore, the periods are referenced with respect to a reference time, which in this case may be t or, alternatively, a time between t and t + m. This limitation is also taught in Kim, Table 1, which also uses reference time “t” in the same manner; thus, this limitation is taught by the combination of Madge and Kim set forth above.]

As to claim 4, the combination of Madge, Kim and Grzegorowski teaches the method of claim 3, wherein each feature time period occurs prior in time to the reference time tref, wherein the label time period occurs subsequent in time to the reference time tref. [Madge, § 3.4: Feature periods are for the past “n” days (Table 1) and extends to time t. A period that extends to t is considered to be a period prior in time to time t (alternatively, the reference time may be understood as a time between t and t + m, such as a transition from t to the next day). These periods are considered to be prior in time the reference time. The period for the label is m days in the future (t + m), which is subsequent to the reference time.] 

As to claim 5, the combination of Madge, Kim and Grzegorowski teaches the method of claim 1, wherein the plurality of feature time periods and the label time period are referenced relative to a reference time tref [Madge, § 3.4: Feature periods are for the past “n” days (Table 1) and extends to time “t,” while the label is for “m” days in the future (i.e., t + m). Therefore, the periods are referenced with respect to a reference time, which in this case may be t or, alternatively, a time between t and t + m.] that differs from one training vector to another. [Madge, § 3.4, teaches that a feature vector is generated for multiple trading days between 2007 and 2014. Therefore, the feature vectors for different trading days would have a different reference time] 

As to claim 8, the combination of Madge, Kim and Grzegorowski teaches the method of claim 5, further comprising, by the computing device:
selecting an initial value of the reference time tref for a first training vector; [Madge, § 3.4, teaches that a feature vector is generated for multiple trading days between 2007 and 2014. The value of “t” (reference time) for the feature vector for the first trading day in the period of 2007 to 2014 corresponds to an initial value.] and
monotonically incrementing the reference time tref for each subsequent training vector. [The feature vectors for the subsequent trading days have monotonically incremented values for reference time “t” as compared to the “t” for the first trading day. As noted in the rejection of claim 5 above, feature periods are for the past “n” days (Table 1) and extends to time “t,” while the label is for “m” days in the future (i.e., t + m). Therefore, the periods for the subsequent trading days are referenced with respect to a reference time, which in this case is represented by “t.”]

As to claims 10 and 12-14, these claims are directed to “a non-transitory computer-readable storage medium having stored thereon computer executable instructions, which when executed by a processing unit, cause the processing unit to” perform operations that are the same or substantially the same as those of the method of claims 1 and 3-5. Therefore, the rejections made to claims 1 and 3-5 are applied to claims 10 and 12-14, respectively.
Additionally, Madge teaches the “non-transitory computer-readable storage medium” and “computer executable instructions” limitations of the instant claims, because Madge teaches performing its method by a computing device (see, e.g., § 1, paragraph 3, which refers to the “code written” for the disclosed method, and the abstract, which refers to machine learning). It is implicitly disclosed that a computing device used in Madge’s method has a memory storing the instructions.

As to claims 16 and 18, these claims are directed to “an apparatus comprising: one or more computer processors; and a computer-readable storage medium comprising instructions for controlling the one or more computer processors to be operable to” perform operations that are the same or substantially the same as those of the method of claims 1 and 5. Therefore, the rejections made to claims 1 and 5 are applied to claims 16 and 18, respectively.
Additionally, Madge teaches the “processor” and “computer-readable storage medium comprising instructions for controlling the one or more processors” limitations of the instant claims, because Madge teaches performing its method by a computing device (see, e.g., § 1, paragraph 3, which refers to the “code written” for the disclosed method, and the abstract, which refers to machine learning). It is implicitly disclosed that such a computing device has a processor and a storage medium for the code.

2.	Claims 6, 15 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Madge in view of Kim and Grzegorowski, and further in view of Lee et al. (US 5,757,964) (“Lee”).
As to claim 6, the combination of Madge, Kim and Grzegorowski teaches the method of claim 5, as set forth in the rejection above, but does not teach the method further comprising the additional limitations recited in claim 6. 
Lee, in the same field of endeavor, teaches “including, by the computing device, the reference time tref as a feature in the feature vector.” Lee generally pertains to time sequence analysis (see Col. 4, lines 22-30). In particular, Lee teaches “including, by the computing device, the reference time tref as a feature in the feature vector” [FIG. 4, time index 76 included in feature vector 70. Col. 5, lines 35-58.] In particular, col. 5, lines 49-51 teaches the benefit of the time index: “The feature vector's time index preferably indicates the particular time interval to which the feature vector's associated segment pertains.” 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Madge, Kim and Grzegorowski with the teachings of Lee by performing the further operation of including, by the computing device, the reference time tref as a feature in the feature vector, in order to obtain a feature vector that indicates the particular time interval to which the feature vector's associated segment pertains, as suggested by Lee. Furthermore, the instant limitation would have been obvious as a combination of prior art elements according to known methods to yield predictable results.

As to claims 15 and 20 because the further limitations recited in these claims are substantially the same as those recited in claim 6, the rejection made to claim 6 is applied to claims 15 and 20.

3.	Claims 7 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Madge in view of Kim and Grzegorowski, and further in view of Baydogan et al. (M. G. Baydogan, G. Runger and E. Tuv, "A Bag-of-Features Framework to Classify Time Series," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 11, pp. 2796-2802, Nov. 2013) (“Baydogan”).
As to claim 7, the combination of Madge, Kim and Grzegorowski teaches the method of claim 5, as set forth in the rejection above, but does not teach the method further comprising the additional limitations recited in claim 7.
Baydogan, in the same field of endeavor, teaches “for each training vector, randomly selecting, by the computing device, a value of the reference time tref.” Baydogan generally pertains to analysis of time series (abstract). In particular, Baydogan teaches “for each training vector, randomly selecting, by the computing device, a value of the reference time tref” [Abstract: “Multiple subsequences selected from random locations and of random lengths are partitioned into shorter intervals to capture the local information.” As illustrated in FIG. 1, the random subsequences have a randomly selected reference time.] Baydogan, Section 5 (Conclusion) teaches that random subsequences may be used to represent a time series, and to detect patterns represented by a series of measurements over shorter time segments. It is noted that the subsequences in Baydogan are analogous to the training vectors in Madge.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Madge, Kim and Grzegorowski with the teachings of Baydogan by performing the further operation of, for each training vector, randomly selecting, by the computing device, a value of the reference time tref, in order to obtain training data that represents a time series and/or to detect patterns represented by the time series, as suggested by Baydogan. Furthermore, the instant limitation would have been obvious as a combination of prior art elements according to known methods to yield predictable results.

As to claim 19, because the further limitations recited in this claim are substantially the same as those recited in claim 7, the rejection made to claim 7 is applied to claim 19.

4.	Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Madge in view of Kim and Grzegorowski, and further in view of Shim et al. (US 2016/0078485 A1) (“Shim”).
As to claim 9, the combination of Madge, Kim and Grzegorowski teaches the method of claim 1, as set forth in the rejection above, but does not teach the method further comprising the additional limitations recited in claim 9.
Shim, in the same field of endeavor, teaches “randomly selecting, by the computing device, a sample of individuals from the population and creating the training set from the sampled individuals.” Shim relates to predictive models applicable to users in a population. In particular, Shim teaches “randomly selecting, by the computing device, a sample of individuals from the population and creating the training set from the sampled individuals” [[0063]: “a training data set is first selected from a random sample of users and their corresponding behaviors and features. The sampled data set is then used for training the two-level statistical model to predict a conversion rate for each cell”]. Shim generally teaches that a model can be suitably trained using a random sample of training data (see [0013] and [0026]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Madge, Kim and Grzegorowski and the teachings of Shim by modifying the combination of Madge, Kim and Grzegorowski to include the additional operation of “randomly selecting, by the computing device, a sample of individuals from the population and creating the training set from the sampled individuals,” in order to obtain a training set that is suitable to train the model. Furthermore, the instant limitation would have been obvious as a combination of prior art elements according to known methods to yield predictable results.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Chen, “SVM application of financial time series forecasting using empirical technical indicators,” teaches features computed from different combinations of moving averages (see table II) corresponding to different subsets of data.
 Any inquiry concerning this communication or earlier communications from the examiner should be directed to YAO DAVID HUANG whose telephone number is (571)270-1764.  The examiner can normally be reached on Monday - Friday 8:30 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571) 270-7092.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Y.D.H./Examiner, Art Unit 2124                                                                                                                                                                                                        




/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124