Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 02/02/2022 has been entered.

Status of Claims
This action is in reply to the amendments and remarks filed on 02/02/2022.
Claims 1-20 are pending.
Claims 1, 2-4, 9, 13-14, and 18-19 have been amended.

Response to Arguments
Applicant’s arguments, with respect to the rejection(s) of claim(s) 1, 9, and 14 under 35 U.S.C. 103, have been considered but they are not persuasive. More specifically, the applicant argues that no art of record teaches the amended claim language, which now states “at least one said observation indicating the event did not occur in a particular outcome window of the plurality of outcome windows and including a portion of the data included in a respective said feature window” [claim 9], and “a first observation indicating the event did not occur”, “a second observation indicating the event did occur”, “the second observation window subsequent to the first observation window and the second feature window subsequent to the first feature window in the dataset” [claims 1 and 14] since “if the event did not occur in the training window (e.g. any of the time bins), Dugger indicates globally that the event did not occur, at all. Thus, Dugger does not teach or suggesting generating observations for the time bins (within the training window), individually, for when events do not occur”, “for the negative scenario, Dugger records that the event did not occur in the training window but not the training bin”, and “by setting beyond training window in Dugger, this is set outside of any of the training bins, i.e., occurs beyond the training window”. The examiner respectfully disagrees to all presented arguments. 
The claim merely states “defining” two “outcome windows” and shifting between the windows if there is no detected event within the first window; thus, finding an event once the window is newly shifted. As such, due to the broadness of the claim language, Dugger was found to teach all requirements of the claim language. Regarding claim 9, Dugger, ¶[0043-0049], teaches selecting training data within a “training window” including variable length “time bins” for developing a model, wherein “If the response data samples 126 for an entity indicate the non-occurrence of the event of interest in the training window (at least one said observation indicating the event did not occur in a particular outcome window), the model-development engine 116 can set I to any time value that occurs beyond the end of the training window” (of the plurality of outcome windows and including a portion of the data included in a respective said feature window). ¶[0081], teaches “Iterating block 306 can create a set of timing-prediction models that span the entire training windows” (plurality of outcome windows). Further, this is also taught to observe an event’s occurrence or non-occurrence over different length “time bins” and including the observation training data in training the model.
Regarding claims 1 and 14: Dugger, ¶[0043-0049], teaches selecting training data within a “training window” including variable length “time bins” for developing a model, wherein “If the response data samples 126 for an entity indicate the non-occurrence of the event of interest in the training window (at least one said observation indicating the event did not occur in a particular outcome window/ a point in time indicative of the first observation), the model-development engine 116 can set I to any time value that occurs beyond the end of the training window” (that is later than a point in time indicative of the first observation). As such, “the model-development engine 116 can count the number of time bins (months) until the first time the event occurs in the training window” (a second observation indicating the event did occur in a second said outcome window and including a portion of data from the subset included in a second said feature window, the second observation indicative of a point in time in the data). Further, this is also taught to observe an event’s occurrence or non-occurrence over different length “time bins” and including the observation training data in training the model.
See 35 U.S.C 103 section for full mapping of claim limitations necessitated by applicant amendments. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Achin et al. (US 2018/0046926 A1), hereafter referred to as Achin, in view of Dugger et al. (WO 2019/217876 A1), hereafter referred to as Dugger.

Regarding claim 1, Achin teaches in a digital medium analytics environment, a method implemented by at least one computing device, the method comprising: generating, by the at least one computing device, training data from a dataset , the generating including:
defining and a second outcome window and a corresponding second feature window defined between a second initial point in time and a second observation time (Achin: ¶[0013], “the values of one or more output variables (“targets”) at one or more future times based on the values of one or more input variables (“features”) at one or more past times”. ¶[0017], “setting the time interval of the time-series data to the time interval of the data sets. In some embodiments, determining the time interval of the data set includes: for one or more pairs of successive observations included in the data set, determining a respective time period between the successive observations”. ¶[0025], “fitting the predictive model to the training data includes fitting the predictive model to a subset of the training data corresponding to a portion of the training-input time range, wherein the portion of the training-input time range starts at a time subsequent to a starting time of the training-input time range and ends at an ending time of the training-input time range”, here, features at one or more past times or starting time is representing as initial point in time, one or more future times or ending time is representing as end time for the observation, starting time to corresponding observation is representing as feature window, and observation time to end time when event occurs is representing as outcome window);

generating a second observation indicating the event did occur in the second outcome window and a portion of data from the dataset included in the second feature window,  (Achin: ¶[0017], “determining that the time periods between the successive pairs of observations are uniform; and setting the time interval of the data set to the time period between the successive pairs of observations”. ¶[0019], “identifying all observations in the data set associated with times corresponding to the respective instance of the time interval of the time-series data; aggregating the identified observations to generate an aggregate observation”. ¶[0028-0029], “each observation included in the time-series data is assigned to the partition corresponding to the portion of the time period that matches the time associated with the observation”, here, generate an aggregate observation is representing as generating an observation, and portion of the time period associated with the observation is representing as event occurred in the outcome window);
training, by the at least one computing device, a classification model based on  second observations in the training data (Achin: ¶[0014], “the amount of training data used to train the models, the time interval between observations of the input variables, the length of the time period covered by the training data, the recentness of the time period covered by the training data, the period of time (“skip range”) between the times associated with the feature values provided to the models and the times associated with the target values predicted by the models, and the period of time (“forecast range”) for which the models predict values of the targets”, here, input variables, feature values are representing as entities, and train the model is representing as train a classification model);
identifying, by the at least one computing device, a category of a plurality of categories, to which, a subsequent observation belongs based on the trained classification model (Achin: ¶[0149], “selecting a subset of the modeling procedures may comprise selecting the modeling procedures assigned to one or more suitability categories (e.g., all modeling procedures assigned to the “suitable category””. ¶[0466], “This capability may be useful in situations where certain values occur infrequently and corresponding observations should be carefully allocated to different partitions. This capability may be useful in situations where the user has trained a model using a different machine learning system”, here, selecting one or more suitability categories is representing as identifying a category that corresponds to the observation of the trained model); and
outputting, by the at least one computing device, a result of the identifying (Achin: ¶[0283], “server 550 may use communications module 556 to communicate the outputs of the predictive modeling module 552 to the client 510”, here, communicate the outputs to the client is representing as outputting the results of the identifying).
Although, in ¶[0015] and ¶[0470-0471], Achin describes generating training data involves modeling time to event occurs, but does not distinctly disclose:
generating, by the at least one computing device, training data from a dataset based on survival analysis to predict occurrence of an event, defining a first outcome window and a corresponding first feature window defined between a first initial point in time and a first observation time, generating a first observation indicating the event did not occur in the first outcome window and a portion of data from the dataset included in the first feature window; and…the second observation window subsequent to the first observation window and the second feature window subsequent to the first feature window in the dataset
However, Dugger teaches:
generating, by the at least one computing device, training data from a dataset based on survival analysis to predict occurrence of an event (Dugger: ¶[0047-0048], “The model-development engine 116 can use one or more approaches for training or updating a given modeling algorithm. Examples of these approaches can include overlapping survival models, non-overlapping hazard models, and interval probability models … Survival analysis predicts the probability of when an event will occur” (to predict occurrence of an event)),
defining a first outcome window and a corresponding first feature window defined between a first initial point in time and a first observation time…generating a first observation indicating the event did not occur in the first outcome window and a portion of data from the dataset included in the first feature window (Dugger: ¶[0043-0049], teaches selecting training data within a “training window” including variable length “time bins” for developing a model (defining a first outcome window and a corresponding first feature window defined between a first initial point in time and a first observation time), wherein “If the response data samples 126 for an entity indicate the non-occurrence of the event of interest in the training window (a first observation indicating the event did not occur in a first said outcome window), the model-development engine 116 can set I to any time value that occurs beyond the end of the training window” (and including a portion of data from the subset included in a first said feature window). Further, this is also taught to observe an event’s occurrence or non-occurrence over different length “time bins” and including the observation training data in training the model.); and
generating a second observation indicating the event did occur in the second outcome window and a portion of data from the dataset included in the second feature window, the second observation window subsequent to the first observation window and the second feature window subsequent to the first feature window in the dataset; training, by the at least one computing device, a classification model based on the first and second observations in the training data (Dugger: ¶[0043-0049], teaches selecting training data within a “training window” including variable length “time bins” for developing a model, wherein “If the response data samples 126 for an entity indicate the non-occurrence of the event of interest in the training window (a point in time indicative of the first observation), the model-development engine 116 can set I to any time value that occurs beyond the end of the training window” (that is later than a point in time indicative of the first observation). As such, “the model-development engine 116 can count the number of time bins (months) until the first time the event occurs in the training window” (a second observation indicating the event did occur in a second said outcome window and including a portion of data from the subset included in a second said feature window, the second observation indicative of a point in time in the data). Further, this is also taught to observe an event’s occurrence or non-occurrence over different length “time bins” and including the observation training data in training the model.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to incorporate the method of generating the training data as taught by Achin with training dataset based on survival analysis and observed event occurrence or non-occurrence as taught by Dugger to predict occurrence of future events.
One would be motivated to do so to compute the probability of “surviving” up to an instant of time at which an event occurs. The survival analysis involves censoring i.e., right-censoring which means the event occurs beyond the training window, in this case, the right-censoring is equivalent to an entity remaining advantageous throughout the training window (Dugger: ¶[0048]).

Regarding claim 2, Achin in view of Dugger teaches the method as described in claim 1 and Achin further teaches:
wherein the feature window defined between a start or end point in the dataset and the observation time (Achin: ¶[0013], “the values of one or more output variables (“targets”) at one or more future times based on the values of one or more input variables (“features”) at one or more past times”. ¶[0017], “determining the time interval of the data set includes: for one or more pairs of successive observations included in the data set, determining a respective time period between the successive observations”. ¶[0025], “a subset of the training data corresponding to a portion of the training-input time range, wherein the portion of the training-input time range starts at a time subsequent to a starting time of the training-input time range and ends at an ending time of the training-input time range”, here, features at one or more past times or starting time is representing as initial point in time, and one or more future times or ending time is representing as end time for the observation).

Regarding claim 3, Achin in view of Dugger teaches the method as described in claim 1 and Achin further teaches:
wherein the generating is performed for second feature window by shifting an observation time from the first feature window (Dugger: ¶[0043-0049], teaches selecting training data within a “training window” including variable length “time bins” for developing a model, wherein “If the response data samples 126 for an entity indicate the non-occurrence of the event of interest in the training window, the model-development engine 116 can set I to any time value that occurs beyond the end of the training window” (wherein the generating is performed for second feature window by shifting an observation time from the first feature window). As such, “the model-development engine 116 can count the number of time bins (months) until the first time the event occurs in the training window” (second feature window). Further, this is also taught to observe an event’s occurrence or non-occurrence over different length “time bins” and including the observation training data in training the model.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to incorporate the method of generating the training data as taught by Achin with training dataset based on survival analysis and observed event occurrence or non-occurrence as taught by Dugger to predict occurrence of future events.
One would be motivated to do so to compute the probability of “surviving” up to an instant of time at which an event occurs. The survival analysis involves censoring i.e., right-censoring which means the event occurs beyond the training window, in this case, the right-censoring is equivalent to an entity remaining advantageous throughout the training window (Dugger: ¶[0048]).

Regarding claim 4, Achin in view of Dugger teaches the method as described in claim 3 and Achin further teaches:
wherein the shifting is based on a sliding interval that describes a defined amount of time (Dugger: ¶[0043-0049], teaches selecting training data within a “training window” including variable length “time bins” for developing a model, wherein “If the response data samples 126 for an entity indicate the non-occurrence of the event of interest in the training window, the model-development engine 116 can set I to any time value that occurs beyond the end of the training window” such as “months” (wherein the shifting is based on a sliding interval that describes a defined amount of time). As such, “the model-development engine 116 can count the number of time bins (months) until the first time the event occurs in the training window”. Further, this is also taught to observe an event’s occurrence or non-occurrence over different length “time bins” and including the observation training data in training the model.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to incorporate the method of generating the training data as taught by Achin with training dataset based on survival analysis and observed event occurrence or non-occurrence as taught by Dugger to predict occurrence of future events.
One would be motivated to do so to compute the probability of “surviving” up to an instant of time at which an event occurs. The survival analysis involves censoring i.e., right-censoring which means the event occurs beyond the training window, in this case, the right-censoring is equivalent to an entity remaining advantageous throughout the training window (Dugger: ¶[0048]).

Regarding claim 5, Achin in view of Dugger teaches the method as described in claim 1 and Achin further teaches:
wherein the plurality of categories is based on the occurrence of the event (Achin: ¶[0470], “the engine 110 may simplify their representation by combining one or more categories into a single category. Based on the relative frequency of each observed category and the frequency with which they appear relative to the values of other features”, here, frequency of each observed category is representing the occurrence of the event).

Regarding claim 6, Achin in view of Dugger teaches the method as described in claim 5 and Achin further teaches:
wherein a first said category indicates the event has occurred and a second said category indicates the event has not occurred (Achin: ¶[0470], “Based on the relative frequency of each observed category and the frequency with which they appear relative to the values of other features, the engine 110 may calculate the optimal way to combine categories. Optionally, the user may override these calculations by removing original categories from a combined category and/or putting existing categories into a combined category”, here, combining categories is representing as combining multiple categories (first and second category) where first category includes original categories that have occurred, and second category removes the original category that have not occurred).

Regarding claim 7, Achin in view of Dugger teaches the method as described in claim 1 and Achin further teaches:
wherein the trained classification model is a statistical model (Achin: ¶[0132], “A modeling technique may provide a focal point for developers and analysts to conceptualize an entire predictive modeling procedure, with all the steps expected based on the best practices in the field. In some embodiments, modeling techniques encapsulate best practices from statistical learning disciplines”, here, modeling technique from statistical learning is representing as classification model is a statistical model).

Regarding claim 8, Achin in view of Dugger teaches the method as described in claim 1 and Achin further teaches:
wherein the training is performed using machine learning (Achin: ¶[0466], “This capability may be useful in situations where the user has trained a model using a different machine learning system and wants to perform a comparison where the training, validation, and holdout partitions are the same”).

Regarding claim 9, Achin teaches in a digital medium analytics environment, a system comprising: 
a processing system; and a computer-readable storage medium having instructions stored thereon that, responsive to execution by the processing system (Achin: ¶[0016, 0060-0061, and 0474-0478] teaches an “apparatus (system), including a memory (CRM) configured to store processor-executable instructions; and a processor configured to execute the processor-executable instructions, wherein executing the processor-executable instructions causes the apparatus to perform steps” listed in the embodiments of the disclosure), causes the processing system to perform operations including:
generating training data from a dataset , the generating including (Achin: ¶[0007], “Data analysts can use analytic techniques and computational infrastructures to build predictive models from electronic data, including operations and evaluation data”. ¶[0015], “generating training data from the time-series data, wherein the training data include a first subset of the observations of at least one of the data sets”, here, time-series data is representing as modeling time to event data that involves survival analysis. ¶[0017], “the time interval of the time-series data is determined based, at least in part, on the times associated with at least a subset of the observations included in at least one of the data sets”. ¶[0042], “obtaining time-series data including one or more data sets, wherein each data set includes a plurality of observations, wherein each observation includes … identifying one or more of the variables as targets, and identifying zero or more other variables as features”. ¶[0336], “the exploration engine 110 loads a dataset (e.g., at step 404 of the method 400 illustrated in FIG. 4), it may automatically detect whether the dataset appears to contain time series data”, here, exploration engine 110 loads a dataset at step 404 in Fig. 4 is representing as dataset location module, and identifying & obtaining data sets with variables as features is representing as locating data set to corresponding entities):
shifting an observation time within a dataset, the shifting defining a plurality of outcome windows and a corresponding plurality of feature windows (Achin: ¶[0028], “a sliding training window covering a first range of training times and each observation included in the first subset is associated with a time within the first range of training times, the third subset of observations corresponds to the sliding training window (shifting) covering a second range of training times and each observation included in the third subset is associated with a time within the second range of training times, and an earliest time in the first range of training times is earlier than an earliest time in the second range of training times” (a plurality of outcome windows and a corresponding plurality of feature windows), here, sliding window is performing the shifting of observation time and a time within the first range of training times is representing as the specified amount of time for the outcome window, and predictive module in Fig. 5 includes sliding or time shifting module); and
generating a plurality of observations, the plurality of observations based on the shifting (Achin: ¶[0017], “determining that the time periods between the successive pairs of observations are uniform; and setting the time interval of the data set to the time period between the successive pairs of observations”. ¶[0019], “identifying all observations in the data set associated with times corresponding to the respective instance of the time interval of the time-series data; aggregating the identified observations to generate an aggregate observation”. ¶[0029], “each observation included in the time-series data is assigned to the partition corresponding to the portion of the time period that matches the time associated with the observation”, here, generate an aggregate observation is representing as generate an observation, portion of the time period associated with the observation is representing as event occurred in the outcome window (event did…occur), and predictive module in Fig. 5 includes an observation module), 
generating a classification model based on the plurality of observations in the training data (Achin: ¶[0014], “the amount of training data used to train the models, the time interval between observations of the input variables, the length of the time period covered by the training data, the recentness of the time period covered by the training data, the period of time (“skip range”) between the times associated with the feature values provided to the models and the times associated with the target values predicted by the models, and the period of time (“forecast range”) for which the models predict values of the targets”. ¶[0466], “This capability may be useful in situations where the user has trained a model using a different machine learning system”, here, input variables, feature values are representing as entities, and train the model is representing as train a classification model).
Although, in ¶[0015] and ¶[0470-0471], Achin describes generating training data involves modeling time to event occurs, but does not distinctly disclose:
generate training data from a dataset based on survival analysis to predict occurrence of an event, and at least one said observation indicating the event did not occur in a particular outcome window of the plurality of outcome windows and including a portion of the data included in a respective said feature window.
However, Dugger teaches:
generate training data from a dataset based on survival analysisto predict occurrence of an event (Dugger: ¶[0047-0048], “The model-development engine 116 can use one or more approaches for training or updating a given modeling algorithm. Examples of these approaches can include overlapping survival models, non-overlapping hazard models, and interval probability models … Survival analysis predicts the probability of when an event will occur” (to predict occurrence of an event)), and
at least one said observation indicating the event did not occur in a particular outcome window of the plurality of outcome windows and including a portion of the data included in a respective said feature window (Dugger: ¶[0043-0049], teaches selecting training data within a “training window” including variable length “time bins” for developing a model, wherein “If the response data samples 126 for an entity indicate the non-occurrence of the event of interest in the training window (at least one said observation indicating the event did not occur in a particular outcome window), the model-development engine 116 can set I to any time value that occurs beyond the end of the training window” (of the plurality of outcome windows and including a portion of the data included in a respective said feature window). ¶[0081], teaches “Iterating block 306 can create a set of timing-prediction models that span the entire training windows” (plurality of outcome windows). Further, this is also taught to observe an event’s occurrence or non-occurrence over different length “time bins” and including the observation training data in training the model.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to incorporate the method of generating the training data as taught by Achin with training dataset based on survival analysis and observed event occurrence or non-occurrence as taught by Dugger to predict occurrence of future events.
One would be motivated to do so to compute the probability of “surviving” up to an instant of time at which an event occurs. The survival analysis involves censoring i.e., right-censoring which means the event occurs beyond the training window, in this case, the right-censoring is equivalent to an entity remaining advantageous throughout the training window (Dugger: ¶[0048]).

Regarding claim 10, Achin in view of Dugger teaches the system as described in claim 9 and Achin further teaches:
wherein the classification model is configured to determine which of a plurality of categories corresponds to a subsequent observation (Achin: ¶[0149], “selecting a subset of the modeling procedures may comprise selecting the modeling procedures assigned to one or more suitability categories (e.g., all modeling procedures assigned to the “suitable category””. ¶[0466], “This capability may be useful in situations where certain values occur infrequently and corresponding observations should be carefully allocated to different partitions. This capability may be useful in situations where the user has trained a model using a different machine learning system”, here, model is representing as classification model, and selecting the model assigned to the suitable category corresponding observations is representing as determine categories corresponds to subsequent observation).

Regarding claim 11, Achin in view of Dugger teaches the system as described in claim 10 and Achin further teaches:
wherein a first said category indicates the event has occurred and a second said category indicates the event has not occurred (Achin: ¶[0470], “Based on the relative frequency of each observed category and the frequency with which they appear relative to the values of other features, the engine 110 may calculate the optimal way to combine categories. Optionally, the user may override these calculations by removing original categories from a combined category and/or putting existing categories into a combined category”, here, combining each category is representing as combining multiple categories (first and second category) where first category includes original categories that have occurred, and second category removes the original category that have not occurred).

Regarding claim 12, Achin in view of Dugger teaches the system as described in claim 9 and Achin further teaches:
wherein the shifting of the observations times is based on a sliding interval that describes a defined amount of time used to shift the observation times (Achin: ¶[0028], “the first subset of observations corresponds to a sliding training window covering a first range of training times and each observation included in the first subset is associated with a time within the first range of training times, the third subset of observations corresponds to the sliding training window covering a second range of training times and each observation included in the third subset is associated with a time within the second range of training times, and an earliest time in the first range of training times is earlier than an earliest time in the second range of training times”, here, sliding window is performing the shifting of observation time and an earliest time in the first range is earlier than an earliest time in the second range is representing as the generating the observation based on the shifted observation time).

Regarding claim 13, Achin in view of Dugger teaches the system as described in claim 12 and Achin further teaches:
wherein the shifting of the observations times by the time shifting module is based on a sliding interval that describes a defined amount of time used to shift at least one said observation times (Achin: ¶[0028], “the first subset of observations corresponds to a sliding training window covering a first range of training times and each observation included in the first subset is associated with a time within the first range of training times, the third subset of observations corresponds to the sliding training window covering a second range of training times and each observation included in the third subset is associated with a time within the second range of training times, and an earliest time in the first range of training times is earlier than an earliest time in the second range of training times”, here, a time within the second range of training times is representing as the specified amount of time for the outcome window).

Regarding claim 14, Achin teaches in a digital medium analytics environment, a method implemented by at least one computing device, the method comprising: receiving, by the at least one computing device, data describing an observation (Achin: ¶[0016, 0060-0061, and 0474-0478] teaches a “computer system, apparatus”…etc. “including a memory configured to store processor-executable instructions; and a processor configured to execute the processor-executable instructions, wherein executing the processor-executable instructions causes the apparatus to perform steps” listed in the embodiments of the disclosure; including as taught in ¶[0357], “The time-series data may be obtained from any suitable source using any suitable technique (measured using sensors, received via a communication network, loaded from a computer-readable medium, etc.). The time-series data may include one or more data sets, each of which may include one or more observations”); and
classifying, by the at least one computing device, the observation into a respective category of a plurality of categories using a classification model, the classification model trained using training data generated from a dataset  by analyzing an expected duration of time until an event occurs, the generation of the training data including: 
locating subsets of data from the dataset as corresponding to respective entities of a plurality of entities (Achin: ¶[0016, 0060-0061, and 0474-0478] teaches a “computer system”, that as taught in ¶[0149 and 0245], “selecting [at least] a subset of the modeling procedures may comprise selecting the modeling procedures assigned to one or more suitability categories (e.g., all modeling procedures assigned to the “suitable category””. ¶[0466], “This capability may be useful in situations where certain values occur infrequently and corresponding observations should be carefully allocated to different partitions. This capability may be useful in situations where the user has trained a model using a different machine learning system”. ¶[0015], “generating training data from the time-series data, wherein the training data include a first subset of the observations of at least one of the data sets”. ¶[0017], “the time interval of the time-series data is determined based, at least in part, on the times associated with at least a subset of the observations included in at least one of the data sets”. ¶[0042], “obtaining time-series data including one or more data sets, wherein each data set includes a plurality of observations, wherein each observation includes … identifying one or more of the variables as targets, and identifying zero or more other variables as features”, here, selecting one or more suitability categories is represented as identifying a category that corresponds to the observation of the trained model, time-series data is represented as modeling time to event occurs, and identifying & obtaining data sets with variables as features is represented as locating data set to corresponding entities);
setting observation times with respect to the subsets, the observation times defining respective outcome windows and respective feature windows (Achin: ¶[0013], “the values of one or more output variables (“targets”) at one or more future times based on the values of one or more input variables (“features”) at one or more past times”. ¶[0017], “setting the time interval of the time-series data to the time interval of the data sets. In some embodiments, determining the time interval of the data set includes: for one or more pairs of successive observations included in the data set, determining a respective time period between the successive observations”. ¶[0025-0029 and 0346], “the training data corresponding to a portion of the training-input time range, wherein the portion of the training-input time range starts at a time subsequent to a starting time of the training-input time range and ends at an ending time of the training-input time range” and collecting “observations…of training times” within the training data and “windows”, here, starting time to corresponding observation is represented as feature windows, and observation times to end time when event occurs is represented as outcome windows); and
generating a plurality of observations based on the observation times, the plurality of observations including: 
 and
a second observation indicating the event did occur in a second said outcome window and including a portion of data from the subset included in a second said feature window,  (Achin: ¶[0017], “determining that the time periods between the successive pairs of observations are uniform; and setting the time interval of the data set to the time period between the successive pairs of observations”. ¶[0019], “identifying all observations in the data set associated with times corresponding to the respective instance of the time interval of the time-series data; aggregating the identified observations to generate an aggregate observation”. ¶[0028-0029], “each observation included in the time-series data is assigned to the partition corresponding to the portion of the time period that matches the time associated with the observation”, here, generate an aggregate observation is representing as generating an observation, and portion of the time period associated with the observation is representing as event occurred in the outcome window).
Although, in ¶[0015] and ¶[0470-0471], Achin describes generating training data involves modeling time to event occurs, but does not distinctly disclose:
training data generated from a dataset based on survival analysis to predict occurrence of an event, a first observation indicating the event did not occur in a first said outcome window and including a portion of data from the subset included in a first said feature window; and…the second observation indicative of a point in time in the data that is later than a point in time indicative of the first observation.
However, Dugger teaches:
training data generated from a dataset based on survival analysis to predict occurrence of an event (Dugger: ¶[0047-0048], “The model-development engine 116 can use one or more approaches for training or updating a given modeling algorithm. Examples of these approaches can include overlapping survival models, non-overlapping hazard models, and interval probability models … Survival analysis predicts the probability of when an event will occur” (to predict occurrence of an event)), and
a first observation indicating the event did not occur in a first said outcome window and including a portion of data from the subset included in a first said feature window (Dugger: ¶[0043-0049], teaches selecting training data within a “training window” including variable length “time bins” for developing a model, wherein “If the response data samples 126 for an entity indicate the non-occurrence of the event of interest in the training window (a first observation indicating the event did not occur in a first said outcome window), the model-development engine 116 can set I to any time value that occurs beyond the end of the training window” (and including a portion of data from the subset included in a first said feature window). Further, this is also taught to observe an event’s occurrence or non-occurrence over different length “time bins” and including the observation training data in training the model.); and
a second observation indicating the event did occur in a second said outcome window and including a portion of data from the subset included in a second said feature window, the second observation indicative of a point in time in the data that is later than a point in time indicative of the first observation (Dugger: ¶[0043-0049], teaches selecting training data within a “training window” including variable length “time bins” for developing a model, wherein “If the response data samples 126 for an entity indicate the non-occurrence of the event of interest in the training window (a point in time indicative of the first observation), the model-development engine 116 can set I to any time value that occurs beyond the end of the training window” (that is later than a point in time indicative of the first observation). As such, “the model-development engine 116 can count the number of time bins (months) until the first time the event occurs in the training window” (a second observation indicating the event did occur in a second said outcome window and including a portion of data from the subset included in a second said feature window, the second observation indicative of a point in time in the data). Further, this is also taught to observe an event’s occurrence or non-occurrence over different length “time bins” and including the observation training data in training the model.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to incorporate the method of generating the training data as taught by Achin with training dataset based on survival analysis and observed event occurrence or non-occurrence as taught by Dugger to predict occurrence of future events.
One would be motivated to do so to compute the probability of “surviving” up to an instant of time at which an event occurs. The survival analysis involves censoring i.e., right-censoring which means the event occurs beyond the training window, in this case, the right-censoring is equivalent to an entity remaining advantageous throughout the training window (Dugger: ¶[0048]).

Regarding claim 15, Achin in view of Dugger teaches the method as described in claim 14 and Achin further teaches:
wherein the classification model is configured to determine which of a plurality of categories corresponds to a subsequent observation (Achin: ¶[0149], “selecting a subset of the modeling procedures may comprise selecting the modeling procedures assigned to one or more suitability categories (e.g., all modeling procedures assigned to the “suitable category””. ¶[0466], “This capability may be useful in situations where certain values occur infrequently and corresponding observations should be carefully allocated to different partitions. This capability may be useful in situations where the user has trained a model using a different machine learning system”, here, model is representing as classification model, and selecting the model assigned to the suitable category corresponding observations is representing as determine categories corresponds to subsequent observation).

Regarding claim 16, Achin in view of Dugger teaches the method as described in claim 15 and Achin further teaches:
wherein a first said category indicates the event has occurred and a second said category indicates the event has not occurred (Achin: ¶[0470], “Based on the relative frequency of each observed category and the frequency with which they appear relative to the values of other features, the engine 110 may calculate the optimal way to combine categories. Optionally, the user may override these calculations by removing original categories from a combined category and/or putting existing categories into a combined category”, here, combining each category is representing as combining multiple categories (first and second category) where first category includes original categories that have occurred, and second category removes the original category that have not occurred).

Regarding claim 17, Achin in view of Dugger teaches the method as described in claim 14 and Achin further teaches:
wherein the generating is performed for the plurality of said observations by shifting the observation time and repeating the generating of the observation based on the shifted observation time (Achin: ¶[0028], “the first subset of observations corresponds to a sliding training window covering a first range of training times and each observation included in the first subset is associated with a time within the first range of training times, the third subset of observations corresponds to the sliding training window covering a second range of training times and each observation included in the third subset is associated with a time within the second range of training times, and an earliest time in the first range of training times is earlier than an earliest time in the second range of training times”, here, sliding window is performing the shifting of observation time and an earliest time in the first range is earlier than an earliest time in the second range is representing as the generating the observation based on the shifted observation time).

Regarding claim 18, Achin in view of Dugger teaches the method as described in claim 17 and Achin further teaches:
wherein the first observation indicates a particular observation window of a plurality of said observation windows in the data, during which, the event did not occur (Dugger: ¶[0043-0049], teaches selecting training data within a “training window” including variable length “time bins” for developing a model, wherein “If the response data samples 126 for an entity indicate the non-occurrence of the event of interest in the training window (first observation indicates a particular observation window of a plurality of said observation windows in the data, during which, the event did not occur), the model-development engine 116 can set I to any time value that occurs beyond the end of the training window”. Further, this is also taught to observe an event’s occurrence or non-occurrence over different length “time bins” and including the observation training data in training the model.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to incorporate the method of generating the training data as taught by Achin with training dataset based on survival analysis and observed event occurrence or non-occurrence as taught by Dugger to predict occurrence of future events.
One would be motivated to do so to compute the probability of “surviving” up to an instant of time at which an event occurs. The survival analysis involves censoring i.e., right-censoring which means the event occurs beyond the training window, in this case, the right-censoring is equivalent to an entity remaining advantageous throughout the training window (Dugger: ¶[0048]).

Regarding claim 19, Achin in view of Dugger teaches the method as described in claim 18 and Achin further teaches:
wherein the shifting is performed for a defined amount of time corresponds to an amount of time specified for the first said outcome window or the second outcome window (Dugger: ¶[0043-0049], teaches selecting training data within a “training window” including variable length “time bins” for developing a model, wherein “If the response data samples 126 for an entity indicate the non-occurrence of the event of interest in the training window, the model-development engine 116 can set I to any time value that occurs beyond the end of the training window” such as “months” (wherein the shifting is performed for a defined amount of time corresponds to an amount of time specified for the first said outcome window or the second outcome window). As such, “the model-development engine 116 can count the number of time bins (months) until the first time the event occurs in the training window”. Further, this is also taught to observe an event’s occurrence or non-occurrence over different length “time bins” and including the observation training data in training the model.).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to incorporate the method of generating the training data as taught by Achin with training dataset based on survival analysis and observed event occurrence or non-occurrence as taught by Dugger to predict occurrence of future events.
One would be motivated to do so to compute the probability of “surviving” up to an instant of time at which an event occurs. The survival analysis involves censoring i.e., right-censoring which means the event occurs beyond the training window, in this case, the right-censoring is equivalent to an entity remaining advantageous throughout the training window (Dugger: ¶[0048]).

Regarding claim 20, Achin in view of Dugger teaches the method as described in claim 14 and Achin further teaches:
wherein the event involves operation of a device (Achin: ¶[0016], “A system of one or more computers can be configured to perform particular actions by virtue of having software, firmware, hardware, or a combination of them installed on the system that in operation causes or cause the system to perform the actions”, here, particular action is representing as event).

Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Pyle et al (US Pub 20120185414) teaches using a statistical forecast model for predicting based on observed events.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CLINT MULLINAX whose telephone number is 571-272-3241.  The examiner can normally be reached on Mon - Fri 8:00-4:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on 571-270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/C.M./Examiner, Art Unit 2123                                                                                                                                                                                                        


/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123