Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
Status of the Application
The following is a non-Final Office Action. 
In response to Examiner’s communication on 1/13/2022, Applicant Request for Continuation Examination on 3/14/2022. Amended Claim 1, 13, 17. Added Claim 21.


Claims 1-8 and 10-21 are now pending in this application and have been examined. 


Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 3/14/2022 has been entered. 




	
Response to Amendment

Applicant's amendments to claims 1, 13, 17 are not sufficient to overcome the 35 USC 101 rejections set forth in the previous action. 

Applicant's amendments to Claims 1, 13, 17 are not sufficient to overcome the prior art rejections set forth in the previous action.



Response to Arguments – 35 USC § 101
Applicant’s arguments with respect to the rejections have been fully considered, but they are not persuasive. 

Applicant submits, “...the Present Application is directed to the practical application of using and updating multiple machine learning prediction models and determining and using a best model...describes a technical benefit of the claimed solution, in that "[t]he compressed sequences 314 can be generated and used (instead of uncompressed data) for performance reasons...The various machine learning approaches can provide other technical advantages...the claims describe a technical solution that solves a problem not solved by other approaches. For example, "[f]or other time series based prediction systems, traditional time series based estimates can be made when sufficient transaction data exists. When data is sparse, the models and approaches described [in this Application] can be used." Id. at [0019]. The claimed solution provides other technical benefits. For example, with the claimed solution, "resource use, including use of order tracking and other computing devices or systems, can be reduced....” Examiner respectfully disagrees.

Examiner notes, while Applicant’s amendments and new additional elements further prosecution, the amendments and new additional elements do not recite sufficient specificity to integrate the identified abstract ideas into technical solution or practical application. 

For example, under the broadest reasonable interpretation, as recited, the computer elements and machine learning models are recited at a high level of generality, and the outcomes of the machine learning models comparison can be compared by a human using a human mind, as such do not integrate the abstract ideas into a practical application or a technical solution.

Examiner invites Applicant to schedule an interview with the Examiner at the Applicant’s convenience to discuss clarifying amendments to expedite the prosecution of the present application.



Response to Arguments – Prior Art
Applicant’s arguments with respect to the rejections have been fully considered, but they are not persuasive. However, Applicant’s arguments are moot in light of new grounds of rejection necessitated by Applicant’s amendments. 

Examiner invites Applicant to schedule an interview with the Examiner at the Applicant’s convenience to discuss clarifying amendments to expedite the prosecution of the present application.














Claim Rejections – 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-8, 10-21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

Claim 1 (similarly 13 and 17) recite,
“A ... method comprising: 
receiving a request to predict transaction quantities for a plurality of transaction entities for a future time period; 
identifying historical transaction data for the transaction entities for a plurality of categories of transacted items, wherein the plurality of categories are organized using a hierarchy of levels; 
dividing the future time period into a set of multiple time units; 
for each respective time unit of the multiple time units:
iterating over multiple levels of the hierarchy starting at a lowest level, wherein the iterating includes, for each current level in the iteration: 
identifying features to include in a quantity forecasting model for the current level and the time unit; 
training the quantity forecasting model for the current level and the time unit including building the identified features into the quantity forecasting model for the current level and the time unit; 
training multiple ... transaction date prediction models using training data retrieved from the historical transaction data for the current level; 
identifying, by each respective ... transaction date prediction model, predicted transaction dates in the time unit predicted for the current level; 
comparing each of the multiple ... transaction date prediction models based on prediction accuracy data generated from testing data retrieved from the historical transaction data for the current level; 
selecting a best ... transaction date prediction model for the current level based on the comparison of all of the multiple ... transaction date prediction models, wherein a first best ... transaction date prediction model selected for a first level is a different type of ... model than a second best ... transaction date prediction model selected for a different second level; 
training the best ... transaction date prediction model based on both the testing data for the current level and the training data for the current level; 
computing values for the features using predicted transaction dates in the time unit generated by the best ... transaction date prediction model and the historical transaction data; 
using the quantity forecasting model to generate predicted quantity information for the current level for the predicted transaction dates in the time unit; and 
using the predicted quantity information for the current level for the predicted transaction dates in the time unit to identify updated features to use in an updated quantity forecasting model for a next time unit of the multiple time units; 
aggregating predicted quantity information for multiple levels into aggregated quantity prediction information; and 
providing the aggregated quantity prediction information in response to the request.”


Analyzing under Step 2A, Prong 1:
The limitations regarding, …receiving a request to predict transaction quantities for a plurality of transaction entities for a future time period; identifying historical transaction data for the transaction entities for a plurality of categories of transacted items, wherein the plurality of categories are organized using a hierarchy of levels; dividing the future time period into a set of multiple time units; for each respective time unit of the multiple time units: iterating over multiple levels of the hierarchy starting at a lowest level, wherein the iterating includes, for each current level in the iteration: identifying features to include in a quantity forecasting model for the current level and the time unit; training the quantity forecasting model for the current level and the time unit including building the identified features into the quantity forecasting model for the current level and the time unit; training multiple ... transaction date prediction models using training data retrieved from the historical transaction data for the current level; identifying, by each respective ... transaction date prediction model, predicted transaction dates in the time unit predicted for the current level; comparing each of the multiple ... transaction date prediction models based on prediction accuracy data generated from testing data retrieved from the historical transaction data for the current level; selecting a best ... transaction date prediction model for the current level based on the comparison of all of the multiple ... transaction date prediction models, wherein a first best ... transaction date prediction model selected for a first level is a different type of ... model than a second best ... transaction date prediction model selected for a different second level; training the best ... transaction date prediction model based on both the testing data for the current level and the training data for the current level; computing values for the features using predicted transaction dates in the time unit generated by the best ... transaction date prediction model and the historical transaction data; using the quantity forecasting model to generate predicted quantity information for the current level for the predicted transaction dates in the time unit; and using the predicted quantity information for the current level for the predicted transaction dates in the time unit to identify updated features to use in an updated quantity forecasting model for a next time unit of the multiple time units; aggregating predicted quantity information for multiple levels into aggregated quantity prediction information; and providing the aggregated quantity prediction information in response to the request..., under the broadest reasonable interpretation, may be interpreted to include a human using their mind and with pen and paper to, receiving a request to predict transaction quantities for a plurality of transaction entities for a future time period; identifying historical transaction data for the transaction entities for a plurality of categories of transacted items, wherein the plurality of categories are organized using a hierarchy of levels; dividing the future time period into a set of multiple time units; for each respective time unit of the multiple time units: iterating over multiple levels of the hierarchy starting at a lowest level, wherein the iterating includes, for each current level in the iteration: identifying features to include in a quantity forecasting model for the current level and the time unit; training the quantity forecasting model for the current level and the time unit including building the identified features into the quantity forecasting model for the current level and the time unit; training multiple ... transaction date prediction models using training data retrieved from the historical transaction data for the current level; identifying, by each respective ... transaction date prediction model, predicted transaction dates in the time unit predicted for the current level; comparing each of the multiple ... transaction date prediction models based on prediction accuracy data generated from testing data retrieved from the historical transaction data for the current level; selecting a best ... transaction date prediction model for the current level based on the comparison of all of the multiple ... transaction date prediction models, wherein a first best ... transaction date prediction model selected for a first level is a different type of ... model than a second best ... transaction date prediction model selected for a different second level; training the best ... transaction date prediction model based on both the testing data for the current level and the training data for the current level; computing values for the features using predicted transaction dates in the time unit generated by the best ... transaction date prediction model and the historical transaction data; using the quantity forecasting model to generate predicted quantity information for the current level for the predicted transaction dates in the time unit; and using the predicted quantity information for the current level for the predicted transaction dates in the time unit to identify updated features to use in an updated quantity forecasting model for a next time unit of the multiple time units; aggregating predicted quantity information for multiple levels into aggregated quantity prediction information; and providing the aggregated quantity prediction information in response to the request…; therefore, the claims are directed to a mental process. 

Further, ...receiving a request to predict transaction quantities for a plurality of transaction entities for a future time period; identifying historical transaction data for the transaction entities for a plurality of categories of transacted items, wherein the plurality of categories are organized using a hierarchy of levels; dividing the future time period into a set of multiple time units; for each respective time unit of the multiple time units: iterating over multiple levels of the hierarchy starting at a lowest level, wherein the iterating includes, for each current level in the iteration: identifying features to include in a quantity forecasting model for the current level and the time unit; training the quantity forecasting model for the current level and the time unit including building the identified features into the quantity forecasting model for the current level and the time unit; training multiple ... transaction date prediction models using training data retrieved from the historical transaction data for the current level; identifying, by each respective ... transaction date prediction model, predicted transaction dates in the time unit predicted for the current level; comparing each of the multiple ... transaction date prediction models based on prediction accuracy data generated from testing data retrieved from the historical transaction data for the current level; selecting a best ... transaction date prediction model for the current level based on the comparison of all of the multiple ... transaction date prediction models, wherein a first best ... transaction date prediction model selected for a first level is a different type of ... model than a second best ... transaction date prediction model selected for a different second level; training the best ... transaction date prediction model based on both the testing data for the current level and the training data for the current level; computing values for the features using predicted transaction dates in the time unit generated by the best ... transaction date prediction model and the historical transaction data; using the quantity forecasting model to generate predicted quantity information for the current level for the predicted transaction dates in the time unit; and using the predicted quantity information for the current level for the predicted transaction dates in the time unit to identify updated features to use in an updated quantity forecasting model for a next time unit of the multiple time units; aggregating predicted quantity information for multiple levels into aggregated quantity prediction information; and providing the aggregated quantity prediction information in response to the request..., under the broadest reasonable interpretation, may be managing human retailer transaction entities, human customer transaction entities, and human salesperson transaction entities’ historical transactions to predict human future transactions, therefore it is managing personal behavior or relationships or interactions between people, Moreover, identifying historical transaction data for the transaction entities for a plurality of categories of transacted items, wherein the plurality of categories are organized using a hierarchy of levels...aggregating predicted quantity information for multiple levels into aggregated quantity prediction information...providing the aggregated quantity prediction information in response to the request, under the broadest reasonable interpretation, is fundamental economic practice and commercial or legal interactions. Thus, the claims are directed to certain methods of organizing human activity. 

Additionally, ...dividing the future time period into a set of multiple time units; for each respective time unit of the multiple time units: iterating over multiple levels of the hierarchy starting at a lowest level, wherein the iterating includes, for each current level in the iteration: identifying features to include in a quantity forecasting model for the current level and the time unit; training the quantity forecasting model for the current level and the time unit including building the identified features into the quantity forecasting model for the current level and the time unit; training multiple ... transaction date prediction models using training data retrieved from the historical transaction data for the current level; identifying, by each respective ... transaction date prediction model, predicted transaction dates in the time unit predicted for the current level; comparing each of the multiple ... transaction date prediction models based on prediction accuracy data generated from testing data retrieved from the historical transaction data for the current level; selecting a best ... transaction date prediction model for the current level based on the comparison of all of the multiple ... transaction date prediction models, wherein a first best ... transaction date prediction model selected for a first level is a different type of ... model than a second best ... transaction date prediction model selected for a different second level; training the best ... transaction date prediction model based on both the testing data for the current level and the training data for the current level; computing values for the features using predicted transaction dates in the time unit generated by the best ... transaction date prediction model and the historical transaction data; using the quantity forecasting model to generate predicted quantity information for the current level for the predicted transaction dates in the time unit; and using the predicted quantity information for the current level for the predicted transaction dates in the time unit to identify updated features to use in an updated quantity forecasting model for a next time unit of the multiple time units; aggregating predicted quantity information for multiple levels into aggregated quantity prediction information;…, is directed to mathematical concepts. 

Accordingly, the claims are directed to a mental process and certain methods of organizing human activities, and mathematical concepts, and thus, the claims are directed to an abstract idea under the first prong of Step 2A.

Analyzing under Step 2A, Prong 2:
This judicial exception is not integrated into a practical application under the second prong of Step 2A. 
In particular, the claims recite the additional elements beyond the recited abstract idea identified under Step 2A, Prong 1, such as:

Claim 1, 13, 17: computer-implemented, training multiple machine learning transaction date prediction models using training data, machine learning, training the best machine learning transaction date prediction model, A system comprising: one or more computers; and a computer-readable medium coupled to the one or more computers having instructions stored thereon which, when executed by the one or more computers, cause the one or more computers to perform operations, A computer program product encoded on a non-transitory storage medium, the product comprising non-transitory, computer readable instructions for causing one or more processors to perform operations
Claim 21: compressed, decompressing

, and pursuant to the broadest reasonable interpretation, as an ordered combination, each of the additional elements are computing elements recited at high level of generality implementing the abstract idea, and thus, are no more than applying the abstract idea with generic computer components. Further, these additional elements generally link the abstract idea to a technical environment, namely the environment of a computer. 

Additionally, with respect to, receiving a request…, identifying historical transaction data..., providing the aggregated quantity prediction information..., these elements do not add a meaningful limitations to integrate the abstract idea into a practical application because they are extra-solution activity, pre and post solution activity - i.e. data gathering – receiving a request…, identifying historical transaction data..., data output – providing the aggregated quantity prediction information....


Analyzing under Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under Step 2B. 
As noted above, the aforementioned additional elements beyond the recited abstract idea are not sufficient to amount to significantly more than the recited abstract idea because, as an order combination, the additional elements are no more than mere instructions to implement the idea using generic computer components (i.e. apply it). 
Additionally, as an order combination, the additional elements append the recited abstract idea to well-understood, routine, and conventional activities in the field as individually evinced by the applicant’s own disclosure, as required by the Berkheimer Memo, in at least: 
[0021] FIG. 1 is a block diagram illustrating an example system 100 for proactively predicting demand based on sparse transaction data. Specifically, the illustrated system 100 includes or is communicably coupled with a server 102, a client device 104, and a network 106. Although shown separately, in some implementations, functionality of two or more systems or servers may be provided by a single system or server. In some implementations, the functionality of one illustrated system, server, or component may be provided by multiple systems, servers, or components, respectively. 
[0030] As used in the present disclosure, the term "computer" is intended to encompass any suitable processing device. For example, although FIG. 1 illustrates a single server 102, and a single client device 104, the system 100 can be implemented using a single, stand-alone computing device, two or more servers 102, or two or more client devices 104. Indeed, the server 102 and the client device 104 may be any computer or processing device such as, for example, a blade server, general-purpose personal computer (PC), Mac@, workstation, UNIX-based workstation, or any other suitable device. In other words, the present disclosure contemplates computers other than general purpose computers, as well as computers without conventional operating systems. Further, the server 102 and the client device 104 may be adapted to execute any operating system, including Linux, UNIX, Windows, Mac OS@, JavaTM, AndroidTM, iGS or any other suitable operating system. According to one implementation, the server 102 may also include or be communicably coupled with an e-mail server, a Web server, a caching server, a streaming data server, and/or other suitable server. 
[0031] Interfaces 150 and 152 are used by the client device 104 and the server 102, respectively, for communicating with other systems in a distributed environment - including within the system 100 - connected to the network 106. Generally, the interfaces 150 and 152 each comprise logic encoded in software and/or hardware in a suitable combination and operable to communicate with the network 106. More specifically, the interfaces 150 and 152 may each comprise software supporting one or more communication protocols associated with communications such that the network 106 or interface's hardware is operable to communicate physical signals within and outside of the illustrated system 100. 
[0032] The server 102 includes one or more processors 154. Each processor 154 may be a central processing unit (CPU), a blade, an application specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or another suitable component. Generally, each processor 154 executes instructions and manipulates data to perform the operations of the server 102. Specifically, each processor 154 executes the functionality required to receive and respond to requests from the client device 104, for example. 
[0033] Regardless of the particular implementation, "software" may include computer-readable instructions, firmware, wired and/or programmed hardware, or any combination thereof on a tangible medium (transitory or non-transitory, as appropriate) operable when executed to perform at least the processes and operations described herein. Indeed, each software component may be fully or partially written or described in any appropriate computer language including C, C++, JavaTM, JavaScript®, Visual Basic, assembler, Perl®, any suitable version of 4GL, as well as others. While portions of the software illustrated in FIG. 1 are shown as individual modules that implement the various features and functionality through various objects, methods, or other processes, the software may instead include a number of sub-modules, third-party services, components, libraries, and such, as appropriate. Conversely, the features and functionality of various components can be combined into single components as appropriate. 
[0034] The server 102 includes memory 156. In some implementations, the server 102 includes multiple memories. The memory 156 may include any type of memory or database module and may take the form of volatile and/or non-volatile memory including, without limitation, magnetic media, optical media, random access memory (RAM), read-only memory (ROM), removable media, or any other suitable local or remote memory component. The memory 156 may store various objects or data, including caches, classes, frameworks, applications, backup data, business objects, jobs, web pages, web page templates, database tables, database queries, repositories storing business and/or dynamic information, and any other appropriate information including any parameters, variables, algorithms, instructions, rules, constraints, or references thereto associated with the purposes of the server 102. 
[0035] The client device 104 may generally be any computing device operable to connect to or communicate with the server 102 via the network 106 using a wireline or wireless connection. In general, the client device 104 comprises an electronic computer device operable to receive, transmit, process, and store any appropriate data associated with the system 100 of FIG. 1. The client device 104 can include one or more client applications, including the prediction application 108. A client application is any type of application that allows the client device 104 to request and view content on the client device 104. In some implementations, a client application can use parameters, metadata, and other information received at launch to access a particular set of data from the server 102. In some instances, a client application may be an agent or client-side version of the one or more enterprise applications running on an enterprise server (not shown). 
[0036] The client device 104 further includes one or more processors 158. Each processor 158 included in the client device 104 may be a central processing unit (CPU), an application specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or another suitable component. Generally, each processor 158 included in the client device 104 executes instructions and manipulates data to perform the operations of the client device 104. Specifically, each processor 158 included in the client device 104 executes the functionality required to send requests to the server 102 and to receive and process responses from the server 102. 
[0037] The client device 104 is generally intended to encompass any client computing device such as a laptop/notebook computer, wireless data port, smart phone, personal data assistant (PDA), tablet computing device, one or more processors within these devices, or any other suitable processing device. For example, the client device 104 may comprise a computer that includes an input device, such as a keypad, touch screen, or other device that can accept user information, and an output device that conveys information associated with the operation of the server 102, or the client device 104 itself, including digital data, visual information, or a GUI 160. 
[0040] There may be any number of client devices 104 associated with, or external to, the system 100. For example, while the illustrated system 100 includes one client device 104, alternative implementations of the system 100 may include multiple client devices 104 communicably coupled to the server 102 and/or the network 106, or any other number suitable to the purposes of the system 100. Additionally, there may also be one or more additional client devices 104 external to the illustrated portion of system 100 that are capable of interacting with the system 100 via the network 106. Further, the term "client", "client device" and "user" may be used interchangeably as appropriate without departing from the scope of this disclosure. Moreover, while the client device 104 is described in terms of being used by a single user, this disclosure contemplates that many users may use one computer, or that one user may use multiple computers. 
[0146] The preceding figures and accompanying description illustrate example processes and computer-implementable techniques. But system 100 (or its software or other components) contemplates using, implementing, or executing any suitable technique for performing these and other tasks. It will be understood that these processes are for illustration purposes only and that the described or similar techniques may be performed at any    appropriate time, including concurrently, individually, or in combination. In addition, many of the operations in these processes may take place simultaneously, concurrently, and/or in different orders than as shown. Moreover, system 100 may use processes with additional operations, fewer operations, and/or different operations, so long as the methods remain appropriate. 
[0147] In other words, although this disclosure has been described in terms of certain embodiments and generally associated methods, alterations and permutations of these embodiments and methods will be apparent to those skilled in the art. Accordingly, the above description of example embodiments does not define or constrain this disclosure.  Other changes, substitutions, and alterations are also possible without departing from the spirit and scope of this disclosure. 

Furthermore, as an ordered combination, these elements amount to generic computer components receiving or transmitting data over a network, performing repetitive calculations, electronic record keeping, and storing and retrieving information in memory, which, as held by the courts, are well-understood, routine, and conventional. See MPEP 2106.05(d).

Moreover, the remaining elements of dependent claims do not transform the recited abstract idea into a patent eligible invention because these remaining elements merely recite further abstract limitations that provide nothing more than simply a narrowing of the abstract idea recited in the independent claims. 

Looking at these limitations as an ordered combination adds nothing additional that is sufficient to amount to significantly more than the recited abstract idea because they simply provide instructions to use a generic arrangement of generic computer components to “apply” the recited abstract idea, perform insignificant extra-solution activity, and generally link the abstract idea to a technical environment. Thus, the elements of the claims, considered both individually and as an ordered combination, are not sufficient to ensure that the claim as a whole amounts to significantly more than the abstract idea itself. Since there are no limitations in these claims that transform the exception into a patent eligible application such that these claims amount to significantly more than the exception itself, claims 1-8, 10-21 are rejected under 35 U.S.C. 101 as being directed to non-statutory subject matter.

Claim Rejections – 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
Determining the scope and contents of the prior art.
Ascertaining the differences between the prior art and the claims at issue.
Resolving the level of ordinary skill in the pertinent art.
Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-8, 10-20 is/are rejected under 35 U.S.C. 103 as being unpatentable by US Patent Publication to US20200184494A1 to Joseph et al., (hereinafter referred to as “Joseph”) in view of US Patent Publication to US20140222506A1 to Frazer, (hereinafter referred to as “Frazer”) 

As per Claim 1, Joseph teaches: (Currently Amended) A computer-implemented method comprising: 
receiving a request to predict transaction quantities for a plurality of transaction entities for a future time period; (in at least [0023] This can be accomplished during any user-configurable time interval but can be performed as often as the user wants. )
identifying historical transaction data for the transaction entities for a plurality of categories of transacted items, wherein the plurality of categories are organized using a hierarchy of levels; (in at least [0024] to collect and maintain vast historical databases of transaction history. Conventional applications called Enterprise Resource Planning (“ERP”) applications have been developed over the years to generate such data. Examples of such conventional ERP packages include SAP, Baan, PeopleSoft, and others. Accordingly, volumes of historical transaction data are available to those businesses that have archived data produced by various ERP applications. Transaction data related to orders, service requests, and other activities is potentially available. [0057]  a separate demand forecast may be generated for each unique good, product and/or service in the dataset and for each different location using a separate machine learning model for each.)
dividing the future time period into a set of multiple time units; (in at least [0038]  the raw data updates 316 may be indexed into a plurality of time intervals to generate time series data. For instance, the raw data may be indexed into 30-minute time intervals so that the demand forecasts can be computed to obtain time series forecast results corresponding to each 30-minute time increment.  [0057] a separate demand forecast may be generated for each unique good, product and/or service in the dataset and for each different location using a separate machine learning model for each. The data may also be divided into time slots or buckets as time series data.)
for each respective time unit of the multiple time units:(in at least [0038]  the raw data updates 316 may be indexed into a plurality of time intervals to generate time series data. For instance, the raw data may be indexed into 30-minute time intervals so that the demand forecasts can be computed to obtain time series forecast results corresponding to each 30-minute time increment. [0057] a separate demand forecast may be generated for each unique good, product and/or service in the dataset and for each different location using a separate machine learning model for each. The data may also be divided into time slots or buckets as time series data.)
iterating over multiple levels of the hierarchy starting at a lowest level, wherein the iterating includes, for each current level in the iteration: (in at least [0031] forecasting demand to a level of granularity that includes the particular good, product, and/or service, and can be performed separately (e.g., using a separate machine learning model) for each specific location of the business or other entity. [0044] observer unit 325 determines that the currently selected model 330 needs to be trained/retrained, it can send a signal to the model (re-)training unit 111 via 328 and the currently selected model can thereafter be trained/retrained using a training data to obtain better results from the selected machine learning model 1-n.)
identifying features to include in a quantity forecasting model for the current level and the time unit; (in at least [0028] to select an optimal machine learning model from among a plurality of available models on the system to apply to one or more datasets that may be continually changing with time in response to updates in the input data, new sources of input data, and/or other external factors. In at least certain embodiments, the system may be adapted to predict demand drivers for determining the workforce requirements for an entity such as past sales, store traffic, seasonality, weather, nearby events, etc. In other cases, the demand may be forecast for other purposes, such as inventory management for example. [0031] Each dataset may contain data referring to different locations and/or categories of items. The system is enabled to select separate models and compute separate forecasts for each of these. In one embodiment a dataset may contain data for just one item category from one location. [0036]  demand forecasts can be generated for 30-minute time increments for each product or service, or other such granularity that can be configured by users for the specific business or entity [0057] a separate demand forecast may be generated for each unique good, product and/or service in the dataset and for each different location using a separate machine learning model for each. The data may also be divided into time slots or buckets as time series data. External factors can also be configured in computing an overall demand forecast. For example, the system may be able to take as inputs information relating to new sources of data or other externals such as sales, store traffic, seasonality, weather, nearby events, etc.)
training the quantity forecasting model for the current level and the time unit including building the identified features into the quantity forecasting model for the current level and the time unit; (in at least [0028] to select an optimal machine learning model from among a plurality of available models on the system to apply to one or more datasets that may be continually changing with time in response to updates in the input data, new sources of input data, and/or other external factors. In at least certain embodiments, the system may be adapted to predict demand drivers for determining the workforce requirements for an entity such as past sales, store traffic, seasonality, weather, nearby events, etc. In other cases, the demand may be forecast for other purposes, such as inventory management for example.  [0039] In FIG. 3 the observer component 325 is shown to be in communication with the model (re-)training component 111 via direct or indirect connection 328 and in communication with the model (re-)selection component 113 via direct or indirect connection 326. The observer component 325 monitors changes in the dataset(s) 322 and other external factors 318 and based thereon determines whether to initiate a model retraining process via component 111 or a model reselection process via component 113. [0057] a separate demand forecast may be generated for each unique good, product and/or service in the dataset and for each different location using a separate machine learning model for each. The data may also be divided into time slots or buckets as time series data. External factors can also be configured in computing an overall demand forecast. For example, the system may be able to take as inputs information relating to new sources of data or other externals such as sales, store traffic, seasonality, weather, nearby events, etc. )
training multiple machine learning transaction ... prediction models using training data retrieved from the historical transaction data for the current level; (in at least [0023] actuals may be received hourly, daily, weekly, etc., and so the model can be automatically recomputed in response to such changes. The models used for forecasting can therefore be constantly refined to get an accurate forecast [0029] Models can be selected from a variety of machine learning algorithms 103 and a number of variations for each algorithm to build machine learning models 105. There are numerous machine learning models to choose from and each may include characteristics that are a better match with a particular dataset than other models. A chart displaying many known types of machine learning algorithms is depicted in FIG. 2. The model or type of model that may be suitable for one dataset or type of data may not be suitable for another. The described embodiments are responsive to different types of data and are adapted to select a machine learning model that is best suited to each unique dataset for providing demand forecast results with the highest accuracy, for example, or that satisfies another one of the various different user-configurable criteria provided to the system.)
identifying, by each respective machine learning transaction ... prediction model, predicted transaction ... in the time unit predicted for the current level; (in at least [0036]  the demand forecasts can be generated for 30-minute time increments for each product or service, or other such granularity that can be configured by users for the specific business or entity [0038] the raw data updates 316 may be indexed into a plurality of time intervals to generate time series data. For instance, the raw data may be indexed into 30-minute time intervals so that the demand forecasts can be computed to obtain time series forecast results corresponding to each 30-minute time increment. Other granularities of time increments are possible and are configurable to the user's particular requirements. The indexed datasets 322 can then be used as the basis for selecting the appropriate machine learning model for demand forecasting.)
comparing each of the multiple machine learning transaction ... prediction models based on prediction accuracy data generated from testing data retrieved from the historical transaction data for the current level; (in at least [0037] The machine learning models 105 can be selected from various machine learning algorithms 103 and variations within each algorithm. Model selection may occur automatically in response to change in the accuracy of the results, changes/updates to the dataset, and/or other external factors. The system can determine when and whether to select a new model or to retrain the currently selected model for optimal results. In other cases the currently selected model may already be providing the most accurate demand forecasts. In those cases the system determines not to initiate either a model retraining or reselection process. Further details of the model (re-) training component 111 and the model selection component 113 are described below in connection with FIG. 3 as they relate to the machine learning models and algorithms. [0038] FIG. 3 depicts a conceptual block diagram of an example embodiment of a system for forecasting demand using automatic machine learning model selection according to the techniques described in this disclosure. System 300 may include some combination of the one or more servers 106, 108 and 114 as shown and described above with respect system 100 of FIG. 1, which may be identified according to the amount and/or type of data that is available for processing. In the illustrated embodiment, system 300 includes a machine learning model (re-)training component 111, a model (re-)selection component 113, an observer component 325, and one or more datasets 322. Raw data updates 316 can be received over one or more networks from one or more data sources (e.g., Square POS system data). The raw data can be indexed in an indexer 320 to generate one or more datasets 322. In one embodiment, the raw data updates 316 may be indexed into a plurality of time intervals to generate time series data. For instance, the raw data may be indexed into 30-minute time intervals so that the demand forecasts can be computed to obtain time series forecast results corresponding to each 30-minute time increment. Other granularities of time increments are possible and are configurable to the user's particular requirements. The indexed datasets 322 can then be used as the basis for selecting the appropriate machine learning model for demand forecasting. [0043] The observer 325 can compare the demand forecasts output by the model 330 against the actual historical data once it is received to determine how accurate the demand forecast results were, and to adjust the machine learning model/algorithm in response thereto as appropriate.)
selecting a best machine learning transaction ... prediction model for the current level based on the comparison of all of the multiple machine learning transaction ... prediction models, wherein a first best machine learning transaction ... prediction model selected for a first level is a different type of machine learning model than a second best machine learning transaction ... prediction model selected for a different second level; (in at least [0037] The machine learning models 105 can be selected from various machine learning algorithms 103 and variations within each algorithm. Model selection may occur automatically in response to change in the accuracy of the results, changes/updates to the dataset, and/or other external factors. The system can determine when and whether to select a new model or to retrain the currently selected model for optimal results. In other cases the currently selected model may already be providing the most accurate demand forecasts. In those cases the system determines not to initiate either a model retraining or reselection process. Further details of the model (re-) training component 111 and the model selection component 113 are described below in connection with FIG. 3 as they relate to the machine learning models and algorithms. [0038] FIG. 3 depicts a conceptual block diagram of an example embodiment of a system for forecasting demand using automatic machine learning model selection according to the techniques described in this disclosure. System 300 may include some combination of the one or more servers 106, 108 and 114 as shown and described above with respect system 100 of FIG. 1, which may be identified according to the amount and/or type of data that is available for processing. In the illustrated embodiment, system 300 includes a machine learning model (re-)training component 111, a model (re-)selection component 113, an observer component 325, and one or more datasets 322. Raw data updates 316 can be received over one or more networks from one or more data sources (e.g., Square POS system data). The raw data can be indexed in an indexer 320 to generate one or more datasets 322. In one embodiment, the raw data updates 316 may be indexed into a plurality of time intervals to generate time series data. For instance, the raw data may be indexed into 30-minute time intervals so that the demand forecasts can be computed to obtain time series forecast results corresponding to each 30-minute time increment. Other granularities of time increments are possible and are configurable to the user's particular requirements. The indexed datasets 322 can then be used as the basis for selecting the appropriate machine learning model for demand forecasting. [0042] For every dataset and every location there are unique characteristics and thus different algorithms may be better suited to be the basis for a model depending on the dataset. In a preferred embodiment, each unique location and good, product and/or service within a business or other entity may be associated with its own machine learning model that is best suited for the corresponding dataset. For example the best machine learning algorithm for a particular dataset and location could be a “neural network” machine learning algorithm. In such a case there may be different configuration settings for the number of inputs, number of layers and number nodes, for example, to use with the neural network algorithm. Each layer may have a different number of nodes which can be configured in the configuration files/data structures. Configurations can also be used for determining what nodes are connected together in the algorithm or some specified subset of the nodes that are interconnected. [0051] If the criteria are satisfied for reselection of the machine learning model (for example to increase its accuracy or make it fit better with the input dataset, etc.), process 400 continues to operation 412 on FIG. 4B which depicts a flow chart of an example embodiment of a process for reselecting a machine learning model for forecasting demand according to the techniques described in this disclosure. Once the system determines that the criteria for model reselection has been satisfied, process 400 continues by training each of the plurality of different machine learning models available on the computer hardware server (or otherwise available to the computer hardware server via one or more network connections) for forecasting demand using the training data to build a plurality of different trained machine learning models (operation 410). A demand forecast for the transaction data for a dataset can then be computed separately for each of the different newly trained machine learning models available to the system (operation 412) and the resulting demand forecast results output from each of the models can be evaluated for accuracy for or for other user-configurable criteria (operation 414). [0052] the evaluation can be performed based on comparing the actual demand from historical data with the forecasted demand results provided by each of the machine learning models to determine which of the available models produces the best demand forecast accuracy for the particular dataset (or is best suited in some other way for the dataset). The model that produces the best accuracy (or satisfies some other user-configurable criteria) can then be selected (operation 416) and used to thereafter process any updates to the dataset(s) (operation 418). In one embodiment, the machine learning model that provides a demand forecast with the highest accuracy is automatically selected or retrained based on continually evaluating updates to the transaction data received in real time from the data sources over the computer network(s).)
training the best machine learning transaction ... prediction model based on both the testing data for the current level and the training data for the current level; (in at least [0040] the appropriate machine learning model may be selected and trained using a training dataset. The training dataset may be part of the dataset 322 to be processed, such as 60% as discussed above, while the remaining 40% of the dataset may be used to validate the model once it is trained. Other percentages of training data/processing data are possible since the process of training forecasting models using a set of training data is well known by persons of ordinary skill in the art. The trained model 330 can then be used to process subsequent data updates 316 received over the network(s) to compute demand forecasts using the trained model 330 (or to process new data received from a new data sources using the trained model 330). In one embodiment training model 330 may comprise applying at least a portion of the input data 316 to the selected model 330 to obtain values for the undefined parameters in the model. The trained model 330 can then be used to process the validation data for the dataset (e.g., 40% of the dataset). [0052] provides a demand forecast with the highest accuracy is automatically selected or retrained based on continually evaluating updates to the transaction data received in real time from the data sources over the computer network(s).)
... the features using predicted transaction ... in the time unit generated by the best machine learning transaction ... prediction model and the historical transaction data;  (in at least [0038] The raw data can be indexed in an indexer 320 to generate one or more datasets 322. In one embodiment, the raw data updates 316 may be indexed into a plurality of time intervals to generate time series data. For instance, the raw data may be indexed into 30-minute time intervals so that the demand forecasts can be computed to obtain time series forecast results corresponding to each 30-minute time increment. Other granularities of time increments are possible and are configurable to the user's particular requirements. The indexed datasets 322 can then be used as the basis for selecting the appropriate machine learning model for demand forecasting.)
using the quantity forecasting model to generate predicted quantity information for the current level for the predicted transaction ... in the time unit; and (in at least [0038] the raw data updates 316 may be indexed into a plurality of time intervals to generate time series data. For instance, the raw data may be indexed into 30-minute time intervals so that the demand forecasts can be computed to obtain time series forecast results corresponding to each 30-minute time increment. Other granularities of time increments are possible and are configurable to the user's particular requirements. The indexed datasets 322 can then be used as the basis for selecting the appropriate machine learning model for demand forecasting. [0048] process 400 monitors any new or updated transaction data received over one or more computer networks from one or more data sources and stores the data into one or more of a collection of datasets (operation 402). The demand forecast(s) can then be computed for each dataset in the collection (operation 404).)
using the predicted quantity information for the current level for the predicted transaction ... in the time unit to identify updated features to use in an updated quantity forecasting model for a next time unit of the multiple time units; (in at least [0040] The trained model 330 can then be used to process subsequent data updates 316 received over the network(s) to compute demand forecasts using the trained model 330 (or to process new data received from a new data sources using the trained model 330). )
aggregating predicted quantity information for multiple levels into aggregated quantity prediction information; and (in at least [0032] The demand results for each good, product and/or service available can then be aggregated to compute an overall demand forecast for each particular location of the coffee house. This information can be used to determine the number and qualifications of personnel needed for each category of tasks to be accomplished within each location of the coffee house chain. In other embodiments a dataset may include the data for each location and item category as a separate dataset, or any combination of these item categories and locations.)
providing the aggregated quantity prediction information in response to the request.  (in at least [0036] one or more applications 112 that consume the selected models run on application servers 114 and receive output from the models via a direct communication channel 115 or over the network 120. Employee users and employers (e.g., administrators) may access the applications 112 on the system via network 120. In one embodiment, the system may be accessible via mobile wired or wireless devices with access to the network. User administrators may be provided with account information, such as login information that can be used to set up an account to access and configure the system for each business (or other entity) and for each of its different locations. Employers can login and configure to the system to receive demand forecasts and to generate employee work schedules that match the forecasted demand. )

Although implied, Joseph does not expressly disclose the following limitations, which however, are taught by Frazer,
...transaction date prediction model... (in at least [0047] The time frames can be as short or as long as desired. For example, the time frame may be a second, or it may be several days. The risk factor is based on the risk that the action takes place over the time frame. The subsequent time frame presents yet another risk factor. The time frames can be equal or can be unequal. The method 100 also includes selecting at least one action based on the predicted likelihood of the occurrence of a future event 114. In marketing, most of the time the at least one action will have a monetary component. In other words, the actions will cost money to perform. In business, it is desirable to get the most effect for the dollar spent. Therefore, selecting the action 114 may also include optimization so that the predictions made can be leveraged across customers and products to meet business goals and objectives within the bounds of resource constraints placed by the business. [0048] feeding back information regarding the occurrence of the event 116. This information is useful in determining or tweaking the relationships or insights between the entities associated with the data as well as predicting the likelihood of occurrence of a future event. Statistics can be kept as to the effectiveness of the predictions for the purpose of pricing the services. The statistics can also be used to determine the timing for retraining models for the predictive component or if some relationships found are no longer significant of it new ones have emerged. [0049] determining the probability of a future event occurring in a first selected time period based on the relationship between the first entity and the second entity 214, and determining the probability of a future action occurring in a second selected time period based on the relationship between the first entity and the second entity 216. In some embodiments, the method 200 also includes selecting one of the first selected time period or the second selected time period based on the ranking of the possibility of a future event occurring in the first selected time period 218. The first entity can be a first product and the second entity can be a second product. In other embodiments of the method 200, the first entity can be a product and the second entity can be a customer or consumer. [0164]  The insight/relationship determination module 320 retail mining framework is context rich, i.e. it supports a wide variety of contexts that may be grouped into two types as shown in FIG. 12: market basket context and purchase sequence context. Each type of context allows is further parameterized to define contexts as necessary and appropriate for different applications and for different retailer types. [0410] Demand Forecasting: Because each customer's future purchase can be predicted using purchase sequence analysis, aggregating these by each product gives a good estimate of when, which product might be sold more. )
...predicted transaction dates... (in at least [0047] The time frames can be as short or as long as desired. For example, the time frame may be a second, or it may be several days. The risk factor is based on the risk that the action takes place over the time frame. The subsequent time frame presents yet another risk factor. The time frames can be equal or can be unequal. The method 100 also includes selecting at least one action based on the predicted likelihood of the occurrence of a future event 114. In marketing, most of the time the at least one action will have a monetary component. In other words, the actions will cost money to perform. In business, it is desirable to get the most effect for the dollar spent. Therefore, selecting the action 114 may also include optimization so that the predictions made can be leveraged across customers and products to meet business goals and objectives within the bounds of resource constraints placed by the business. [0048] feeding back information regarding the occurrence of the event 116. This information is useful in determining or tweaking the relationships or insights between the entities associated with the data as well as predicting the likelihood of occurrence of a future event. Statistics can be kept as to the effectiveness of the predictions for the purpose of pricing the services. The statistics can also be used to determine the timing for retraining models for the predictive component or if some relationships found are no longer significant of it new ones have emerged. [0049] determining the probability of a future event occurring in a first selected time period based on the relationship between the first entity and the second entity 214, and determining the probability of a future action occurring in a second selected time period based on the relationship between the first entity and the second entity 216. In some embodiments, the method 200 also includes selecting one of the first selected time period or the second selected time period based on the ranking of the possibility of a future event occurring in the first selected time period 218. The first entity can be a first product and the second entity can be a second product. In other embodiments of the method 200, the first entity can be a product and the second entity can be a customer or consumer. [0164]  The insight/relationship determination module 320 retail mining framework is context rich, i.e. it supports a wide variety of contexts that may be grouped into two types as shown in FIG. 12: market basket context and purchase sequence context. Each type of context allows is further parameterized to define contexts as necessary and appropriate for different applications and for different retailer types. [0410] Demand Forecasting: Because each customer's future purchase can be predicted using purchase sequence analysis, aggregating these by each product gives a good estimate of when, which product might be sold more. )
computing values for the features using predicted transaction dates in the time unit generated by the best machine learning transaction date prediction model and the historical transaction data;  (in at least [0053] The model can then be used to project future actions of a person or consumer based on other entities, such as promotions or the product. The future event prediction module 430 is used to determine the possibility of a future event occurring within a number of time frames. The future event prediction module 430 determines the possibility of a future event over at least two selected time frames. The future event prediction module 430 uses a proportional hazard type model. The possibility that an event will occur within a time frame is set forth as a number. The number represents the possibility that the event will occur in the particular time frame. The number is between zero (where it absolutely sill not occur) and one (where the event will occur during that particular time frame). The number assigned is actually a probability of the event occurring. Assigning the probability for the various time frames may also be referred to as scoring the possibility or propensity of the future event happening during the time frame. The future event prediction module shifts the emphasis to when an event, such as a purchase, will occur. In other words, the emphasis is not merely a prediction that the event will occur but the prediction is made with finer granularity with respect to the timing of the future event. [0058] The scorecards take into account previous transaction information (in the form of recency and frequency attributes), as well as seasonal information. This information is often very rich and predictive of future behavior. Other potential inputs are customer demographics, behavior summary features, marketing variables, pricing information, economic and competitor data, etc.) [0188] The time t in the transaction data is in days. Typically, it is not useful to create purchase sequence context at this resolution because at this resolution we may not have enough data, moreover, this may be a finer resolution than the retailer can make actionable decisions on. Therefore, to allow a different time resolution, we introduce a parameter: ρ that quantifies the number of days in each time unit (Δt). For example, if ρ=7, the purchase sequence context is computed at week resolution [0486] FIG. 28 is a schematic diagram of the analytic process 2710 performed by the predictive time-to-event component 320. The TTE analytic process 2800 is a highly automated process of generating data for and building a large number of scorecards. [0487] There are potentially thousands or even millions of features. The training dataset 2814, 2815, 2816 is appropriately down sampled and labeled for the target. )


At the time the invention was filed, it would have been obvious for one of ordinary skill in the art to have modified the teachings of Joseph by, ...selecting a next action includes reading transaction data, determining insights and relationships between a first entity and a second entity from the collected transaction data. Once these relationships and insights have been determined, the possibility of a future event occurring in one of a number of selected time periods can be determined using a predictive time-to-event component...The time frames can be as short or as long as desired....feeding back information regarding the occurrence of the event 116. This information is useful in determining or tweaking the relationships or insights between the entities associated with the data as well as predicting the likelihood of occurrence of a future event. Statistics can be kept as to the effectiveness of the predictions for the purpose of pricing the services. The statistics can also be used to determine the timing for retraining models for the predictive component or if some relationships found are no longer significant of it new ones have emerged....determining the probability of a future event occurring in a first selected time period based on the relationship between the first entity and the second entity 214, and determining the probability of a future action occurring in a second selected time period based on the relationship between the first entity and the second entity 216. In some embodiments, the method 200 also includes selecting one of the first selected time period or the second selected time period based on the ranking of the possibility of a future event occurring in the first selected time period 218. The first entity can be a first product and the second entity can be a second product. In other embodiments of the method 200, the first entity can be a product and the second entity can be a customer or consumer....The insight/relationship determination module 320 retail mining framework is context rich, i.e. it supports a wide variety of contexts that may be grouped into two types as shown in FIG. 12: market basket context and purchase sequence context. Each type of context allows is further parameterized to define contexts as necessary and appropriate for different applications and for different retailer types. ...Demand Forecasting: Because each customer's future purchase can be predicted using purchase sequence analysis, aggregating these by each product gives a good estimate of when, which product might be sold more...The model can then be used to project future actions of a person or consumer based on other entities, such as promotions or the product. The future event prediction module 430 is used to determine the possibility of a future event occurring within a number of time frames. The future event prediction module 430 determines the possibility of a future event over at least two selected time frames. The future event prediction module 430 uses a proportional hazard type model. The possibility that an event will occur within a time frame is set forth as a number. The number represents the possibility that the event will occur in the particular time frame. The number is between zero (where it absolutely sill not occur) and one (where the event will occur during that particular time frame). The number assigned is actually a probability of the event occurring. Assigning the probability for the various time frames may also be referred to as scoring the possibility or propensity of the future event happening during the time frame. The future event prediction module shifts the emphasis to when an event, such as a purchase, will occur. In other words, the emphasis is not merely a prediction that the event will occur but the prediction is made with finer granularity with respect to the timing of the future event....The scorecards take into account previous transaction information (in the form of recency and frequency attributes), as well as seasonal information. This information is often very rich and predictive of future behavior. Other potential inputs are customer demographics, behavior summary features, marketing variables, pricing information, economic and competitor data, etc.)...The time t in the transaction data is in days. Typically, it is not useful to create purchase sequence context at this resolution because at this resolution we may not have enough data, moreover, this may be a finer resolution than the retailer can make actionable decisions on. Therefore, to allow a different time resolution, we introduce a parameter: ρ that quantifies the number of days in each time unit (Δt). For example, if ρ=7, the purchase sequence context is computed at week resolution ... FIG. 28 is a schematic diagram of the analytic process 2710 performed by the predictive time-to-event component 320. The TTE analytic process 2800 is a highly automated process of generating data for and building a large number of scorecards....There are potentially thousands or even millions of features. The training dataset 2814, 2815, 2816 is appropriately down sampled and labeled for the target...propensity matrices can be reviewed for a number of time frames and the occurrences of time for customers for a set of events can be compiled into an optimized offer schedule. FIG. 29 depicts this process 2900. A series of customers and offers are compiled along with multiple selected time periods 2910. The compiled results are input to the offer scheduling optimization process. Constraints 2920 are placed on the process. The result is that by considering the constraints a schedule of offers that is substantially optimized 2930 can be produce...Indirect or Derived properties such as aggregates of the line item properties, e.g. total margin of the transaction, total number of products purchased, and market basket diversity across higher level product categories, etc. ...compute a seasonal value of a product in each season as well as its expected value across all seasons...total value of the product u across all seasons..., as taught by Frazer, with a reasonable expectation of success if arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make this modification to the teachings of Joseph with the motivation of, ...understanding consumer spending habits...to identify and categorize consumer interests, in order to learn how consumers spend money...advertising and promotions related to these interests will be more successful in obtaining a positive consumer response, such as purchases of the advertised products or services.... ability to model consumer financial behavior based on actual historical spending patterns that reflect the time-related nature of each consumer's purchase. Further, it is desirable to extract meaningful classifications of merchants based on the actual spending patterns, and from the combination of these, predict future spending of an individual consumer in specific, meaningful merchant groupings....to encourage repeat purchase behavior and to identify customers with high value growth potential....online analytical processing (OLAP) capabilities to “slice and dice” the purchase data to extract basic statistical reports and use them and other domain language to make marketing decisions....interpolating between a segment and the overall population to create more insights and improve the accuracy of the recommendation engine if it is possible....optimization so that the predictions made can be leveraged across customers and products to meet business goals and objectives within the bounds of resource constraints placed by the business....timing recommendations that will be the most effective in causing the future event...use a customer's past purchase behavior and current market basket to develop accurate, timely, and very effective cross-sell and up-sell offers...effectively in creating product assortment promotions because they capture the latent intentions of customers in a way that was not possible before...., as recited in Frazer.


As per Claim 2, Joseph teaches: (Original) The method of Claim 1, 
where a lowest level of the hierarchy corresponds to a transacted item.  (in at least [0022]  They can be computed not only for different locations, but also for different goods, products and/or services to obtain a plurality of machine learning models for demand forecasting. The framework described herein is adaptable to provide a unique model for each location and each type of data, and to automatically evaluate which is the best model among dozens of available models (for each location/category). Food or beverage orders, mobile orders, in-store orders, different channels, different locations, etc. are some of the examples of the specific categories that can be modeled for forecasting demand. [0032]  a coffee house chain may have three locations and the system can select an optimal machine learning model for each different type of coffee, pastry, or other product available at each of the different locations to yield multiple separate machine learning models for each item category for each location. The separate models can then be optimized using the techniques described herein. The demand results for each good, product and/or service available can then be aggregated to compute an overall demand forecast for each particular location of the coffee house. This information can be used to determine the number and qualifications of personnel needed for each category of tasks to be accomplished within each location of the coffee house chain. In other embodiments a dataset may include the data for each location and item category as a separate dataset, or any combination of these item categories and locations. [0057]  a separate demand forecast may be generated for each unique good, product and/or service in the dataset and for each different location using a separate machine learning model for each. The data may also be divided into time slots or buckets as time series data. External factors can also be configured in computing an overall demand forecast. For example, the system may be able to take as inputs information relating to new sources of data or other externals such as sales, store traffic, seasonality, weather, nearby events, etc.)


As per Claim 3, Joseph teaches: (Original) The method of Claim 2, 
wherein higher levels of the hierarchy correspond to more general categories of transacted items.   (in at least [0022]  They can be computed not only for different locations, but also for different goods, products and/or services to obtain a plurality of machine learning models for demand forecasting. The framework described herein is adaptable to provide a unique model for each location and each type of data, and to automatically evaluate which is the best model among dozens of available models (for each location/category). Food or beverage orders, mobile orders, in-store orders, different channels, different locations, etc. are some of the examples of the specific categories that can be modeled for forecasting demand.  [0032]  a coffee house chain may have three locations and the system can select an optimal machine learning model for each different type of coffee, pastry, or other product available at each of the different locations to yield multiple separate machine learning models for each item category for each location. The separate models can then be optimized using the techniques described herein. The demand results for each good, product and/or service available can then be aggregated to compute an overall demand forecast for each particular location of the coffee house. This information can be used to determine the number and qualifications of personnel needed for each category of tasks to be accomplished within each location of the coffee house chain. In other embodiments a dataset may include the data for each location and item category as a separate dataset, or any combination of these item categories and locations. [0057]  a separate demand forecast may be generated for each unique good, product and/or service in the dataset and for each different location using a separate machine learning model for each. The data may also be divided into time slots or buckets as time series data. External factors can also be configured in computing an overall demand forecast. For example, the system may be able to take as inputs information relating to new sources of data or other externals such as sales, store traffic, seasonality, weather, nearby events, etc)


As per Claim 4, Joseph teaches:  (Original) The method of Claim 1, 
wherein the aggregated quantity prediction information is aggregated with ... and quantity prediction.  (in at least [0032] The demand results for each good, product and/or service available can then be aggregated to compute an overall demand forecast for each particular location of the coffee house. This information can be used to determine the number and qualifications of personnel needed for each category of tasks to be accomplished within each location of the coffee house chain. In other embodiments a dataset may include the data for each location and item category as a separate dataset, or any combination of these item categories and locations.)

Although implied, Joseph does not expressly disclose the following limitations, which however, are taught by Frazer,
wherein the aggregated quantity prediction information is aggregated with corresponding predicted transaction dates into an aggregated transaction date and quantity prediction.  (in at least [0167] A time elapsed intention may not cover all its products in a single visit. Sometimes the customer just forgets to buy all the products that may be needed for a particular intention, e.g. a multi-visit birthday party shopping, and may visit the store again the same day or the very next day or week. Sometimes the customer buys products as needed in a time-elapsed intention for example a garage re-modeling or home theater set up that might happen in different stages, the customer may choose to shop for each stage separately. To accommodate both these behaviors, it is useful to have a parametric way to define the appropriate time resolution for a forgot visit, e.g. a week, to a intentional subsequent visit, e.g. 15 to 60 days. [0195] FIG. 14 shows the basic idea of Technique 2. In FIG. 14, each non-empty cell represents a transaction. If the last grey square on the right is the TO transaction, then there are two FROM sets: the union of the two center grey square transactions and the union of the two left grey square transactions resulting, correspondingly, in two context instances...The union of the two becomes the first FROM set resulting in the purchase sequence context instance (the grey square above the time line union=FROM, last grey square on the right=TO, Δt=1). Going further back there are two transactions at Δt=2 (two left most grey squares). The union of these two becomes the second FROM set resulting in the purchase sequence context instance (grey square below the time line union=FROM, last grey square on the right=TO, Δt=1).  [0440] input customer history and the target products are interpreted as market baskets. For retailers where timing of purchase is important, the insight/relationship determination module 320 framework provides the ability to use not just what was bought in the past but also when it was bought and use that to recommend not just what will be bought in the future by the customer but also when it is to be bought... As shown in FIG. 21, the purchase sequence context uses the time-lag between any past purchase and the time of recommendation to create both timely and precise recommendations.)

The reason and rationale to combine Joseph and Frazer is the same as recited above.




As per Claim 5, Joseph teaches:  (Original) The method of Claim 4, 
wherein the aggregated transaction ... and quantity prediction comprises ... visits to transaction entities that are predicted to have certain transactions on certain dates an in certain quantities. (in at least [0032] a coffee house chain may have three locations and the system can select an optimal machine learning model for each different type of coffee, pastry, or other product available at each of the different locations to yield multiple separate machine learning models for each item category for each location. The separate models can then be optimized using the techniques described herein. The demand results for each good, product and/or service available can then be aggregated to compute an overall demand forecast for each particular location of the coffee house. This information can be used to determine the number and qualifications of personnel needed for each category of tasks to be accomplished within each location of the coffee house chain. In other embodiments a dataset may include the data for each location and item category as a separate dataset, or any combination of these item categories and locations. [0057] demand forecast may be generated for each unique good, product and/or service in the dataset and for each different location using a separate machine learning model for each. The data may also be divided into time slots or buckets as time series data. External factors can also be configured in computing an overall demand forecast. For example, the system may be able to take as inputs information relating to new sources of data or other externals such as sales, store traffic, seasonality, weather, nearby events, etc.)

Although implied, Joseph does not expressly disclose the following limitations, which however, are taught by Frazer,
wherein the aggregated transaction date and quantity prediction comprises a transaction entity visit schedule for scheduling visits to transaction entities that are predicted to have certain transactions on certain dates an in certain quantities. (in at least [0045] The method 100 also includes predicting the likelihood of the occurrence of a future event 112. In retail situations, the future event many times is the purchase of another product. For example, when a consumer buys a personal computer many times the consumer will follow with purchases of other hardware or software. The consumer may buy a printer or may buy a word processing program shortly after making a computer purchase. The future event can actually include other items, such as an in-store visit. [0057] Predicting store visits, purchases in various departments, or of various products, can be exploited by sending brochures, discount coupons, or by means of a product recommendation engine. [0167] a market basket context instance is defined as a SET of products purchased on one or more consecutive visits. This definition generalizes the notion of a market basket context in a systematic, parametric way. The set of all products purchased by a customer (i) in a single visit, or (ii) in consecutive visits within a time window of (say) two weeks, or (iii) all visits of a customer are all valid parameterized instantiations of different market basket contexts...Time elapsed intentions—As mentioned above, transaction data is a mixture of projections of possibly time-elapsed latent intentions of customers. A time elapsed intention may not cover all its products in a single visit. Sometimes the customer just forgets to buy all the products that may be needed for a particular intention, e.g. a multi-visit birthday party shopping, and may visit the store again the same day or the very next day or week. Sometimes the customer buys products as needed in a time-elapsed intention for example a garage re-modeling or home theater set up that might happen in different stages, the customer may choose to shop for each stage separately. To accommodate both these behaviors, it is useful to have a parametric way to define the appropriate time resolution for a forgot visit, e.g. a week, to a intentional subsequent visit, e.g. 15 to 60 days. [0494] Such as propensity matrix 2300 can be used as part of a recommendation engine to answer any of the following questions: What are the best products to recommend to a customer at a certain time, e.g. say today or next week? What are the best customers to whom a particular product should be recommended at a certain time? What is the best time to recommend a particular product to a particular customer?)

The reason and rationale to combine Joseph and Frazer is the same as recited above.



As per Claim 6,  Although implied, Joseph does not expressly disclose the following limitations, which however, are taught by Frazer, (Original) The method of Claim 5, 
further comprising applying at least one inclusion rule to the transaction entity visit schedule.  (in at least [0167] a market basket context instance is defined as a SET of products purchased on one or more consecutive visits. This definition generalizes the notion of a market basket context in a systematic, parametric way. The set of all products purchased by a customer (i) in a single visit, or (ii) in consecutive visits within a time window of (say) two weeks, or (iii) all visits of a customer are all valid parameterized instantiations of different market basket contexts...Time elapsed intentions—As mentioned above, transaction data is a mixture of projections of possibly time-elapsed latent intentions of customers. A time elapsed intention may not cover all its products in a single visit. Sometimes the customer just forgets to buy all the products that may be needed for a particular intention, e.g. a multi-visit birthday party shopping, and may visit the store again the same day or the very next day or week. Sometimes the customer buys products as needed in a time-elapsed intention for example a garage re-modeling or home theater set up that might happen in different stages, the customer may choose to shop for each stage separately. To accommodate both these behaviors, it is useful to have a parametric way to define the appropriate time resolution for a forgot visit, e.g. a week, to a intentional subsequent visit, e.g. 15 to 60 days.  [0436] insight/relationship determination module 320's Market Basket Recommendation Engine may be used. In MBRE customer history is interpreted as a market basket, i.e. current visit, union of recent visits, history weighted all visit.  [0502] propensity matrices can be reviewed for a number of time frames and the occurrences of time for customers for a set of events can be compiled into an optimized offer schedule. FIG. 29 depicts this process 2900. A series of customers and offers are compiled along with multiple selected time periods 2910. The compiled results are input to the offer scheduling optimization process. Constraints 2920 are placed on the process. The result is that by considering the constraints a schedule of offers that is substantially optimized 2930 can be produced.)

The reason and rationale to combine Joseph and Frazer is the same as recited above.




As per Claim 7,  Although implied, Joseph does not expressly disclose the following limitations, which however, are taught by Frazer, (Original) The method of Claim 6, 
wherein the at least one inclusion rule includes a minimum visit rule or a minimum predicted transaction value rule.  (in at least [0425] The customer may just browse the product to consider for purchasing such as in clothing, the customer might try-it-on or read the table of contents before buying a book or sampling the music before buying a CD or read the reviews before buying a high end product. The fact that the customer took time at least to browse these products shows that he has some interest in them and, therefore, even if he does not purchase them, they can still be used as part of the customer history along with the products he did purchase.  [0502] propensity matrices can be reviewed for a number of time frames and the occurrences of time for customers for a set of events can be compiled into an optimized offer schedule. FIG. 29 depicts this process 2900. A series of customers and offers are compiled along with multiple selected time periods 2910. The compiled results are input to the offer scheduling optimization process. Constraints 2920 are placed on the process. The result is that by considering the constraints a schedule of offers that is substantially optimized 2930 can be produced. [0503] selecting actions with respect to a plurality of customers also includes selecting a combination of the first, second, third or fourth future events based on optimizing a select amount of resources associated with at least one of the first entity, the second entity and the third entity. )

The reason and rationale to combine Joseph and Frazer is the same as recited above.



As per Claim 8, Joseph teaches:  (Original) The method of Claim 1, 
wherein the predicted quantity information comprises ....   (in at least [0032] The demand results for each good, product and/or service available can then be aggregated to compute an overall demand forecast for each particular location of the coffee house. This information can be used to determine the number and qualifications of personnel needed for each category of tasks to be accomplished within each location of the coffee house chain. In other embodiments a dataset may include the data for each location and item category as a separate dataset, or any combination of these item categories and locations.)


Although implied, Joseph does not expressly disclose the following limitations, which however, are taught by Frazer,
wherein the predicted quantity information comprises predicted transaction value and the method further comprises determining predicted number of units based on the predicted transaction value.  (in at least [0447] compute a seasonal value of a product in each season as well as its expected value across all seasons. Deviation from the expected value quantify the degree of seasonality adjustment. More formally: Let S={s1, . . . , sK} be K seasons. Each season could simply be a start-day and end-day pair. Let {V(u|sk)}k=1 K denote value, e.g. revenue, margin, etc., of a product u across all seasons. Let {N(sk)}k=1 K be the normalizer, e.g. number of customers/transactions for each season. Let V  ( u ) = ∑ k = 1 K   V  ( u  s k ) be the total value of the product u across all seasons. [0464] This up-sell business objective might be combined with the recommendation scores by creating a value-score for each product and the value property. i.e. revenue, margin, margin percent, etc. These value-scores are then normalized, e.g. max, z-score, rank, and combined with the recommendation score to increase or decrease the overall score of a high/low value product.)

The reason and rationale to combine Joseph and Frazer is the same as recited above.


As per Claim 10, Joseph teaches:  (Original) The method of Claim 1, 
wherein the quantity forecasting model is trained using ... information that has been merged with the historical transaction data.  (in at least [0038] system 300 includes a machine learning model (re-)training component 111, a model (re-)selection component 113, an observer component 325, and one or more datasets 322. Raw data updates 316 can be received over one or more networks from one or more data sources (e.g., Square POS system data). The raw data can be indexed in an indexer 320 to generate one or more datasets 322. In one embodiment, the raw data updates 316 may be indexed into a plurality of time intervals to generate time series data. For instance, the raw data may be indexed into 30-minute time intervals so that the demand forecasts can be computed to obtain time series forecast results corresponding to each 30-minute time increment. Other granularities of time increments are possible and are configurable to the user's particular requirements. The indexed datasets 322 can then be used as the basis for selecting the appropriate machine learning model for demand forecasting. [0057] demand forecast may be generated for each unique good, product and/or service in the dataset and for each different location using a separate machine learning model for each. The data may also be divided into time slots or buckets as time series data. External factors can also be configured in computing an overall demand forecast. For example, the system may be able to take as inputs information relating to new sources of data or other externals such as sales, store traffic, seasonality, weather, nearby events, etc.)

Although implied, Joseph does not expressly disclose the following limitations, which however, are taught by Frazer,
wherein the quantity forecasting model is trained using promotion information that has been merged with the historical transaction data.  (in at least [0491] The result of the process associated with the TTE component 320 and the process 2710, is that a set of propensity matrices can be produced for several future time periods so as to define the relationship between the risk of an event occurring in each of several discrete time periods. It should be noted that the predictors can change their values in each of the future time periods so that a decision can be made to send a marketing offer while it has the most probability of maturing into a sale. [0492] The results as time movers on are fed back to both the insight/relationship determination component 310 and the predictive time-to-event component 320. Scoring is repeated at regular time intervals, as determined by the business (e.g. every night, every weekend, or the like). The score value of a particular individual and a particular event can change over the course of time, either due to recent events experienced by the individual, or due to the passage of time itself. The score values (i.e. likelihoods) of all individuals for all events of interest are input into a decision optimization. For example, a retailer may use the scores in a recommendation engine, which matches customers to products for which they have a high propensity. [0502] propensity matrices can be reviewed for a number of time frames and the occurrences of time for customers for a set of events can be compiled into an optimized offer schedule. FIG. 29 depicts this process 2900. A series of customers and offers are compiled along with multiple selected time periods 2910. The compiled results are input to the offer scheduling optimization process. Constraints 2920 are placed on the process. The result is that by considering the constraints a schedule of offers that is substantially optimized 2930 can be produced.)

The reason and rationale to combine Joseph and Frazer is the same as recited above.


As per Claim 11, Joseph teaches: Original) The method of Claim 10, 
wherein the features include ...-related features.  (in at least [0038] system 300 includes a machine learning model (re-)training component 111, a model (re-)selection component 113, an observer component 325, and one or more datasets 322. Raw data updates 316 can be received over one or more networks from one or more data sources (e.g., Square POS system data). The raw data can be indexed in an indexer 320 to generate one or more datasets 322. In one embodiment, the raw data updates 316 may be indexed into a plurality of time intervals to generate time series data. For instance, the raw data may be indexed into 30-minute time intervals so that the demand forecasts can be computed to obtain time series forecast results corresponding to each 30-minute time increment. Other granularities of time increments are possible and are configurable to the user's particular requirements. The indexed datasets 322 can then be used as the basis for selecting the appropriate machine learning model for demand forecasting. [0057] demand forecast may be generated for each unique good, product and/or service in the dataset and for each different location using a separate machine learning model for each. The data may also be divided into time slots or buckets as time series data. External factors can also be configured in computing an overall demand forecast. For example, the system may be able to take as inputs information relating to new sources of data or other externals such as sales, store traffic, seasonality, weather, nearby events, etc.)

Although implied, Joseph does not expressly disclose the following limitations, which however, are taught by Frazer,
wherein the features include promotion-related features.  (in at least [0489] The predictive time-to-event component 320 can also produce one prospensity matrix or more propensity matrices (which are discussed in more detail below along with FIGS. 23-24) for all customers in the input dataset. [0491] The result of the process associated with the TTE component 320 and the process 2710, is that a set of propensity matrices can be produced for several future time periods so as to define the relationship between the risk of an event occurring in each of several discrete time periods. It should be noted that the predictors can change their values in each of the future time periods so that a decision can be made to send a marketing offer while it has the most probability of maturing into a sale. [0502] propensity matrices can be reviewed for a number of time frames and the occurrences of time for customers for a set of events can be compiled into an optimized offer schedule. FIG. 29 depicts this process 2900. A series of customers and offers are compiled along with multiple selected time periods 2910. The compiled results are input to the offer scheduling optimization process. Constraints 2920 are placed on the process. The result is that by considering the constraints a schedule of offers that is substantially optimized 2930 can be produced) 

The reason and rationale to combine Joseph and Frazer is the same as recited above.



As per Claim 12, Although implied, Joseph does not expressly disclose the following limitations, which however, are taught by Frazer, (Original) The method of Claim 1, 
wherein the features include cumulative sum or expanding mean features.  (in at least [0155] Indirect or Derived properties such as aggregates of the line item properties, e.g. total margin of the transaction, total number of products purchased, and market basket diversity across higher level product categories, etc. [0447] compute a seasonal value of a product in each season as well as its expected value across all seasons...total value of the product u across all seasons)

The reason and rationale to combine Joseph and Frazer is the same as recited above.


As per Claim 13-16 and 17-20 for a system (see at least Frazer [0038]) and computer program product (see at least Frazer [0509]), respectively, substantially recite the subject matter of Claim 1-4 and are rejected based on the same reasoning and rationale.


Claims 21 is/are rejected under 35 U.S.C. 103 as being unpatentable by US Patent Publication to US20200184494A1 to Joseph et al., (hereinafter referred to as “Joseph”) in view of US Patent Publication to US20140222506A1 to Frazer, (hereinafter referred to as “Frazer”)  in view of US Patent Publication to US20160134723A1 to Gupta et al., (hereinafter referred to as “Gupta”) 

As per Claim 21, Joseph teaches: (New) The computer-implemented method of Claim 1, wherein: 
at least some of the machine learning transaction ... prediction models are based on ... training data; and (in at least [0040] the appropriate machine learning model may be selected and trained using a training dataset. The training dataset may be part of the dataset 322 to be processed, such as 60% as discussed above, while the remaining 40% of the dataset may be used to validate the model once it is trained. Other percentages of training data/processing data are possible since the process of training forecasting models using a set of training data is well known by persons of ordinary skill in the art. The trained model 330 can then be used to process subsequent data updates 316 received over the network(s) to compute demand forecasts using the trained model 330 (or to process new data received from a new data sources using the trained model 330). In one embodiment training model 330 may comprise applying at least a portion of the input data 316 to the selected model 330 to obtain values for the undefined parameters in the model. The trained model 330 can then be used to process the validation data for the dataset (e.g., 40% of the dataset). [0052] provides a demand forecast with the highest accuracy is automatically selected or retrained based on continually evaluating updates to the transaction data received in real time from the data sources over the computer network(s).)
wherein comparing each of the multiple machine learning transaction ... prediction models comprises ... prediction output before the comparing.  (in at least [0037] The machine learning models 105 can be selected from various machine learning algorithms 103 and variations within each algorithm. Model selection may occur automatically in response to change in the accuracy of the results, changes/updates to the dataset, and/or other external factors. The system can determine when and whether to select a new model or to retrain the currently selected model for optimal results. In other cases the currently selected model may already be providing the most accurate demand forecasts. In those cases the system determines not to initiate either a model retraining or reselection process. Further details of the model (re-) training component 111 and the model selection component 113 are described below in connection with FIG. 3 as they relate to the machine learning models and algorithms. [0038] FIG. 3 depicts a conceptual block diagram of an example embodiment of a system for forecasting demand using automatic machine learning model selection according to the techniques described in this disclosure. System 300 may include some combination of the one or more servers 106, 108 and 114 as shown and described above with respect system 100 of FIG. 1, which may be identified according to the amount and/or type of data that is available for processing. In the illustrated embodiment, system 300 includes a machine learning model (re-)training component 111, a model (re-)selection component 113, an observer component 325, and one or more datasets 322. Raw data updates 316 can be received over one or more networks from one or more data sources (e.g., Square POS system data). The raw data can be indexed in an indexer 320 to generate one or more datasets 322. In one embodiment, the raw data updates 316 may be indexed into a plurality of time intervals to generate time series data. For instance, the raw data may be indexed into 30-minute time intervals so that the demand forecasts can be computed to obtain time series forecast results corresponding to each 30-minute time increment. Other granularities of time increments are possible and are configurable to the user's particular requirements. The indexed datasets 322 can then be used as the basis for selecting the appropriate machine learning model for demand forecasting. [0043] The observer 325 can compare the demand forecasts output by the model 330 against the actual historical data once it is received to determine how accurate the demand forecast results were, and to adjust the machine learning model/algorithm in response thereto as appropriate. [0040] the appropriate machine learning model may be selected and trained using a training dataset. The training dataset may be part of the dataset 322 to be processed, such as 60% as discussed above, while the remaining 40% of the dataset may be used to validate the model once it is trained. Other percentages of training data/processing data are possible since the process of training forecasting models using a set of training data is well known by persons of ordinary skill in the art. The trained model 330 can then be used to process subsequent data updates 316 received over the network(s) to compute demand forecasts using the trained model 330 (or to process new data received from a new data sources using the trained model 330). In one embodiment training model 330 may comprise applying at least a portion of the input data 316 to the selected model 330 to obtain values for the undefined parameters in the model. The trained model 330 can then be used to process the validation data for the dataset (e.g., 40% of the dataset). [0052] provides a demand forecast with the highest accuracy is automatically selected or retrained based on continually evaluating updates to the transaction data received in real time from the data sources over the computer network(s).)


 
 
 Although implied, Joseph does not expressly disclose the following limitations, which however, are taught by Frazer,
 ...transaction date prediction models ... (in at least [0047] The time frames can be as short or as long as desired. For example, the time frame may be a second, or it may be several days. The risk factor is based on the risk that the action takes place over the time frame. The subsequent time frame presents yet another risk factor. The time frames can be equal or can be unequal. The method 100 also includes selecting at least one action based on the predicted likelihood of the occurrence of a future event 114. In marketing, most of the time the at least one action will have a monetary component. In other words, the actions will cost money to perform. In business, it is desirable to get the most effect for the dollar spent. Therefore, selecting the action 114 may also include optimization so that the predictions made can be leveraged across customers and products to meet business goals and objectives within the bounds of resource constraints placed by the business. [0048] feeding back information regarding the occurrence of the event 116. This information is useful in determining or tweaking the relationships or insights between the entities associated with the data as well as predicting the likelihood of occurrence of a future event. Statistics can be kept as to the effectiveness of the predictions for the purpose of pricing the services. The statistics can also be used to determine the timing for retraining models for the predictive component or if some relationships found are no longer significant of it new ones have emerged. [0049] determining the probability of a future event occurring in a first selected time period based on the relationship between the first entity and the second entity 214, and determining the probability of a future action occurring in a second selected time period based on the relationship between the first entity and the second entity 216. In some embodiments, the method 200 also includes selecting one of the first selected time period or the second selected time period based on the ranking of the possibility of a future event occurring in the first selected time period 218. The first entity can be a first product and the second entity can be a second product. In other embodiments of the method 200, the first entity can be a product and the second entity can be a customer or consumer. [0164]  The insight/relationship determination module 320 retail mining framework is context rich, i.e. it supports a wide variety of contexts that may be grouped into two types as shown in FIG. 12: market basket context and purchase sequence context. Each type of context allows is further parameterized to define contexts as necessary and appropriate for different applications and for different retailer types. [0410] Demand Forecasting: Because each customer's future purchase can be predicted using purchase sequence analysis, aggregating these by each product gives a good estimate of when, which product might be sold more. )

The reason and rationale to combine Joseph and Frazer is the same as recited above.


 
Although implied, Joseph in view of Frazer does not expressly disclose the following limitations, which however, are taught by Frazer,
... compressed ... ; and (in at least [0002] configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems. [0045] The client 502 may also include a decompression unit 506 configured to decompress the received data, if the data has been compressed by the web server. [0046] ACM 310 may determine that compressed data should be returned, and the web server 402 may return 606 compressed data to the web client 502. The decompression module 602 on the client 502 may then decompress the received data. In certain embodiments, communication of compressed data may be a default setting)
... decompressing ... (in at least [0045] The client 502 may also include a decompression unit 506 configured to decompress the received data, if the data has been compressed by the web server. [0046] ACM 310 may determine that compressed data should be returned, and the web server 402 may return 606 compressed data to the web client 502. The decompression module 602 on the client 502 may then decompress the received data. In certain embodiments, communication of compressed data may be a default setting.)

At the time the invention was filed, it would have been obvious for one of ordinary skill in the art to have modified the teachings of Joseph in view of Frazer by, ... adaptive compression management for web services ... receiving a request for a web-service data operation. The method may also include identifying, using a data processing device, a network performance statistic for characterizing a data link between a web server and a web client. ... determining, using the data processing device, a data size threshold in response to the network performance statistic... determining, using the data processing device, whether to compress data associated with the web-service operation in response to the data size threshold....configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems....The client 502 may also include a decompression unit 506 configured to decompress the received data, if the data has been compressed by the web server...ACM 310 may determine that compressed data should be returned, and the web server 402 may return 606 compressed data to the web client 502. The decompression module 602 on the client 502 may then decompress the received data. In certain embodiments, communication of compressed data may be a default setting, as taught by Gupta, with a reasonable expectation of success if arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make this modification to the teachings of Joseph in view of Frazer  with the motivation of, .... handling needs and requirements vary between different users or applications... reduce overall data transmission times and improve overall system performance...saving system resources and improving system performance...saving the overhead of data compression...configured for a specific user or specific use such as financial transaction processing..., as recited in Gupta.


 









Conclusion
Relevant prior art not relied upon:
Jumper, US20210117851A1, Disclosed herein are system, method, and computer program product embodiments for generating labels for training a machine learning mode using an incremental time window process. The described process may be used in a recurrence detection system. A dataset may be analyzed using incremental split dates to divide the dataset into an analysis portion and a holdout portion. The analysis portion may be analyzed to determine input features related to a predicted recurrence in the dataset. The holdout portion may be tested against the analysis portion and the input features to generate a label. The label may indicate whether or not the holdout portion confirms the prediction. The testing of the holdout portion against the analysis portion may be repeated by incrementally using different split dates and multiple separate analysis portions and holdout portions to generate multiple labels and corresponding input features.

Faro, US20140108209A1, A method of monitoring cashless transaction data includes extracting transaction history data for a merchant within a first predetermined time period from a transaction data storage. A plurality of forecast models that forecast transaction patterns for the merchant over the first predetermined time period are provided, each forecast model of the first plurality having a different forecast period. The forecast models may be constructed according to a Holt-Winters time-series forecast. The forecast model which most accurately forecasts periodic fluctuations in the transaction history data is selected, and a first forecast of transactions for the merchant for a first forecast period outside the first predetermined time period is provided. A score comparing the forecast with actual data for the first forecast period is further provided. An alert is generated when the score exceeds a predetermined threshold. Also disclosed are a computer and recording medium to carry out the method.

Barsness, US20180268003A1, Disclosed aspects relate to managing a database management system (DBMS) using a set of stream computing data derived from a stream computing environment. The set of stream computing data which indicates a set of stream computing environment statistics may be collected with respect to the stream computing environment. A proactive database management operation may be determined for performance with respect to the DBMS based on the set of stream computing data which indicates the set of stream computing environment statistics. The proactive database management operation may be performed to manage the DBMS using the set of stream computing data.

Arelakis, US20180138921A1, Methods, devices and systems enhance compression and decompression of data blocks of data values by selecting the best suited compression method and device among two or a plurality of compression methods and devices, which are combined together and which said compression methods and devices compress effectively data values of particular data types; said best suited compression method and device is selected using as main selection criterion the dominating data type in a data block by predicting the data types within said data block.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  

Any inquiry concerning this communication or earlier communications from the examiner should be directed to PO HAN MAX LEE whose telephone number is (571) 272-3821.  The examiner can normally be reached on Mon-Thurs 8:00 am - 7:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Rutao Wu can be reached on (571) 272-6045.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/PO HAN MAX LEE/Examiner, Art Unit 3623


/CHARLES GUILIANO/Primary Examiner, Art Unit 3623