Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of the Application 
	Claims 1-20 have been examined in this application. This communication is the first action on the merits. 

Allowable Subject Matter
Claims 3, 4, and 8 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

s 1, 9, 11, and 12 are rejected under 35 U.S.C. 103 as being unpatentable over US Patent Application Publication Number 20190385080 (“Wu”) in view of US Patent Application Publication Number 20210110429 (“Keng”). 
Claim 1
	As per claim 1, Wu teaches a computer implemented method comprising: 
	training a base model with existing customer records describing purchases within an item collection having a plurality of categories, the existing customer records being associated with items across the plurality of categories, the base model comprising a neural network ([0011] “creating structured data based on the transaction data; identifying transactions in the structured data that are associated with a purchase category; labelling the transactions associated with the purchase category; and training the model using the structured data and the labelled transactions.” And, [0022] “label the transactions associated with the purchase category; use a recurrent neural network to build a model, using the structured data and the labels as training data.”  And, [0068] “purchases or transactions in data structure may be identified and labeled, for example, to a purchase category. Labelling transactions, for example, when a client has or has not purchased a car based on transaction history.” Examiner interprets the labeled transactions associated with purchase categories as customer records describing purchases within in item collection having a plurality of categories.);
generate relevancy predictions for one or more of the plurality of categories, the relevancy predictions being specific to at least one particular user ([0012] “predicting a likelihood of the user purchasing a product associated with the purchase category, using the model.” Examiner interprets the predicted likelihood as a relevancy prediction being specific to the particular user.)
Wu teaches generating relevancy predictions (Wu [0012]) but does not explicitly teach the following feature taught by Keng: 
tuning the base model for at least one specific promotion task via a transfer learning process to generate relevancy predictions ([0152] “The system 100 uses a combination of constrained optimization, prediction, and reinforcement learning to directly optimize the generation of promotional materials . . . system 100 leverages the historical data to make predictions about what products and promotions should be placed on the promotional materials to optimize the results of the promotional materials.” And, [0169] reinforcement learning hybrid approach. As further data is collected by the system 100, the building blocks are re-trained and re-scored, and, as a result, new predictions are made for to be provided to the machine learning module 120. This reinforcement learning and feedback approach can be invoked repeatedly to further hone the scores. As this process continues, various iterations occur whereby promotional materials are distributed, outcomes are received and the building blocks are re-trained and re-scored.” And, [0166] “generate predictive or explanatory scores of the outcomes of the promotional materials.”).
Wu teaches relevancy predictions (Wu [0012]) but does not explicitly teach the following feature taught by Keng:
combining the relevancy predictions and advertising revenue to estimate overall performance of the at least one specific promotion task ([0015] “model to predict an average effective discounted price of the promotion based on a category of products.” And, [0152] “system 100 uses a combination of constrained optimization, prediction, and reinforcement learning to directly optimize the generation of promotional materials. For example, optimization can result in any output measure, such as revenue . . . system leverages the historical data to make predictions about what products and promotions should be placed on the promotional materials to optimize the results of the promotional material.” And, [0166] “[t]he scores can be based on the historical data . . . the predicted selling quantities of one or more products due to the distribution of the promotional materials, the predicted profit or revenue generated by the promotional materials, the return on investment of the promotional materials.”). 
Therefore, it would have been obvious for Wu to include tuning the base model for at least one specific promotion task via a transfer learning process to generate relevancy predictions and combining the relevancy predictions and advertising revenue to estimate overall performance of the at least one specific promotion task to include determining, by the processor, that the customer is enrolled in the advertising program offered by the financial institution using content stored in the account database as further taught by Keng in order to “use[] a combination of constrained optimization, prediction, and 

Claim 9
As per claim 9, Wu does not explicitly teach but Keng teaches: 
wherein the at least one specific promotion task corresponds to a particular brand or particular category from among the plurality of categories ([0111] “The regression promotion forecasting model can be trained both on a per SKU and, in some cases, on a per brand or subcategory level to predict demand.” And, [0121] “model can be trained on a pooled SKU basis, such as across a subcategory or brand.”). 
Therefore, it would have been obvious to modify the combination of Wu and Keng to include wherein the at least one specific promotion task corresponds to a particular brand or particular category from among the plurality of categories as further taught by Keng in order to “use[] a combination of constrained optimization, prediction, and reinforcement learning to directly optimize the generation of promotional materials” which can “can result in . . . revenue uplift” (Keng [0152]). 

Claim 11
	As per claim 11, Wu does not explicitly teach but Kang teaches: 
	further comprising, based on the estimate of overall performance, implementing the at least one specific promotion task (0152] “system 100 uses a combination of constrained optimization, prediction, and reinforcement learning to directly optimize the generation of promotional materials. For example, optimization can result in any output measure, such as revenue.” And, [0155] “the machine learning module 120 selects which products are to be included in the promotional materials based on a machine learning model. The selection of products can include determining which products, from a predetermined roster of products, are optimally ready to be promoted.” And, [0156] “the machine learning module 120 selects the configuration and layout of the products on the promotional materials based on the machine learning model.”). 


Claim 12
	As per claim 12, Wu teaches a computer implemented method comprising: 
a computing system including a processor ([0022] “processor”) operatively coupled to a memory subsystem, the memory subsystem storing customer records and instructions ([0022] “memory in communication with the processor, the memory storing instructions that, when executed by the processor cause the processor to: receive purchase transaction data associated with one or more first users over time; create structured data based on the purchase transaction data.” And, [0057] “data structuring module for structuring data from raw data sources.”) which, when executed, cause the computing system to:
train a base model with the customer records, the customer records describing purchases within an item collection having a plurality of categories and being associated with items across the plurality of categories, the base model comprising a neural network ([0011] “creating structured data based on the transaction data; identifying transactions in the structured data that are associated with a purchase category; labelling the transactions associated with the purchase category; and training the model using the structured data and the labelled transactions.” And, [0022] “label the transactions associated with the purchase category; use a recurrent neural network to build a model, using the structured data and the labels as training data.”  And, [0068] “purchases or transactions in data structure may be identified and labeled, for example, to a purchase category. Labelling transactions, for example, when a client has or has not purchased a car based on transaction history.” Examiner interprets the labeled transactions associated with purchase categories as customer records describing purchases within in item collection having a plurality of categories.);
generate relevancy predictions for one or more of the plurality of categories, the relevancy predictions being specific to at least one particular user ([0012] “predicting a likelihood of the user purchasing a product associated with the purchase category, using the model.” Examiner interprets the predicted likelihood as a relevancy prediction being specific to the particular user.)
Wu teaches generating relevancy predictions (Wu [0012]) but does not explicitly teach the following feature taught by Keng: 
tune the base model for at least one specific promotion task via a transfer learning process to generate relevancy predictions ([0152] “The system 100 uses a combination of constrained optimization, prediction, and reinforcement learning to directly optimize the generation of promotional materials . . . system 100 leverages the historical data to make predictions about what products and promotions should be placed on the promotional materials to optimize the results of the promotional materials.” And, [0169] “Prior promotional materials data can be imported into the system to enable the machine learning module to benefit from prior experience. This instantiation step is a mix of constrained optimization and conditional rules.” And, [0170] “the steady state of the machine learning module shifts over to a reinforcement learning hybrid approach. As further data is collected by the system 100, the building blocks are re-trained and re-scored, and, as a result, new predictions are made for to be provided to the machine learning module 120. This reinforcement learning and feedback approach can be invoked repeatedly to further hone the scores. As this process continues, various iterations occur whereby promotional materials are distributed, outcomes are received and the building blocks are re-trained and re-scored.” And, [0166] “generate predictive or explanatory scores of the outcomes of the promotional materials.”).
Wu teaches relevancy predictions (Wu [0012]) but does not explicitly teach the following feature taught by Keng:
combine the relevancy predictions and advertising revenue to estimate overall revenue ([0015] “model to predict an average effective discounted price of the promotion based on a category of products.” And, [0152] “system 100 uses a combination of constrained optimization, prediction, and reinforcement learning to directly optimize the generation of promotional materials. For example, optimization can result in any output measure, such as revenue . . . system leverages the historical data the predicted profit or revenue generated by the promotional materials, the return on investment of the promotional materials.”). 
Therefore, it would have been obvious for Wu to include tune the base model for at least one specific promotion task via a transfer learning process to generate relevancy predictions and combine the relevancy predictions and advertising revenue to estimate overall revenue as further taught by Keng in order to “use[] a combination of constrained optimization, prediction, and reinforcement learning to directly optimize the generation of promotional materials” which can “can result in . . . revenue uplift” (Keng [0152]). 


2 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Application Publication Number 20190385080 (“Wu”) in view of US Patent Application Publication Number 20210110429 (“Keng”) as applied to claim 1 above, and in further view of US Patent Application Publication Number 20140244361 (“Zhang”).
Claim 2
	As per claim 2, Wu teaches predicting a likelihood of the user purchasing a product associated with the purchase category *[0012]) but does not explicitly teach the following feature taught by Zhang:
wherein the base model has an input feature vector X and an output feature matrix Y, wherein each of a plurality of values in the input feature vector X corresponds to a score for a customer associated with a particular product category, and wherein each of a plurality of values in the output feature matrix Y corresponds to a purchase propensity score ([0081] “his rank is obtained by first estimating the probability P(u, e) of a user u buying in each category e, and by successively ranking the probabilities.” And, [0106] “For each user u the system may rank categories by assigning to each category e the ranking score.” And, [0123] “For training, a user u is represented by a feature vector, and the label is the ranking score gsRank(u, e). During testing, for each user the predicted gsRank scores for each category are gathered as produced by the 35 models, and the categories are ranked accordingly. The L2 regularization parameter is optimized on a subset of the training set.” And, [0121] “A standard Naive Bayes model can be used, which for each user-category pair predicts the probability that the user will purchase from the category. The algorithm returns the ranked list of categories for each user.”).
Therefore, it would have been obvious to modify the combination of Wu and Keng to include wherein the base model has an input feature vector X and an output feature matrix Y, wherein each of a plurality of values in the input feature vector X corresponds to a score for a customer associated with a particular product category, and wherein each of a plurality of values in the output feature matrix Y corresponds to a purchase propensity score as taught by Zhang in order to “improve[e] existing product recommendation engines [i.e., the system in Wu], by providing category-level priors that can guide the recommender system to find domains of interest for the user” (Zhang [0023]). 


5 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Application Publication Number 20190385080 (“Wu”) in view of US Patent Application Publication Number 20210110429 (“Keng”) as applied to claim 1 above, and in further view of US Patent Application Publication Number 20200167448 (“Modarresi”).
Claim 5
	As per claim 5, Wu teaches predicting a likelihood of the user purchasing a product associated with the purchase category ([0012]) but does not explicitly teach the following feature taught by Modarresi:
wherein tuning the base model comprises removing a final layer of the base model and retraining a new final layer to predict campaign-specific sales ([0003] “deep neural network can be automatically grown by adding a neuron to the output layer with new connections to each neuron in the preceding layer.” And, [0045] “adding a dimension (e.g., a neuron) to the model (e.g., the output layer).” And, [0050] “new connections are added between the new neuron and each neuron in the previous layer of the network.” ANd [0045] “model may be automatically and/or periodically retrained using the updated training dataset.” And, [0046] “output prediction can be used to tailor content delivery or advertisements for a predicted user, to associate a purchase or other revenue generation event . . . with other digital interactions from the same user.”). 
Therefore, it would have been obvious to modify the combination of Wu and Keng to include wherein tuning the base model comprises removing a final layer of the base model and retraining a new final layer to predict campaign-specific sales as taught by Modarresi “in order to optimize the return/reward derived from the offers” (Modarresi [0001]). 

6 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Application Publication Number 20190385080 (“Wu”) in view of US Patent Application Publication Number 20210110429 (“Keng”) as applied to claim 1 above, and in further view of Australian Patent Application Publication Number 2011279407 (“Gandhi”). 
Claim 6
	As per claim 6, Wu teaches predicting a likelihood of the user purchasing a product associated with the purchase category ([0012]) but does not explicitly teach the following feature taught by Gandhi:
wherein combining the relevancy predictions and advertising revenue to estimate overall revenue comprises optimizing total revenue by selecting one or more advertising campaigns for each customer to balance total revenue (Gandhi [page 13, lines 25-31] “the selection of advertisements for distribution to an individual in response to an advertisement request balances advertiser's preferences, contextual relevance, revenue considerations, and/or advertisement quality considerations to select the best advertisements or other content to the individual.”). 
Therefore, it would have been obvious to modify the combination of Wu and Keng to include wherein combining the relevancy predictions and advertising revenue to estimate overall revenue comprises optimizing total revenue by selecting one or more advertising campaigns for each customer to balance total revenue as taught by Gandhi in order to “improve the performance of an advertisement, or to achieve a desired value of one or more performance metrics” (Gandhi [page 34, lines 25-31]).

7 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Application Publication Number 20190385080 (“Wu”) in view of US Patent Application Publication Number 20210110429 (“Keng”) in view of Australian Patent Application Publication Number 2011279407 (“Gandhi”) as applied to claim 6 above, and in further view of Japanese Patent Application Publication Number 2015219611 (“Sotaro”).  
Claim 7
As per claim 7, Wu does not explicitly teach but Sotaro:
wherein total revenue comprises a sum of retail revenue from a customer and promotional revenue from a sponsor (Sotaro [page 3] “the advertising device calculates, as the mobilization unit price, the sum of the revenue obtained from the provision of advertising content . . . and the revenue obtained from the provision of sales service on the advertising device side.”). 
Therefore, it would have been obvious to modify the combination of Wu, Keng, and Sotaro to include wherein total revenue comprises a sum of retail revenue from a customer and promotional revenue from a sponsor as taught by Sotaro in order to take into account all forms of revenue so that “the error of a predicted value may be reduced” (Sotaro [page 24]).

10 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Application Publication Number 20190385080 (“Wu”) in view of US Patent Application Publication Number 20210110429 (“Keng”) as applied to claim 1 above, and in further view of US Patent Application Publication Number 20210264195 (“Ingram”) and US Patent Application Publication Number 20090198602 (“Wang”). 
Claim 10
As per claim 10, Wu does not explicitly teach but Ingram:
wherein the existing customer records include records of item views, additions of items to a shopping cart ([0138] “The logic can track a product-level event, such as a product image viewed on a web page hosting a product detail, a product added to an electronic shopping cart or wallet, a product purchased, or others. In particular, the computing platform 104 can auto-track product views.” And, [0135] “various image views are recorded along with a number of times the image has been viewed.”). 
Therefore, it would have been obvious to modify the combination of Wu and Keng to include wherein the existing customer records include records of item views, additions of items to a shopping cart as taught by Ingram in order to “allow[] for future correlation of user behavior with user characteristics” (Ingram [0131]) so that user interest in products (and associated advertising effectiveness) can be more easily predicted. 
Wu does not explicitly teach but Wang:
customer records include category spend for each of a plurality of users ([0009] “the system classifies the spending data for the set of users into the set of categories so that each category is associated with a set of spending records for the set of users. Next, for each category, the system creates a set of bins, wherein each bin represents an interval of spending strength for the category.”). 
Therefore, it would have been obvious to modify the combination of Wu, Keng, and Ingram to include customer records include category spend for each of a plurality of users as taught by Wang in order to “allow[] personal spending behaviors for different users to be easily compared and measured” (Wang [0050]) and “provide a rough estimate of how ‘interesting’ the offer is to a specific user” (Wang [0063]) 
13 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Application Publication Number 20190385080 (“Wu”) in view of US Patent Application Publication Number 20210110429 (“Keng”) as applied to claim 12 above, and in further view of US Patent Application Publication Number 20130332275 (“Takami”).
Claim 13 
	As per claim 13, Wu does not explicitly teach but Takami teaches: 
	wherein the computing system is further configured to automatically identify one or more recommended promotions and transmit an identifier of the one or more recommended promotions to a retail website ([0007] “an advertisement most suitable for content of a web page or a preference of a user may not be selected.” And, [0067] “The advertisement selection unit has a function of transmitting information on the selected advertisement to the web server. The information to be transmitted may be, for example, only a selected advertisement ID.” And, [0042] “web system 2 includes a web server 10 that provides the web page.”). 
Therefore, it would have been obvious to modify the combination of Wu and Keng to include wherein the computing system is further configured to automatically identify one or more recommended promotions and transmit an identifier of the one or more recommended promotions to a retail website as taught by Takami in order “to select the advertisement most suitable for content of the web page” (Takami [0021]) resulting in increased advertisement effectiveness. 

s 14-16 are rejected under 35 U.S.C. 103 as being unpatentable over US Patent Application Publication Number 20190385080 (“Wu”) in view of US Patent Application Publication Number 20210110429 (“Keng”) as applied to claim 12 above, and in further view of US Patent Application Publication Number 20100235241 (“Wang II”).
Claim 14
	As per claim 14, Wu does not explicitly teach but Wang II teaches: 
wherein the computing system is communicatively connected to a retailer website, and wherein the instructions cause the computing system to: identify the at least one specific promotion task from among a plurality of different promotion task candidates as an optimized promotion task to be presented to a particular customer based on the customer records ([0026] “A publisher 106 is any web site that hosts and provides electronic access to a resource (e.g., web page content).” And, [0068] “the advertisement selection module 122 can select advertisements for user sessions (e.g., current user sessions) of a user based on the user identifier profile data for that user. For example, if a user's user identifier profile data indicates that the user has a high user identifier interest weight in a vertical category (e.g., as compared to other users) then the advertisement selection module 122 can account for such interest when selecting, ranking, and/or ordering advertisements (e.g., select advertisements categorized in the vertical category corresponding to the high user identifier interest weight). The advertisement selection module 122 can select advertisements from the advertisement data store.”  And, [0069]-[0070]). 
Wu does not explicitly teach but Wang II teaches:
	automatically present to the particular customer a promotion in accordance with the at
least one specific promotion task via the retailer website ([0070] “The highest ranked advertisements can be provided for presentation for a user session associated with the user identifier profile data used to select the advertisements.” ([0026] “A publisher 106 is any web site that hosts and provides electronic access to a resource (e.g., web page content).”). 
Therefore, it would have been obvious to modify the combination of Wu and Keng to include identify the at least one specific promotion task from among a plurality of different promotion task candidates as an optimized promotion task to be presented to a particular customer based on the customer records and automatically present to the particular customer a promotion in accordance with the 

Claim 15
	As per claim 15, Wu does not explicitly teach but Wang II teaches: 
	wherein each of the plurality of different promotion task candidates is assessed relative to at least the particular customer to identify the at least one specific promotion task ([0068] “the advertisement selection module 122 can select advertisements for user sessions (e.g., current user sessions) of a user based on the user identifier profile data for that user. For example, if a user's user identifier profile data indicates that the user has a high user identifier interest weight in a vertical category (e.g., as compared to other users) then the advertisement selection module 122 can account for such interest when selecting, ranking, and/or ordering advertisements (e.g., select advertisements categorized in the vertical category corresponding to the high user identifier interest weight). The advertisement selection module 122 can select advertisements from the advertisement data store.”  And, [0069]-[0070]). 
Therefore, it would have been obvious to modify the combination of Wu and Keng to include wherein each of the plurality of different promotion task candidates is assessed relative to at least the particular customer to identify the at least one specific promotion task as taught by Wang II in order to “target advertisements to users sessions associated with the user identifier” ([0023]) resulting in increased advertisement effectiveness.  

Claim 16
	As per claim 16, Wu does not explicitly teach but Wang II teaches: 
wherein the computing system is communicatively connected to a retailer website, and wherein the instructions cause the computing system to: identify the particular customer to present a promotion corresponding to the at least one specific promotion task as an optimized promotion task, the particular customer being identified from among a plurality of customers ([0026] “A publisher 106 is any web site that hosts and provides electronic access to a resource (e.g., web page content).” And, [0068] “the as compared to other users) then the advertisement selection module 122 can account for such interest when selecting, ranking, and/or ordering advertisements (e.g., select advertisements categorized in the vertical category corresponding to the high user identifier interest weight). The advertisement selection module 122 can select advertisements from the advertisement data store.”  And, [0069]-[0070]). 
Wu does not explicitly teach but Wang II teaches:
	automatically present to the particular customer a promotion in accordance with the at
least one specific promotion task via the retailer website ([0070] “The highest ranked advertisements can be provided for presentation for a user session associated with the user identifier profile data used to select the advertisements.” ([0026] “A publisher 106 is any web site that hosts and provides electronic access to a resource (e.g., web page content).”). 
Therefore, it would have been obvious to modify the combination of Wu and Keng to include identify the at least one specific promotion task from among a plurality of different promotion task candidates as an optimized promotion task to be presented to a particular customer based on the customer records and automatically present to the particular customer a promotion in accordance with the at least one specific promotion task via the retailer website as taught by Wang II in order to “target advertisements to users sessions associated with the user identifier” ([0023]) resulting in increased advertisement effectiveness.  


17 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Application Publication Number 20190385080 (“Wu”) in view of US Patent Application Publication Number 20210110429 (“Keng”) as applied to claim 12 above, and in further view of US Patent Publication Number 10438216 (“Parekh”). 
Claim 17
	As per claim 17, Wu does not explicitly teach but Parekh teaches: 
wherein identifying the particular customer comprises optimizing total revenue by selecting one or more advertising campaigns for each customer to balance total revenue ([col. 3, lines 10-13] “selection of the subset of the plurality of consumers maximizes expected revenue generated by the promotion.” And [col. 30, lines 5-20] “selection of the subset of the plurality of consumers maximizes expected revenue generated by the promotion. To this end, while the ranking of the consumers affects the expected revenue generated by the promotion, the size of the subset may also affect this expected revenue generation, and depending on preexisting interest in the good, service, or experience offered by the promotion, increasing or shrinking the size of the subset may either increase or decrease the expected revenue produced by offering the promotion. As a result, maximizing expected revenue may require utilizing a size that is not a predetermined percentage of highest ranked consumers, but instead is a variable size that is calculated to maximize the expected revenue from offering the promotion.”) 
Therefore, it would have been obvious to modify the combination of Wu and Keng to include wherein identifying the particular customer comprises optimizing total revenue by selecting one or more advertising campaigns for each customer to balance total revenue as taught by Parekh in order to “maximize[] expected revenue generated by the promotion” (Parekh [col. 3, lines 10-13]). 

18 is rejected under 35 U.S.C. 103 as being unpatentable over US Patent Application Publication Number 20190385080 (“Wu”) in view of US Patent Application Publication Number 20210110429 (“Keng”) in view of US Patent Application Publication Number 20170316450 (“Kobylkin”). 
Claim 18
	As per claim 18, Wu teaches a promotion generation system comprising:
	a computing system including a processor ([0022] “processor”) operatively coupled to a memory subsystem, the memory subsystem storing customer records and instructions ([0022] “memory in communication with the processor, the memory storing instructions that, when executed by the processor cause the processor to: receive purchase transaction data associated with one or more first users over time; create structured data based on the purchase transaction data.” And, [0057] “data structuring module for structuring data from raw data sources.”) which, when executed, cause the computing system to:
train a base model with the customer records, the customer records describing purchases within an item collection having a plurality of categories and being associated with items across the plurality of categories, the base model comprising a neural network ([0011] “creating structured data based on the transaction data; identifying transactions in the structured data that are associated with a purchase category; labelling the transactions associated with the purchase category; and training the model using the structured data and the labelled transactions.” And, [0022] “label the transactions associated with the purchase category; use a recurrent neural network to build a model, using the structured data and the labels as training data.”  And, [0068] “purchases or transactions in data structure may be identified and labeled, for example, to a purchase category. Labelling transactions, for example, when a client has or has not purchased a car based on transaction history.” Examiner interprets the labeled transactions associated with purchase categories as customer records describing purchases within in item collection having a plurality of categories.);
generate relevancy predictions for one or more of the plurality of categories, the relevancy predictions being specific to at least one particular user ([0012] “predicting a likelihood of the user purchasing a product associated with the purchase category, using the model.” Examiner interprets the predicted likelihood as a relevancy prediction being specific to the particular user.)
Wu teaches generating relevancy predictions (Wu [0012]) but does not explicitly teach the following feature taught by Keng: 
tuning the base model for at least one specific promotion task via a transfer learning process to generate relevancy predictions ([0152] “The system 100 uses a combination of constrained optimization, prediction, and reinforcement learning to directly optimize the generation of promotional materials . . . system 100 leverages the historical data to make predictions about what products and promotions should be placed on the promotional materials to optimize the results of the promotional materials.” And, [0169] “Prior promotional materials data can be imported into the system to enable the machine learning module to benefit from prior experience. This instantiation step is a mix of constrained optimization and conditional rules.” And, [0170] “the steady state of the machine learning module shifts over to a reinforcement learning hybrid approach. As further data is collected by the system 100, the building blocks are re-trained and re-scored, and, as a result, new predictions are made for to be provided to the machine learning module 120. This reinforcement learning and feedback approach can be invoked repeatedly to further hone the scores. As this process continues, various iterations occur whereby promotional materials are distributed, outcomes are received and the building blocks are re-trained and re-scored.” And, [0166] “generate predictive or explanatory scores of the outcomes of the promotional materials.”).
Wu teaches relevancy predictions (Wu [0012]) but does not explicitly teach the following feature taught by Keng:
combine the relevancy predictions and advertising revenue to estimate overall revenue associated with the at least one particular user and the at least one specific promotion task ([0015] “model to predict an average effective discounted price of the promotion based on a category of products.” And, [0152] “system 100 uses a combination of constrained optimization, prediction, and reinforcement learning to directly optimize the generation of promotional materials. For example, optimization can result in any output measure, such as revenue . . . system leverages the historical data to make predictions about what products and promotions should be placed on the promotional materials to optimize the results of the promotional material.” And, [0166] “[t]he scores can be based on the historical data . . . the predicted selling quantities of one or more products due to the distribution of the the predicted profit or revenue generated by the promotional materials, the return on investment of the promotional materials.”). 
Therefore, it would have been obvious for Wu to include tuning the base model for at least one specific promotion task via a transfer learning process to generate relevancy predictions and combine the relevancy predictions and advertising revenue to estimate overall revenue associated with the at least one particular user and the at least one specific promotion task as further taught by Keng in order to “use[] a combination of constrained optimization, prediction, and reinforcement learning to directly optimize the generation of promotional materials” which can “can result in . . . revenue uplift” (Keng [0152]). 
Wu does not explicitly teach but Kobylkin teaches:
transmit an instruction to an online retail platform to automatically present a promotion associated with the at least one specific promotion task to the at least one particular user ([0089] “In implementations where certain selected users and/or prospective users correspond to merchant and/or prospective merchants, the advertisement modeling application 510 may communicate with the merchant advertisement application 170 to transmit advertisements to the corresponding current merchant server(s) 110 and/or prospective merchant servers 130.” And, [0097] “For example, one or more advertisements may be transmitted (e.g., by the merchant advertisement application 170 and/or the advertisement modeling application 510) to the user computer(s) 502 and the prospective user computer(s) 504.” And, [0089] “The advertisement response model may facilitate a selection of which users of the user computers 502 and which prospective users of the prospective user computers 504 to target.”). 
Therefore, it would have been obvious to modify the combination of Wu and Kobylkin to include combine the relevancy predictions and advertising revenue to estimate overall revenue associated with the at least one particular user and the at least one specific promotion task and transmit an instruction to an online retail platform to automatically present a promotion associated with the at least one specific promotion task to the at least one particular user as further taught by Kobylkin in order to “target the determined units that have a positive predicted response” (Kobylkin [0032]) resulting in increased advertisement effectiveness and “make certain campaigns profitable that would otherwise have been unprofitable otherwise” (Kobylkin [0034])
Claim 19
	As per claim 19, Wu does not explicitly teach but Kobylkin teaches: 
	further comprising the online retail platform, wherein the online retail platform comprises at least one of a retail website server and a mobile application server ([0089] “In implementations where certain selected users and/or prospective users correspond to merchant and/or prospective merchants, the advertisement modeling application 510 may communicate with the merchant advertisement application 170 to transmit advertisements to the corresponding current merchant server(s) 110 and/or prospective merchant servers 130.”).
Therefore, it would have been obvious for Wu, Keng, and Kobylkin to include further comprising the online retail platform, wherein the online retail platform comprises at least one of a retail website server and a mobile application server as further taught by Kobylkin in order to “target the determined units that have a positive predicted response” (Kobylkin [0032]) resulting in increased advertisement effectiveness and “make certain campaigns profitable that would otherwise have been unprofitable otherwise” (Kobylkin [0034]).

Claim 20
	As per claim 20, Wu further teaches: 
wherein the customer records are associated with a plurality of customers including the at least one particular user ([0022] “purchase transaction data associated with one or more first users over time; create structured data based on the purchase transaction data.” And, [0061] “raw data may include, for each client of a number of clients, or user of a number of users, client transaction history.”).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US Patent Application Publication 20150371279 (“Gerard”) teaches computing a model score that may be based on a consumer spending model, such as a consumer's propensity to spend in a related category (e.g., merchant, industry, product, etc.).
US Patent Application Publication 20180084078 (“Yan”) teaches generating a score indicative of a likelihood that the target user will be interested in information describing the item.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALLAN J WOODWORTH, II whose telephone number is (571) 272-6904.  The examiner can normally be reached on Mon-Fri 9:00-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ilana Spar can be reached at 571-270-7537. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/ALLAN J WOODWORTH, II/Examiner, Art Unit 3622                                                                                    
/ILANA L SPAR/Supervisory Patent Examiner, Art Unit 3622