DETAILED ACTION
Status of Claims
This is a first office action on the merits in response to the application filed on 28 January 2022. 
Claims 1-20 are currently pending and have been examined. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 31 July 2022 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Priority
This application claims priority of US Provisional Application No. 62/693295 filed on 2 July 2018 and US Provisional Application No.62/828084 filed on 2 April 2019. 
This application claims continuation status priority of US Application No. 16/448419 filed on 21 June 2019 and US Application No. 16/824446 filed on 19 March 2020. 
Applicant’s claim for the benefit of these prior filed applications is acknowledged. 

Claim Objections
Claims 14, 16, 18, and 19 are objected to because of the following informalities:  
Claim 14 recites “further comprising the reinforcement learning model configured to”, which appears to contain an error of omission and should recite “further comprising the reinforcement learning model being configured to”. Claims 16, 18, and 19 are similarly objected to. 
Appropriate correction is required.

Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. Claims not listed below are rejected for dependency. 

Claim 1 recites “a reinforcement learning model configured at a given point in time: to receive digital data about a state of a user at the given point in time; to receive digital data about an environment at the given point in time; to receive digital data about a campaign at the given point in time; to optimize total expected future number of positive rewards at the given point in time; and to execute an action at the given point in time.” As written, the limitation appears to describe receiving information, optimizing rewards, and execute an action simultaneously. However, one of ordinary skill in the art would not understand these processes as being able to be performed simultaneously, and the specification does not appear to provide details regarding performing them simultaneously. Because of this confusion, one of ordinary skill in the art would not know the boundaries of the claim, rendering the claim indefinite. 
For the purposes of examination, the processes will be interpreted as occurring within a short timespan but not simultaneously. 

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  
The claims do not fall within at least one of the four categories of patent eligible subject matter because the claims are directed to a “reinforcement learning model.” A “reinforcement learning model” is not a series of acts or steps, nor is it a concrete thing, nor is it a tangible article, nor is it a combination of substances. Similar to the claimed “reinforcement learning model”, MPEP 2106 notes that a “business model” does not fall within any statutory category. Thus the claimed invention does not fall within any of the four categories of patent eligible subject matter. 

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Per MPEP 2106.03: “when a claim fails under Step 1 (Step 1: NO), but it appears from applicant’s disclosure that the claim could be amended to fall within a statutory category (Step 1: YES), the analysis should proceed to determine whether such an amended claim would qualify as eligible at Pathway A, B or C.” Applicant’s disclosure describes a computer system which can be used to implement embodiments of the invention (e.g., [0074]). A system with a processor configured to implement the reinforcement learning model would likely resolve the rejection for being directed to non-statutory subject matter.  Thus the claims will be considered, for the Mayo/Alice analysis, as if the preamble instead recited “a system comprising: a processor configured to implement a reinforcement model configured at a given point in time:”
Claim 1 recites “a reinforcement learning model configured at a given point in time: to receive data about a state of a user at the given point in time; to receive data about an environment at the given point in time; to receive data about a campaign at the given point in time; to optimize total expected future number of positive rewards at the given point in time; and to execute an action at the given point in time.” These limitations set forth a concept of collecting data, analyzing the data to optimize for a future total reward, and performing an action. Acting on information to maximize a reward is present in virtually all economic practice, and thus the claims set forth a fundamental economic practice. Further, as disclosed, this concept appears intended to be applied to the management of a marketing campaign, which would unambiguously be a marketing or advertising activity. Thus the claims are determined to recite an abstract idea. 
Under the 2019 PEG, the additional elements of the claims are considered for whether they integrate a recited abstract idea into a practical application. As noted above, for the purpose of this eligibility analysis, the claims will be treated as if they recited a processor configured to implement the steps of the claim. Thus the claims recite an additional element of a processor. However, this additional element is recited at an extreme level of generality, and may be interpreted as a generic computing device used to implement the abstract idea. Under the 2019 PEG, the use of a generic computing device to implement an abstract idea does not integrate that abstract idea into a practical application. The claims further recite the additional element of digital data. However, this additional element reflects no improvement to technology, no particular device, and does not effect a transformation of an article. Further, this additional element does not meaningfully limit the implementation of the claims. Instead, this additional element, individually and in combination with the prior computing device, only generally links the abstract idea to a technological environment of a computing device. As such, this additional element, individually and in combination with the prior additional element, does not integrate the abstract idea into a practical application. Thus the claims are determined to be directed to an abstract idea. 
At Step 2B of the Mayo/Alice analysis, the additional elements of the claims are considered for whether they amount to significantly more than the abstract idea. As previously noted, the claims recite an additional element which may be interpreted as a generic computing device. However, implementing an abstract idea on a generic computing device does not add significantly more, similar to how the recitation of the computer in the claim in Alice amounted to mere instructions to apply the abstract idea of intermediated settlement on a generic computer. As such, these elements do not provide an inventive concept and do not constitute significantly more. As previously noted, the claims recite the additional element of digital data. However, as noted before, this additional element, individually and in combination, only generally links the abstract idea to a technological environment involving a computing device. Per MPEP 2106, the courts have found generally linking an abstract idea to a technological environment to not be enough to qualify as significantly more. Thus this additional element, individually and in combination with the prior additional element, does not amount to significantly more. There are no further additional elements. Therefore, when considered individually and as an ordered combination, the additional elements of the independent claims do not amount to significantly more than the judicial exception. Thus the independent claims are not patent eligible.  
Dependent claims 2-16 and 18-20 further narrow the abstract idea, but the claims continue to recite an abstract idea. These claims recite no further additional elements. The previously identified additional elements, individually and as a combination, do not integrate the narrowed abstract idea into a practical application for reasons similar to those explained above. Therefore these claims continue to be directed to an abstract idea. The previously identified additional elements, individually and as a combination, do not amount to significantly more than the narrowed abstract idea for reasons similar to those explained above. Dependent claim 17 recites the additional element of a neural network. However, this additional element is interpreted may be interpreted as implementing the abstract idea with a computing device. As such this additional element, individually and in combination with the prior additional elements, does not integrate the abstract idea into a practical application or amount to significantly more. Thus as the dependent claims remain directed to a judicial exception, and as the additional elements of the claims do not amount to significantly more, the dependent claims are not patent eligible.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1, 6, 14, and 17 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Cai et al. (“Real-Time Bidding by Reinforcement Learning in Display Advertising”). 

Regarding Claim 1: Cai discloses a reinforcement learning model configured at a given point in time:
to receive digital data about a state of a user at the given point in time;
to receive digital data about an environment at the given point in time;
to receive digital data about a campaign at the given point in time;
to optimize total expected future number of positive rewards at the given point in time; and
to execute an action at the given point in time.


    PNG
    media_image1.png
    185
    341
    media_image1.png
    Greyscale
(See at least Page 3). 

    PNG
    media_image2.png
    125
    339
    media_image2.png
    Greyscale
(See at least Page 3). 

    PNG
    media_image3.png
    522
    337
    media_image3.png
    Greyscale
(See at least Page 3). 
Examiner’s note: The broadest reasonable interpretation of “environment” includes users. This is confirmed by the specification at [0040] which states “An agent is interacting with an environment (customers)”.

Claim 6: Cai discloses the above limitations. Additionally, Cai discloses wherein the environment is a date and time (“Each entry of x corresponds to a category in a field, such as the category London in the field City, and the category Friday in the field Weekday. The fields consist of the campaign’s ad information (e.g., ad creative ID and campaign ID) and the auctioned impression contextual information (e.g., user cookie ID, location, time, publisher domain and URL).” See at least Page 3). 

Claim 14: Cai discloses the above limitations. Additionally, Cai discloses the reinforcement learning model configured to receive digital data about a constraint. 


    PNG
    media_image3.png
    522
    337
    media_image3.png
    Greyscale
(See at least Page 3). 

Claim 17: Cai discloses the above limitations. Additionally, Cai discloses wherein the reinforcement learning model is a neural network (“Furthermore, to handle the scalability problem for the real-world auction volume and campaign budget, we propose to leverage a neural network model to approximate the value function.” See at least Page 2). Also: “RLB-NN is our proposed model for the large-scale problem, which uses the neural network NN(t, b) to approximate D(t, b).” See at least Page 7). 


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 2 and 3 are rejected under 35 U.S.C. 103 as being unpatentable over Cai et al. (“Real-Time Bidding by Reinforcement Learning in Display Advertising”) in view of Abe et al. (US 2004/0015386 A1)

Claim 2: Cai discloses the above limitations. Cai does not explicitly disclose wherein the state of the user at the given point in time is a number of communications the user has received in a particular time period. However, Abe teaches a number of communications the user has received in a particular time period (The inventors used the training data portion of the original data set, which contains data for approximately 100 thousand selected individuals (This is contained in "cup981rn.zip" on the URL "http://kdd.ics.uci.edu/databases/kddcup98/kddcup98.html"). Out of the large number of demographic features contained in the data set, the inventors selected only age and income bracket. Based on the campaign information in the data, the inventors generated a number of temporal features that are designed to capture the state of that individual at the time of each campaign. These features include the frequency of gifts, the recency of gifts and promotions, the number of recent promotions in the last 6 months, etc., and are summarized in Table 1 which is provided in FIG. 7. See at least [0121] and Fig. 7, noting “numprom | number of promotions to date” in Table 1 of Fig. 7). 
	Cai provides a reinforcement learning model which receives data used by the model to determine an action, which differs from the claimed invention by the substitution of Cai’s data for a number of communications the user has received. However, Abe demonstrates that the prior art already knew of the concept of a number of communications the user has received. One of ordinary skill in the art could have trivially substituted Abe’s data in for the data of Cai. Further, one of ordinary skill in the art would have recognized that such a substitution would have predictably resulted in a system which would make decisions based on a number of communications the user has received. As such, the identified substitution and the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosures of Cai and the teachings of Abe. 

Claim 3: Cai discloses the above limitations. Cai does not explicitly disclose wherein the state of the user at the given point in time is a time since a last communication. However, Abe teaches a time since a last communication (The inventors used the training data portion of the original data set, which contains data for approximately 100 thousand selected individuals (This is contained in "cup981rn.zip" on the URL "http://kdd.ics.uci.edu/databases/kddcup98/kddcup98.html"). Out of the large number of demographic features contained in the data set, the inventors selected only age and income bracket. Based on the campaign information in the data, the inventors generated a number of temporal features that are designed to capture the state of that individual at the time of each campaign. These features include the frequency of gifts, the recency of gifts and promotions, the number of recent promotions in the last 6 months, etc., and are summarized in Table 1 which is provided in FIG. 7. See at least [0121] and Fig. 7, noting “promrecency | months since last promotion” in Table 1 of Fig. 7). 
	Cai provides a reinforcement learning model which receives data used by the model to determine an action, which differs from the claimed invention by the substitution of Cai’s data for a time since a last communication. However, Abe demonstrates that the prior art already knew of the concept of a time since a last communication. One of ordinary skill in the art could have trivially substituted Abe’s data in for the data of Cai. Further, one of ordinary skill in the art would have recognized that such a substitution would have predictably resulted in a system which would make decisions based on a time since a last communication. As such, the identified substitution and the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosures of Cai and the teachings of Abe. 

Claims 4 and 8-11 are rejected under 35 U.S.C. 103 as being unpatentable over Cai et al. (“Real-Time Bidding by Reinforcement Learning in Display Advertising”) in view of Sterns et al. (US 2015/0100412 A1). 

Claim 4: Cai discloses the above limitations. Cai does not explicitly disclose wherein the state of the user at the given point in time is the user’s past behavior. However, Sterns teaches a user’s past behavior (In some embodiments, the respective user is a member of a group of users (e.g., an group of users who have registered as members of the group or an implicit group of users who meet predefined demographic criteria such as age, sex, income level, geographic location, etc.) and the campaign report for the respective user includes a comparison between the respective user's interactions with the first campaign (and the respective user's interactions with the second campaign) and interactions of the group of users with the first campaign. In some embodiments, after the message-interaction report has been obtained (e.g., generated by the server system or retrieved from a remote storage system) the server system provides (1014) the respective campaign report to the requestor. Comparing the interactions of a particular user with message campaigns as compared to the group of similar users can provide valuable information for better tailoring future message campaigns to the user as well as to other users in the group. While the examples described above refer to a campaign report that includes information about a first message campaign and a second message campaign for a respective user, it should be understood that the information in the campaign report optionally includes information about the respective user's interactions with any number of message campaigns. Optionally the campaign report includes statistics and activity detail for the respective user over the respective user's entire history with the company/brand whose products/services are being promoted by the message campaigns (or some subset of that history such as a time-limited subset of that history). See at least [0141]). 
	Cai provides a reinforcement learning model which receives data used by the model to determine an action, which differs from the claimed invention by the substitution of Cai’s data for a user’s past behavior. However, Sterns demonstrates that the prior art already knew of the concept of a user’s past behavior. One of ordinary skill in the art could have trivially substituted Stern’s data in for the data of Cai. Further, one of ordinary skill in the art would have recognized that such a substitution would have predictably resulted in a system which would make decisions based on a user’s past behavior. As such, the identified substitution and the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosures of Cai and the teachings of Sterns. 

Claim 8: Cai discloses the above limitations. Cai does not explicitly disclose wherein the digital data about the campaign is the user’s past interaction with the campaign. However, Sterns teaches a user’s past interaction with the campaign (In some embodiments, the respective user is a member of a group of users (e.g., an group of users who have registered as members of the group or an implicit group of users who meet predefined demographic criteria such as age, sex, income level, geographic location, etc.) and the campaign report for the respective user includes a comparison between the respective user's interactions with the first campaign (and the respective user's interactions with the second campaign) and interactions of the group of users with the first campaign. In some embodiments, after the message-interaction report has been obtained (e.g., generated by the server system or retrieved from a remote storage system) the server system provides (1014) the respective campaign report to the requestor. Comparing the interactions of a particular user with message campaigns as compared to the group of similar users can provide valuable information for better tailoring future message campaigns to the user as well as to other users in the group. While the examples described above refer to a campaign report that includes information about a first message campaign and a second message campaign for a respective user, it should be understood that the information in the campaign report optionally includes information about the respective user's interactions with any number of message campaigns. Optionally the campaign report includes statistics and activity detail for the respective user over the respective user's entire history with the company/brand whose products/services are being promoted by the message campaigns (or some subset of that history such as a time-limited subset of that history). See at least [0141]). 
	Cai provides a reinforcement learning model which receives data used by the model to determine an action, which differs from the claimed invention by the substitution of Cai’s data for a user’s past interaction with the campaign. However, Sterns demonstrates that the prior art already knew of the concept of a user’s past interaction with the campaign. One of ordinary skill in the art could have trivially substituted Stern’s data in for the data of Cai. Further, one of ordinary skill in the art would have recognized that such a substitution would have predictably resulted in a system which would make decisions based on a user’s past interaction with the campaign. As such, the identified substitution and the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosures of Cai and the teachings of Sterns. 

Claim 9: Cai discloses the above limitations. Cai does not explicitly disclose wherein the digital data about the campaign is the user’s past interaction with the campaign. However, Sterns teaches a user’s past interaction with other campaigns (In some embodiments, the respective user is a member of a group of users (e.g., an group of users who have registered as members of the group or an implicit group of users who meet predefined demographic criteria such as age, sex, income level, geographic location, etc.) and the campaign report for the respective user includes a comparison between the respective user's interactions with the first campaign (and the respective user's interactions with the second campaign) and interactions of the group of users with the first campaign. In some embodiments, after the message-interaction report has been obtained (e.g., generated by the server system or retrieved from a remote storage system) the server system provides (1014) the respective campaign report to the requestor. Comparing the interactions of a particular user with message campaigns as compared to the group of similar users can provide valuable information for better tailoring future message campaigns to the user as well as to other users in the group. While the examples described above refer to a campaign report that includes information about a first message campaign and a second message campaign for a respective user, it should be understood that the information in the campaign report optionally includes information about the respective user's interactions with any number of message campaigns. Optionally the campaign report includes statistics and activity detail for the respective user over the respective user's entire history with the company/brand whose products/services are being promoted by the message campaigns (or some subset of that history such as a time-limited subset of that history). See at least [0141]). 
	Cai provides a reinforcement learning model which receives data used by the model to determine an action, which differs from the claimed invention by the substitution of Cai’s data for a user’s past interaction with other campaigns. However, Sterns demonstrates that the prior art already knew of the concept of a user’s past interaction with other campaigns. One of ordinary skill in the art could have trivially substituted Stern’s data in for the data of Cai. Further, one of ordinary skill in the art would have recognized that such a substitution would have predictably resulted in a system which would make decisions based on a user’s past interaction with other campaigns. As such, the identified substitution and the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosures of Cai and the teachings of Sterns. 

Claim 10: Cai discloses the above limitations. Cai does not explicitly disclose wherein the digital data about the campaign is a plurality of users’ past interactions with the campaign. However, Sterns teaches a plurality of users’ past interactions with the campaign (In some embodiments, the respective user is a member of a group of users (e.g., an group of users who have registered as members of the group or an implicit group of users who meet predefined demographic criteria such as age, sex, income level, geographic location, etc.) and the campaign report for the respective user includes a comparison between the respective user's interactions with the first campaign (and the respective user's interactions with the second campaign) and interactions of the group of users with the first campaign. In some embodiments, after the message-interaction report has been obtained (e.g., generated by the server system or retrieved from a remote storage system) the server system provides (1014) the respective campaign report to the requestor. Comparing the interactions of a particular user with message campaigns as compared to the group of similar users can provide valuable information for better tailoring future message campaigns to the user as well as to other users in the group. While the examples described above refer to a campaign report that includes information about a first message campaign and a second message campaign for a respective user, it should be understood that the information in the campaign report optionally includes information about the respective user's interactions with any number of message campaigns. Optionally the campaign report includes statistics and activity detail for the respective user over the respective user's entire history with the company/brand whose products/services are being promoted by the message campaigns (or some subset of that history such as a time-limited subset of that history). See at least [0141]). 
	Cai provides a reinforcement learning model which receives data used by the model to determine an action, which differs from the claimed invention by the substitution of Cai’s data for a plurality of users’ past interactions with the campaign. However, Sterns demonstrates that the prior art already knew of the concept of a plurality of users’ past interactions with the campaign. One of ordinary skill in the art could have trivially substituted Stern’s data in for the data of Cai. Further, one of ordinary skill in the art would have recognized that such a substitution would have predictably resulted in a system which would make decisions based on a plurality of users’ past interactions with the campaign. As such, the identified substitution and the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosures of Cai and the teachings of Sterns. 

Claim 11: Cai discloses the above limitations. Cai does not explicitly disclose wherein the digital data about the campaign is a plurality of users’ past interactions with other campaigns. However, Sterns teaches a plurality of users’ past interactions with other campaigns (In some embodiments, the respective user is a member of a group of users (e.g., an group of users who have registered as members of the group or an implicit group of users who meet predefined demographic criteria such as age, sex, income level, geographic location, etc.) and the campaign report for the respective user includes a comparison between the respective user's interactions with the first campaign (and the respective user's interactions with the second campaign) and interactions of the group of users with the first campaign. In some embodiments, after the message-interaction report has been obtained (e.g., generated by the server system or retrieved from a remote storage system) the server system provides (1014) the respective campaign report to the requestor. Comparing the interactions of a particular user with message campaigns as compared to the group of similar users can provide valuable information for better tailoring future message campaigns to the user as well as to other users in the group. While the examples described above refer to a campaign report that includes information about a first message campaign and a second message campaign for a respective user, it should be understood that the information in the campaign report optionally includes information about the respective user's interactions with any number of message campaigns. Optionally the campaign report includes statistics and activity detail for the respective user over the respective user's entire history with the company/brand whose products/services are being promoted by the message campaigns (or some subset of that history such as a time-limited subset of that history). See at least [0141]). 
	Cai provides a reinforcement learning model which receives data used by the model to determine an action, which differs from the claimed invention by the substitution of Cai’s data for a plurality of users’ past interactions with other campaigns. However, Sterns demonstrates that the prior art already knew of the concept of a plurality of users’ past interactions with other campaigns. One of ordinary skill in the art could have trivially substituted Stern’s data in for the data of Cai. Further, one of ordinary skill in the art would have recognized that such a substitution would have predictably resulted in a system which would make decisions based on a plurality of users’ past interactions with other campaigns. As such, the identified substitution and the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosures of Cai and the teachings of Sterns. 

Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Cai et al. (“Real-Time Bidding by Reinforcement Learning in Display Advertising”) in view of Sorg (US 2016/0373396 A1)

Claim 5: Cai discloses the above limitations. Cai does not explicitly disclose wherein the state of the user at the given point in time is the user’s engagement score from a predictive model to engage with a communication. However, Sorg teaches a user’s engagement score from a predictive model to engage with a communication ( The scoring module 240 generates engagement scores for each of the selected new content items and for the content items in the unread section of the previous content feed. Additionally, in some embodiments, the scoring module 240 generates engagement scores for one or more read content items in the read section of the previous content feed. An engagement score measures a predicted level of interaction the user would have with a content item. See at least [0041]). 
	Cai provides a reinforcement learning model which receives data used by the model to determine an action, which differs from the claimed invention by the substitution of Cai’s data for a user’s engagement score. However, Sorg demonstrates that the prior art already knew of the concept of a user’s engagement score. One of ordinary skill in the art could have trivially substituted Sorg’s data in for the data of Cai. Further, one of ordinary skill in the art would have recognized that such a substitution would have predictably resulted in a system which would make decisions based on a user’s engagement score. As such, the identified substitution and the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosures of Cai and the teachings of Sorg. 

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Cai et al. (“Real-Time Bidding by Reinforcement Learning in Display Advertising”) in view of Law et al. (US 2016/0189238 A1). 

Claim 7: Cai discloses the above limitations. Cai does not explicitly disclose wherein the digital data about the campaign is a campaign type. However, Law teaches a campaign type (the processor are configured to provide a suggestion for one of the campaign goal and the campaign strategy based on the campaign type. See at least [0014]). 
	Cai provides a reinforcement learning model which receives data used by the model to determine an action, which differs from the claimed invention by the substitution of Cai’s data for a campaign type. However, Law demonstrates that the prior art already knew of the concept of a campaign type. One of ordinary skill in the art could have trivially substituted Law’s data in for the data of Cai. Further, one of ordinary skill in the art would have recognized that such a substitution would have predictably resulted in a system which would make decisions based on a campaign type. As such, the identified substitution and the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosures of Cai and the teachings of Law. 

Claims 12 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Cai et al. (“Real-Time Bidding by Reinforcement Learning in Display Advertising”) in view of Moreau et al. (US 2017/0017971 A1)

Claim 12: Cai discloses the above limitations. Cai does not explicitly disclose wherein the action is transmitting a communication. However, Moreau teaches an action of transmitting a communication (FIG. 8 shows a process flow 800 for deciding whether to transmit a marketing communication according to certain embodiments. See at least [0177]). 
Cai provides a reinforcement learning model which receives data used by the model to determine an action, which differs from the claimed invention by the substitution of Cai’s action for the transmission of a communication. However, Moreau demonstrates that the prior art already knew of the concept of transmitting a communication. One of ordinary skill in the art could have trivially substituted Moreau’s action in for the action of Cai. Further, one of ordinary skill in the art would have recognized that such a substitution would have predictably resulted in a system which would make decisions to send a communication. As such, the identified substitution and the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosures of Cai and the teachings of Moreau. 

Claim 13: Cai discloses the above limitations. Cai does not explicitly disclose wherein the action is refraining from transmitting a communication. However, Moreau teaches an action of refraining from transmitting a communication (FIG. 8 shows a process flow 800 for deciding whether to transmit a marketing communication according to certain embodiments. See at least [0177]). 
Cai provides a reinforcement learning model which receives data used by the model to determine an action, which differs from the claimed invention by the substitution of Cai’s action for not transmitting a communication. However, Moreau demonstrates that the prior art already knew of the concept of not transmitting a communication. One of ordinary skill in the art could have trivially substituted Moreau’s action in for the action of Cai. Further, one of ordinary skill in the art would have recognized that such a substitution would have predictably resulted in a system which would make decisions to not transmit a communication. As such, the identified substitution and the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosures of Cai and the teachings of Moreau. 

Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Cai et al. (“Real-Time Bidding by Reinforcement Learning in Display Advertising”) in view of Bohrmann et al. (US 2019/0122259 A1).

Claim 15: Cai discloses the above limitations. Cai does not explicitly disclose wherein the constraint is a maximum number of communications to send in a particular time period. However, Bohrmann teaches a maximum number of communications to send in a particular time period (To prevent multiple presentations of the same content items to the same user of an online system within a short period of time, online systems may place frequency caps on content items. A frequency cap placed on a content item limits the frequency with which the content item may be presented to each online system user. For example, a frequency cap placed on a content item may limit the number of times the content item may be presented to each online system user within the same day or within the same week. See at least [0007]).
Cai provides a reinforcement learning model which receives data used by the model to determine an action, which differs from the claimed invention by the substitution of Cai’s constraint for a frequency cap constraint. However, Bohrmann demonstrates that the prior art already knew of the concept of a frequency cap constraints. One of ordinary skill in the art could have trivially substituted Bohrmann’s constraint in for the constraint of Cai. Further, one of ordinary skill in the art would have recognized that such a substitution would have predictably resulted in a system which would make decisions subject to a frequency cap constraint. As such, the identified substitution and the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosures of Cai and the teachings of Bohrmann. 

Claim 16 is rejected under 35 U.S.C. 103 as being unpatentable over Cai et al. (“Real-Time Bidding by Reinforcement Learning in Display Advertising”) in view of Montero (US 2017/0316448 A1).

Claim 16: Cai discloses the above limitations. Cai does not explicitly disclose the reinforcement learning model being configured to aggregate data from multiple clients. However, Montero teaches aggregating data from multiple clients (Using the normalized score, a model may be generated via machine learning. This model may either be as a decision tree or a perceptron neural network model based upon the scale of the data involved. Decision tree and neural network algorithm are known, and may be employed in the generation of the model. Generally, for the data originating from a single client (brand or retailer) the relatively small scale of data is conducive to utilization of a decision tree model, whereas when the information from multiple clients are being aggregated a neural network model is more effective. See at least [0157]). .
	Cai provides a reinforcement learning model which receives data used by the model to determine an action, upon which the claimed invention’s aggregation of data from multiple clients can be seen as an improvement. However, Montero demonstrates that the prior art already knew of basing models from the data of multiple clients. One of ordinary skill in the art could have easily applied the techniques of Montero to the model of Cai. Further, one of ordinary skill in the art would have recognized that such an application of Montero would have resulted in an improved model which would be based on a wider set of data. As such, the application of Montero and the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosures of Cai and the teachings of Montero. 

Claims 18 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Cai et al. (“Real-Time Bidding by Reinforcement Learning in Display Advertising”) in view of Edelen et al. (US 2017/0004408 A1). 

Claim 18: Cai discloses the above limitations. Cai does not explicitly disclose the reinforcement learning model being configured at the given point in time to perform a comparison of an output of the reinforcement learning model to an actual output generated from application of the output. However, Edelen teaches perform a comparison of an output of the model to an actual output generated from application of the output (Training is performed on the generic predictive model to generate a new and personalized predictive model based on the user's actual responses to the analyzed inputs. The personalized predictive model is then utilized for predicting user response to future inputs of the same type. At a prescribed frequency, the generated personalized predictive model is updated by analyzing actual user responses to predictions provided by the personalized predictive model.  See at least [0003]. Also: The trainer module 145 is a software module, system or device operative to analyze actual user decisions/responses based on/to actual inputs in comparison to predicted user decisions/responses and is further operative to update user decision thresholds for updating the operation of a personalized predictive model generated for the user.  See at least [0025]. Also:  the in-use decision threshold is updated for improving prediction accuracy. As described in detail below, precision/recall curves are developed for each user and are updated from time-to-time, for example, daily based on accumulated prediction data versus actual decision data, and updated decision thresholds are then determined from the updated precision/recall curves. The personalized predictive model for a given user is then updated with a corresponding updated decision threshold so that the personalized predictive model is continually being updated to reflect variations in user behavior. See at least [0019]). 
	Cai provides a reinforcement learning model which receives data used by the model to determine an action, upon which the claimed invention’s updating based on a comparison to actual data can be seen as an improvement. However, Montero demonstrates that the prior art already knew of updating based on a comparison to actual data. One of ordinary skill in the art could have easily applied the techniques of Edelen to the model of Cai. Further, one of ordinary skill in the art would have recognized that such an application of Edelen would have resulted in a system which could update and improve model performance as additional data is acquired. As such, the application of Edelen and the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosures of Cai and the teachings of Edelen. 

Claim 19: Cai in view of Edelen makes obvious the above limitations. As previously noted in combination with Cai, Edelen teaches the model being configured at the given point in time to update the model (Training is performed on the generic predictive model to generate a new and personalized predictive model based on the user's actual responses to the analyzed inputs. The personalized predictive model is then utilized for predicting user response to future inputs of the same type. At a prescribed frequency, the generated personalized predictive model is updated by analyzing actual user responses to predictions provided by the personalized predictive model.  See at least [0003]. Also: The trainer module 145 is a software module, system or device operative to analyze actual user decisions/responses based on/to actual inputs in comparison to predicted user decisions/responses and is further operative to update user decision thresholds for updating the operation of a personalized predictive model generated for the user.  See at least [0025]. Also:  the in-use decision threshold is updated for improving prediction accuracy. As described in detail below, precision/recall curves are developed for each user and are updated from time-to-time, for example, daily based on accumulated prediction data versus actual decision data, and updated decision thresholds are then determined from the updated precision/recall curves. The personalized predictive model for a given user is then updated with a corresponding updated decision threshold so that the personalized predictive model is continually being updated to reflect variations in user behavior. See at least [0019]). The motivation to combine Cai and Edelen is the same as explained under claim 18 above, and is incorporated herein.

Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Cai et al. (“Real-Time Bidding by Reinforcement Learning in Display Advertising”) in view of Zhang (US 2011/0264511 A1). 

Claim 20: Cai discloses the above limitations. Cai does not explicitly disclose wherein the action is prioritizing between communications. However, Zhang teaches prioritizing between communications (In some embodiments, serving policies may be adjusted, for example, to prioritize serving in particular categories. As a simple example, a particular user may be qualified to receive advertisements from a number of categories, but available applicable serving opportunities to the user may mean that only advertisements from some of the categories may be served. In such circumstances, if monitored online information leads to a determination that predicted CTR for a particular category is currently above that which was projected using the offline-trained model, then that category can be prioritized in terms of advertisement serving, such as by adjusting a delivery policy to emphasize or prioritize advertisement serving in that category. See at least [0046]). 
Cai provides a reinforcement learning model which receives data used by the model to determine an action, which differs from the claimed invention by the substitution of Cai’s action for prioritizing between communications. However, Zhang demonstrates that the prior art already knew of the concept of prioritizing between communications. One of ordinary skill in the art could have trivially substituted Zhang’s action in for the action of Cai. Further, one of ordinary skill in the art would have recognized that such a substitution would have predictably resulted in a system which would make decisions to prioritizing between communications. As such, the identified substitution and the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosures of Cai and the teachings of Zhang. 

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

Claim 1 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 11270340. Although the claims at issue are not identical, they are not patentably distinct from each other because claim 1 of the reference application claims: a reinforcement learning model (configured at a given point in time: to receive digital data about a state of a user at the given point in time; to receive digital data about an environment at the given point in time; to receive digital data about a campaign at the given point in time; to optimize total expected future number of positive rewards at the given point in time; and to execute an action at the given point in time (“a reinforcement learning model to learn and optimize a personalized frequency for sending the electronic communications to the particular individual customer or the potential individual customer, the reinforcement learning model iteratively basing its actions, defined as a minimum time between emails, on (i) type and other email metadata, (ii) a state for the particular individual customer or the potential individual customer, including a number of emails clicked for a predetermined period of time, opened for a predetermined period of time, a number of purchases for a predetermined period of time, and a subscription state and (iii) a past outcome including click, no click, purchase or no purchase for the particular individual customer or the potential individual customer, with a model-based policy iteration receiving (i)-(iii) as inputs to determine an optimal policy comprising a set of rules for which action to take when the particular individual customer or the potential individual customer is in a given state and wherein the action is executed by a multi-layered gate for the particular individual customer or the potential individual customer, the personalized frequency designated for the particular individual customer or the potential individual customer; wherein the optimal policy includes optimizing the personalized frequency to maximize a number of rewards received from the particular individual customer or the potential individual customer in a manner that does not result in disengagement of the particular individual customer or the potential individual customer; based on the content and the audience, creating at least one of the electronic communications to send to the particular individual customer or the potential individual customer; causing the at least one of the electronic communications to be sent at the personalized frequency to the particular individual customer or the potential individual customer.” Claim 1). 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Bion A Shelden whose telephone number is (571)270-0515. The examiner can normally be reached M-F, 12pm-10pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hajime S Rojas can be reached on (571)270-5491. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Bion A Shelden/             Examiner, Art Unit 3681                                                                                                                                                                                           	2022-12-17