DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
2.	According to paper filed July 23rd 2021, claims 1-14 are pending for examination with a November 29th 2016 priority date under 35 USC §317.
	According to the present Amendment, claims 1 and 3-12 are amended. Claim 2 is canceled and claims 13-14 are newly added.
	In view of the present Amendment, rejection to claim 11 under 35 USC §112(d) is withdrawn.

Claim Rejections - 35 USC § 103
3.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. §102 and §103 (or as subject to pre-AIA  35 U.S.C. §102 and §103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
4.	The following is a quotation of 35 U.S.C. §103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


5.	This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the 
6.	Claims 1, 5-6, and 9-14 are rejected under 35 U.S.C. §103 as being unpatentable over Kluckner et al. (US 2018/0032841), hereinafter Kluckner, and further in view of Bowers et al. (US 2017/0161779), hereinafter Bowers.
Claim 1
“a reward estimating part configured to execute estimation of a reward for an action on a basis of a first user input for the action; a presentation control part configured to execute control for presentation of the estimated reward” Bowers [0029] discloses “the user action value module 220 can be a function of user-specific information known by the advertising platform 140…. The advertising platform 140 interfaces with a social networking system…. The user action value module 220 calculates the estimated user action value based on the received profile information…. the estimated user action value can be based on the time of day the advertisement is displayed…. the viewing user’s geographic location”;

“an output device configured to execute the presentation of the estimated reward to the user” Kluckner [0035] discloses “reward system may output the positive reward values if a current state from the state space is proximity to the target state”,

“wherein the reward estimating part is further configured to execute correction of the reward for the action on a basis of a second user input that is input after the presentation of the estimated reward is executed” Kluckner [0009] discloses “creating one or more new target states based on those user inputs, and updating the reward system based on the one or more new target states”,

“wherein the reward estimating part and the presentation control part are each implemented via at least one processor” Kluckner [0010] discloses “one or more processors” for communication and “to establish a reward system based on the target state and actions which modify the available parameters”.

Kluckner and Bowers disclose analogous art. However, Kluckner does not spell out the “estimation of a reward” as recited above. It is disclosed in Bowers. Hence, it would have been obvious to one ordinary skilled in the art at the time the present invention was made to incorporate said feature of Bowers into Kluckner to enhance its estimated value correction functions.

Claim 5
“wherein the presentation control part is further configured to execute control for presentation of a reward after the correction” Bowers [0031] discloses “[t]he expected user action value is the estimated user action value adjusted for the likelihood of the user action. The expected user action value is based on both the estimated user action value and the user action likelihood”.

Claim 6
“wherein the reward estimating part is further configured to correct the reward for the action to a reward estimated on a basis of the second user input” Bowers [0004] discloses “user action such as a user clicking on a link associated with an advertisement or purchasing a product”; the “clicking” can certainly be executed more than once.

Claim 9
“a first learning part configured to execute learning of a model used for the estimation of the reward, using a combination of the first user input and the reward after the correction, wherein the first learning part is implemented via at least one processor” Bowers [0028] discloses “a machine learning algorithm” and the “reward correction (updating)” is disclosed in Kluckner [0009]. 

Claim 10
“a second learning part configured to execute learning of a model used for execution of the action, using a combination of the action and the reward after the correction, wherein the second learning part is implemented via at least one processor” Bowers [0028] discloses “a machine learning algorithm” ” and the “reward correction (updating)” is disclosed in Kluckner [0009].

Claim 11
“wherein the presentation control part is further configured to execute control such that the estimated reward is presented using a first presentation method and an emotion of the information processing apparatus is presented using a second presentation method different than the first presentation method” Bowers [0031] discloses “[t]he expected user action value is the estimated user action value adjusted for the likelihood of the user action. The expected user action value is based on both the estimated user action value and the user action likelihood”.
	The newly amended feature of “a first presentation method” and “a second presentation method” is unclear if two different display screens are involved? Otherwise, a first presentation of an estimated reward and a second presentation of “an emotion” are different in the contents only; accordingly, no prior art is cited for said amended feature.

Claims 12-13
Claims 12 and 13 are each rejected for the rationale given for claim 1.

Claim 14
“wherein the second user input is different from the first user input, and wherein the second user input is independent of the first user input” Piche [0155] discloses “a first input-output pairing that has a confidence score that is preferable compared to a second input-output pairing”.

7.	Claims 3-4 and 7-8 are rejected under 35 U.S.C. §103 as being unpatentable over Kluckner et al. (US 2018/0032841), hereinafter Kluckner, in view of Bowers et al. (US 2017/0161779), hereinafter Bowers, and further in view of Piche et al. (US 2018/0025288), hereinafter Piche.
Claim 3
“wherein the reward estimating part is further configured to execute correction of the reward for the action on a basis of the second user input that is input in a predetermined time period after the presentation of the estimated reward” Piche [0155] discloses “[t]he calculation of the confidence score may be repeated for at least one other of the input-output pairing such that the confidence score is determined for two or more of the input-out pairing. The confidence scores determined for the two or more of the input-output pairings are each compared against a predetermined threshold”.

Kluckner, Bowers, and Piche disclose analogous art. However, Kluckner does not spell out the “second input in a predetermined period of time” as recited above. It is disclosed in Piche. Hence, it would have been obvious to one ordinary skilled in the art at the time the present invention was made to incorporate said feature of Piche into Kluckner to enhance its estimated value correction functions.

Claim 4
“wherein the presentation control part is further configured to execute control for presentation indicating that the second user input is being accepted” Piche [0155] discloses “a second input-output pairing”.

Claim 7
“wherein the reward estimating part is further configured to correct the reward for the action to a reward produced by weighted-adding the reward estimated on a basis of the first user input, and the reward estimated on a basis of the second user input, to each other” Piche [0053] discloses “an algorithm for the calculation of the second derivative of an error function with respect to the weights is provided”.

Claim 8
“wherein the reward estimating part is further configured to determine necessity or unnecessity of any correction for the reward, on a basis of at least one of a difference between the reward estimated on a basis of the first user input and the reward estimated on a basis of the second user input, and a time period up to a time when the second user input is executed” Piche [0053] discloses “a review of a training algorithm for deterministic neural networks with disturbance rejection is provided”.

Response to Arguments
8.	Applicant's arguments filed July 23rd 2021 have been fully considered but they are not persuasive.
	Applicant argues that “in Bowers, the estimated user action value is not presented to a user, instead, in Bowers, the estimated user action is merely used for computing a bid amount associated with impression opportunities for advertisement.” Said argument is not persuasive because presenting estimated value to a user is not novel and is disclosed in Kluckner. In Kluckner, claim 11 recites “the reward system outputs reward values”.
	Further, applicant argues that “Bowers clearly does not teach or suggest… ‘an output device… execute correction of the reward for the action on a basis of a second user input that is input after the presentation of the estimated reward is executed,’ as recited by claim 1”. Said argument is not persuasive because, although Bowers does not spell out an output device that presents corrected reward, the “presentation of the estimated reward” is disclosed in Kluckner.
As indicated above, Kluckner claim 11 recites outputting reward values, reward is presented to users; hence, it is inherently disclosed that updated reward values can be outputted as well. 

Conclusion
9.	Applicant's amendment necessitated the new ground(s) of rejection presented in this Office
action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

10.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.

11.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to RUAY HO whose telephone number is (571)272-6088; RightFax number is (571) 273-6088.  The examiner can normally be reached on Monday to Friday 9am - 5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, William Bashore can be reached on 571-272-4088.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair.
Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Ruay Ho/Primary Patent Examiner, Art Unit 2175