DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

This action is in response to the application filed on 07/02/2019. Claims 1-15 are pending in the application and have been considered below.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) are highlighted below with the generic place holder in bold and the functional language italicized:
Claim 1:
a learning execution unit configured to execute learning processing.
an annotation input unit configured to input annotation information.
Claim 3:
           an action determination unit configured to determine an action.
             the processing execution unit is caused to execute.
Claim 4:
          a data input unit configured to input the respective pieces.
Claim 5:
the annotation input unit configured to input annotation information.
Claim 6:
           a control unit configured to store the respective pieces of information.
Claim 13:
the annotation input unit configured to input annotation information.
Claim 14:
a learning execution unit configured to execute learning processing.
an annotation input unit configured to input annotation information.
Claim 15:
a learning execution unit configured to execute learning processing.
an annotation input unit configured to input annotation information.

 Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.



Claims 2-3 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea  without significantly more. 

Regarding Claim 2:
Claim 2 recites limitations such as “derives an action determination rule …” that are part of the abstract idea and do not amount to an inventive concept.
For Step 2, Prong 2, the claim recites additional elements: “learning execution ... “
and “learning processing ... “
Step 2B
The additional elements “learning execution “and “learning processing “are generally linked to the judicial exception and do not integrate the abstract idea into a practical application.  The claim is directed to the abstract idea.
Regarding Claim 3:
Claim 3 recites limitations such as “determine an action…. and do not amount to an inventive concept.
For Step 2, Prong 2, the claim recites additional elements: “processing execution... “
Step 2B
The additional element “processing execution “is generally linked to the judicial exception and does not integrate the abstract idea into a practical application.  The claim is directed to the abstract idea.

Claim 15 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. 
 
Claim 15 recites a program causing information processing to be executed in an information processing apparatus, 
  The claim recites an information processing apparatus and the various components of the information processing apparatus.  As such, the claim is directed to a computer program (software “per se”).  A claim that recites a piece of software alone without any link to a hardware component is directed to non-statutory subject matter since there is no relationship between the computer software and hardware components which permits the functionality of the software to be realized.  The claims lack the necessary physical articles or objects to constitute a machine or a manufacture within the meaning of 35 USC 101. They are clearly not a series of steps or acts to be a process nor are they a combination of chemical compounds to be a composition of matter. As such, they fail to fall within a statutory category.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-9, 12, and 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over KOBAYASHI (US 2013/0097107 A1, hereinafter referred to as KOBAYASHI), in view of Jonnalagadda et al. (US 2020/0143247 A1, hereinafter referred to as Jonnalagadda).
As to claim 1, KOBAYASHI teaches an information processing apparatus comprising: 
a database configured to store respective pieces of information of a state, an action, and a reward of a processing execution unit (paragraphs [0006] - [007] … action history data, a reward estimator that estimates the reward value from inputted state data and action data…; [0249]-[0250]… FIG. 39, FIGs. 45 & 52…ACTION HISTORY DATA...; [0250] …storage apparatus); 
a learning execution unit configured to execute learning processing in accordance with a reinforcement learning algorithm to which the respective pieces of information of the state, the action, and the reward stored in the database are applied (paragraphs [0006] - [007] …learning data to generate, through machine learning, a reward estimator that estimates the reward value from inputted state data and action data…; [0241] …estimated rewards).
However, KOBAYASHI fails to explicitly teach: 
an annotation input unit configured to input annotation information including sub reward setting information and store the annotation information in the database, wherein the learning execution unit executes learning processing to which the respective pieces of information of the state, the action, and the reward input from the processing execution unit and the sub reward setting information input via the annotation input unit are applied.  
Jonnalagadda, in combination with KOBAYASHI, teaches:
an annotation input unit configured to input annotation information including sub reward setting information and store the annotation information in the database, wherein the learning execution unit executes learning processing to which the respective pieces of information of the state, the action, and the reward input from the processing execution unit and the sub reward setting information input via the annotation input unit are applied (paragraphs [0015]-[0016]…annotation database…;[0029] Fig 5E….; [0080]… reinforcement learning …an annotation platform requests annotation of sentence intents 592 using active learning…; [0085]…Paragraph intents are annotated in the annotation workflow when sentence level intents and entities are not able to convey enough information to the reinforcement learning agent to be able to take an action…; [0150]…
optimize for an objective reward and determine the action (at 1360). If the action cannot be determined with a suitable degree of confidence (at 1370), the process may
institute an annotation procedure (at 1380)).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of KOBAYASHI to add annotation with reinforcement learning to the system of KOBAYASHI, as taught by Jonnalagadda above. The modification would have been obvious because one of ordinary skill would be motivated to allow for more effective Artificial intelligence (AI) operations, improvements to the experience of a conversation target, and increased productivity through Artificial intelligence (AI) assistance, as suggested by Jonnalagadda ([0010]).

As to claim 2, which incorporates the rejection of claim 1, KOBAYASHI teaches
 wherein the learning execution unit derives an action determination rule for estimating an action to be executed to raise an expected reward by the learning processing (paragraphs [0006]– [0007] ...reward estimator generating unit…; [0010]-[0011] … reward estimator generating function; [0137]…calculates an estimated value y from input data X using the constructed estimator. The estimated value y is used in recognizing the input data X. For example, a recognition result of "Yes" is obtained if the estimated value y is equal to or larger than a specified threshold Th and a recognition result of "No" is obtained if the estimated value y is smaller than the specified threshold Th; [0178] …the estimator will increase if learning data is added in every iteration, the performance of the estimator will improve).  

As to claim 3, which incorporates the rejection of claim 1, KOBAYASHI teaches:
an action determination unit configured to determine an action which the processing execution unit is caused to execute in accordance with the action determination rule ([0148] …determines whether a specified end condition is satisfied (Sl05). If
the specified end condition is satisfied, the information processing apparatus 10 advances to step S106. Meanwhile, if the specified end condition is not satisfied, the information processing apparatus 10 returns to step Sl02 and the processing in steps S102 to S104 is executed once again…).  

As to claim 4, which incorporates the rejection of claim 1, KOBAYASHI teaches:
a data input unit configured to input the respective pieces of information of the state, the action, and the reward input from the processing execution unit wherein the database stores input data of the data input unit paragraphs [0006] - [007] … action history data, a reward estimator that estimates the reward value from inputted state data and action data…; [0249]-[0250]… FIG. 39, FIGs. 45 & 52…ACTION HISTORY DATA..).
However, fails to explicitly teach:
stores the sub reward setting information input via the annotation input unit.  
Jonnalagadda, in combination with KOBAYASHI, teaches:
stores the sub reward setting information input via the annotation input unit (paragraphs [0015]-[0016]…annotation database…;[0029] Fig 5E….; [0080]… reinforcement learning …an annotation platform requests annotation of sentence intents 592 using active learning…; [0085]…Paragraph intents are annotated in the annotation workflow…
reinforcement learning agent to be able to take an action…; [0150] …If the action cannot be determined with a suitable degree of confidence (at 1370), the process may
institute an annotation procedure (at 1380), wherein using the broadest reasonable interpretation, Examiner interprets the annotation database and platform to include the sub reward setting information).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of KOBAYASHI to add annotation with reinforcement learning to the system of KOBAYASHI, as taught by Jonnalagadda above. The modification would have been obvious because one of ordinary skill would be motivated to allow for more effective Artificial intelligence (AI) operations, improvements to the experience of a conversation target, and increased productivity through Artificial intelligence (AI) assistance, as suggested by Jonnalagadda ([0010]).

As to claim 5, which incorporates the rejection of claim 1, Jonnalagadda, in combination with KOBAYASHI, teaches wherein the annotation input unit inputs the annotation information including the sub reward setting information input via an annotation input apparatus enabling input processing at an arbitrary time to be performed by a user and stores the annotation information in the database (paragraphs [0015]-[0016]…receiving an annotation work in an annotation queue, prioritizing the annotations, and sending the highest priority annotations to the annotator in order.  This is used to update the production annotation database…human annotator...; [0096]- [0097] …AI annotation database schema (a relational database storing the structured annotation relations and the human annotation metrics)).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of KOBAYASHI to add annotation with reinforcement learning to the system of KOBAYASHI, as taught by Jonnalagadda above. The modification would have been obvious because one of ordinary skill would be motivated to allow for more effective Artificial intelligence (AI) operations, improvements to the experience of a conversation target, and increased productivity through Artificial intelligence (AI) assistance, as suggested by 3 ([0010]).

As to claim 6, which incorporates the rejection of claim 1, Jonnalagadda, in combination with KOBAYASHI, teaches a control unit configured to store the respective pieces of information of the state and the action of the processing execution unit at time of input of the annotation in the database in association with the sub reward setting information included in the annotation (paragraphs [0015]-[0017]…receiving an annotation work in an annotation queue, prioritizing the annotations, and sending the highest priority annotations to the annotator in order.  This is used to update the production annotation database…human annotator... control versioning in the conversation system; [0096]- [0097] …AI annotation database schema (a relational database storing the structured annotation relations and the human annotation metrics)).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of KOBAYASHI to add annotation with reinforcement learning to the system of KOBAYASHI, as taught by Jonnalagadda above. The modification would have been obvious because one of ordinary skill would be motivated to allow for more effective Artificial intelligence (AI) operations, improvements to the experience of a conversation target, and increased productivity through Artificial intelligence (AI) assistance, as suggested by Jonnalagadda ([0010]).

 As to claim 7, which incorporates the rejection of claim 6, KOBAYASHI teaches wherein the learning execution unit executes learning processing to which the respective pieces of information of the state, the action, and the reward input from the processing execution unit and respective pieces of information of a state, an action (paragraphs [0006] - [007] … action history data, a reward estimator that estimates the reward value from inputted state data and action data…; [0249]-[0250]… FIG. 39, FIGs. 45 & 52…ACTION HISTORY DATA...) 
However, KOBAYASHI fails to explicitly teach: 
 a sub reward stored in the database in association with the sub reward setting information input via the annotation input unit are applied.  
Jonnalagadda, in combination with KOBAYASHI, teaches:
a sub reward stored in the database in association with the sub reward setting information input via the annotation input unit are applied (paragraphs [0015]-[0017]…receiving an annotation work in an annotation queue, prioritizing the annotations, and sending the highest priority annotations to the annotator in order.  This is used to update the production annotation database…human annotator...; [0096]- [0097] …AI annotation database schema (a relational database storing the structured annotation relations and the human annotation metrics)).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of KOBAYASHI to add annotation with reinforcement learning to the system of KOBAYASHI, as taught by Jonnalagadda above. The modification would have been obvious because one of ordinary skill would be motivated to allow for more effective Artificial intelligence (AI) operations, improvements to the experience of a conversation target, and increased productivity through Artificial intelligence (AI) assistance, as suggested by Jonnalagadda ([0010]).

As to claim 8, which incorporates the rejection of claim 1, Jonnalagadda, in combination with KOBAYASHI, teaches wherein the sub reward setting information input via the annotation input unit is information input by a user observing processing that the processing execution unit executes (paragraphs [0015]-[0017]…receiving an annotation work in an annotation queue, prioritizing the annotations, and sending the highest priority annotations to the annotator in order.  This is used to update the production annotation database…human annotator...; [0096]- [0097] …AI annotation database schema (a relational database storing the structured annotation relations and the human annotation metrics)).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of KOBAYASHI to add annotation with reinforcement learning to the system of KOBAYASHI, as taught by Jonnalagadda above. The modification would have been obvious because one of ordinary skill would be motivated to allow for more effective Artificial intelligence (AI) operations, improvements to the experience of a conversation target, and increased productivity through Artificial intelligence (AI) assistance, as suggested by Jonnalagadda ([0010]).

As to claim 9, which incorporates the rejection of claim 1, Jonnalagadda, in combination with KOBAYASHI, teaches wherein the sub reward setting information input via the annotation input unit is information input by a user controlling processing that the processing execution unit executes  (paragraphs [0015]-[0017]…receiving an annotation work in an annotation queue, prioritizing the annotations, and sending the highest priority annotations to the annotator in order.  This is used to update the production annotation database…human annotator...; [0096]- [0097] … (a relational database storing the structured annotation relations and the human annotation metrics)).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the system of KOBAYASHI to add annotation with reinforcement learning to the system of KOBAYASHI, as taught by Jonnalagadda above. The modification would have been obvious because one of ordinary skill would be motivated to allow for more effective Artificial intelligence (AI) operations, improvements to the experience of a conversation target, and increased productivity through Artificial intelligence (AI) assistance, as suggested by Jonnalagadda ([0010]).

As to claim 12, which incorporates the rejection of claim 1, KOBAYASHI teaches
wherein the processing execution unit is an independent apparatus different from the information processing apparatus, and the information processing apparatus performs data transmission and reception by communication processing with the processing execution unit and controls the processing execution unit (paragraphs [0335]-[0337]…CPU 902 functions as a computational processing apparatus or a control apparatus… communication unit 926).  

Claim 14 recites substantially the same functionalities recited in claim 1, and is directed to an information processing method performed by the information processing apparatus of claim 1.  Therefore claim 14 is rejected for the same reasons as applied to claim 1 above.

Claim 15 recites substantially the same functionalities recited in claim 1, and is directed to an information processing apparatus similar to the information processing apparatus of claim 1.  Therefore claim 15 is rejected for the same reasons as applied to claim 1 above.

Claims 10-11 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over KOBAYASHI (US 2013/0097107 A1, hereinafter referred to as KOBAYASHI), in view of Jonnalagadda et al. (US 2020/0143247 A1, hereinafter referred to as Jonnalagadda), and further in view of SAKAKI et al. (US 2015/0254223 A1, hereinafter referred to as SAKAKI).

As to claim 10, which incorporates the rejection of claim 1, SAKAKI, in combination with KOBAYASHI and Jonnalagadda, teaches wherein the sub reward setting information input via the annotation input unit is reward setting information which is input by a user observing processing that the processing execution unit executes and includes a positive reward value input by the user that has confirmed that the processing that the processing execution unit executes is correct(see paragraphs [0014]-[0018]… annotation may be of a binary type, such as "positive" and "negative", or may be categorized into multiple values by preparing multiple categories…).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of KOBAYASHI and Jonnalagadda to add negative reward to the combination system of KOBAYASHI and Jonnalagadda, as taught by SAKAKI above. The modification would have been obvious because one of ordinary skill would be motivated to add annotations to the same range as an annotator with low reliability, so that redundant addition of low-reliability annotations can be avoided, as suggested by SAKAKI ([0035]-[-0036]).
  
As to claim 11, which incorporates the rejection of claim 1, SAKAKI, in combination with KOBAYASHI and Jonnalagadda, teaches wherein the sub reward setting information input via the annotation input unit is reward setting information which is input by a user observing processing that the processing execution unit executes and includes a negative reward value input by the user that has confirmed that the processing that the processing execution unit executes is not correct (see paragraphs [0014]-[0018]… annotation may be of a binary type, such as "positive" and "negative", or may be categorized into multiple values by preparing multiple categories…).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of KOBAYASHI and Jonnalagadda to add negative reward to the combination system of KOBAYASHI and Jonnalagadda, as taught by SAKAKI above. The modification would have been obvious because one of ordinary skill would be motivated to add annotations to the same range as an annotator with low reliability, so that redundant addition of low-reliability annotations can be avoided, as suggested by SAKAKI ([0035]-[-0036]).

As to claim 13, which incorporates the rejection of claim 1, SAKAKI, in combination with KOBAYASHI and Jonnalagadda, teaches wherein the annotation input unit is configured to input the annotation information input by an independent annotation input apparatus different from the information processing apparatus (see paragraphs [0016]-[0018]…annotation adding unit 100 receives an annotation input by an annotator and adds the am1otation to some of multiple annotation targets included in the annotation target information 111. …).
It would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention to modify the combination system of KOBAYASHI and Jonnalagadda to add negative reward to the combination system of KOBAYASHI and Jonnalagadda, as taught by SAKAKI above. The modification would have been obvious because one of ordinary skill would be motivated to add annotations to the same range as an annotator with low reliability, so that redundant addition of low-reliability annotations can be avoided, as suggested by SAKAKI ([0035]-[-0036]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  Patents and patent related publications are cited in the Notice of References Cited (Form PTO-892) attached to this action to further show the state of the art with respect to the invention.

Noda et al. (US 8,527,434 B2) teach “Information processing device, has learning unit learning state transition probability model defined by observation probability of predetermined observed value being observed from state and observed value.”

PETANDER (US 2019/0222895 A1) teaches “Method for selectively playing advertisement video on mobile communication device, involves determining whether is to be delayed playing of video or played video based on relationship, and playing video upon determining to play video.”

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ABABACAR SECK whose telephone number is (571)270-7146. The examiner can normally be reached Monday-Friday 8:00 A.M.-6:00 P.M..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on 5712723719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ABABACAR SECK/Examiner, Art Unit 2122                                                                                                                                                                                                        
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122