DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112(a)
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1-12, 18 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.  Specifically, exemplary claim 1 recites, “generates a set of gain vectors…by summing a product of cumulative expected gains for state transition and the transition parameter.”  Generating an at the target time point n as a sum of cumulative expected gains for state transition from the transition parameter P* (-,-|s,a) and immediate expected gains ra in the state s.”

Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-12, 18 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  Specifically, exemplary claim 1 recites, “generates a set of gain vectors…by summing a product of cumulative expected gains for state transition and the transition parameter.”  Generating gain vectors by this process is not disclosed in the original disclosure.  On the contrary, the specification in paragraph 43 recites, “the first generation section 140 generates gain vectors αan at the target time point n as a sum of cumulative expected gains for state transition from the transition parameter P* (-,-|s,a) and immediate expected gains ra in the state s.”  This contrary recitation makes the scope of the claim unclear and indefinite.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-6, 8, 11-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Oh et al. (hereinafter Oh), U.S. Patent Application Publication 2014/0196065, in view of Gupta et al. (hereinafter Gupta), U.S. Patent Application Publication 2010/0094786, further in view of Pakzad, U.S. Patent 8,594,701.
Regarding Claim 1, Oh discloses a generation apparatus [“mobile video streaming” ¶1; Note: Mobile video streaming requires a computer.] for generating gain vectors for a transition model, the apparatus comprising: 
an acquisition section that acquires a set of gain vectors for a state of a next time point after state of a target time point, and a cumulative expected gain obtained from each gain vector of the set of gain vectors at and after the next time point for each state 
a first determination section that determines a value of a transition parameter used for transitioning from the state of the target time point to the state of the next time point [“transition probabilities” ¶26], and 
wherein the set of gain vectors for the state of the next time point are used to calculate the cumulative expected gain in which transition from a current state to a next state occurs in response to an action [“reward for each pair of a state and an action” ¶26].
However, Oh fails to explicitly disclose a first generation section that generates a set of gain vectors for the state of the target time point from the gain vectors for the state of the next time point, using the transition parameter.
Gupta discloses a first generation section that generates a set of gain vectors for the state of the target time point from the gain vectors for the state of the next time point [“relies on the future reward from the next time interval when computing the Q value for the current time interval” ¶6], by summing a product of cumulative expected gains for state transition and transition parameter [“the Q value of the state at time t (s.sub.t) and the action at time t (a.sub.t) is the sum of the reward function of the current state at the current time interval (s.sub.t) added to the expected value of the policy at a state in the next consecutive time interval (s.sub.t+1), determined by the state transition probabilities (T)” ¶37].
It would have been obvious to one having ordinary skill in the art, having the teachings of Oh and Gupta before him before the effective filing date of the claimed 
Given the advantage of using future rewards to determine current rewards in order to increase reward accuracy, one having ordinary skill in the art would have been motivated to make this obvious modification.
However, Oh fails to explicitly disclose from a valid range of the transition parameter, based on the cumulative expected gain obtained from each gain vector at the stated of the next time point.
Pakzad discloses from a valid range of the transition parameter, based on the cumulative expected gain obtained from each gain vector at the stated of the next time point [“assigned a likelihood of transition within a certain range of probabilities, wherein the range for each edge is based on a relative disposition of the edge relative to the direction of the target cell.” col. 11, lines 20-23].
It would have been obvious to one having ordinary skill in the art, having the teachings of Oh, Gupta, and Pakzad before him before the effective filing date of the claimed invention, to modify the combination to incorporate range of transition values of Pakzad.
Given the advantage of using a range of transition values based on likelihood of transition, one having ordinary skill in the art would have been motivated to make this obvious modification.

Regarding Claim 2, Oh, Gupta, and Pakzad disclose the generation apparatus according to claim 1.  Oh further discloses wherein the first determination section 

Regarding Claim 3, Oh, Gupta, and Pakzad disclose the generation apparatus according to claim 1.
However, Oh fails to explicitly disclose wherein the first determination section determines the value of the transition parameter that minimizes the cumulative expected gains obtained from the set of gain vectors for the state of the next time point.
Gupta discloses wherein the first determination section determines the value of the transition parameter that minimizes the cumulative expected gains obtained from the set of gain vectors for the state of the next time point [“a discount factor” ¶35; Note: the discount factor is between 0 and 1.  When set to 0, immediate awards are maximized, but when set to 1, future awards are maximized.].
It would have been obvious to one having ordinary skill in the art, having the teachings of Oh, Gupta, and Pakzad before him before the effective filing date of the claimed invention, to modify the combination to incorporate the reward calculations of Gupta.
Given the advantage of prioritizing immediate awards over future awards (i.e. greedy algorithm), one having ordinary skill in the art would have been motivated to make this obvious modification.


However, Oh fails to explicitly disclose wherein the generation apparatus generates the gain vectors for the state of the target time point, going back from the state of the future time point.
Gupta discloses wherein the generation apparatus generates the gain vectors for the state of the target time point, going back from the state of the future time point [“relies on the future reward from the next time interval when computing the Q value for the current time interval” ¶6].
It would have been obvious to one having ordinary skill in the art, having the teachings of Oh, Gupta, and Pakzad before him before the effective filing date of the claimed invention, to modify the combination to incorporate the reward calculations of Gupta.
Given the advantage of using future rewards to determine current rewards in order to increase reward accuracy, one having ordinary skill in the art would have been motivated to make this obvious modification.

Regarding Claim 5, Oh, Gupta, and Pakzad disclose the generation apparatus according to claim 1.  Oh further discloses wherein 

the first determination section determines the value of the transition parameter for each gain vector included in the set of gain vectors for the state of the next time point [“transition probability matrix for each action” ¶26]; and 
for each gain vector included in the set of gain vectors for the state of the next time point, the first generation section generates the gain vector for the state of the target time point using the transition parameter and adds the gain vector to the set of the gain vectors for the state of the target time point [“reward for each pair of a state and an action” ¶26].

Regarding Claim 6, Oh, Gupta, and Pakzad disclose the generation apparatus according to claim 1.  Oh further discloses wherein the first determination section determines a transition probability from each state at the target time point to each state at the next time point, from a valid range of the transition parameter [“transition probability matrix for each action” ¶26].

Regarding Claim 8, Oh, Gupta, and Pakzad disclose the generation apparatus according to claim 6.  Oh further discloses wherein the first determination section determines the valid range of the transition parameter as being from a reference value up to a constant multiple of the reference value [“transition probability matrix” ¶26].



Regarding Claim 12, Oh, Gupta, and Pakzad disclose the generation apparatus according to claim 1.  Oh further discloses wherein said apparatus is implemented by a program of instructions executable by a computer, tangibly embodied in one or more computer readable program storage devices [¶34].

Regarding Claim 13, Oh discloses a selection apparatus that selects an action in a transition model, the apparatus comprising: 
a set acquisition section that acquires a set of gain vectors for a state of the target time point [“immediate reward r(st, at)” ¶26] and a cumulative expected gain obtained from each gain vector of the set of gain vectors for and after the state of the target time point, for each state at the target time point [“cumulative rewards” ¶26]; 
a probability acquisition section that acquires an assumed probability of being in each state at the target time point [“transition probability matrix for each action” ¶26]; 
a selection section that selects a gain vector from the set of gain vectors based on the set of gain vectors and the assumed probability [“immediate reward r(st, at)” ¶26]; 

a second generation section that generates an assumed probability of being in each state at the next time point after the state of the target time point, using the transition parameter [“transition probability matrix for each action” ¶26], 
wherein a transition from a current state to a next state occurs in response to the action [“transition probability matrix for each action” ¶26].
However, Oh fails to explicitly disclose an output section that selects and outputs the action corresponding to the selected gain vector; 
Gupta discloses an output section that selects and outputs the action corresponding to the selected gain vector [“an action taken at a state” ¶5; “display…data as described herein” ¶27]. 
It would have been obvious to one having ordinary skill in the art, having the teachings of Oh and Gupta before him before the effective filing date of the claimed invention, to modify the apparatus of Oh to incorporate displaying the functioning of the process of Gupta.
Given the advantage of displaying information, one having ordinary skill in the art would have been motivated to make this obvious modification.
However, Oh fails to explicitly disclose from a valid range of the transition parameter.
Pakzad discloses from a valid range of the transition parameter [“assigned a likelihood of transition within a certain range of probabilities” col. 11, lines 20-23].

Given the advantage of using a range of transition values based on likelihood of transition, one having ordinary skill in the art would have been motivated to make this obvious modification.

Claim 14 is rejected with the same mappings as claim 2.
Claim 15 is rejected with the same mappings as claim 3.

Regarding Claim 16, Oh, Gupta, and Pakzad disclose the selection apparatus according to claim 13.  Oh further discloses further comprising a generation apparatus that generates the set of gain vectors for calculating cumulative expected gains for the transition from the current state to the next state, wherein the set acquisition section acquires the set of gain vectors generated by the generation section [“cumulative rewards” ¶26].

Claim 17 is rejected with the same mappings as claim 12.
Claim 18 is rejected on the same grounds as claim 1.
Claim 19 is rejected on the same grounds as claim 13.

Claim 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Oh, Gupta, and Pakzad, further in view of Levchuk et al. (hereinafter Levchuk), U.S. Patent Application Publication 2011/0016067.
Regarding Claim 7, Oh, Gupta, and Pakzad disclose the generation apparatus according to claim 6.
However, Oh fails to explicitly disclose wherein the first determination section determines the transition probability by linear programming, from the valid range of the transition parameter, the range being expressed by a linear inequality of the transition parameter.
Levchuk discloses wherein the first determination section determines the transition probability by linear programming, from the valid range of the transition parameter, the range being expressed by a linear inequality of the transition parameter [“a value function can be given by a piece-wise linear and convex representation” ¶81; “a piece-wise linear function with support areas for each linear component represented as an interval in the range between 0 and 1” ¶100].
It would have been obvious to one having ordinary skill in the art, having the teachings of Oh, Gupta, Pakzad, and Levchuk before him before the effective filing date of the claimed invention, to modify the combination to incorporate the linear programming of Levchuk.
Given the advantage of using a well-known optimization method in optimizing the transitions, one having ordinary skill in the art would have been motivated to make this obvious modification.

Claims 9-10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Oh, Gupta, and Pakzad, further in view of Boutilier et al. (hereinafter Boutilier), Computing Optimal Policies for Partially Observable Decision Processes using Compact Representations.
Regarding Claim 9, Oh, Gupta, and Pakzad disclose the generation apparatus according to claim 5.
However, Oh fails to explicitly disclose further comprising an elimination section that eliminates a gain vector that does not maximize a value within a probability distribution range of each state, from the set of the gain vectors for the state of the target time point generated by the first generation section.
Boutilier discloses further comprising an elimination section that eliminates a gain vector that does not maximize a value within a probability distribution range of each state, from the set of the gain vectors for the state of the target time point generated by the first generation section [“eliminating such dominated vectors” pg. 12, ¶2].
It would have been obvious to one having ordinary skill in the art, having the teachings of Oh, Gupta, Pakzad, and Boutilier before him before the effective filing date of the claimed invention, to modify the combination to incorporate the vector elimination of Boutilier.
Given the advantage of reducing computation time, one having ordinary skill in the art would have been motivated to make this obvious modification.

Regarding Claim 10, Oh, Gupta, Pakzad, and Boutilier disclose the generation apparatus according to claim 9.

Boutilier discloses wherein the gain vector eliminated by the elimination section from the set of the gain vectors for the state of the target time point generated by the first generation section does not maximize the value of the cumulative expected gains in a predetermined probability distribution within the range of probability distribution of each state [“eliminating such dominated vectors” pg. 12, ¶2; “expected rewards accumulated” pg. 7, ¶2].
It would have been obvious to one having ordinary skill in the art, having the teachings of Oh, Gupta, Pakzad, and Boutilier before him before the effective filing date of the claimed invention, to modify the combination to incorporate the vector elimination of Boutilier.
Given the advantage of reducing computation time, one having ordinary skill in the art would have been motivated to make this obvious modification.

Examiner’s Note
The Examiner respectfully requests of the Applicant in preparing responses, to fully consider the entirety of the reference(s) as potentially teaching all or part of the claimed invention.  It is noted, REFERENCES ARE RELEVANT AS PRIOR ART FOR ALL THEY CONTAIN.  “The use of patents as references is not limited to what the .

Response to Arguments
Applicant’s arguments with respect to a valid range have been considered but are moot because the arguments do not apply to the references being used in the current rejection.
Regarding the 103 rejections, Applicant's arguments have been fully considered but have been found unpersuasive.  As a preliminary matter, Applicant argues that Oh does not disclose the valid range, however that limitation is rejected by Pakzad, not Oh.  Turning to the other arguments, Applicant argues that Gupta does not disclose, “a first generation section that generates a set of gain vectors for the target time point from the set of gain vectors for the next time point, by summing a product of cumulative expected gains for state transition and transition parameter” of claim 1 and “an output section that 
Regarding the first issue, the argued limitation both lacks written description and is indefinite.  As stated in the 112 rejections, the specification actually provides a different way to generate gain vectors.  Gupta discloses, “the Q value of the state at time t (s.sub.t) and the action at time t (a.sub.t) is the sum of the reward function of the current state at the current time interval (s.sub.t) added to the expected value of the policy at a state in the next consecutive time interval (s.sub.t+1), determined by the state transition probabilities (T)” in at least paragraph 37.  As the instant specification states in paragraph 43, “generates gain vectors αan at the target time point n as a sum of cumulative expected gains for state transition from the transition parameter P* (-,-|s,a) and immediate expected gains ra in the state s.”  Essentially, these passages both disclose summing expect gains for the state transitions to determine the gain vectors.  Accordingly, as best as can be understood, this limitation is rendered obvious by the combination of references.
Regarding the second issue, Applicant argues that while Gupta outputs information, it does not select it.  However, the cited reference need not explicitly state or use the claim language from the application.  “The use of patents as references is not limited to what the patentees describe as their own inventions or to the problems with which they are concerned. They are part of the literature of the art, relevant for all they contain.” In re Heck, 699 F.2d 1331, 1332-33, 216 USPQ 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (CCPA 1968)).  Furthermore, a reference may be relied upon for all that it would have reasonably 
For at least these reasons, the rejections are maintained.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ROBERT H BEJCEK II whose telephone number is (571)270-3610.  The examiner can normally be reached on Monday - Friday: 9:00am - 5:00pm.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on (571) 270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/R.B./            Examiner, Art Unit 2123                                                                                                                                                                                            

/ALEXEY SHMATOV/           Supervisory Patent Examiner, Art Unit 2123