DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
Claims 13, 17, 19, 22, 26, 28, 30 and 34 were amended. Claims 1-12, 18, 21, and 27 have been canceled. Claims 35-36 were added. Claims 13-17, 19-20, 22-26, and 28-36 are pending and are examined herein.
Applicant’s amendment overcomes the previous grounds of rejection under 35 USC 101.
The rejection of claims 13-17, 19-20, 22-26, and 28-34 under 35 USC 103 has been updated responsive to Applicant’s amendment. Newly added claims 35-36 are rejected under 35 USC 103. See response to arguments.

Response to Arguments
Applicant’s arguments filed 09/21/2022 regarding the rejection under 35 USC 103 have been fully considered, but are not persuasive. Applicant argues on page 8 that the references do not teach the language added to the independent claims in the most recent amendment. Examiner respectfully disagrees. The rejection has been updated to clarify how the references are relied upon to teach the claimed subject matter. Responsive to Applicant’s amendment clarifying that the deep neural network machine learning model is configured to output an action, the rejection has been updated to clarify that the model is being interpreted as corresponding to the machine learning model comprising both the value network and the policy network taught by Fox. The artificial intelligence model taught by Fox uses these DNNs together to generate actions/plans. See current rejection for further details.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 09/29/2022 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

	Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 13-17, 19, 22-26, 28 and 30-35 are rejected under 35 U.S.C. 103 as being unpatentable over “Fox” (US 2019/0302310 A1) in view of “Garrett” (Learning to Rank for Synthesizing Planning Heuristics, incorporated into Fox by reference), further in view of “Tapia” (A PDDL-Based Simulation System).

	Regarding claim 13, Fox teaches
	A method, comprising: (Abstract describes a method for representing plans as pixels and then training a neural network based on the identified plans.)
	…a specification of a domain specified using an artificial intelligence planning language; (Fox, [0144] describes representing a planning task or workflow using a planning domain definition language (PDDL).)
	…using a computer processor (Figure 14 shows a system 1490 comprising a computer processor for implementing the techniques taught by Fox. See also [0170].)
	… a plurality of problem specifications ([0147] describes training a deep neural network using planning instances generated by a planner for a number of planning domains and problems. Suggests using the techniques of Garret, incorporated by reference (see below).)
	using an automated artificial intelligence planner to generate [planning instances]; and ([0147]: “As to a supervised learning phase, a DNN can be trained by planning instances generated by a planner for a number of planning domains and problems”. [0147] goes on to suggest generating the training data using the method taught by Garret, incorporated by reference into Fox. Garret is discussed in detail below. The “planner” is described in more detail at [0144] and is being interpreted as corresponding to the recited “automated artificial intelligence planner.)
	…training, using the generated [planning instances], a deep neural network machine learning model version of the automated artificial intelligence planner,  ([0133] indicates that the model of Fox may include a plurality of neural networks including a value network and a policy network. The “deep neural network machine learning model” is being interpreted as encompassing both of these networks. The training is described at [0146-0148]. In particular, [0147] includes: “a DNN can be trained by planning instances generated by a planner for a number of planning domains and problems”. [0147-0148] indicates that the training instances (generated using the method taught by Garret, incorporated by reference and described in more detail below) are used at least to generate the value network, which in turn is used to generate the policy network. Since the model is trained using the instances generated by the planner, it is reasonable to interpret the resulting model as a “version” of the automated intelligence planner.)
	wherein the deep neural network machine learning model version of the automated artificial intelligence planner is configured to receive as input, features extracted from a different specification in the artificial intelligence planning language and output an action that is decoded into the artificial intelligence planning language. ([0149] indicates that the resulting trained planner may operate as follows: “At each state , the value network is used to decide the most promising state to expand, and the policy network is used to decide the most promising applicable action to apply to expand that state.” [0144] indicates that the planning task or workflow may be represented in an artificial intelligence planning language such as PDDL. A person of ordinary skill in the art would recognize that the PDDL coding of the task or workflow taught at [0144] includes coding the actions which are part of the task or workflow in PDDL (i.e., the action output by the model would be “decoded” into PDDL since the whole workflow is expressed in PDDL). [0167-0168] further describes deploying the DNN to generate a plan including at least one action based on pixels (i.e., features) representing a problem specification.)
	Fox does not appear to explicitly describe 
	receiving a specification of a domain 
	…parsing the received specification of the domain;
	receiving problem-set generator parameters; 
	…to generate a plurality of problem specifications based on the parsed specification of the domain and the received problem-set generator parameters
	using an automated artificial intelligence planner to generate a plurality of problem solutions; and
	training, using the generated plurality of problem solutions, a deep neural network machine learning model version of the automated artificial intelligence planner,
	However, Garrett—directed to analogous art and incorporated into Fox by reference—teaches
	receiving a specification of a domain…; (Section 3, on the second page describes generating training data for training a machine learning planner. Section 3, right hand column: “The overall approach will be, for each planning domain, to train a learning algorithm on several planning problem instances, and then to use the learned heuristic to improve planning  performance on additional planning problems from that same domain. Note that the new problem instances use the same predicate and action schemas, but may have different universes of objects, initial states, and goal states.” Section 6 indicates that this was actually implemented: “We experimented on four domains from the 2014 IPC learning track [Vallati et al., 2015]: elevators, transport, parking, and no-mystery. For each domain, we constructed a set of unique examples with the competition problem generators by sampling parameters that cover  competition parameter space. We use a variant of the 2014 FastDownward Stone Soup portfolio [Helmert et al., 2011] planner, with a large timeout and memory limit, to generate training example plans. We trained on at most 10 examples randomly selected from the set of problems our training portfolio planner was able to solve, and then tested on the remaining problems”. In particular, the domains were provided to the system.)
	…receiving problem-set generator parameters; (Section 3, In order to learn a heuristic for a particular domain, we must first gather training examples from a set of existing training problems within the domain [Jimenez et al., 2012]. Suppose that we have a distribution over problems for a domain D, which will be used to generate testing problems. We will sample a set of training problems {Π1, ..., Πn} from D.” The number “n” of training problems is a problem set generator parameter. The experiments used n=10 as described in the second paragraph of section 6. The distribution over the problems is also being interpreted as a parameter. Note the published instant specification’s broad reading of “parameters” at [0022].)
	…to generate a plurality of problem specifications based on …the domain and the received problem-set generator parameters. (Section 3, two paragraphs following “Definition 2” describe generating a plurality of problem specifications for a given domain and given a number of training problems n to sample from a given distribution of D. See also second paragraph of section 6.)
	using an automated artificial intelligence planner to generate a plurality of problem solutions; and (Section 3, two paragraphs following “Definition 2” describe generating a plurality of problem specifications for a given domain and given a number of training problems n to sample from a given distribution of D. Approximately optimal plans (i.e., problem solutions) are determined for the problems. See also second paragraph of section 6 where the particular AI planners which were used are described.)
	training, using the generated plurality of problem solutions, a [model] (Section 3, two paragraphs following “Definition 2” describe training a learning algorithm/model on planning problem instances. The training data is of the form <x,y> where the x represents a state and problem and the y represents a length of an approximately optimal plan determined based on the solutions identified above. As the solutions are used to generate the training data, they are used to train the model. In the combination with Fox, this would correspond to training the “value network”, which is described in more detail above with respect to Fox.)
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify Fox to generate training data as taught by Garrett and described above because Garrett is incorporated into Fox by reference (see Fox at [0147]), where Fox explicitly suggests using the method of Garrett for generating training examples. 
	The combination of Fox and Garrett does not appear to explicitly teach 
	parsing the received specification of the domain;
	… the parsed specification of the domain
	However, Tapia—directed to analogous art—teaches
	parsing the received specification of the domain;… the parsed specification of the domain (Abstract describes a planning simulation system that allows for automated planning based on the PDDL language. Section 2.3. describes a parser for parsing PDDL problems.)
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify Fox and Garrett to use a PDDL parser as taught by Tapia because Fox teaches representing problems using PDDL and the parser taught by Tapia parses and stores PDDL data as convenient structures that can then be used to explore, analyze or translate the problem as described by Tapia in the third paragraph of section 2.3.
	
	Regarding claim 14, the rejection of claim 13 is incorporated herein. Furthermore, Fox teaches
	further comprising generating a training corpus for a machine learning model  ([0147] describes training a deep neural network using planning instances generated by a planner for a number of planning domains and problems. The collection of training data is being interpreted as a training corpus)
	Fox does not appear to explicitly teach 
	further comprising generating a training corpus for a machine learning model using the … specification of the domain and the generated plurality of problem specifications. 
	However, Garrett—directed to analogous art—teaches 
	further comprising generating a training corpus for a machine learning model using the … specification of the domain and the generated plurality of problem specifications. (Section 3, two paragraphs following “Definition 2” describe generating a plurality of problem specifications for a given domain and given a number of training problems n to sample from a given distribution of D. See also second paragraph of section 6. The collection of training data is being interpreted as a training corpus.)
	It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention to have performed this combination for the reasons given above with respect to claim 13.
	The combination of Fox and Garrett does not appear to explicitly teach 
	parsed specification of the domain
	However, Tapia—directed to analogous art—teaches
	parsed specification of the domain (Abstract describes a planning simulation system that allows for automated planning based on the PDDL language. Section 2.3. describes a parser for parsing PDDL problems.)
	It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention to have performed this combination for the reasons given above with respect to claim 13.

	Regarding claim 15, the rejection of claim 13 is incorporated herein. Furthermore, Fox teaches
	further comprising determining machine learning features from the generated plurality of problem specifications. ([0147] describes generating training data based on a plurality of problem specifications. [0167] describes representing the plan information (i.e., features) as pixels.)

	Regarding claim 16, the rejection of claim 15 is incorporated herein. Furthermore, Fox teaches
	wherein the determined machine learning features are encoded as a pixel-based image. ([0147] describes generating training data based on a plurality of problem specifications. [0167] describes representing the plan information (i.e., features) as pixels.)

	Regarding claim 17, the rejection of claim 13 is incorporated herein. Fox does not appear to explicitly teach 
	wherein the plurality of problem solutions based on the parsed specification of the domain and the received problem-set generator parameters.
	However, Garrett—directed to analogous art—teaches
	wherein the plurality of problem solutions based on the parsed specification of the domain and the received problem-set generator parameters. (Section 3, two paragraphs following “Definition 2” describe generating a plurality of problem specifications for a given domain and given a number of training problems n to sample from a given distribution of D. Approximately optimal plans are determined for the problems. See also second paragraph of section 6 where the particular AI planners which were used are described.)
	It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention to have performed this combination for the reasons given above with respect to claim 13.
	The combination of Fox and Garrett does not appear to explicitly teach 
	the parsed specification of the domain
	However, Tapia—directed to analogous art—teaches
	the parsed specification of the domain (Abstract describes a planning simulation system that allows for automated planning based on the PDDL language. Section 2.3. describes a parser for parsing PDDL problems.)
	It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention to have performed this combination for the reasons given above with respect to claim 13.

	Regarding claim 19, the rejection of claim 17 is incorporated herein. Fox does not appear to explicitly teach 
	wherein a first action of each of the generated plurality of problem solutions is extracted and encoded.
	However, Garrett—directed to analogous art—teaches
	wherein a first action of each of the generated plurality of problem solutions is extracted and encoded. (Section 4: We can now view our training inputs as xij =<sij, gi, πijh> where πijh is the DAG generated by heuristic h for state sij and goal gi.” The determination of the pi values is being interpreted as an extraction step and the representation in the feature vector is being interpreted as an encoding step. See also sections 4.1 and 4.2 for further details of the representation.)
	It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention to have performed this combination for the reasons given above with respect to claim 13.

Regarding claim 22, Fox teaches
A system, comprising: a processor; and a memory coupled with the processor, wherein the memory is configured to provide the processor with instructions which when executed cause the processor to: (Figure 14, system 1490 includes a processor and memory which stores instructions which are executable by the processor. See also [0170].)
The remainder of claim 22 is substantially similar to claim 13 and is rejected with the same rationale, mutatis mutandis.

Claims 23-26 and 28 recites substantially similar subject matter to claims 14-17 and 19, respectively, and are rejected with the same rationale in view of the rejection of claim 22, mutatis mutandis.

Regarding claim 30, Fox teaches
A computer program product embodied in a non-transitory computer readable medium and comprising computer instructions for: (Figure 14, system 1490 includes a processor and memory which stores instructions which are executable by the processor. See also [0170]. [0197] describes an embodiment as a computer program product on a non-transitory computer readable medium.)
The remainder of claim 30 is substantially similar to claim 13 and is rejected with the same rationale, mutatis mutandis.

	Claims 31-35 recites substantially similar subject matter to claims 14-17 and 19, respectively, and are rejected with the same rationale in view of the rejection of claim 30, mutatis mutandis.

	Claims 20, 29, and 36 are rejected under 35 U.S.C. 103 as being unpatentable over “Fox” (US 2019/0302310 A1) in view of “Garrett” (Learning to Rank for Synthesizing Planning Heuristics), further in view of “Tapia” (A PDDL-Based Simulation System), and further in view of “Williams” (US 2017/0330077 A1).

	Regarding claim 20, the rejection of claim 19 is incorporated herein. The combination of Fox, Garrett and Tapia does not appear to explicitly teach 
	wherein the first action is encoded as a one-hot vector.
	However, Williams—directed to analogous art—teaches
	wherein the first action is encoded as a one-hot vector. (Abstract describes using supervised and reinforcement learning to train bots to perform actions. [0062] indicates that the actions may be represented using one-hot encoding.)
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify Fox, Garrett and Tapia to use one-hot encoding of actions as taught by Williams because this allows for supervised training of a neural network using a cross-entropy loss function as described by Williams at [0062]. [0003] indicates that the implementation taught by Williams (which would include the one-hot encoding and use of cross-entropy loss function) allows for the autonomous improvement of bot behavior.

	Claims 29 and 36 recites substantially similar subject matter to claim 20 and are rejected with the same rationale in view of the rejection of claims 28 and 35, mutatis mutandis.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Markus A Vasquez whose telephone number is (303)297-4432. The examiner can normally be reached Monday to Friday 9AM to 4PM PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li Zhen can be reached on (571) 272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MARKUS A. VASQUEZ/Examiner, Art Unit 2121