DETAILED ACTION
Introduction
This office action is in response to applicant’s request for continued examination and IDS filed 5/6/2021. Claims 1-4 and 6-20 are currently pending and have been examined. Applicant’s IDS have been considered. There is no claim to foreign priority.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after allowance or after an Office action under Ex Parte Quayle, 25 USPQ 74, 453 O.G. 213 (Comm'r Pat. 1935). Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, prosecution in this application has been reopened pursuant to 37 CFR 1.114.  Applicant's submission filed on 5/6/2021 has been entered.
Response to Arguments
Applicant’s arguments, see remarks, filed 10/21/2020, with respect to the rejection(s) of claim(s) 1-20 under 35 USC 102 and 35 USC 103 and the Double Patenting rejection (see applicant’s arguments), as based on the applicant’s current amendments, have been fully considered and are persuasive.  Therefore, the rejections have been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of the previously cited references and Carbune et al. (Carbune, US 2018/0061400), with respect to 35 USC 103.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4, 6-8, 11-16 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Liu e al. (Liu, US 2018/0357566) in view of Pock et al. (Pock, Diagonal preconditioning for first order primal-dual algorithms in convex optimization) and further in view of Carbune et al. (Carbune, US 2018/0061400).
As per claim 1, Liu teaches a system, comprising a processor to: 
receive data comprising text input (paragraph [0001, 0018-0020]-see his text and classification thereof) [corresponding to a conversation comprising an input partial response from an agent to be completed as a multi-objective generative text task] for a multi-objective generative text task (Figs. 1 and 5, paragraphs [0002, 0016, 0035, 0044]-his input, training, classifier, multiple tasks and optimizations); and 
generate a [completed] response to the text input (ibid-see below primal-dual network discussion) via a trained primal network, [wherein the completed response comprises the input partial response and generated text that completes the input partial response], wherein the primal network and a dual network are trained for [the multi-objective generative] text task using a Lagrangian loss function representing a plurality of objectives, wherein the primal network is trained to minimize the Lagrangian loss function and the dual network is trained to maximize the Lagrangian loss function (paragraph [0042, 0001, 0002, 
[wherein the primal network comprises a step size that is smaller than a step size of the dual network during training]. 
The Examiner notes Liu lacks explicitly teaching that which Pock et al. teaches wherein the primal network comprises a step size that is smaller than a step size of the dual network during training (page 5 section 3, see his sigma and tau, primal and dual step size for gradient-wherein the primal step size is smaller than the dual network step size, and Lagrange multipliers discussion, further pages 1-8, define the primal dual algorithm, including parameterization, step size and corresponding convergence, alternating steps and producing an optimized and significantly faster convergence).
Thus, it would have been obvious to one of ordinary skill in the linguistics art, before the effective filing date of the invention, as all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods (computer implemented techniques and algorithms combining processes and steps in natural language processing), in view KSR International Co. v. Teleflex Inc., 550 US. -- 82 USPQ2nd 1385 (2007), wherein the predictable result would be producing an optimized and significantly faster convergence of primal-dual networks (ibid, Pock).
The above combination lacks teaching that which Carbune teaches receive data comprising text input corresponding to a conversation comprising an input partial response from an agent to be completed as a multi-objective generative text task (paragraph [0010]-his “receiving an initial textual output” as the partial response from the agent, to be completed, Fig. 7 item 731); and 
generate a completed response to the text input via a trained primal network, wherein the completed response comprises the input partial response and generated text that completes the input partial response (ibid-paragraph [0010]-his “generating the reply content”, “modifying the initial textual output”, which subsequently completes the response, using the trained machine learning model, including the input partial response and generated text, see his generated the multi-objective generative text task using a Lagrangian loss function representing a plurality of objectives (ibid-see his machine learning discussion, for the multi-objective generative text task).
Thus, it would have been obvious to one of ordinary skill in the linguistics art, before the effective filing date of the invention, as all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods (computer implemented techniques and algorithms combining processes and steps in natural language processing), in view of the teachings of Liu and Pock with Carbune to combine the prior art element of the primal-dual gradient method as taught by Liu with the alternating gradient having a different step size in the primal network as compared to the dual network as taught by Pock with the generative text task, including completion of a response from an agent in a conversation mode, as each element performs the same function as it does separately, as the combination would yield predictable results, KSR International Co. v. Teleflex Inc., 550 US. -- 82 USPQ2nd 1385 (2007), wherein the predictable result would be producing an optimized and significantly 
As per claims 4, 11 and 20, Liu further makes obvious the system of claim 1, wherein the dual network is randomly initialized during training (ibid, Liu, paragraph [0038-0042]-his randomly generated parameter vectors applied to his dual formulation). 
As per claims 6, 12 and 18, Liu further makes obvious the system of claim 1, wherein the processor is to estimate gradients based on a likelihood ratio estimate (ibid, paragraph [0037, 0038]-see his estimated gradients discussion, and cost function ratio). 
As per claim 7, Liu further makes obvious the system of claim 1, wherein the multi-objective generative text task comprises a selection, a classification, a regression, a recommendation, a generation, or a prediction task (ibid, paragraph [0035, 0044]-see his multiple tasks, recommendation, classification, etc., see also Carbune generative text tasks, Fig. 7, abstract, Figs. 1-7-and generating responsive reply content, based upon multiple objectives).
As per claim 8, claim 1 sets forth limitations similar to claim 1 and is thus rejected under similar reasons and rationale, wherein the system is deemed to embody the computer-implemented method, such that Liu teaches a computer-
generative text task using a Lagrangian loss function representing a plurality of objectives (ibid-see claim 1, corresponding and similar discussion), wherein training the primal network and the dual network comprises training the primal network to minimize the Lagrangian loss function and training the dual network to maximize the Lagrangian loss function (ibid), wherein the primal network comprises a step size that is smaller than a step size of the dual network during training (ibid); receiving data comprising text input corresponding to a conversation comprising an input partial response from an agent to be completed as the multi-objective generative text task (ibid); and generate a completed response to the text input via the trained primal network (ibid), wherein the completed response comprises the input partial response and generated text that completes the input partial response (ibid). 
As per claims 13 and 19, Liu further makes obvious the computer-implemented method of claim 8, comprising updating policy gradients of the primal network and the dual network based on different step sizes for the primal network and the dual network (ibid-paragraphs [0038-0042-his training/updating gradient descent with respect to primal/dial parameters as policies, and 
As per claim 14, Liu further makes obvious the computer-implemented method of claim 8, wherein training the primal network and the dual network comprises alternately training the primal network and the dual network (ibid, paragraphs [0003, 0037-0042]-his sequential training, first the primal network, then dual network, see his sequential training discussion).
As per claim 15, claim 1 sets forth limitations similar to claim 1 and is thus rejected under similar reasons and rationale, wherein the computer-readable storage medium is deemed to embody the method, such that Liu teaches a  computer program product for training neural networks to perform multi-objective tasks, the computer program product comprising a computer-readable storage medium having program code embodied therewith, wherein the computer-readable storage medium is not a transitory signal per se, the program code executable by a processor to cause the processor to (paragraph [0005, 0006]-see his machine-readable medium discussion): train a primal network and a dual network for a multi-objective generative text task using a Lagrangian loss function representing a plurality of objectives (ibid-see claim 1, corresponding and similar discussion); train the primal network to minimize the Lagrangian loss 
As per claim 16, Liu further makes obvious the computer program product of claim 15, further comprising program code executable by the processor to train the primal network and the dual network using a pre-existing dataset, a simulator, a feedback from an environment, or any combination thereof (paragraphs [0003, 0004, 0017, 0037-0042]-his training primal dual network using pre-existing data, see his training samples discussion). 
Claims 2, 3, 9, 10 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Liu e al. (Liu, US 2018/0357566) in view of Pock in view of Carbune et al. (Carbune, US 2018/0061400), as applied to claim 1, and further in view of Peng et al. (Peng, Compsosite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning).
claims 2 and 9, Liu with Pock with Carbune further makes obvious the system of claim 1, but lacks teaching that which Peng teaches wherein (formulating-as per claim 9) the multi-objective generative text task comprises a Markov Decision process comprising a finite state space and a finite action space (pages 1-3-his Markov Decision Process (MDP) and  state-action space, wherein the options comprise Markov decision processes). 
Thus, it would have been obvious to one of ordinary skill in the linguistics art, at the time of the invention, as all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods (computer implemented techniques and algorithms combining processes and steps in natural language processing), in view of the teachings of Liu and Peng to combine the prior art element of performing a task utilizing multiple objective training model as taught by Liu with Markov Decision process as taught by Peng as each element performs the same function as it does separately, as the combination would yield predictable results, KSR International Co. v. Teleflex Inc., 550 US. -- 82 USPQ2nd 1385 (2007), wherein the predictable result would be having a learning technique to optimized prediction results in complex tasks (ibid, Peng). 
claims 3, 10 and 17, Liu further makes obvious the system of claim 1, but lacks teaching that which Peng teaches wherein the primal network is pretrained using a general policy learned from another setting or a random initialization (ibid-page 12-his learning algorithm/model using set of policies and random initialization).
Thus, it would have been obvious to one of ordinary skill in the linguistics art, at the time of the invention, as all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods (computer implemented techniques and algorithms combining processes and steps in natural language processing), in view of the teachings of Liu and Peng to combine the prior art element of performing a task utilizing multiple objective training model as taught by Liu with pretraining using policies as taught by Mei as each element performs the same function as it does separately, as the combination would yield predictable results, KSR International Co. v. Teleflex Inc., 550 US. -- 82 USPQ2nd 1385 (2007), wherein the predictable result would be having a learning technique to optimized via policy/reward prediction results (ibid, Peng see also abstract and pages 1-3, his policy learning discussion).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure (See PTO-892).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LAMONT M SPOONER whose telephone number is (571)272-7613.  The examiner can normally be reached on 8:00 AM -5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571)272-5551.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private 






/LAMONT M SPOONER/           Primary Examiner, Art Unit 2657                                                                                                                                                                                             
lms
8/14/2021