DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
This action is in reply to the application filed on 11/27/2019
Claims 1-12 are currently pending and have been examined.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-12 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Claims 1-12 are either directed to a system or method, which are/is one of the statutory categories of invention.  (Step 1: YES).
The Examiner has identified method Claim 12 as the claim that represents the claimed invention for analysis and is similar to system Claim 1.
Claim 12 recites the limitations of:

generating a subsidiary prediction value by inputting the trading data into a pre-trained first deep learning model based on supervised learning; 
deriving an order execution strategy for the at least one item during a current period of time based on the trading data and the subsidiary prediction value by using a pre-trained second deep learning model based on reinforcement learning; and 
instructing order execution for the at least one item during the current period of time by using order information including the order execution strategy.

These above limitations, under their broadest reasonable interpretation, cover performance of the limitation as certain methods of organizing human activity.  The claim recites elements, highlighted in bold above, which covers performance of the limitation as a fundamental economic practice.  If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation as a fundamental economic practice, then it falls within the “Certain Methods of Organizing Human Activity” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.  Therefore Claims 1 and 12 are abstract. (Step 2A-Prong 1: YES. The claims are abstract)
	Claim 12 is also abstract as a mental process, as at least one of the steps must be performed by a computer.  (For example: generating/deriving, by the order execution server, a subsidiary value/order execution strategy…).

i.e., as a generic processor performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component.  Accordingly, these additional elements, when considered separately and as an ordered combination, do not integrate the abstract idea into a practical application because they  do not impose any meaningful limits on practicing the abstract idea. 
Furthermore, the claims recite the additional elements of “collecting trading data on at least one item” and by using a pre-trained second deep learning model based on reinforcement learning;”. The additional elements do not amount to significantly more than extra solution activity, because the additional element amounts to no more than mere instructions to apply a generic computer. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea.  Therefore claims 1, and 12 are directed to an abstract idea without a practical application. (Step 2A-Prong 2: NO. The additional claimed elements are not integrated into a practical application)
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered separately and as an ordered combination, they do not add significantly more (also known as an “inventive concept”) to the exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a computer hardware amounts to no more than mere instructions to apply the exception using a generic computer component.  
The claim is not patent eligible. Steps such as receiving and transmitting are steps that are considered insignificant extra solution activity and mere instructions to apply the exception using general computer components (see MPEP 2106.05(d), II). Thus claims 1 and 12 are not patent eligible. (Step 2B: NO. The claims do not provide significantly more)  
Dependent claims 2-11 further define the abstract idea that is present in their respective independent claims 1, and 12 and thus correspond to Certain Methods of Organizing Human Activity and hence are abstract for the reasons presented above.  The dependent claims do not include any additional elements that integrate the abstract idea into a practical application or are sufficient to amount to significantly more than the judicial exception when considered both individually and as an ordered combination.  Therefore, the claims 2-11 are directed to an abstract idea.  Thus, the claims 1-12 are not patent-eligible.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-4 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Burhani et al.  (PG PUB US 2019/0361739 A1) in view of Singh et al. (PG PUB US 2019/0244288).
Regarding claims 1 and 12 

Burhani teaches:
An order execution server for stock trading, comprising: a data collection unit configured to collect trading data on at least one item; (See at least Burhani [0048] The 

an order execution strategy deriving unit configured to derive an order execution strategy for the at least one item during a current period of time based on the trading data and the subsidiary prediction value by using a pre-trained second deep learning model based on reinforcement learning; and (See at least Burhani [0045] “The computing system 100 can train one or more reinforcement learning network(s) 110 (which can also be referred to as reinforcement learning agent 110) using training engine 118 and matching engine 114. The trained reinforcement learning networks 110 can be used by computing system 100 or other system. In some embodiments, the trained reinforcement learning networks 110 can receive data from one or more data sources 160 which can provide market data, and based on this received data and its previous training, can transmit one or more data processing tasks representing electronic trade instructions to trade entities 150a, 150b, in some embodiments. The computing system 100 can process trade orders using the reinforcement learning network 110 in response to commands or data messages from trade entities 150a, 150b, in some embodiments. Trade entities 150a, 150b can interact with the computing system to receive output data and provide input data. The trade entities 150a, 150b can have at least one computing device. In some embodiments, a trade entity can represent one or more order processing engines at a stock exchange.”

an order execution instruction unit configured to instruct order execution for the at least one item during the current period of time by using order information including the order execution strategy.  (See at least Burhani [0052] and [0132]: [0052] “Reinforcement learning network 110 receives input data (via a data collection unit) and generates output data for provision to trade entities 150a, 150b. Reinforcement learning network 110 can refer to a neural network that implements reinforcement learning.” And [0132] “In other words, in some embodiments, the processor advances the clock based on a timing critical path for the reinforcement learning agent(s), the matching engine, and the resource generating agent to complete their respective tasks for a time interval.”


However Burhani does not teach “a subsidiary prediction value generation unit configured to generate a subsidiary prediction value by inputting the trading data into a pre-trained first deep learning model based on supervised learning;”
However Singh teaches: at least at [0012] A machine learnt network using a large set of historic market data identifies optimal timing and sizing by predicting future market states in order to generate the transaction placement strategy.


It would have been obvious to a person of ordinary skill in the art before the effective filling date to combine the Trade platform with reinforcement learning network and machine 

Regarding claim 2
Burhani teaches:
The order execution server of Claim 1, further comprising: a model generation unit configured to generate the second deep learning model, wherein the second deep learning model includes two or more actors which are neural networks that determine an action policy of a reinforcement learning agent and a critic which is a neural network that estimates an action value of the reinforcement learning agent.  (See at least Burhani [0041] For example, in some embodiments, aspects of the embodiments described herein may provide a dynamic resource environment for training machine learning agents to generate and interact with its own or other agents' data processing tasks which require different electronic resources. 
Regarding claim 3
Burhani teaches:

The order execution server of Claim 2, wherein the model generation unit is configured to train the two or more actors to improve a reward for the order execution strategy based on the trading data in a reinforcement learning environment.  (See at least Burhani [0008] “to a reinforcement learning agent associated with the first data processing task or the second data processing task, executed task data identifying the matched data processing task, the consumed resource, the cost of the resource, and the quantity of the consumed resource; the executed task data providing input or state data for the reinforcement learning agent.”)

Regarding claim 4

Burhani teaches:
 The order execution server of Claim 2, wherein the two or more actors include: a first actor configured to determine an order volume of the at least one item; and (See at least Burhani [0057] and [0107]: [0057] “Scheduler 116 is configured to control the reinforcement learning network 110 within schedule satisfaction bounds computed using order volume and order duration.” And [0107] “[0107] In some embodiments, the executed task data can be combined, normalized and/or used to created derivative or other data based on the executed task data to be used in training or providing inputs or state data to the reinforcement learning agents. For example, in some embodiments, the executed task data can be used to calculated volume weighted average prices, daily volumes, cumulative volumes, price movements/momentum, etc.”)

a second actor configured to determine an order cancellation volume of the at least one item. (See at least Burhani [0078] and [0126]-[0127] “The liquidity filter 204 implements random cxl generation at 210 for randomly sampling prices, for example. This can also generate controls to cancel pricing and volume. And [0126] With reference to FIG. 3, in some embodiments, the processor is configured to generate cancellation data processing tasks. In some embodiments, the cancellations are randomly generated to cancel/remove an order previously generated by the resource generation agent after a cancellation delay. In some embodiments, the cancellations can be generated for all data processing tasks or a randomly selected subset of the data processing tasks. And [0127] In some embodiments, the cancellation delay can be defined (e.g. 10 seconds), or can be randomly generated (e.g. based on a Gaussian distribution). 
Regarding claim 10
Burhani teaches: 
The order execution server of Claim 1, wherein the order execution instruction unit is configured to instruct order execution for the at least one item during the current period of time by transmitting the order information including the order execution strategy to a stock trading management server.   (See at least Burhani [0052] and [0132]: [0052] “Reinforcement learning network 110 receives input data (via a data collection unit) and generates output data for provision to trade entities 150a, 150b. Reinforcement learning network 110 can refer to a neural network that implements reinforcement learning.” And [0132] “In other words, in some embodiments, the processor advances the clock based on a timing critical path for the reinforcement learning agent(s), the matching engine, and the resource generating agent to complete their respective tasks for a time interval.”)

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over the references of claims 1-4, and 10 and further in view of Fong (PG PUB US 2019/0197244 A1)
Regarding claim 8
Burhani teaches:
The order execution server of Claim 2, wherein the critic includes: a first critic configured to calculate a reward based on a result of performing order execution according to the order execution strategy; and (See at least Burhani [0056] A reward system 126 integrates with the reinforcement learning network 110, to control what constitutes good and bad results within the environment (e.g. the simulation environment generated by matching engine 114). In some embodiments, the reward system 126 can implement a process in which reward data is normalized and converted into the reward that is fed into models of reinforcement learning networks 110.”)

However Burhani does not specifically teach “a second critic configured to update the calculated reward by a reward average method. 

However Fong teaches: 

At least at Fong [0042] Embodiments described herein generally relate to using an Asynchronous Advantage Actor -Critic (A3C) reinforcement learning algorithm to help build a payload based on how a system reacts to other synthetically created payloads.


It would have been obvious to a person of ordinary skill in the art before the effective filling date to combine the Trade platform with reinforcement learning network and machine engine of Burhani and Singh with the Reinforcement based system of Fong since the claimed invention is merely a combination of old elements and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable. Further motivation is provided because “time and computing resources may be saved, since the machine learning agent iteratively updates the machine learning model to generate action data based on incremental feedback of the system and computer application.” (Fong  [0040]) Therefore, Claim 8 is obvious over the disclosure of Burhani, in view of Singh and Fong.

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over the references of claim8 and further in view of Wang et al.  (PG PUB US 2019/0042761 A1)

Regarding claim 9 

Burhani does not specifically teach “The order execution server of Claim 8, wherein the model generation unit is configured to train the first critic to calculate a reward based on a result of performing order execution according to the order execution strategy, and the model generation unit is configured to train the second critic to update the calculated reward by a reward average method.”

However Wang teaches:

At least at [0030] The critic agent 147 determines a prediction of a future reward based on the observation and reward from the processing environment 111. More specifically, the critic agent is a value function that measures how good each state or state-action pair is. The goal of the critic agent 147 is to find a policy that maximizes the total accumulated reward, also called the return. By following a given policy and processing the rewards, the critic agent 147 can build estimates of the return. In the training stage, the critic agent 147 may use temporal-difference (TD) learning to improve itself and the actor agent 108 performs an action. The critic agent 147 accesses how good the action and environment state to compute a gradient for training the actor agent 108, for example.
It would have been obvious to a person of ordinary skill in the art before the effective filling date to combine the Trade platform with reinforcement learning network and machine engine of Burhani, Singh and Fong with the Techniques to detect perturbation attacks with an actor –critic framework of Wang since the claimed invention is merely a combination of old elements and in the combination each element merely would have performed the same .

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over the references of claims 1-4, and 10 and further in view of Yu (PG PUB US 2019/0114711 A1)

Regarding claim 11 


However Burhani does not teach “The order execution server of Claim 1, wherein the subsidiary prediction value is a volume curve.”
However Yu teaches: at least at [0058] The financial analysis system for unstructured text data provided by the present disclosure is easy to operate. After inputting a key word, the user only needs to picks up one of indexes (e.g., a node of the stock volume curve, a node of the stock price cure or a k-line) at a time point shown in the display image of the user interface, the financial analysis system for unstructured text data provided by the present disclosure can generate many kinds of analysis results which are structured data, such as the exposure factor, the optimistic factor, the encouraging factor, the positive article number, the negative article number and the word cloud.  

It would have been obvious to a person of ordinary skill in the art before the effective filling date to combine the Trade platform with reinforcement learning network and machine engine of Burhani, Singh with the Techniques to detect perturbation attacks with an actor –critic framework of Wang since the claimed invention is merely a combination of old elements and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable. Further motivation is provided because “By using the financial analysis system and method for unstructured text data provided by the present disclosure, unstructured text data, such as daily news in different industries, can be converted into many kinds of analysis results which are represented as structured data. In this manner, the trend of the stock market, such as stock volume, stock index . . . etc., can be more effectively predicted..” (Yu  [0008]) Therefore, Claim 8 is obvious over the disclosure of Burhani, in view of Singh, and Yu


Prior art Rejection
After search and consideration the art rejection regarding claim 5-7 no rejection is made at this time. 




Double Patenting
Claims 1-12 of this application is patentably indistinct from claims 1-13 of Application No. 16/698094. Pursuant to 37 CFR 1.78(f), when two or more applications filed by the same applicant or assignee contain patentably indistinct claims, elimination of such claims from all but one application may be required in the absence of good and sufficient reason for their retention during pendency in more than one application. Applicant is required to either cancel the patentably indistinct claims from all but one application or maintain a clear line of demarcation between the applications. See MPEP § 822.


The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claim 1-12 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1-13 of copending Application No. 16/698,094. Although the claims at issue are not identical, they are not patentably distinct from each other.
Regarding claims 1 and 12
An order execution server for stock trading, comprising: a data collection unit configured to collect trading data on at least one item;  (See 16/698,094 claim 1 “An order execution server for stock trading, comprising: a data collection unit configured to collect trading data on at least one item;”)
a subsidiary prediction value generation unit configured to generate a subsidiary prediction value by inputting the trading data into a pre-trained first deep learning model based on supervised learning; (See 16/698,094 Claims 2 and 3: Claim 2 “The order execution server of Claim 1, wherein the model generation unit is configured to generate a supervised learning-based deep learning model that derives a subsidiary prediction value by inputting the trading data.” And Claim 3 “The order execution server of Claim 2, further comprising: a subsidiary prediction value generation unit configured to generate a subsidiary prediction value by inputting the trading data into the supervised learning-based deep learning model.”)
an order execution strategy deriving unit configured to derive an order execution strategy for the at least one item during a current period of time based on the trading data and the subsidiary prediction value by using a pre-trained second deep learning model based on reinforcement learning; and (See 16/698,094  Claim 4 “an order execution strategy deriving unit configured to derive an order execution strategy for the at least one item during the current period of time based on the trading data and the subsidiary prediction value by using the reinforcement learning-based deep learning model; and”)
an order execution instruction unit configured to instruct order execution for the at least one item during the current period of time by using order information including the order execution strategy.  (See 16/698,094  Claim 4 “an order execution instruction unit configured to instruct order execution for the at least one item during the current period of time by transmitting the order information including the order execution strategy to a stock trading management server.”) 

Regarding Claim 2
The order execution server of Claim 1, further comprising: a model generation unit configured to generate the second deep learning model, wherein the second deep learning model includes two or more actors which are neural networks that determine an action policy of a reinforcement learning agent and a critic which is a neural network that estimates an action value of the reinforcement learning agent.   (See 16/698,094  claim 1 “a model generation unit configured to generate a reinforcement learning-based deep learning model including two or more actors which are neural networks that determine an action policy of a reinforcement learning agent and a critic which is a neural network that estimates an action value of the reinforcement learning agent and train the reinforcement learning-based deep learning model to derive an order execution strategy for the at least one item based on the trading data;”)

Regarding Claim 3

The order execution server of Claim 2, wherein the model generation unit is configured to train the two or more actors to improve a reward for the order execution strategy based on the trading data in a reinforcement learning environment.  (See 16/698,094  claim 5 “The order execution server of Claim 3, wherein the model generation unit is configured to train the two or more actors to improve a reward for the order execution strategy based on the trading data in a reinforcement learning environment.;”)

Regarding Claim 4

The order execution server of Claim 2, wherein the two or more actors include: a first actor configured to determine an order volume of the at least one item; and a second actor configured to determine an order cancellation volume of the at least one item. (See 16/698,094  Claim 6 “The order execution server of Claim 2, wherein the two or more actors include: a first actor configured to determine an order volume of the at least one item; and a second actor configured to determine an order cancellation volume of the at least one item.”)

Regarding claim 5

The order execution server of Claim 4, wherein the two or more actors include a third actor configured to determine a final order volume of the at least one item based on the order volume determined by the first actor and the order cancellation volume determined by the second actor.  (See 16/698,094  Claim 7 “The order execution server of Claim 6, wherein the two or more actors include a third actor configured to determine a final order volume of the at least one item based on the order volume determined by the first actor and the order cancellation volume determined by the second actor.”)

Regarding claim 6

The order execution server of Claim 5, wherein the model generation unit is configured to train the first actor to determine an order volume of the at least one item for the current period of time based on the subsidiary prediction value, and the model generation unit is configured to train the second actor to determine an order cancellation volume of the at least one item for the current period of time based on the subsidiary prediction value.   (See 16/698,094  Claim 8 “wherein the model generation unit is configured to train the first actor to determine an order volume of the at least one item for the current period of time based on the trading data, and the model generation unit is configured to train the second actor to determine an order cancellation volume of the at least one item for the current period of time based on the trading data.”)


Regarding claim 7
The order execution server of Claim 5, wherein the model generation unit is configured to train the third actor to determine a final order volume based on the order volume determined by the first actor and the order cancellation volume determined by the .  (See 16/698,094 Claim 9 “The order execution server of Claim 8, wherein the model generation unit is configured to train the third actor to determine a final order volume based on the order volume determined by the first actor and the order cancellation volume determined by the second actor.”)

Regarding Claim 8 
The order execution server of Claim 2, wherein the critic includes: a first critic configured to calculate a reward based on a result of performing order execution according to the order execution strategy; and a second critic configured to update the calculated reward by a reward average method.  (See 16/698,094  Claim 10 “The order execution server of Claim 1, wherein the critic includes: a first critic configured to calculate a reward based on a result of performing order execution according to the order execution strategy; and a second critic configured to update the calculated reward by a reward average method.”)

Regarding claim 9 
The order execution server of Claim 8, wherein the model generation unit is configured to train the first critic to calculate a reward based on a result of performing order execution according to the order execution strategy, and the model generation unit is configured to train the second critic to update the calculated reward by a reward average method. (See 16/698,094  Claim 11 “The order execution server of Claim 10, wherein the model generation unit is configured to train the first critic to calculate a reward based on a result of performing order execution according to the order execution strategy, and the model generation unit is configured to train the second critic to update the calculated reward by a reward average method.”)

Regarding claim 10 
The order execution server of Claim 1, wherein the order execution instruction unit is configured to instruct order execution for the at least one item during the current period of time by transmitting the order information including the order execution strategy to a stock trading management server.   (See 16/698,094  Claim 4 “an order execution instruction unit configured to instruct order execution for the at least one item during the current period of time by transmitting the order information including the order execution strategy to a stock trading management server.”)
Regarding claim 11 
The order execution server of Claim 1, wherein the subsidiary prediction value is a volume curve.   (See 16/698,094  Claim 12 “The order execution server of Claim 3, wherein the subsidiary prediction value is a volume curve.”)

This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.

Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to GREGORY MARK JAMES whose telephone number is (571)272-5155.  The examiner can normally be reached on M-F 8:30am - 5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Shahid Merchant can be reached on (571) 270-1360.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/GREGORY M JAMES/Examiner, Art Unit 3693                                                                                                                                                                                                        
/KENNETH BARTLEY/Primary Examiner, Art Unit 3693