Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responsive to the application filed November 19, 2019.  Claims 1-9 are pending.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-5 and 7-9 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Chu et al. United States Patent Application Publication No.  2020/0099733.

As per claim 1, Chu teaches system for training adaptive real-time streaming using deep reinforcement learning (DRL), comprising:
one or more agents, one or more environment units, and one or more deep reinforcement learning networks [ABR controllers (pp 0023); user device (pp 0024); ML model and training a learning agent (pp 0025); DRL (pp 0065)],
wherein each agent takes an action towards said one or more environment units at time t, the action including transmitting video data at a bitrate [stream 
each agent receives one or more network states from said one or more environment units, said network states including one or more network quality of service (QoS) factors and one or more playback statuses [network data (pp 0036); each abr controller associated with specific network information (pp 0060)];
each agent takes another action at time t+1 based on a reward received from said one or more environment units [reward determined (pp 0048); inputs to abr controller include reward (pp 0060); action space and feedback space including reward (pp 0066)]; and
wherein said one or more environment units receive the action from each agent, provide said network states to each agent, and provide said reward to each agent [abr weighted based on reward function (pp 0051); determine bitrate based on network information (pp 0061)];
said one or more environment units determining said reward by balancing multiple network quality of experience (QoE) requirements [reward includes QoE and the reward is normalized (pp 0048)].  

As per claim 2, Chu teaches the system of Claim 1, wherein said deep reinforcement learning networks are deployed in said one or more agents to receive said network states, make determinations on said actions and update said one or more agents' networks [deep learning is deployed in agent (pp 0009; 0025, 0065)].  



As per claim 4, Chu teaches the system of Claim 1, wherein said playback statuses comprise a received frame rate, a maximum received frame interval, and a minimum received frame interval [playback states (pp 0033-0034); player simulator (pp 0053)].  

As per claim 5, Chu teaches the system of Claim 1, wherein said multiple QoE requirements include maximizing the video quality by utilizing highest average bitrate, minimizing video freezing events, maintaining the video quality smoothness, and minimizing the video latency [Qoe indicators (pp 0033, 0048, 0066-0067)].  

As per claim 7, Chu teaches the system of Claim 1, wherein the action is taken at a frequency to enable fast reaction to a change in said network states, including one action per second or one action per group of picture [network data includes bandwidth per time or other network information (pp 0036)].   

As per claim 8, Chu teaches the system of Claim 1, wherein the one or more agents comprise one or more regular agents and one or more central agents, wherein the central agent receives information from one or more regular agents, computes one or more network parameters based on the information, and passes said network parameters to said one or more 
  
As per claim 9, Chu teaches the system of Claim 1, where in a simulation is constructed to provide network states to train the deep reinforcement learning networks offline [platform includes simulator which can include offline real-world data conditions (pp 0046, 0052)].  


Allowable Subject Matter
Claim 6 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  Claim 6 is deemed to be directed to an nonobvious improvement over the teaching in United States Patent Application Publication No.  2020/0099733. The claim comprises a reward being calculated by subtracting a freezing penalty, a smoothing penalty and a latency penalty from a bitrate utility.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure is noted in PTO-892.

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ario Etienne can be reached on 571-272-4001.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/UZMA ALAM/             Primary Examiner, Art Unit 2457