DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 6/30/2022 has been entered.
This Non-Final rejection is in response to the amendment filed on 6/30/2022. Claims 1, 3-13, 15-17, and 21-25 are pending. Claims 1, 7, 13, and 15 are currently amended. Claims 1 and 13 are independent claims. Claims 2, 14, and 18-20 are cancelled. Claims 21-25 are new and Claim 21 is an independent claim.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 3-13, 15-17, and 21-25 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (IDS entry, NPL, Chen et al. “DeepRMSA: A Deep Reinforcement Learning Framework for Routing, Modulation and Spectrum Assignment in Elastic Optic Networks”, Journal article dated 5/19/2019, hereinafter Chen) in view of Li et al. (US 2021/0406147 A1, hereinafter Li, EFD 6/27/2020).
Regarding claim 1, Chen teaches a non-transitory computer-readable medium having instructions stored thereon for programming one or more processors to performs steps of:
obtaining a network state of a network having a plurality of nodes interconnected by a plurality of links and with services configured and operating between the plurality of nodes on the plurality of links, [Figure 1 shows DeepRMSA architecture, nodes, applications, DRL agents, SDN agents; Page 2, Section III DeepRMSA Framework, lines 28-36 describes SDN agents to collect network states and lightpath requests, topology derived is shown in Figure 1 with nodes and links, on Page 3, Column 1, lines 7-10 describe deploying multiple parallel DRL agents, each for a particular application or functionality (services); see Abstract and Introduction for context];
utilizing a reinforcement learning engine to analyze the services and the network state to determine modifications to one or more candidate services of the services to increase a value of the network state, [Page 2, Section III DeepRMSA Framework, lines 28-53 describe seven steps using reinforcement learning framework and modifying states based on actions adopted to increase rewards; Figure 1 shows (state, action) pairs implemented using DRL agents via SDN controller and SDN agents; see Abstract and Introduction for context]; 
wherein the modifications include changes to any of routing, modulation, and spectral assignment to the one or more candidate services, [Title and Abstract, and Page 2, Column 2, lines 1-4 describe provisioning a routing path with proper modulation format and allocating a number of spectrally contiguous FS’s on each link in the routing path]; and
responsive to implementation of the modification to the one or more candidate services, updating the network state based thereon, [Page 2, Section III DeepRMSA Framework, lines 28-53 and in particular on line 38 for step 1 where the SDN controller retrieves from the traffic engineering database key network state representations, including the in-service light paths, resource utilization and topology abstraction and Figure 1 describes (state, action) pairs and actions implemented update the network state for the next iteration; see Abstract and Introduction for context];
Chen does not explicitly teach wherein the reinforcement learning engine considers cost for disrupting the one or more candidate services;
Li teaches wherein the reinforcement learning engine considers cost for disrupting the one or more candidate services, [Par.[0296] describes reinforcement learning engine using rewards and penalties(cost) and Par.[0307] describes a scenario where assigning resource allocation with best effort workloads would involve penalties (disruption cost to other services) for priority workloads based on performance metrics];
it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen to include cost functions in RL optimization. The motivation/suggestion would have been to implement a learning environment using penalty/reward framework for choosing the optimal policies that are beneficial, [Li: Par.[0296].
Method claim 13 is a corresponding claim to claim 1 and is rejected as above.
Regarding claim 21, Chen teaches a non-transitory computer-readable medium having instructions stored thereon for programming one or more processors to performs steps of:
obtaining a network state of a network having a plurality of nodes interconnected by a plurality of links and with services configured and operating between the plurality of nodes on the plurality of links, [Figure 1 shows DeepRMSA architecture, nodes, applications, DRL agents, SDN agents; Page 2, Section III DeepRMSA Framework, lines 28-36 describes SDN agents to collect network states and lightpath requests, topology derived is shown in Figure 1 with nodes and links, on Page 3, Column 1, lines 7-10 describe deploying multiple parallel DRL agents, each for a particular application or functionality (services); see Abstract and Introduction for context];
utilizing a reinforcement learning engine to analyze the services and the network state to determine modifications to one or more candidate services of the services to increase a value of the network state, [Page 2, Section III DeepRMSA Framework, lines 28-53 describe seven steps using reinforcement learning framework and modifying states based on actions adopted to increase rewards; Figure 1 shows (state, action) pairs implemented using DRL agents via SDN controller and SDN agents; see Abstract and Introduction for context]; 
wherein the modifications include any of adding physical hardware to the network, migrating the one or more candidate services to use higher capacity modems, and changing provisioning of physical hardware, [RMSA scheme in Chen provides for flexible resource allocation mechanisms in EON which means that resources are allocated as needed, (adding/migrating/changing), Page 1, Column 2, lines 1-45]; and
responsive to implementation of the modification to the one or more candidate services, updating the network state based thereon, [Page 2, Section III DeepRMSA Framework, lines 28-53 and in particular on line 38 for step 1 where the SDN controller retrieves from the traffic engineering database key network state representations, including the in-service light paths, resource utilization and topology abstraction and Figure 1 describes (state, action) pairs and actions implemented update the network state for the next iteration; see Abstract and Introduction for context];
Chen does not explicitly teach wherein the reinforcement learning engine considers cost for disrupting the one or more candidate services;
Li teaches wherein the reinforcement learning engine considers cost for disrupting the one or more candidate services, [Par.[0296] describes reinforcement learning engine using rewards and penalties(cost) and Par.[0307] describes a scenario where assigning resource allocation with best effort workloads would involve penalties (disruption cost to other services) for priority workloads based on performance metrics];
it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Chen to include cost functions in RL optimization. The motivation/suggestion would have been to implement a learning environment using penalty/reward framework for choosing the optimal policies that are beneficial, [Li: Par.[0296].
Method claim 13 is a corresponding claim to claim 1 and is rejected as above.
Regarding claim 3, Chen and Li teach the non-transitory computer-readable medium of claim 1, and Chen teaches wherein the modifications include any of adding physical hardware to the network, migrating the one or more candidate services to use higher capacity modems, and changing provisioning of physical hardware, [RMSA scheme in Chen provides for flexible resource allocation mechanisms in EON which means that resources are allocated as needed, (adding/migrating/changing), Page 1, Column 2, lines 1-45]. 
Regarding claim 4, Chen and Li teach the non-transitory computer-readable medium of claim 1, and Chen teaches wherein the reinforcement learning engine is configured to evaluate the network state and provide the modifications to one or more candidate services each providing some increase in the value of the network state, [Page 2, Section III DeepRMSA Framework, lines 28-53, in particular step 3 where the DNNs of Deep RMSA read the state data and output an RMSA policy (modification) for the SDN controller with an associated reward].
Regarding claim 5, Chen and Li teach the non-transitory computer-readable medium of claim 1, and Chen teaches wherein the network state includes signals sensitive to any of topology of the network, link utilization, link spectral fragmentation, link participation in earlier blocking events, cost to increase link optical bandwidth, link contribution to latency, link optical path length, link path redundancy, customer supplied value, and value returned by a value function, [Page 2, Section III DeepRMSA Framework, lines 28-53, in particular network state representations, including the in-service lightpaths, resource utilization (link or spectrum utilization), topology abstraction].
Regarding claim 6, Chen and Li teach the non-transitory computer-readable medium of claim 1, and Chen teaches wherein the value of the network state is quantified by values for the services based on any of source node, destination node, links which a corresponding service crosses, path length relative to a shortest path in the absence of spectral contention, difficulty to route, latency, cost of disrupting the corresponding service, and customer value, [Page 2, Section III DeepRMSA Framework, lines 28-53, in particular step 6 that refers to a reward quantifying a value r for the (state, action) pair, see also Page 3, Column 2]. 
Regarding claim 7, Chen and Li teach the non-transitory computer-readable medium of claim 1, and Chen teaches wherein the reinforcement learning engine includes a determination of a reward after each action that includes the modifications, wherein the reward is utilized to determine the value of the network state, and wherein the reward is determined from any of fragmentation, survivability, latency, capacity, and output of a customer supplied value function, [Page 2, Section III DeepRMSA Framework, lines 28-53, in particular step 6 that refers to a reward quantifying a value r for the (state, action) pair, see also Page 3, Column 2]. 
Regarding claim 8, Chen and Li teach the non-transitory computer-readable medium of claim 1, and Chen teaches wherein the steps further include training the reinforcement learning engine for estimating a cumulative reward with respect to the value of the network state for each of the modifications, [Page 2, Section III DeepRMSA Framework, lines 28-53, in particular step 7 which indicates using training signals from previous RMSA operations for the (state, action) pair with the reward r, and continued on Page 3, Column 1, after servicing lightpath request R, maximizing long-term cumulative reward through self-learning]. 
Regarding claim 9, Chen and Li teach the non-transitory computer-readable medium of claim 8, and Chen teaches wherein the estimating is based on any of a parameterized deep neural network, a parameterized function, and a lookup table, [Page 3, Col. 2, under A. Modeling section, DNNs are described used in the modeling; see also Abstract].
Regarding claim 10, Chen and Li teach the non-transitory computer-readable medium of claim 8, and Chen teaches wherein the estimating is determined through one or more of simulation of events on the network and analyzing historical network data, [Page 6, Col. 1 describes offline training with an RMSA simulator before incorporating it in online lightpath provisioning for fine tuning and by definition, an offline simulator is using historical network data]. 
Regarding claim 11, Chen and Li teach the non-transitory computer-readable medium of claim 1, and Chen teaches wherein the implementation is based on an opportunity in the network, [it is unclear what this limitation is referring to; Page 2, Section III DeepRMSA Framework, lines 28-53, in particular step 1 of receiving a lightpath request R which presents an ‘opportunity’].
Regarding claim 12, Chen and Li teach the non-transitory computer-readable medium of claim 1, and Chen teaches wherein the services include any of optical channels and Time Division Multiplexed (TDM) channels, [Abstract indicates elastic optical networks and lightpath request indicates optical channels]. 
Claim 15 corresponds to claim 7 and is rejected as above.
Claim 16 corresponds to claim 8 and is rejected as above.
Claim 17 corresponds to claim 11 and is rejected as above.
Claim 22 corresponds to claim 4 and is rejected as above.
Claim 23 corresponds to claim 5 and is rejected as above.
Claim 24 corresponds to claim 6 and is rejected as above.
Claim 25 corresponds to claim 7 and is rejected as above.
Response to Arguments
Applicant’s arguments filed on 6/30/2022 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Upon further consideration, Examiner agrees with the Applicant that the previous reference Ciena2 raises a 102(b)(2)(C) exception . This reference is replaced with Li in the current rejection. Claims 2 and 3 were previously rejected using Chen and that rejection stands with respect to the current independent claims 1, 13, and 21 which incorporate those dependent claims (previous claim 2 into current claims 1 and 13 and previous claim 3 into new claim 21).  
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PADMA MUNDUR whose telephone number is (571)272-5383. The examiner can normally be reached 9:30 AM to 6:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Wing Chan can be reached on 571 272 7493. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/PADMA MUNDUR/Primary Examiner, Art Unit 2441