DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Examiner notes the entry of the following papers:
Amended claims filed 5/24/2022.
Applicant arguments/remarks made in amendment filed 5/24/2022.

A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 5/24/2022 has been entered.
 
Claims 1, 4, 5, 10, 13, 14, 17, and 20 are amended.
Claims 1-20 are presented for examination.
Response to Arguments
Applicant’s arguments in regard to prior art of record does not disclose the amended limitations are moot in view of new grounds of rejection necessitated by amendments.  See detailed rejection below.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 6-8, 10, and  16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Vu et al (A Deep Learning Based Method for Handling Imbalanced Problem in Network Traffic Classification, herein Vu), Morris et al (Industrial Control System Traffic Data Sets for Intrusion Detection Research, herein Morris), Khan et al (QoE Prediction Model and its Application in Video Quality Adaptation Over UMTS Networks, herein Khan), and Sarigiannidis et al (DIANA: A Machine Learning Mechanism for Adjusting the TDD Uplink-Downlink Configuration in XG-PON-LTE Systems, herein Sarigiannidis).
Regarding claim 1, 
	Vu teaches a method comprising (Vu, page 333, column 1, paragraph 1, line 9 “We used a recent proposed deep network for unsupervised learning called Auxiliary Classifier Generative Adversarial Network to generate synthesized data samples for balancing between the minor and the major classes.  We tested our method on a well-known network traffic dataset and the results showed that our proposed method achieved better performance compared to a recent proposed method for handling imbalanced problem in network traffic classification.” In other words, method is method.):
collecting, at a network device, a training dataset representing one or more states of the network device deployed in a network, the training dataset including at least a conditional class [that represents an operational state of the network device] (Vu, Figure 1, and page 336, column 1, paragraph 1, line 4 “This is the traffic dataset collected from the internal network with many applications of Dalhousie University Computing and Information Services Centre (UCIS) in 2007 on the campus network between the university and the Internet.  In NIMS dataset, both SSH traffic and non-SSH traffic are generated from the applications. There are six SSH services as Shell login; X11; Local tunneling; Remote tunneling; SCP; and SFTP.  Rest of applications are non-SSH traffic including DNS, HTTP, FTP, P2P (limewire), and telnet.  In this dataset, SSH traffic is generated by SSH connections from client computers to four SSH servers outside of Dalhousi network.” And page 336, column 1, paragraph 3, line 1 “NIMS dataset groups packets into flows based on the statistical features.” And, page 335, column 1, paragraph 2, line 6 “Another different property is the output of Discriminator D including a probability distribution over sources LS (Equation 2) and over classes labels LC (Equation 3).” 

    PNG
    media_image1.png
    92
    554
    media_image1.png
    Greyscale

In other words, internal network is network device, traffic dataset is training dataset, packets and statistical features are states of the network device, and from equation (3), LC, which is the likelihood of the correct conditional class, is conditional class.);
	training, by the network device and based on the training dataset, a first model that generates one or more fabricated attribute sets of artificial network traffic through the network device, wherein each fabricated attribute set is associated with a fabricated network packet (Vu, Figure 1, Figure 3, and,  page 334, column 2, paragraph 4, line 1 “Generative Adversarial Network (GAN) was proposed by GoodFellow et al. in [10] for unsupervised learning.  A GAN has two neuron networks which are trained in an opposition way.  The first neuron network is a Generator (G) and the second neuron network is a Discriminator (D).  The main idea behind GAN is to have two competing neural network models.  The generator takes noise as input and generates samples.  The discriminator receives samples from both the generator and the training data and attempt to distinguish between the two sources.  These two networks play a continuous game, where the generator is learning to produce more and more realistic samples, and the discriminator is learning to get better and better at distinguishing the generated data from the real data.  The two networks are trained simultaneously, and hope that the competition will drive the generated samples to be indistinguishable from the real data.” And, page 335, column 1, paragraph 2, line 6 “Another different property is the output of Discriminator D including a probability distribution over sources LS (Equation 2) and over class labels LC (Equation 3).” And, page 336, column 1, paragraph 1, line 1 “In order to test the effectiveness of the proposed method we used a well-known network traffic dataset: network Information Management and Security Group (NIMS) dataset [6].” And page, 336, column 1, paragraph 3, line 1 “NIMS dataset groups packets into flows based on the statistical features.  Traffic flows are defined by the sequence of packets that have same five tuples as source IP address, destination IP address, source port, destination port, and protocol type [11].  Each flow is described by 22 statistical features [6] show in Table 1.” And, page 336, column 1, paragraph 4, line 1 “We divided NIMS into two parts: a half of samples for training set and rest for testing set described in Table 2. We used the training set to train AC-GAN.  Then, the generator (After training) is used to generate new synthesized samples.”  And, specification of the instant application, paragraph [0014], line 13 “More specifically, the system 100 may observe or collect characteristics or attributes (e.g., loss per packet, delay per packet, etc.) of actual network packets that travel through a network device in order to formulate the training dataset 130.”
 
    PNG
    media_image2.png
    430
    562
    media_image2.png
    Greyscale
              
    PNG
    media_image3.png
    438
    405
    media_image3.png
    Greyscale

The GAN is attempting to mimic the “attributes” from the training data by producing “samples”.  In other words, training process is training, NIMS dataset is training dataset, GAN (generator and discriminator combined) is first model, generated is generates, five tuples is attribute set, flow defined by packets that have the same five tuples is attribute set is associated with a network packet, and synthesized samples is one or more fabricated attribute sets.);
	[training, by the network device and based on the training dataset, a second model that is modeled after an application program, of a plurality of application programs, of a client device that is connected to the network device and is communicating traffic via the network device] 
	[wherein the second model generates a predictive experience metric distribution that represents a predicted performance of the application program,] 
	[wherein the predictive experience metric distribution is unique for the application program;]
	generating the one or more fabricated attribute sets based on the training of the first model; (Vu, page 335, Figure 1, and page 334, column 2, paragraph 5, line 1 “The input of the generator (G) is a vector of random noise z and it outputs a synthesized sample Xfake = G(z).  Network D takes the input of a real data sample or a synthesized sample from the generator and the output is a probability distribution P(S|X) = D(X) over possible sources.” And, specification of the instant application, paragraph [0024], line 4 “…each network packet attribute set is associated with a distribution of predictive experience metrics 150, one predictive experience metric 150 for each replay of an audio content set.”

    PNG
    media_image4.png
    375
    494
    media_image4.png
    Greyscale

In other words, generator generates, and synthesized sample is fabricated attribute set.)
	[generating the predictive experience metric distribution for each fabricated attribute set of the one or more fabricated attribute sets based on the training of the second model; and]
	[altering, by the network device, one or more configurations of the network based on the predictive experience metric distribution.]
	Thus far, Vu does not explicitly teach the training dataset … that represents an operational state of the network device.
	Morris teaches the training dataset … that represents an operational state of the network device (Morris, page 68, paragraph 2, line 1 “The data sets presented in this paper include network traffic, process control and process measurement features from normal operations and attack against the two SCADA systems.” And, page 67, paragraph 2, line 2 “Much of the research uses training and validation data sets created by the same researchers who developed the intrusion detection systems.  Indeed, no standardized data set counting the normal SCADA network traffic and attack traffic is currently available to researchers.  In order to evaluate the performance of data mining and machine learning algorithms for SCADA intrusion detection systems, a network data set used for benchmarking intrusion detection system performance is sorely needed.  This paper presents four data sets, which include network traffic, process control and process measurement features from a set of 28 attacks against two laboratory-scale industrial control systems that use the MODBUS application layer protocol.” In other words, data sets are training data sets, and process control of a SCADA is operational state of the network device. )
Both Morris and Vu are directed to monitoring networks.  Vu teaches a training dataset including at least a conditional class but doesn’t teach a training dataset… that represents an operational state of the network device.   Morris teaches a training dataset… that represents an operational state of the network device. In view of the teaching of Vu, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Morris into Vu. This would result in the training dataset including at least a conditional class that represents an operational state of the network device.
One of ordinary skill in the art would be motivated to do this to provide more verifiable assurance and validation in intrusion detection systems and network security. (Morris, page 1, paragraph 1, line 5 “Researchers primarily rely on unique threat models and the corresponding network traffic data set to train and validate their intrusion detection systems.  This leads to a situation in which researchers cannot independently verify the results, cannot compare the effectiveness of different intrusion detection systems, and cannot adequately validate the ability of intrusion detection systems to detect various classes of attacks.  Indeed, a common data set is needed that can be used by researchers to compare intrusion detection approaches and implementations.  This paper describes four data sets, which include network traffic, process control and process measurement features from a set of 28 attacks against to laboratory-scale industrial control systems that use the MODBUS application layer protocol.”)
	Thus far, the combination of Vu and Morris does not explicitly teach training, by the network device and based on the training dataset, a second model that is modeled after an application program, of a plurality of application programs, of a client device that is connected to the network device and is communicating traffic via the network device. 
	Khan teaches training, [by the network device (see Vu Figure 1, and page 336, column 1, paragraph 1, line 4), page 4 of office action] and based on the training dataset,  a second model that is modeled after an application program, of a plurality of application programs, of a client device that is connected to the network device and is communicating traffic via the network device (Khan, page 431, column 1, paragraph 1, line 1 “The primary aim of this paper is to present a new content-based, non-intrusive quality of experience (QoE) prediction model for low bitrate and resolution (QCIF) H.264 encoded videos and to illustrate its application in video quality adaptation over Universal Mobile Telecommunication Systems (UMTS) networks.” And page 432, column 2, paragraph 4, line 3 “The test material comprises of six clips – three chosen for model training and three for validation.” And, page 432, column 1, paragraph 2, line 1 “Recent work on video quality assessment [20]-[22] has shown that video quality is affected by parameters associated with the encoder (e.g., sender bitrate) and the network (e.g., packet loss).” And, page 432, column 2, paragraph 2, line 2 “A new and efficient model to predict video quality over UMTS networks non-intrusively. The model uses a combination of parameters associated with the encoder and the UMTS access network for different types of content.” In other words, training is training, model is model, encoded videos is an application program, model for low bitrate and resolution…encoded videos is modeled after an application, encoder is network device, UMTS network is network and bitrate and packets is communicating traffic via the network device.),
Khan teaches wherein the second model generates a predictive experience metric distribution that represents a predicted performance of the application program (Khan, page 432, column 2, paragraph 4, line 1 “In this section, we present the development of the non-intrusive content-based QoE prediction model for low bitrate H.264 video for mobile streaming application. A new and efficient model to predict video quality over UMTS networks, non-intrusively.” And, Fig. 4, and page 434, column 1, paragraph 2, line 7 “Fig. 4 indicates that the MOS distribution is biased towards high MOS values.” And, page 431, paragraph 1, line 18 “The performance of the model was evaluated with unseen dataset with good prediction accuracy (~ 93%).”

    PNG
    media_image5.png
    568
    670
    media_image5.png
    Greyscale

In other words, model to predict video quality is a second model generates a predictive metric,  MOS distribution is distribution, and the performance of the model was evaluated with unseen dataset with good prediction accuracy (~ 93%) is predicted performance of the application program.),
Khan teaches wherein the predictive experience metric distribution is unique for the application program (Khan, page 434, column 1, paragraph 3, line 1 “We analyzed the relationships of the four chosen parameters that impacts on QoE – sender bitrate, content type, block error rate and mean burst length on end-to-end video quality.” And, page 434, column 1, paragraph 1, line 1 “Table III shows the results of the analysis.  The fourth column shows the F statistic and the fifth column gives the p-value, which is derived from the cumulative distribution function (cdf) of F [28]. A small p-value (p<= 0.01) indicates that MOS is significantly affected by the corresponding parameter.”

    PNG
    media_image6.png
    432
    575
    media_image6.png
    Greyscale

In other words, p-value is predicted experience metric, p-value is derived from the cumulative distribution function is predictive experience metric distribution, and Table III shows the p-value and cdf is derived from the application program is metric distribution is unique for the application program.);
	Khan teaches generating the predictive experience metric distribution for each [fabricated attribute set of the one or more fabricated attribute sets (see Vu, page 335, Figure 1, and page 334, column 2, paragraph 5, line 1) page 7 of office action] based on the training of the second model (Khan, Fig. 4, and page 434, column 1, paragraph 2, line 7 “Fig. 4 indicates that the MOS distribution is biased towards high MOS values.” And, page 432, column 2, paragraph 4, line 1 “In this section, we present the development of the non-intrusive content-based QoE prediction model for low bitrate H.264 video for mobile streaming application. A new and efficient model to predict video quality over UMTS networks, non-intrusively.” And page 432, column 2, paragraph 4, line 3 “The test material comprises of six clips – three chosen for model training and three for validation.” In other words, MOS distribution is predictive experience metric distribution, and model training is based on training the second model.  Examiner notes that Vu teaches fabricated sets, and once the second model is trained, it will generate predictive experience metric distributions regardless of the source of the dataset.); and
	Both Khan and the combination of Vu and Morris are directed to modeling network traffic for the purpose of improving quality of experience (QoE), among other things.  In view of the teaching of the combination of Vu and Morris, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Khan into the combination of Vu and Morris.  This would result in being able to model and generate the network traffic for an application QoE prediction model.
	One of ordinary skill in the art would be motivated to do this because the increased transmission of content over telecommunications networks has had a negative impact on the quality of experience, causing reduced usage and reduced revenue. (Khan, page 1, column 1, paragraph 1, line 1 “Transmission of video content over Universal Mobile Telecommunication Systems (UMTS) networks is growing exponentially and gaining popularity. Digital videos are now available everywhere – from handheld devices to  personal computers.   However, due to the bandwidth constraints of UMTS networks quality of experience (QoE) still remains of concern.  This is because low video quality leads to poor QoE which in turn leads to reduced usage of the applications/services and hence reduced revenues.”)


	Thus far, the combination of Vu, Morris, and Khan does not explicitly teach altering, by the network device, one or more configurations of the network based on the predictive experience metric distribution.  
	Sarigiannidis teaches altering, by the network device, one or more configurations of the network based on the predictive experience metric distribution (Sarigiannidis, Figure 3, and page 2, column 1, paragraph 3, line 11 “This concept implies that both devices are connected independently, for example, by using a single cable.  Hence, the bridging of the two domains (optical and wireless) is simple and flexible, needing no additional hardware. What this paper proposes is a novel machine learning mechanism that adjusts the uplink-downlink configuration based on the traffic conditions in the hybrid network.” And, page 2, column 1, paragraph 3, line 31 “The proposed framework succeeds in suitably changing this configuration in a periodic fashion by sensing the traffic changes in the network based on the SDN controller knowledge.  Numerical results indicate the improvements of the proposed framework when applied in multiple channel scenarios in terms of latency and jitter.” And, page 7, column 2, paragraph 2, line 6 “A probability vector is defined (each f LTE frame) for facilitating the decision making of the OLT SDN controller as follows: 
    PNG
    media_image7.png
    47
    457
    media_image7.png
    Greyscale

Pi(f) includes 7 member probabilities in line with the total uplink-downlink configuration options…. This probability vector implies how likely each configuration is to appear as the most suitable one.” Examiner notes that “predictive experience metrics 150(1) – 150(L) represent a calculated metric of a predicted performance of a respective application program when given a set of network packet attribute inputs”. (Paragraph [0015], line 13, specification of instant application.). In other words, adjusts is altering, device is device, configuration is one or more configurations, and machine learning mechanism using numerical results (probability vector) that were sensed by the SDN, is the predictive experience metric distribution. See Figure 3 steps 3 -5 “The OLT SDN controller runs Algorithms (1-3) and calculates..”).

    PNG
    media_image8.png
    647
    823
    media_image8.png
    Greyscale

	
	Both Sarigiannidis and the combination of Vu, Morris, and Khan are directed to, among other things, improving wireless service. In view of the teaching of the combination of Vu, Morris, and Khan it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Sarigiannidis into the combination of Vu, Morris, and Khan. This would result in being able to adjust device configurations based on observed performance in order to improve service.
	One of ordinary skill in the art would be motivated to do this to improve network performance by reducing latency and jitter. (Sarigiannidis, page 2, column 2, paragraph 1, line 2 “Numerical results indicate the improvements of the proposed framework when applied in multiple channel scenarios in terms of latency and jitter.  The results have been obtained by using real traffic traces in both downlink and uplink directions of the hybrid network.”)
Regarding claim 6,
	the combination of Vu, Morris, Khan, and Sarigiannidis teach the method of claim 1,
	wherein altering the one or more configurations of the network further comprises: causing, by the network device, a network controller to alter the one or more configurations of the network to improve performance of traffic flows passing through the network device (Sarigiannidis, page 1, column 1, line 7 “The proposed mechanism receives traffic-aware knowledge from the SDN controllers and applies an adjustment on the uplink-downlink configuration in the LTE radio communication.”  In other words, applies an adjustment is altering, configuration is the one or more configurations, SDN controller is network controller, traffic is traffic flow, applies adjustment is to improve the performance of the traffic flow.).
Regarding claim 7,
	the combination of Vu, Morris, Khan, and Sarigiannidis teach the method of claim 1,
	wherein altering the one or more configurations of the network further comprises: causing, by the network device, the client device to alter connectivity of the client device to the network device (Sarigiannidis, page 1, column 1, line 2 “At the same time, the proliferation of Software Defined Networking (SDN) enables the efficient reconfiguration of the underlying network components dynamically using SDN controllers.” In other words, reconfiguration is altering the one or more configurations, underlying network components is network device and client device, and reconfiguration of underlying network components using SDN controllers is altering connectivity of the client device to the network device.).
Regarding claim 8,
	the combination of Vu, Morris, Khan, and Sarigiannidis teach the method of claim 1,
wherein altering the one or more configurations of the network further comprises: providing, by the network device, the client device with a recommended application mode for the application program of the client device (Sarigiannidis, page 1, column 1, line 7 “The proposed mechanism receives traffic-aware knowledge from the SDN controllers and applies an adjustment on the uplink-downlink configuration in the LTE radio communication.  This traffic-aware mechanism is capable of determining the most suitable configuration based on the traffic dynamics in the whole hybrid network. The introduced scheme is evaluated in a realistic environment using real traffic traces such as Voice over IP (VoIP), real-time video, and streaming video.” In other words, adjustments is altering the one or more configurations, network is network, determining the most suitable configuration is providing the client device with a recommended application mode, and VoIP, real-time video, and streaming video are applications.).
Claim 10 is an apparatus claim corresponding to method claim 1.  Otherwise, they are the same.  It is implicit that a computer implemented method, monitoring, among other things, network traffic, requires an apparatus including a processor coupled to a network interface unit in order to be executed.  Therefore, claim 10 is rejected for the same reasons as claim 1.
Claims 15-16 are apparatus claims corresponding to method claims 6-7, respectively.  Otherwise, they are the same.  Therefore, claims 15-16 are rejected for the same reasons as claims 6-7, respectively.
Claim 17 is a non-transitory computer readable storage media claim corresponding to method claim 1.  Otherwise, they are the same.  It is implicit that a computer implemented method requires a non-transitory computer readable storage media in order to execute.  Therefore, claim 17 is rejected for the same reasons as claim 1.
Claims 2-5, 9, 11-14, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Vu, Morris, Khan, Sarigiannidis, and Bujlow et al (A method for classification of network traffic based on C5.0 Machine Learning Algorithm, herein, Bujlow).
Regarding claim 2,
	the combination of Vu, Morris, Khan, and Sarigiannidis teach the method of claim 1,
wherein the training dataset further includes: 
	Thus far, the combination of Vu, Morris, Khan, and Sarigiannidis does not explicitly teach attributes of actual network traffic experienced by the network device.
	Bujlow teaches attributes of actual network traffic experienced by the network device (Bujlow, page 238, column 2, paragraph 3, line 1 “A good solution for obtaining accurate training data can rely on collecting the flows at the user side along with the name of the associated application.” And, line 8 “The task for the client is to register information about each flow passing the Network Interface Card (NIC), with the exception of traffic to and from the local subnet, to prevent capturing transfers between local peers.” And page 1, column 1, paragraph 1, line 13 “This high accuracy was achieved by using high quality training data collected by our system, a unique set of parameters used for both training and classification, an algorithm for recognizing flow direction and the C5.0 itself.” In other words, training data is training dataset, and each flow passing the Network Interface Card (NIC) is attributes of actual network traffic experienced by the network device.).
Bujlow and the combination of Vu, Morris, Khan, and Sarigiannidis are directed to monitoring and measuring performance of high-speed multi-hop networks.  In view of the teaching of the combination of Vu, Morris, Khan, and Sarigiannidis it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Bujlow into the combination of Vu, Morris, Khan, and Sarigiannidis. This would result in being able to collect high quality traffic data.
	One of ordinary skill in the art would be motivated to do this because collecting higher quality traffic data for use in training would help provide higher accuracy of classification. (Bujlow, page 1, column 1, paragraph 1, line 8 “On the basis of statistical traffic information received from volunteers and C5.0 algorithm we constructed a boosted classifier, which was shown to have ability to distinguish between 7 different applications in test set of 76,632-1,622,710 unknown cases with average accuracy of 99.3-99.9%.  This high accuracy was achieved by using high quality training data collected by our system, a unique set of parameters used for both training and classification, an algorithm for recognizing flow direction and the C5.0 itself.”)	
Regarding claim 3,
	the combination of Vu, Morris, Khan, Sarigiannidis and Bujlow teach the method of claim 2,
	wherein the first model is an auxiliary classifier generative adversarial network model that includes a generator model and a discriminator model.  (Vu, page 333, column 1, paragraph 1, line 9 “We used a recent proposed deep network for unsupervised learning called Auxiliary Classifier Generative Adversarial Network to generate synthesized data samples for balancing between the minor and the major classes.  We tested our method on a well-known network traffic dataset and the results showed that our proposed method achieved better performance compared to a recent proposed method for handling imbalanced problem in network traffic classification.” And page 334, column 2, paragraph 5, line 1 “The input of the generator (G) is a vector of random noise z and it outputs a synthesized sample Xfake = G(z).  Network D takes the input of a real data sample or a synthesized sample from the generator and the output is a probability distribution P(S|X) = D(X) over possible sources.” In other words, auxiliary classifier generative adversarial network is auxiliary classifier generative adversarial network model, generator is generator model, and discriminator is discriminator model.)
Regarding claim 4,
	the combination of Vu, Morris, Khan, Sarigiannidis and Bujlow teach the method of claim 3,
	wherein the generator model is trained to generate the one or more fabricated attribute sets based on the conditional class of the network device and a random noise value.  (Vu, page 335, Figure 2, and page 334, column 2, paragraph 5, line 1 “The input of the generator (G) is a vector of random noise z and it outputs a synthesized sample Xfake = G(z).  Network D takes the input of a real data sample or a synthesized sample from the generator and the output is a probability distribution P(S|X) = D(X) over possible sources.” In other words, synthesize is generate, generator is the generator model, sample Xfake = G(z) is fabricated attribute set, input of a real data sample is conditional class of the network device, and noise z is random noise.)

    PNG
    media_image9.png
    301
    468
    media_image9.png
    Greyscale

Regarding claim 5,
	the combination of Vu, Morris, Khan, Sarigiannidis and Bujlow teach the method of claim 3,
	wherein the discriminator model is trained to differentiate between the one or more fabricated attribute sets generated by the generator model and the attributes of the training dataset.  (Vu, page 334, column 2, paragraph 4, line 1 “Generative Adversarial Network (GAN) was proposed by GoodFellow et al. in [10] for unsupervised learning.  A GAN has two neuron networks which are trained in an opposition way.  The first neuron network is a Generator (G) and the second neuron network is a Discriminator (D).  The main idea behind GAN is to have two competing neural network models.  The generator takes noise as input and generates samples.  The discriminator receives samples from both the generator and the training data and attempt to distinguish between the two sources.  These two networks play a continuous game, where the generator is learning to produce more and more realistic samples, and the discriminator is learning to get better and better at distinguishing the generated data from the real data.  The two networks are trained simultaneously, and hope that the competition will drive the generated samples to be indistinguishable from the real data.” In other words, discriminator is the discriminator model, and attempt to distinguish between the two sources is differentiate between the one or more fabricated attribute sets and attributes of the training dataset.)
Claims 11-14 are apparatus claims corresponding to method claims 2-5, respectively.  Otherwise, they are the same. Therefore, Claims 11-14 are rejected for the same reasons as claims 2-5 respectively. 
Claims 18-19 are non-transitory computer readable storage media claims corresponding to method claims 2-3, respectively.  Otherwise, they are the same.  Therefore, claims 18-19 are rejected for the same reasons as claims 2-3, respectively.
Claim 20 is a non-transitory computer readable storage media claim corresponding to the combination of method claims 4 and 5.  Otherwise, it is the same. Claim 20 is rejected for the same reasons as the combination of claims 4 and 5.
 Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Vu, Morris, Khan, Sarigiannidis, and Abeysooriya et al (US 9336483 B1, herein Abeysooriya).
Regarding claim 9,
	the combination of Vu, Morris, Khan, and Sarigiannidis teach the method of claim 1,
	Thus far, the combination of Vu, Morris, Khan, and Sarigiannidis does not explicitly teach wherein the first model is a database.  
	Abeysooriya teaches wherein the first model is a database.  (Abeysooriya, FIG. 3, and, column 1, line 49 “In some embodiments, neural network training data may be stored in and retrieved from a training database, training batch files, or other computer storage, and used to perform an initial training process for a neural network data structure.” In other words, training database is the first model is a database.) 
	Both Abeysooriya and the combination of Vu, Morris, Khan, and Sarigiannidis are directed to, among other things, performance of content distribution networks.  In view of the teaching of the combination of Vu, Morris, Khan, and Sarigiannidis, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Abeysooriya into the combination of Vu, Morris, Khan, and Sarigiannidis.  This would result in being able to use a database for training a machine learning model.
	One of ordinary skill in the art would be motivated to do this in order to better train neural networks in order to perform decision-making and predictive analyses. (Abeysooriya, column 1, line 17 “After a neural network data structure has been generated and trained with an appropriate training data set, it may be used to perform decision-making processes and predictive analyses for various systems.  For instance, a trained neural network may be deployed with a content distribution network and used to perform tasks such as detecting patterns, predicting user behavior, data processing, function approximation, and the like.”)
Conclusion
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to BART RYLANDER whose telephone number is (571)272-8359. The examiner can normally be reached Monday - Thursday 8:00 to 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on 571-270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/B.I.R./Examiner, Art Unit 2124                                                                                                                                                                                                        

/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124