Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

Status of Claims
Claims 1-5, 9-12, and 15-20 are pending in the present application. Claims 1 and 11 are newly amended. Claims 6-8 and 13-14 are cancelled.

Response to Arguments
Applicant's arguments filed 3/31/2022 have been fully considered but they are not persuasive. 
Arguments Regarding Specification (page 8):
The amended title is considered satisfactory.
Arguments Regarding Rejections under §103 (pages 8-11):
Per applicant’s argument that “Li fails to describe or suggest any "variance model" as recited in claim 1.” (pp. 9-10)”
The examiner thanks the applicant for their response, and would like to provide clarification. The examiner views the broadest reasonable interpretation of a “variance model” as recited in claim 1 to be any model that outputs any form of variance. Li discloses a model which measures uncertainty between multiple samples by measuring the difference between predicted outputs for the multiple samples (see p.1, Abstract). The examiner views measuring the uncertainty between samples as equivalent to measuring the variance between those samples. The examiner further notes measuring the difference between predicted outputs is a closed form function, and that the model comprises a closed form function. Thus, Li’s model qualifies as a variance model as recited in claim 1. Accordingly, the rejections are upheld.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1, 5, 11, 15, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over "Estimating the spectral gain and the noise figure of EDFA using artificial neural networks" to Bastos-Filho et al (hereinafter, Bastos-Filho) in view of US 20120269519 A1 to Jiang et al (hereinafter, Jiang), further in view of “Active learning for hyperspectral image classification with a stacked autoencoders based neural network” to Li (hereinafter, Li).

As per claim 1, Bastos-Filho teaches A method comprising: training a machine learning (ML) model using a set of labeled training objects and a ML algorithm to generate a ML model for predicting a gain associated with a target channel of a plurality of channels of an amplifier device (p.2, section 3, “The ANN proposed in [7] was altered to consider new inputs and outputs (as shown in Fig. 1) to include the channel information stored in the amplifier mask. The new inputs are the total input power of the signal (Pin), the gain in which the amplifier should operate (Gset) and the frequency of the channel. The new outputs are channel gain (Gch) and channel noise figure (NFch)… The ANN training used all the points in the amplifier mask since the main goal was to estimate the values correctly within the mask, but not considering the extrapolation of values outside the mask. The training data was shuffled considering the Gset and Pin, but they were sorted by frequencies. We performed the sorting process to present the complete spectrum of an operating point to the ANN. The validation data was selected by randomly choose half of the points in the training data, while the test data was composed of the other half not chosen for the validation phase. We used the backpropagation training approach with a learning rate of 0.3, a momentum alpha equal to 0.3. The stop criterion is a maximum of 5,000 training epochs.” Figure 1. Examiner Note: Bastos-Filho discloses an ANN trained using backpropagation where the outputs of the ANN include a gain of an amplifier channel.).

Bastos-Filho does not explicitly teach each labeled training object in the set of labeled training objects comprising an indication of a channel loading for the amplifier device, and an indication of a gain for the target channel; receiving a plurality of unlabeled training objects, each unlabeled training object comprising an indication of a channel loading  for the amplifier device; determining an additional labeled training object from the unlabeled training objects for further training the generated ML model, the determining comprising: determining, based on a type of the generated ML model and the set of unlabeled training objects, a variance model; selecting, based on a maximum of the variance model , a candidate training object from the plurality of unlabeled training objects; receiving a measured gain value for the target channel based on the channel loading indicated by the candidate training object; and generating, based on the candidate training object and the gain value, the additional labeled training object; and adding the additional labeled training object to the set of labeled training objects; and further training the generated ML model using the set of labeled training objects including the additional labeled training object.

Jiang teaches training a machine learning (ML) model using a set of labeled training objects and a ML algorithm to generate a ML model for predicting a gain associated with a target channel of a plurality of channels of an amplifier device, each labeled training object in the set of labeled training objects comprising an indication of a channel loading for the amplifier device, and an indication of a gain for the target channel ([0047] “The measurement apparatus 400 may be used to obtain the noise figure and/or gain measurements (as a function of wavelength) for a plurality of different channel loading conditions, e.g., for different sets of selected wavelength or signal channels.” Examiner Note: Bastos-Filho discloses the training of an ML model for predicting the channel specific gain of a multichannel amplifier, as cited above, but does not cite specific features regarding the training data used to train said model. Jiang discloses a measurement apparatus that measures the output of the channel specific gain when given a channel loading. The examiner recognizes the data output from Jiang’s measurement system as equivalent to a labelled training object comprising an indication of a channel loading for an amplifier device, and an indication of a gain for a target channel. When Jiang is applied to Bastos-Filho, the resulting system would train a machine learning (ML) model using a set of labeled training objects and a ML algorithm to generate a ML model for predicting a gain associated with a target channel of a plurality of channels of an amplifier device, each labeled training object in the set of labeled training objects comprising an indication of a channel loading for the amplifier device, and an indication of a gain for the target channel.).

Bastos-Filho and Jiang are analogous art because they are both directed to data processing systems. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to increase the system’s accuracy, which can be accomplished by using more accurate training data (Jiang, [0004] “The optical signal may also be amplified before or after transmission to enhance performance, for example to compensate for attenuation or noise during transport. Erbium Doped Fiber Amplifiers (EDFAs) are one type of optical amplifiers that are commonly used in optical systems. However, EDFAs may also contribute noise in the optical signals, which needs to be accounted for.”).
	

	The combination of Bastos-Filho and Jiang does not explicitly teach receiving a plurality of unlabeled training objects, each unlabeled training object comprising an indication of a channel loading  for the amplifier device; determining an additional labeled training object from the unlabeled training objects for further training the generated ML model, the determining comprising: determining, based on a type of the generated ML model and the set of unlabeled training objects, a variance model; selecting, based on a maximum of the variance model , a candidate training object from the plurality of unlabeled training objects; receiving a measured gain value for the target channel based on the channel loading indicated by the candidate training object; and generating, based on the candidate training object and the gain value, the additional labeled training object; and adding the additional labeled training object to the set of labeled training objects; and further training the generated ML model using the set of labeled training objects including the additional labeled training object.

Li teaches receiving a plurality of unlabeled training objects, each unlabeled training object comprising an indication of a channel loading for the amplifier device (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget” Examiner Note: Bastos-Filho discloses training an ML model of a multichannel amplifier with training data. Jiang training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches receiving a batch of unlabeled training samples. When Li is applied to Jiang and Bastos-Filho, the resulting combination would receive a plurality of unlabeled training objects, each unlabeled training object comprising an indication of a channel loading for the amplifier device.);
determining an additional labeled training object from the unlabeled training objects for further training the generated ML model, the determining comprising (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget” Examiner Note: Bastos-Filho discloses training an ML model of a multichannel amplifier with training data. Jiang training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches classifying additional samples from the batch of unlabeled training data. When Li is applied to Jiang and Bastos-Filho, the resulting combination would determine an additional labeled training object from the unlabeled training objects for further training the generated ML model.):
determining, based on a type of the generated ML model and the plurality of unlabeled training objects, a variance model comprising a closed form function (p. 1, Abstract “Uncertainty for a given sample is measured by the difference between the largest two class outputs of the neural network. The less difference there is, the more uncertainty the sample has.” p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget. Here the uncertainty criterion is based on the top layer outputs of the neural network. When using neural network for classification, the number of units in the top layer is equal to the number of label classes. Given a sample, we forward pass it through the network, and label prediction of the input sample can be determined by the largest value of the top layer outputs. Therefore, if there is little difference between the largest two top layer outputs, we consider the input is uncertain with the current neural network. This measure is similar to the margin motivation behind MCLU, except that neural network is a natural multi-class classifier and the boundary should be more smooth than SVMs.” Examiner Note: Bastos-Filho discloses training an ML model of a multichannel amplifier with training data. Jiang training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches modeling the variance of a sample based on a neural network uncertainty. When Li is applied to Jiang and Bastos-Filho, the resulting combination would determine, based on a type of the generated ML model and the set of unlabeled training objects, a variance model. The examiner recognizes that the subtraction operation performed by Li constitutes a closed form solution to the variance model described above.);
determining a solution to the variance model (p. 1, Abstract “Uncertainty for a given sample is measured by the difference between the largest two class outputs of the neural network. The less difference there is, the more uncertainty the sample has.” Examiner Note: The examiner recognizes that the subtraction operation performed by Li constitutes a closed form solution to the variance model described above.);
determining, based on the solution to the variance model, a maximum of the variance model (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget. Here the uncertainty criterion is based on the top layer outputs of the neural network. When using neural network for classification, the number of units in the top layer is equal to the number of label classes. Given a sample, we forward pass it through the network, and label prediction of the input sample can be determined by the largest value of the top layer outputs. Therefore, if there is little difference between the largest two top layer outputs, we consider the input is uncertain with the current neural network. This measure is similar to the margin motivation behind MCLU, except that neural network is a natural multi-class classifier and the boundary should be more smooth than SVMs.” Examiner Note: Picking the most uncertain sample from the model measuring uncertainty (i.e., variance) is seen as equivalent to determining the maximum of the variance model.);
selecting, based on the maximum of the variance model, a candidate training object from the plurality of unlabeled training objects (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget. Here the uncertainty criterion is based on the top layer outputs of the neural network. When using neural network for classification, the number of units in the top layer is equal to the number of label classes. Given a sample, we forward pass it through the network, and label prediction of the input sample can be determined by the largest value of the top layer outputs. Therefore, if there is little difference between the largest two top layer outputs, we consider the input is uncertain with the current neural network. This measure is similar to the margin motivation behind MCLU, except that neural network is a natural multi-class classifier and the boundary should be more smooth than SVMs.” Examiner Note: Bastos-Filho discloses training an ML model of a multichannel amplifier with training data. Jiang training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches modeling the variance of a sample based on a neural network uncertainty, and selecting the most uncertain samples for labeling. When Li is applied to Jiang and Bastos-Filho, the resulting combination would select, based on a maximum of the variance model, a candidate training object from the plurality of unlabeled training objects.);
receiving a measured gain value for the target channel based on the channel loading indicated by the candidate training object (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget” Examiner Note: Bastos-Filho discloses predicting the gain value of a target channel based on an input to the amplifier (i.e., channel loading). Jiang training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches sending unlabeled training samples to a neural network for classification. When Li is applied to Bastos-Filho and Jiang, the resulting system would measure the gain of an unlabeled training object via Bastos-Filho’ neural network.); and
generating, based on the candidate training object and the measured gain value, the additional labeled training object (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget” Examiner Note: Bastos-Filho discloses predicting the gain value of a target channel based on an input to the amplifier (i.e., channel loading). Jiang training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches sending unlabeled training samples to a neural network for classification. When Li is applied to Bastos-Filho and Jiang, the resulting system would apply the measured gain of an unlabeled training object via Bastos-Filho’ neural network, applying a label to the object.); and
adding the additional labeled training object to the set of labeled training objects (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget” Examiner Note: Bastos-Filho discloses predicting the gain value of a target channel based on an input to the amplifier (i.e., channel loading). Jiang training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches adding a newly labeled training object to a set of training objects. When Li is applied to Bastos-Filho and Jiang, the resulting system would add the newly labeled sample to the set of training samples.); and
further training the generated ML model using the set of labeled training objects including the additional labeled training object (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget” Examiner Note: Bastos-Filho discloses predicting the gain value of a target channel based on an input to the amplifier (i.e., channel loading). Jiang training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches retraining a neural network with a set of training data including the newly generated training data. When Li is applied to Bastos-Filho and Jiang, the resulting system would re-train the neural network with the updated set of labeled training data.).

Bastos-Filho, Jiang, and Li are analogous art because they are directed to data processing methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data, and Li’s ML training system. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to increase efficiency of the system, which can be accomplished by using active learning methods (Li, Abstract “Experimental results on Pavia university dataset showed that our method outperforms the current support vector machines (SVMs) based multiclass/level uncertainty (MCLU) method both in classification accuracy and generalization capability.”).

As per claim 5, the combination of Bastos-Filho, Jiang, and Li thus far teaches The method of claim 1. 
Bastos-Filho teaches wherein the generated ML model comprises a linear model (p.1, section 2 “The implementation of module 3 is performed in more than one way and, because of this, more than one estimator is proposed. Nevertheless, the models 1 and 2 are the ones that use the information in the amplifier mask, and they follow a linear interpolation strategy to calculate their output.”).

 As per claim 11, Bastos-Filho teaches A method of generating additional training objects for training a generated machine learning (ML) model of a gain for a target channel of a plurality of channels of an amplifier device, the generated ML model having been learned using a set of labeled training objects (p.2, section 3, “The ANN proposed in [7] was altered to consider new inputs and outputs (as shown in Fig. 1) to include the channel information stored in the amplifier mask. The new inputs are the total input power of the signal (Pin), the gain in which the amplifier should operate (Gset) and the frequency of the channel. The new outputs are channel gain (Gch) and channel noise figure (NFch)… The ANN training used all the points in the amplifier mask since the main goal was to estimate the values correctly within the mask, but not considering the extrapolation of values outside the mask. The training data was shuffled considering the Gset and Pin, but they were sorted by frequencies. We performed the sorting process to present the complete spectrum of an operating point to the ANN. The validation data was selected by randomly choose half of the points in the training data, while the test data was composed of the other half not chosen for the validation phase. We used the backpropagation training approach with a learning rate of 0.3, a momentum alpha equal to 0.3. The stop criterion is a maximum of 5,000 training epochs.” Figure 1. Examiner Note: Bastos-Filho discloses an ANN trained using backpropagation where the outputs of the ANN include a gain of an amplifier channel.).

Bastos-Filho does not explicitly teach each labeled training object in the set of labeled training objects comprising an indication of a channel loading for the amplifier device, and an indication of a gain for the target channel; receiving a plurality of unlabeled training objects, each unlabeled training object comprising an indication of a channel loading  for the amplifier device; determining an additional labeled training object from the unlabeled training objects for further training the generated ML model, the determining comprising: determining, based on a type of the generated ML model and the set of unlabeled training objects, a variance model; selecting, based on a maximum of the variance model , a candidate training object from the plurality of unlabeled training objects; receiving a measured gain value for the target channel based on the channel loading indicated by the candidate training object; and generating, based on the candidate training object and the gain value, the additional labeled training object; and adding the additional labeled training object to the set of labeled training objects; and further training the generated ML model using the set of labeled training objects including the additional labeled training object.

Jiang teaches A method of generating additional training objects for training a generated machine learning (ML) model of a gain for a target channel of a plurality of channels of an amplifier device, the generated ML model having been learned using a set of labeled training objects, each labeled training object of the set of labeled training objects comprising an indication of a channel loading for the amplifier device, and an indication of a gain for the target channel, the method comprising: ([0047] “The measurement apparatus 400 may be used to obtain the noise figure and/or gain measurements (as a function of wavelength) for a plurality of different channel loading conditions, e.g., for different sets of selected wavelength or signal channels.” Examiner Note: Bastos-Filho discloses the training of an ML model for predicting the channel specific gain of a multichannel amplifier, as cited above, but does not cite specific features regarding the training data used to train said model. Jiang discloses a measurement apparatus that measures the output of the channel specific gain when given a channel loading. The examiner recognizes the data output from Jiang’s measurement system as equivalent to a labelled training object comprising an indication of a channel loading for an amplifier device, and an indication of a gain for a target channel. When Jiang is applied to Bastos-Filho, the resulting system would generate additional training objects for training a generated machine learning (ML) model of a gain for a target channel of a plurality of channels of an amplifier device, the generated ML model having been learned using a set of labeled training objects, each labeled training object of the set of training objects comprising an indication of a channel loading for the amplifier device, and an indication of a gain for the target channel, the method comprising.).

Bastos-Filho and Jiang are analogous art because they are both directed to data processing systems. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to increase the system’s accuracy, which can be accomplished by using more accurate training data (Jiang, [0004] “The optical signal may also be amplified before or after transmission to enhance performance, for example to compensate for attenuation or noise during transport. Erbium Doped Fiber Amplifiers (EDFAs) are one type of optical amplifiers that are commonly used in optical systems. However, EDFAs may also contribute noise in the optical signals, which needs to be accounted for.”).
	

	The combination of Bastos-Filho and Jiang does not explicitly teach receiving a plurality of unlabeled training objects, each unlabeled training object comprising an indication of a channel loading  for the amplifier device; determining an additional labeled training object from the unlabeled training objects for further training the generated ML model, the determining comprising: determining, based on a type of the generated ML model and the set of unlabeled training objects, a variance model; selecting, based on a maximum of the variance model , a candidate training object from the plurality of unlabeled training objects; receiving a measured gain value for the target channel based on the channel loading indicated by the candidate training object; and generating, based on the candidate training object and the gain value, the additional labeled training object; and adding the additional labeled training object to the set of labeled training objects; and further training the generated ML model using the set of labeled training objects including the additional labeled training object.

Li teaches receiving a plurality of unlabeled training objects, each unlabeled training object comprising an indication of a channel loading for the amplifier device (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget” Examiner Note: Bastos-Filho discloses training an ML model of a multichannel amplifier with training data. Jiang training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches receiving a batch of unlabeled training samples. When Li is applied to Jiang and Bastos-Filho, the resulting combination would receive a plurality of unlabeled training objects, each unlabeled training object comprising an indication of a channel loading for the amplifier device.);
determining an additional labeled training object from the unlabeled training objects for further training the generated ML model, the determining comprising (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget” Examiner Note: Bastos-Filho discloses training an ML model of a multichannel amplifier with training data. Jiang training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches classifying additional samples from the batch of unlabeled training data. When Li is applied to Jiang and Bastos-Filho, the resulting combination would determine an additional labeled training object from the unlabeled training objects for further training the generated ML model.):
determining, based on a type of the generated ML model and the plurality of unlabeled training objects, a variance model (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget. Here the uncertainty criterion is based on the top layer outputs of the neural network. When using neural network for classification, the number of units in the top layer is equal to the number of label classes. Given a sample, we forward pass it through the network, and label prediction of the input sample can be determined by the largest value of the top layer outputs. Therefore, if there is little difference between the largest two top layer outputs, we consider the input is uncertain with the current neural network. This measure is similar to the margin motivation behind MCLU, except that neural network is a natural multi-class classifier and the boundary should be more smooth than SVMs.” Examiner Note: Bastos-Filho discloses training an ML model of a multichannel amplifier with training data. Jiang discloses training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches modeling the variance of a sample based on a neural network uncertainty. When Li is applied to Jiang and Bastos-Filho, the resulting combination would determine, based on a type of the generated ML model and the set of unlabeled training objects, a variance model.);
determining a plurality of channel loadings for the variance model (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget. Here the uncertainty criterion is based on the top layer outputs of the neural network. When using neural network for classification, the number of units in the top layer is equal to the number of label classes. Given a sample, we forward pass it through the network, and label prediction of the input sample can be determined by the largest value of the top layer outputs. Therefore, if there is little difference between the largest two top layer outputs, we consider the input is uncertain with the current neural network. This measure is similar to the margin motivation behind MCLU, except that neural network is a natural multi-class classifier and the boundary should be more smooth than SVMs.” Examiner Note: Bastos-Filho discloses training an ML model of a multichannel amplifier with training data. Jiang discloses a plurality of channel loadings, and measurement of a plurality of channel loadings. Li teaches modeling the variance of a sample based on a neural network uncertainty. When Li is applied to Jiang and Bastos-Filho, the resulting combination would determine, based on a type of the generated ML model and the set of unlabeled training objects, a variance model.);
determining, using the variance model and the plurality of channel loadings, a plurality of variance parameters, wherein each variance parameter of the plurality of variance parameters corresponds to a channel loading of the plurality of channel loadings (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget. Here the uncertainty criterion is based on the top layer outputs of the neural network. When using neural network for classification, the number of units in the top layer is equal to the number of label classes. Given a sample, we forward pass it through the network, and label prediction of the input sample can be determined by the largest value of the top layer outputs. Therefore, if there is little difference between the largest two top layer outputs, we consider the input is uncertain with the current neural network. This measure is similar to the margin motivation behind MCLU, except that neural network is a natural multi-class classifier and the boundary should be more smooth than SVMs.” Examiner Note: Bastos-Filho discloses training an ML model of a multichannel amplifier with training data. Jiang discloses a plurality of channel loadings, and measurement of a plurality of channel loadings. Li teaches modeling the variance of outputs from a neural network. When Li is applied to Jiang and Bastos-Filho, the resulting combination would input at least one channel loading of the network, and determine a variance corresponding to each of the input channel loadings.);
selecting, based on the plurality of variance parameters, a channel loading of the plurality of channel loadings (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget. Here the uncertainty criterion is based on the top layer outputs of the neural network. When using neural network for classification, the number of units in the top layer is equal to the number of label classes. Given a sample, we forward pass it through the network, and label prediction of the input sample can be determined by the largest value of the top layer outputs. Therefore, if there is little difference between the largest two top layer outputs, we consider the input is uncertain with the current neural network. This measure is similar to the margin motivation behind MCLU, except that neural network is a natural multi-class classifier and the boundary should be more smooth than SVMs.” Examiner Note: Bastos-Filho discloses training an ML model of a multichannel amplifier with training data. Jiang discloses training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches modeling the variance of a sample based on a neural network uncertainty, and selecting the most uncertain samples for labeling. When Li is applied to Jiang and Bastos-Filho, the resulting combination would select, based on a maximum of the variance model, a candidate training object from the plurality of unlabeled training objects.);
determining, based on the selected channel loading, a candidate training object from the plurality of unlabeled training objects (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget. Here the uncertainty criterion is based on the top layer outputs of the neural network. When using neural network for classification, the number of units in the top layer is equal to the number of label classes. Given a sample, we forward pass it through the network, and label prediction of the input sample can be determined by the largest value of the top layer outputs. Therefore, if there is little difference between the largest two top layer outputs, we consider the input is uncertain with the current neural network. This measure is similar to the margin motivation behind MCLU, except that neural network is a natural multi-class classifier and the boundary should be more smooth than SVMs.” Examiner Note: Bastos-Filho discloses training an ML model of a multichannel amplifier with training data. Jiang training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches modeling the variance of a sample based on a neural network uncertainty, and selecting the most uncertain samples for labeling. When Li is applied to Jiang and Bastos-Filho, the resulting combination would select, based on a maximum of the variance model, a candidate training object from the plurality of unlabeled training objects.);
receiving a measured gain value for the target channel based on the channel loading indicated by the candidate training object (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget” Examiner Note: Bastos-Filho discloses predicting the gain value of a target channel based on an input to the amplifier (i.e., channel loading). Jiang teaches training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches sending unlabeled training samples to a neural network for classification. When Li is applied to Bastos-Filho and Jiang, the resulting system would measure the gain of an unlabeled training object via Bastos-Filho’ neural network.); and
generating, based on the candidate training object and the measured gain value, an additional labeled training object (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget” Examiner Note: Bastos-Filho discloses predicting the gain value of a target channel based on an input to the amplifier (i.e., channel loading). Jiang training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches sending unlabeled training samples to a neural network for classification. When Li is applied to Bastos-Filho and Jiang, the resulting system would apply the measured gain of an unlabeled training object via Bastos-Filho’ neural network, applying a label to the object.); and
adding the additional labeled training object to the set of labeled training objects (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget” Examiner Note: Bastos-Filho discloses predicting the gain value of a target channel based on an input to the amplifier (i.e., channel loading). Jiang training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches adding a newly labeled training object to a set of training objects. When Li is applied to Bastos-Filho and Jiang, the resulting system would add the newly labeled sample to the set of training samples.); and
further training the generated ML model using the set of labeled training objects including the additional labeled training object (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget” Examiner Note: Bastos-Filho discloses predicting the gain value of a target channel based on an input to the amplifier (i.e., channel loading). Jiang training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches retraining a neural network with a set of training data including the newly generated training data. When Li is applied to Bastos-Filho and Jiang, the resulting system would re-train the neural network with the updated set of labeled training data.).

Bastos-Filho, Jiang, and Li are analogous art because they are directed to data processing methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data, and Li’s ML training system. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to increase efficiency of the system, which can be accomplished by using active learning methods (Li, Abstract “Experimental results on Pavia university dataset showed that our method outperforms the current support vector machines (SVMs) based multiclass/level uncertainty (MCLU) method both in classification accuracy and generalization capability.”).


As per claim 15, the combination of Bastos-Filho, Jiang, and Li thus far teaches The method of claim 11. 
Bastos-Filho does not explicitly teach further comprising transmitting, to a device configured to measure one or more gain values of the amplifier device, the candidate training object.
Jiang teaches further comprising transmitting, to a device configured to measure one or more gain values of the amplifier device, the candidate training object ([0039] “The express channel source/input 110 may be a data channel and/or a carrier for data channels. The first multiplexer 112 may be any device or component configured to combine a plurality of different wavelength channels from one or more transmitters into a single combined channel and redirect the single combined channel to the first WSS 120. The different wavelength channels may be data channels that may be transmitted from one or a plurality of transmitters coupled to the first multiplexer 112.” [0047] “The measurement apparatus 400 may be used to obtain the noise figure and/or gain measurements (as a function of wavelength) for a plurality of different channel loading conditions, e.g., for different sets of selected wavelength or signal channels.” Examiner Note: Jiang teaches a gain measurement device, a transmitter, a data receiver, and training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches classifying additional samples from the batch of unlabeled training data. When Li is applied to Bastos-Filho and Jiang, the resulting system would transmit the generated training samples to a device configured to measure one or more gain values of the amplifier device.).

Bastos-Filho, Jiang, and Li are analogous art because they are directed to data processing methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data, and Li’s ML training system. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to increase the system’s accuracy, which can be accomplished by using more accurate training data (Jiang, [0004] “The optical signal may also be amplified before or after transmission to enhance performance, for example to compensate for attenuation or noise during transport. Erbium Doped Fiber Amplifiers (EDFAs) are one type of optical amplifiers that are commonly used in optical systems. However, EDFAs may also contribute noise in the optical signals, which needs to be accounted for.”).

As per claim 19, the combination of Bastos-Filho, Jiang, and Li thus far teaches The method of claim 11.
Bastos-Filho does not explicitly teach wherein determining the plurality of variance parameters comprises, for each channel loading of the plurality of channel loadings: determining a plurality of predicted gain values corresponding to the respective channel loading; and determining an empirical variance of the plurality of predicted gain values.
Jiang teaches wherein determining the plurality of variance parameters comprises, for each channel loading of the plurality of channel loadings: determining a plurality of predicted gain values corresponding to the respective channel loading ([0047] “The measurement apparatus 400 may be used to obtain the noise figure and/or gain measurements (as a function of wavelength) for a plurality of different channel loading conditions, e.g., for different sets of selected wavelength or signal channels.” Examiner Note: Bastos-Filho discloses predicting the gain value of a target channel based on an input to the amplifier (i.e., channel loading). Jiang teaches training data comprising an indication of a channel loading for an amplifier device as explained above. Li teaches determining the variance of a neural network output by comparing the outputs of each node of the last node of the network. Thus, the examiner sees each of the last nodes as a predicted output of the neural network. When Li is applied to Bastos-Filho and Jiang, the resulting system would produce a plurality of predicted gains for each of the input channel loadings.).

Bastos-Filho, Jiang, and Li are analogous art because they are directed to data processing methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data, and Li’s ML training system. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to increase the system’s accuracy, which can be accomplished by using more accurate training data (Jiang, [0004] “The optical signal may also be amplified before or after transmission to enhance performance, for example to compensate for attenuation or noise during transport. Erbium Doped Fiber Amplifiers (EDFAs) are one type of optical amplifiers that are commonly used in optical systems. However, EDFAs may also contribute noise in the optical signals, which needs to be accounted for.”).

Jiang does not explicitly teach determining an empirical variance of the plurality of predicted gain values.

Li teaches determining an empirical variance of the plurality of predicted gain values (p. 1, Abstract “Uncertainty for a given sample is measured by the difference between the largest two class outputs of the neural network. The less difference there is, the more uncertainty the sample has.” Examiner Note: Bastos-Filho discloses predicting the gain value of a target channel based on an input to the amplifier (i.e., channel loading). Li teaches an empirical measurement (i.e., subtraction) of predicted variance of a neural network. When Li is applied to Bastos-Filho and Jiang, the resulting system would determine an empirical variance of the plurality of predicted gain values.).

Bastos-Filho, Jiang, and Li are analogous art because they are directed to data processing methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data, and Li’s ML training system. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to increase efficiency of the system, which can be accomplished by using active learning methods (Li, Abstract “Experimental results on Pavia university dataset showed that our method outperforms the current support vector machines (SVMs) based multiclass/level uncertainty (MCLU) method both in classification accuracy and generalization capability.”).

Claims 2-4, 9-10, 12, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of "Estimating the spectral gain and the noise figure of EDFA using artificial neural networks" to Bastos-Filho et al (hereinafter, Bastos-Filho), US 20120269519 A1 to Jiang et al (hereinafter, Jiang), and “Active learning for hyperspectral image classification with a stacked autoencoders based neural network” to Li (hereinafter, Li), further in view of US 20180005136 A1 to Gai et al (hereinafter, Gai).

As per claim 2, the combination of Bastos-Filho, Jiang, and Li thus far teaches The method of claim 1. 

Bastos-Filho does not explicitly teach wherein training comprises: receiving the set of labeled training objects; determining a list of features corresponding to the set of labeled training objects; ranking the features in the list of features, thereby generating a ranked list of features; augmenting the labeled training objects to include a subset of features in the ranked list of features; and training the ML model using the augmented labeled training objects and the ML algorithm to generate the ML model. 

Li teaches wherein training comprises: receiving the set of labeled training objects (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget”).

Bastos-Filho, Jiang, and Li are analogous art because they are directed to data processing methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data, and Li’s ML training system. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to increase efficiency of the system, which can be accomplished by using active learning methods (Li, Abstract “Experimental results on Pavia university dataset showed that our method outperforms the current support vector machines (SVMs) based multiclass/level uncertainty (MCLU) method both in classification accuracy and generalization capability.”).

The combination of Bastos-Filho, Li, and Jiang does not explicitly teach determining a list of features corresponding to the set of labeled training objects; ranking the features in the list of features, thereby generating a ranked list of features; augmenting the labeled training objects to include a subset of features in the ranked list of features; and training the ML model using the augmented labeled training objects and the ML algorithm to generate the ML model.

Gai teaches determining a list of features corresponding to the set of labeled training objects ([0020] “In some implementations, one or more filter feature selection methods, such as the Chi squared test, information gain, and correlation scores may be employed by the feature extraction circuitry 112 to identify at least some of the features 106 present in the initial data set 102. Such filter feature selection methods may employ one or more statistical measures to apply or otherwise assign a scoring to each feature. Features may then be ranked by score and, based at least in part on the score, selected for inclusion in the data set or rejected for inclusion in the data set.”); 
ranking the features in the list of features, thereby generating a ranked list of features ([0020] “In some implementations, one or more filter feature selection methods, such as the Chi squared test, information gain, and correlation scores may be employed by the feature extraction circuitry 112 to identify at least some of the features 106 present in the initial data set 102. Such filter feature selection methods may employ one or more statistical measures to apply or otherwise assign a scoring to each feature. Features may then be ranked by score and, based at least in part on the score, selected for inclusion in the data set or rejected for inclusion in the data set.”); 
augmenting the labeled training objects to include a subset of features in the ranked list of features ([0020] “In some implementations, one or more filter feature selection methods, such as the Chi squared test, information gain, and correlation scores may be employed by the feature extraction circuitry 112 to identify at least some of the features 106 present in the initial data set 102. Such filter feature selection methods may employ one or more statistical measures to apply or otherwise assign a scoring to each feature. Features may then be ranked by score and, based at least in part on the score, selected for inclusion in the data set or rejected for inclusion in the data set.” Examiner Note: Selecting certain ranked features for inclusion in a data set is seen as equivalent to augmenting a set of training objects to include a subset of ranked features.); and 
training the ML model using the augmented labeled training objects and the ML algorithm to generate the ML model ([0020] “In some implementations, one or more filter feature selection methods, such as the Chi squared test, information gain, and correlation scores may be employed by the feature extraction circuitry 112 to identify at least some of the features 106 present in the initial data set 102. Such filter feature selection methods may employ one or more statistical measures to apply or otherwise assign a scoring to each feature. Features may then be ranked by score and, based at least in part on the score, selected for inclusion in the data set or rejected for inclusion in the data set.” Examiner Note: Bastos-Filho teaches training an ML model with training data. Gai teaches augmenting training data as above. When Gai is applied to the combination of Bastos-Filho, Jiang, and Li, the resulting combination would train an ML model using the augmented training data set.).

Bastos-Filho, Jiang, Li, and Gai are analogous art because they are directed to data processing methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data, Li’s ML training system, and Gai’s training data processing. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to decrease error rate, which can be accomplished by adversarial data processing methods (Gai, [0009] “The systems and methods described herein minimize the worst case loss over all possible compromised features. Such systems and methods improve the performance of the adversarial environment classifier by providing a reduced overall error rate and a more robust defense to adversaries.”).

As per claim 3, the combination of Bastos-Filho, Jiang, Li, and Gai thus far teaches The method of claim 2. 
Bastos-Filho does not explicitly teach wherein the subset of features comprises a number of highest- ranked features in the ranked list.

Gai teaches wherein the subset of features comprises a number of highest- ranked features in the ranked list ([0020] In some implementations, one or more filter feature selection methods, such as the Chi squared test, information gain, and correlation scores may be employed by the feature extraction circuitry 112 to identify at least some of the features 106 present in the initial data set 102. Such filter feature selection methods may employ one or more statistical measures to apply or otherwise assign a scoring to each feature. Features may then be ranked by score and, based at least in part on the score, selected for inclusion in the data set or rejected for inclusion in the data set.).

Bastos-Filho, Jiang, Li, and Gai are analogous art because they are directed to data processing methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data, Li’s ML training system, and Gai’s training data processing. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to decrease error rate, which can be accomplished by adversarial data processing methods (Gai, [0009] “The systems and methods described herein minimize the worst case loss over all possible compromised features. Such systems and methods improve the performance of the adversarial environment classifier by providing a reduced overall error rate and a more robust defense to adversaries.”).

As per claim 4, the combination of Bastos-Filho, Jiang, Li, and Gai thus far teaches The method of claim 3.
Bastos-Filho does not explicitly teach further comprising determining the number of highest-ranked features by determining a combination of features in the ranked list of features that maximizes a performance measure of the generated ML model.

Gai teaches further comprising determining the number of highest-ranked features by determining a combination of features in the ranked list of features that maximizes a performance measure of the generated ML model ([0020] “In some implementations, one or more filter feature selection methods, such as the Chi squared test, information gain, and correlation scores may be employed by the feature extraction circuitry 112 to identify at least some of the features 106 present in the initial data set 102. Such filter feature selection methods may employ one or more statistical measures to apply or otherwise assign a scoring to each feature. Features may then be ranked by score and, based at least in part on the score, selected for inclusion in the data set or rejected for inclusion in the data set.”).

Bastos-Filho, Jiang, Li, and Gai are analogous art because they are directed to data processing methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data, Li’s ML training system, and Gai’s training data processing. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to decrease error rate, which can be accomplished by adversarial data processing methods (Gai, [0009] “The systems and methods described herein minimize the worst case loss over all possible compromised features. Such systems and methods improve the performance of the adversarial environment classifier by providing a reduced overall error rate and a more robust defense to adversaries.”).

As per claim 9, the combination of Bastos-Filho, Jiang, and Li thus far teaches The method of claim 1. 
Bastos-Filho does not explicitly teach further comprising causing display of an indication of the target channel.

Gai teaches further comprising causing display of an indication of the target channel ([0038] “In some implementations, the one or more configurable circuits 204 may execute one or more machine-readable instruction sets that cause all or a portion of the one or more configurable circuits 204 to provide the feature extraction circuitry 110, the sample allocation circuitry 120, and the machine learning circuitry 130. The one or more configurable circuits 204 may communicate with the Northbridge (and other system components) via one or more buses.” [0043] “The one or more video control circuits 240 may receive data from the configurable circuit 204 via the Northbridge 210.” [0044] “One or more video output devices 242 may be communicably coupled to the one or more video control circuits. Such video output devices 242 may be wirelessly communicably coupled to the one or more video control circuits 240 or tethered (i.e., wired) to the one or more video control circuits 242. The one or more video output devices 242 may include, but are not limited to, one or more liquid crystal (LCD) displays; one or more light emitting diode (LED) displays; one or more polymer light emitting diode (PLED) displays; one or more organic light emitting diode (OLED) displays; one or more cathode ray tube (CRT) displays; or any combination thereof.” Examiner Note: Jiang discloses a target channel, as explained above. Gai teaches machine learning circuitry connected to a display. When Gai and Jiang are applied to Bastos-Filho, the resulting system would display an indication of a target channel.).

Bastos-Filho, Jiang, Li, and Gai are analogous art because they are directed to data processing methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data, Li’s ML training system, and Gai’s training data processing. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to decrease error rate, which can be accomplished by adversarial data processing methods (Gai, [0009] “The systems and methods described herein minimize the worst case loss over all possible compromised features. Such systems and methods improve the performance of the adversarial environment classifier by providing a reduced overall error rate and a more robust defense to adversaries.”).

As per claim 12, the combination of Bastos-Filho, Jiang, and Li thus far teaches The method of claim 11. 
Bastos-Filho does not explicitly teach wherein training comprises: receiving the set of labeled training objects; determining a list of features corresponding to the set of labeled training objects; removing one or more features in the list of features; generate a ranked list of feature by ranking the features in the list of features; and generating the ML model using an amount of highest-ranked features in the ranked list.
 
Li teaches wherein training comprises: receiving the set of labeled training objects (p. 3, section 2.2, Algorithm 1, “Require: An initial deep neural network A, unlabeled samples of the pool U, batch size (number of samples for label query in each iteration) K and upper limit of label queries (budget). 1: repeat 2: Use A to classify the samples in U. 3: Pick K most uncertain samples from U, and add them to the training set, update U. 4: Train classifier A again. 5: until # of current label queries reaches the budget”).

Bastos-Filho, Jiang, and Li are analogous art because they are directed to data processing methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data, and Li’s ML training system. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to increase efficiency of the system, which can be accomplished by using active learning methods (Li, Abstract “Experimental results on Pavia university dataset showed that our method outperforms the current support vector machines (SVMs) based multiclass/level uncertainty (MCLU) method both in classification accuracy and generalization capability.”).

The combination of Bastos-Filho, Li, and Jiang does not explicitly teach determining a list of features corresponding to the set of labeled training objects; ranking the features in the list of features, thereby generating a ranked list of features; augmenting the labeled training objects to include a subset of features in the ranked list of features; and training the ML model using the augmented labeled training objects and the ML algorithm to generate the ML model.

Gai teaches determining a list of features corresponding to the set of labeled training objects ([0020] “In some implementations, one or more filter feature selection methods, such as the Chi squared test, information gain, and correlation scores may be employed by the feature extraction circuitry 112 to identify at least some of the features 106 present in the initial data set 102. Such filter feature selection methods may employ one or more statistical measures to apply or otherwise assign a scoring to each feature. Features may then be ranked by score and, based at least in part on the score, selected for inclusion in the data set or rejected for inclusion in the data set.”);
removing one or more features in the list of features ([0020] “In some implementations, one or more filter feature selection methods, such as the Chi squared test, information gain, and correlation scores may be employed by the feature extraction circuitry 112 to identify at least some of the features 106 present in the initial data set 102. Such filter feature selection methods may employ one or more statistical measures to apply or otherwise assign a scoring to each feature. Features may then be ranked by score and, based at least in part on the score, selected for inclusion in the data set or rejected for inclusion in the data set.” Examiner Note: Rejecting a feature for inclusion in the data set is seen as equivalent to removing the feature from the list.); 
generate a ranked list of feature by ranking the features in the list of features ([0020] “In some implementations, one or more filter feature selection methods, such as the Chi squared test, information gain, and correlation scores may be employed by the feature extraction circuitry 112 to identify at least some of the features 106 present in the initial data set 102. Such filter feature selection methods may employ one or more statistical measures to apply or otherwise assign a scoring to each feature. Features may then be ranked by score and, based at least in part on the score, selected for inclusion in the data set or rejected for inclusion in the data set.”); and 
generating the ML model using an amount of highest-ranked features in the ranked list ([0020] “In some implementations, one or more filter feature selection methods, such as the Chi squared test, information gain, and correlation scores may be employed by the feature extraction circuitry 112 to identify at least some of the features 106 present in the initial data set 102. Such filter feature selection methods may employ one or more statistical measures to apply or otherwise assign a scoring to each feature. Features may then be ranked by score and, based at least in part on the score, selected for inclusion in the data set or rejected for inclusion in the data set.” Examiner Note: Bastos-Filho teaches training an ML model with training data. Gai teaches augmenting training data as above. When Gai is applied to the combination of Bastos-Filho, Jiang, and Li, the resulting combination would generate an ML model using the augmented training data set.).

Bastos-Filho, Jiang, Li, and Gai are analogous art because they are directed to data processing methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data, Li’s ML training system, and Gai’s training data processing. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to decrease error rate, which can be accomplished by adversarial data processing methods (Gai, [0009] “The systems and methods described herein minimize the worst case loss over all possible compromised features. Such systems and methods improve the performance of the adversarial environment classifier by providing a reduced overall error rate and a more robust defense to adversaries.”).

As per claim 18, the combination of Bastos-Filho, Jiang, and Li thus far teaches The method of claim 11. 
Bastos-Filho does not explicitly teach wherein determining the plurality of channel loadings comprises determining a set of all possible channel loadings for the variance model.

Gai teaches wherein determining the plurality of channel loadings comprises determining a set of all possible channel loadings for the variance model ([0032] “The minmax problem presented in equation (4) may be solved directly by varying all possible scenarios.” Examiner Note: Jiang teaches a gain measurement device, a transmitter, a data receiver, and training data comprising an indication of a channel loading for an amplifier device as explained above. Gai teaches testing all possible scenarios for a model. When Li is applied to Bastos-Filho and Jiang, the resulting system would determine a set of all possible channel loadings for the variance model.).

Bastos-Filho, Jiang, Li, and Gai are analogous art because they are directed to data processing methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data, Li’s ML training system, and Gai’s training data processing. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to decrease error rate, which can be accomplished by adversarial data processing methods (Gai, [0009] “The systems and methods described herein minimize the worst case loss over all possible compromised features. Such systems and methods improve the performance of the adversarial environment classifier by providing a reduced overall error rate and a more robust defense to adversaries.”).
Claims 16, 17, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of "Estimating the spectral gain and the noise figure of EDFA using artificial neural networks" to Bastos-Filho et al (hereinafter, Bastos-Filho), US 20120269519 A1 to Jiang et al (hereinafter, Jiang), and “Active learning for hyperspectral image classification with a stacked autoencoders based neural network” to Li (hereinafter, Li), further in view of “Ensemble Classification and Regression-Recent Developments, Applications and Future Directions [Review Article]” to Ren et al (hereinafter, Ren).

As per claim 16, the combination of Bastos-Filho, Jiang, and Li thus far teaches The method of claim 11.

Bastos-Filho does not explicitly teach wherein the generated ML model comprises a quadratic model.

Ren teaches wherein the generated ML model comprises a quadratic model (p. 48, left column, “In [152], the authors proposed to train the MKL with sequential minimal optimization (SMO) which is simple, easy to implement and adapt, and efficiently scales to large problems. ‘Support Kernel Machine’ based on the dual formulation of the quadratically constrained quadratic program (QCQP) as a second-order cone programming was proposed in [153]. This work also shows how to exploit the MoreauYosida regularization to yield a formulation which can be combined with SMO.” Examiner Note: Bastos-Filho discloses generating an ML model. Ren discloses a quadratic ML model. When Ren is applied to Bastos-Filho, the resulting system would generate a quadratic ML model).

Bastos-Filho, Jiang, Li, and Ren are analogous art because they are directed to data processing methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data, Li’s ML training system, and Ren’s ML model variants. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to increase efficiency of the system, which can be accomplished by a quadratic multiple kernel model (Ren, p.47, middle column “Numerous research works have demonstrated the advantage of multiple kernels over a pre-defined kernel function with optimized parameter values”).

As per claim 17, the combination of Bastos-Filho, Jiang, and Li thus far teaches The method of claim 11. 
Bastos-Filho does not explicitly teach wherein the generated ML model comprises one or more tree models.

Ren teaches wherein the generated ML model comprises one or more tree models (Page 44, middle column, “There are improved versions of random forest reported in the literature: stratified random forest [68] with weighted feature sampling; Instance based random forest [69] for better tree selection; Extremely randomized trees [70] with randomly generated threshold for randomly selected features and its extended versions such as: oblique random forest with additional random combination effect” Examiner Note: Bastos-Filho discloses generating an ML model. Ren discloses a random forest model consisting of a plurality of trees. When Ren is applied to Bastos-Filho, the resulting system would generate a random forest model consisting of a plurality of trees.).

Bastos-Filho, Jiang, Li, and Ren are analogous art because they are directed to data processing methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data, Li’s ML training system, and Ren’s ML model variants. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to decrease variance of the system, which can be accomplished by using ensemble learning methods such as decision forests (Ren, p.43, left column “Numerous research indicates that some ensemble methods (such as bagging based ensemble learning) can significantly reduce the variance of the base classifiers [23], [24], while other ensemble methods (such as boosting type approach) can achieve significant bias and variance reduction [25], [26].”).

As per claim 20, the combination of Bastos-Filho, Jiang, and Li thus far teaches The method of claim 19. 
Bastos-Filho does not explicitly teach wherein determining the plurality of predicted gain values comprises determining the output of each tree of a plurality of trees.

Ren teaches wherein determining the plurality of predicted gain values comprises determining the output of each tree of a plurality of trees (Page 44, middle column, “There are improved versions of random forest reported in the literature: stratified random forest [68] with weighted feature sampling; Instance based random forest [69] for better tree selection; Extremely randomized trees [70] with randomly generated threshold for randomly selected features and its extended versions such as: oblique random forest with additional random combination effect” Examiner Note: Bastos-Filho discloses generating an ML model predicting gain values. Ren discloses a random forest model consisting of a plurality of trees. Li teaches determining the variance of a model output by comparing the predictions of the last layer of the model. When Ren is applied to Bastos-Filho, the resulting system would generate a random forest model consisting of a plurality of trees, where the output of each tree is a predicted gain value.).

Bastos-Filho, Jiang, Li, and Ren are analogous art because they are directed to data processing methods. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Bastos-Filho’ amplifier ML model with Jiang’s amplifier data, Li’s ML training system, and Ren’s ML model variants. The combination would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention because he/she would have been motivated to decrease variance of the system, which can be accomplished by using ensemble learning methods such as decision forests (Ren, p.43, left column “Numerous research indicates that some ensemble methods (such as bagging based ensemble learning) can significantly reduce the variance of the base classifiers [23], [24], while other ensemble methods (such as boosting type approach) can achieve significant bias and variance reduction [25], [26].”).








Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. “AGC EDFA transient suppression algorithm assisted by cognitive neural network” to Carvalho et al is considered pertinent due to the EDFA ML model disclosed. “Active Learning Literature Survey” to Settles is considered pertinent due to the active learning methods disclosed.
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL G SMITH whose telephone number is (571)272-9730. The examiner can normally be reached M-F 9:30-18:00 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on 5712729767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



Respectfully Submitted,

/P.G.S./Examiner, Art Unit 2126                                                                                                                                                                                                        
/NICHOLAS KLICOS/Primary Examiner, Art Unit 2145