Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This office action is in response to the claims filed on 06/14/2019. 
Claims 1-20 are presented for examination.
Information Disclosure Statement
4. The information disclosure statements (IDS) filed 06/14/2019, 12/09/2019, 11/04/2020, 05/12/2021, 11/04/2022 are in compliance with the provisions of 37 CFR 1.97 and 1.98. Accordingly, the information disclosure statement is being considered by the examiner
Priority
The following claimed benefit is acknowledged: the instant application, filed 06/14/2019 claims priority from provisional application 62830131, filed 04/05/2019.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims (1-20) are rejected under 35 U.S.C. 112(b), as being indefinite for failing to particularly point out and distinctly claim the subject matter which applicant regards as the invention.
Claim 1 recites “forming a fourth training dataset based on the third dataset” in line 9. However, the scope of “forming a fourth training dataset based on the third dataset” is unclear, because what is mean of “forming”, or how the fourth training data is being formed based on third dataset. This does not seem to be a widely-used term of art, and the applicant does not seem to clearly define the term in the written description. Furthermore, the Applicant’s specification does not make clear the scope of the term “forming a fourth training dataset based on the third dataset” as illustrated by the points above, the term is ambiguous, and consequently a person of ordinary skill would not be able to understand the scope of the claim with reasonable certainty. Therefore, the claim is indefinite. In the interest of compact prosecution, the examiner subsequently interprets this limitation as reading “forming a fourth training dataset based on the third dataset” as – fourth training dataset is a third training dataset-- for the purpose of further examination.
Claims 2-13 are dependent of claim 1, and are likewise indefinite.
Claim 14 is rejected for the same reason as the claim 1.
Claims 15-19 are dependent of claim 14, and are likewise indefinite.
Claim 20 is rejected for the same reason as the claim 1.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:


A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-3, 7-9, 14, 15, 16, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Dong et al. (Pub. No. US20200210808– hereinafter, Dong)  in view of Haerterich et al. (Pub. No. US20200097763– hereinafter, Haerterich) and further in view of AGARWAL et al. (Pub. No. US20190087728– hereinafter, Agarwal). 
Regarding to claim 1, Dong teaches a method for classification, the method comprising:  training a variational auto encoder with the second training dataset (Dong, [Par.0018], “Referring to FIGS. 1 and 2, the method may begin at block 102, where a first training process is performed on a semi-supervised adversarial autoencoder model, to generate a first trained semi-supervised adversarial autoencoder model. A first training dataset of transactions may be used in the first training process.” Examiner’s note, the word first and second is not claimed in a manner that requires an order, that is just a name of the training data. Therefore, the first training data is considered as a second training dataset.),
the variational auto encoder comprising an encoder and a decoder (Dong, [Par.0020], “] As shown in the example FIG. 2, the autoencoder 218 includes an encoder 202 and a decoder 204, each of the encoder 202 and decoder 204 may be implemented using a neural network model.”);
generating a third dataset, by feeding pseudorandom vectors into the decoder (Dong, [Par.0020], “

    PNG
    media_image1.png
    738
    554
    media_image1.png
    Greyscale
” Examiner’s note, a decoder uncompressed a latent variable into the reconstructed data that is closely matched the input data, that help to reduce a noise of data. Therefore, the latent variables(xi) is considered as a pseudorandom vector, the reconstructed data is considered as third dataset.  .
);
However, Dong does not teach forming a first training dataset and a second training dataset from a labeled input dataset; 
On the other hand, Martin teaches forming a first training dataset and a second training dataset from a labeled input dataset (Haerterich, Par.0052, “The systems and methods described herein improve the performance of the computer system for detecting training data usage in generative models. To demonstrate this several tests were performed using two generative models. A first generative model was a deep convolutional generative adversarial network (DCGAN). A second generative model was a variational autoencoder (VAE) model. The generative models were trained and tested using the Modified National Institute of Standards and Technology (MNIST) dataset. The MNIST dataset is a standard dataset in machine learning and computer vision that includes 70,000 labeled, handwritten digits which are separated into 60,000 training samples and 10,000 other test samples. Each sample includes one digit and is a 28x28 gray scale image. In the demonstrations described herein, a 10% subset of the training images were uses so as to provoke overfitting to the training data” Examiner’s note, the training sample is considered as the first training dataset and test samples is considered as a second training data, which come from the MNIST dataset and they are labeled, therefor, the MINIST dataset is considered as the labeled input dataset. ); 
Dong and Haerterich are analogous in arts because they have the same filed of endeavor of using a machine learning to classify the dataset.
Accordingly, it would have been prima facie obvious to one of the ordinary skills in the art before the effective filing date of the claimed invention to have modified Dong’s method of improving a data classification by using a neural network and combined with method of forming a first training dataset and a second training dataset from a labeled input dataset taught by Haerterich. The modification would have been obvious because one of the ordinary skills in art would be motivated to provoke the overfitting to the training data (Haerterich, Par.0052, “The systems and methods described herein improve the performance of the computer system for detecting training data usage in generative models. To demonstrate this several tests were performed using two generative models. A first generative model was a deep convolutional generative adversarial network (DCGAN). A second generative model was a variational autoencoder (VAE) model. The generative models were trained and tested using the Modified National Institute of Standards and Technology (MNIST) dataset. The MNIST dataset is a standard dataset in machine learning and computer vision that includes 70,000 labeled, handwritten digits which are separated into 60,000 training samples and 10,000 other test samples. Each sample includes one digit and is a 28x28 gray scale image. In the demonstrations described herein, a 10% subset of the training images were uses so as to provoke overfitting to the training data”).
However, Dong and Haerterich do not teach training a first classifier with the first training dataset; labeling the third dataset, using the first classifier, to form a third training dataset; forming a fourth training dataset based on the third dataset; and training a second classifier with the fourth training dataset.
On the other hand, Agarwal teaches training a first classifier with the first training dataset (Agarwal, [Par.0036], “For example, the first classifier model (Ml) is a single layer recurrent neural network with LSTM units for classification trained on the first set of training data. This is used as a baseline for classification.”);
labeling the third dataset, using the first classifier, to form a third training dataset (Agarwal [005], “classifying the one or more selected queries as queries that exists in the first set of training data and as new queries using a first classified model; augmenting the first set of training data with the new queries to obtain a second set of training data;” Examiner’s note, the first classifier is correctly classified a new queries using a first training dataset, wherein, the new queries are considered as a third dataset which is generated by an augmentation process of the autoencoder. The obtained second training dataset is a labeled data, therefore, the classified second training dataset is considered as a third training data. For further classification, see [Par.0037], “In an example implementation, the training bias correction module 108 selects one or more of the new queries (top k) which are correctly classified by the first classifier model based on an entropy of a Softmax distribution function. In this example implementation, to obtain a label for the novel questions generated by the VAE, the training bias correction module 108 uses Ml and chooses the top K sentences, based on the entropy of the softmax distribution, as candidates for augmenting the training data. Also, the training bias correction module 108 enables the user to identify the selected queries which are wrongly classified by the first classifier model. In an embodiment, the training bias correction module 108 enables the user to verify the label and correct the label if it is incorrectly classified by Ml. Also, the training bias correction module 108 removes the questions that clearly correspond to new classes.”);
  forming a fourth training dataset based on the third dataset (Agarwal, [Par.0038], “Further, the training bias correction module 108 augments the first set of training data with top k new queries correctly classified by the first classifier model (Ml) and the queries which are wrongly classified by the first classified model to obtain a second set of training data. Furthermore, the training bias correction module 108 trains a second classifier model using the second set of training data, thus correcting linguistic training bias in training data.” Examiner’s note, the obtained second training dataset is considered as the third training data. However, the claim does not clearly define what is forming mean, therefore, examiner interprets the obtained second training dataset (third dataset) is the same as the fourth training data.) ;
and training a second classifier with the fourth training dataset (Agarwal, [Par.0041], “In an example embodiment, one or more of the new queries which are correctly classified by the first classifier model are selected based on an entropy of a softmax distribution function. Further, the first set of training data is augmented with the one or more of the new queries which are correctly classified by the first classifier model. In some embodiments, the user is enabled to identify the selected queries which are wrongly classified by the first classifier model. Further, the second set of training data is augmented with the queries which are wrongly classified by the first classifier model. At block 516, a second classifier model is trained using the second set of training data, thus correcting linguistic training bias in training data.” Examiner’s note, the obtained second training dataset is trained on the second classifier, therefore, second training dataset is considered as the fourth training dataset).
Dong, Haerterich and Agarwal are analogous in arts because they have the same filed of endeavor of using a machine learning to classify the dataset.
Accordingly, it would have been prima facie obvious to one of the ordinary skills in the art before the effective filing date of the claimed invention to have modified Dong’s method of improving a data classification by using a neural network and training a first classifier with the first training dataset, labeling the third dataset, using the first classifier, to form a third training dataset, forming a fourth training dataset based on the third dataset, and training a second classifier with the fourth training dataset taught by Agarwal. The modification would have been obvious because one of the ordinary skills in art would be motivated to correct over fitting due to training bias of the training data (Agarwal, [Par.0043], “The various embodiments described in FIGS. 1-6 propose an approach for a generative model, which uses LSTM-VAE followed by sentence selection using a LM for correcting linguistic training bias in training data. In this approach, weighted cost annealing technique is used for training the LSTM-VAE. When such sentences are added to the training set, it indirectly forces the model to learn to distinguish the classes based on some other words than such non-concept words. Thus, augmenting training data with automatically generated sentences is able to correct over fitting due to linguistic training bias. The newly generated sentences sometimes belonged to completely new classes not present in the original training data. Further, augmenting training data with automatically generated sentences results in an improved accuracy (2%) of the deep-learning classifier.”).
Regarding claim 14 is being rejected for the same reason as the claim 1. 
Additionally, Dong further teaches a system, comprising: a processing circuit configured (Dong, [Par.0051], “The computer system 800 may transmit and receive messages, data, information and instructions, including one or more programs (i.e., application code) through the communication link 824 and the network interface component 812. The network interface component 812 may include an antenna, either separate or integrated, to enable transmission and reception via the communication link 824. Received program code may be executed by processor 804 as received and/or stored in disk drive component 810 or some other non-volatile storage component for execution” .)
Regarding claim 20 is being rejected for the same reason as the claim 1.
Additionally, Dong further teaches a system for classifying manufactured parts as good or defective, the system comprising: a data collection circuit; and a processing circuit, the processing circuit being configured (Dong, [Par.0018-0019], “Referring to FIG. 1, an embodiment of a method 100 for providing data augmentation for fraud detection using a neural network system is illustrated. Referring to FIGS. 1 and 2, the method may begin at block 102, where a first training process is performed on a semi-supervised adversarial autoencoder model, to generate a first trained semi-supervised adversarial autoencoder model. A first training dataset of transactions may be used in the first training process. FIG. 1 will be further discussed in detail below after an explanation of FIG. 2. All or a portion of the operations referred to in FIGS. 1, 2, 3, 4, 5, and elsewhere herein may be performed in various embodiments by any suitable computer system including system 800 as discussed in FIG. 8. Such a computer system may comprise multiple processors and/or server systems in some instances (e.g. a cloud cluster or other computer cluster). [0019] Referring to FIG. 2, a neural network system 200 for fraud detection including a semi-supervised adversarial autoencoder model is illustrated, according to some embodiments. The neural network system 200 as shown includes an autoencoder 218, a generative adversarial network (GAN) 214 with a prior distribution discriminator 210 (also referred to as a prior distribution GAN 214), and a GAN 216 with a fraud discriminator 206 (also referred to as a fraudulent transaction GAN 216).”).
Regarding claim 2, Dong teaches the method of claim 1, wherein the first training dataset is the labeled input dataset  (Haerterich, Par.0052, “The systems and methods described herein improve the performance of the computer system for detecting training data usage in generative models. To demonstrate this several tests were performed using two generative models. A first generative model was a deep convolutional generative adversarial network (DCGAN). A second generative model was a variational autoencoder (VAE) model. The generative models were trained and tested using the Modified National Institute of Standards and Technology (MNIST) dataset. The MNIST dataset is a standard dataset in machine learning and computer vision that includes 70,000 labeled, handwritten digits which are separated into 60,000 training samples and 10,000 other test samples. Each sample includes one digit and is a 28x28 gray scale image. In the demonstrations described herein, a 10% subset of the training images were uses so as to provoke overfitting to the training data” Examiner’s note, the training sample is considered as the first training dataset which comes from the MNIST dataset and they are labeled, therefor, the MINIST dataset is considered as the labeled input dataset. ); 
Dong and Haerterich are analogous in arts because they have the same filed of endeavor of using a machine learning to classify the dataset.
Accordingly, it would have been prima facie obvious to one of the ordinary skills in the art before the effective filing date of the claimed invention to have modified Dong’s method of improving a data classification by using a neural network and combined with the first training dataset is the labeled input dataset  taught by Haerterich. The modification would have been obvious because one of the ordinary skills in art would be motivated to provoke the overfitting to the training data (Haerterich, Par.0052, “The systems and methods described herein improve the performance of the computer system for detecting training data usage in generative models. To demonstrate this several tests were performed using two generative models. A first generative model was a deep convolutional generative adversarial network (DCGAN). A second generative model was a variational autoencoder (VAE) model. The generative models were trained and tested using the Modified National Institute of Standards and Technology (MNIST) dataset. The MNIST dataset is a standard dataset in machine learning and computer vision that includes 70,000 labeled, handwritten digits which are separated into 60,000 training samples and 10,000 other test samples. Each sample includes one digit and is a 28x28 gray scale image. In the demonstrations described herein, a 10% subset of the training images were uses so as to provoke overfitting to the training data”).
Regarding claim 15 is being rejected for the same reason as the claim 2. 
Regarding claim 3, Dong teaches the method of claim 1, wherein the second training dataset is the labeled input dataset (Haerterich, Par.0052, “The systems and methods described herein improve the performance of the computer system for detecting training data usage in generative models. To demonstrate this several tests were performed using two generative models. A first generative model was a deep convolutional generative adversarial network (DCGAN). A second generative model was a variational autoencoder (VAE) model. The generative models were trained and tested using the Modified National Institute of Standards and Technology (MNIST) dataset. The MNIST dataset is a standard dataset in machine learning and computer vision that includes 70,000 labeled, handwritten digits which are separated into 60,000 training samples and 10,000 other test samples. Each sample includes one digit and is a 28x28 gray scale image. In the demonstrations described herein, a 10% subset of the training images were uses so as to provoke overfitting to the training data”. Examiner’s note, the test samples are considered as the second training dataset which comes from the MNIST dataset and they are labeled, therefor, the MINIST dataset is considered as the labeled input dataset. ); 
Dong and Haerterich are analogous in arts because they have the same filed of endeavor of using a machine learning to classify the dataset.
Accordingly, it would have been prima facie obvious to one of the ordinary skills in the art before the effective filing date of the claimed invention to have modified Dong’s method of improving a data classification by using a neural network and combined with the second training dataset is the labeled input dataset taught by Haerterich. The modification would have been obvious because one of the ordinary skills in art would be motivated to provoke the overfitting to the training data (Haerterich, Par.0052, “The systems and methods described herein improve the performance of the computer system for detecting training data usage in generative models. To demonstrate this several tests were performed using two generative models. A first generative model was a deep convolutional generative adversarial network (DCGAN). A second generative model was a variational autoencoder (VAE) model. The generative models were trained and tested using the Modified National Institute of Standards and Technology (MNIST) dataset. The MNIST dataset is a standard dataset in machine learning and computer vision that includes 70,000 labeled, handwritten digits which are separated into 60,000 training samples and 10,000 other test samples. Each sample includes one digit and is a 28x28 gray scale image. In the demonstrations described herein, a 10% subset of the training images were uses so as to provoke overfitting to the training data”).
Regarding claim 16 is being rejected for the same reason as the claim 3. 
Regarding claim 7, Dong as modified in view of Agarwal teaches the method of claim 4, wherein the fourth training dataset is the same as the third training dataset (Agarwal, [Par.0041], “In an example embodiment, one or more of the new queries which are correctly classified by the first classifier model are selected based on an entropy of a softmax distribution function. Further, the first set of training data is augmented with the one or more of the new queries which are correctly classified by the first classifier model. In some embodiments, the user is enabled to identify the selected queries which are wrongly classified by the first classifier model. Further, the second set of training data is augmented with the queries which are wrongly classified by the first classifier model. At block 516, a second classifier model is trained using the second set of training data, thus correcting linguistic training bias in training data.” Examiner’s note, second training dataset (third training dataset) is obtained by data argumentation, as described in [Par.0005], “classifying the one or more selected queries as queries that exists in the first set of training data and as new queries using a first classified model; augmenting the first set of training data with the new queries to obtain a second set of training data;”  The obtained second training dataset (third training dataset) is continually trained on the second classifier, second training dataset (third training dataset) is considered as the fourth training dataset, therefore, the fourth training dataset is the same as the third training dataset).).
Dong, Haerterich and Agarwal are analogous in arts because they have the same filed of endeavor of using a machine learning to classify the dataset.
Accordingly, it would have been prima facie obvious to one of the ordinary skills in the art before the effective filing date of the claimed invention to have modified Dong’s method of improving a data classification by using a neural network and in combine with Agarwal by having the fourth training dataset is the same as the third training dataset. The modification would have been obvious because one of the ordinary skills in art would be motivated to correct over fitting due to training bias of the training data (Agarwal, [Par.0043], “The various embodiments described in FIGS. 1-6 propose an approach for a generative model, which uses LSTM-VAE followed by sentence selection using a LM for correcting linguistic training bias in training data. In this approach, weighted cost annealing technique is used for training the LSTM-VAE. When such sentences are added to the training set, it indirectly forces the model to learn to distinguish the classes based on some other words than such non-concept words. Thus, augmenting training data with automatically generated sentences is able to correct over fitting due to linguistic training bias. The newly generated sentences sometimes belonged to completely new classes not present in the original training data. Further, augmenting training data with automatically generated sentences results in an improved accuracy (2%) of the deep-learning classifier.”).
Regarding claim 8, Dong as modified in view of Agarwal teaches the method of claim 4, wherein the forming of the fourth training dataset comprises combining: a first portion of the labeled input dataset, and the third training dataset to form the fourth training dataset (Agarwal, [Par.0005],  “classifying the one or more selected queries as queries that exists in the first set of training data and as new queries using a first classified model; augmenting the first set of training data with the new queries to obtain a second set of training data;” Examiner’s note, the second training dataset (fourth training dataset) is obtained from the first training dataset through the data augmentation process. ).
Dong, Haerterich and Agarwal are analogous in arts because they have the same filed of endeavor of using a machine learning to classify the dataset.
Accordingly, it would have been prima facie obvious to one of the ordinary skills in the art before the effective filing date of the claimed invention to have modified Dong’s method of improving a data classification by using a neural network and in combine with Agarwal by forming of the fourth training dataset comprises combining: a first portion of the labeled input dataset, and the third training dataset to form the fourth training dataset. The modification would have been obvious because one of the ordinary skills in art would be motivated to correct over fitting due to training bias of the training data (Agarwal, [Par.0043], “The various embodiments described in FIGS. 1-6 propose an approach for a generative model, which uses LSTM-VAE followed by sentence selection using a LM for correcting linguistic training bias in training data. In this approach, weighted cost annealing technique is used for training the LSTM-VAE. When such sentences are added to the training set, it indirectly forces the model to learn to distinguish the classes based on some other words than such non-concept words. Thus, augmenting training data with automatically generated sentences is able to correct over fitting due to linguistic training bias. The newly generated sentences sometimes belonged to completely new classes not present in the original training data. Further, augmenting training data with automatically generated sentences results in an improved accuracy (2%) of the deep-learning classifier.”).
Regrading claim 9, Dong as modified in view of Agarwal teaches the method of claim 4, wherein the forming of the fourth training dataset comprises combining: a first portion of the labeled input dataset, the first supplementary dataset, and the third training dataset to form the fourth training dataset (Agarwal, [Par.0005], “classifying the one or more selected queries as queries that exists in the first set of training data and as new queries using a first classified model; augmenting the first set of training data with the new queries to obtain a second set of training data;” ).
Dong, Haerterich and Agarwal are analogous in arts because they have the same filed of endeavor of using a machine learning to classify the dataset.
Accordingly, it would have been prima facie obvious to one of the ordinary skills in the art before the effective filing date of the claimed invention to have modified Dong’s method of improving a data classification by using a neural network and forming of the fourth training dataset comprises combining: a first portion of the labeled input dataset, the first supplementary dataset, and the third training dataset to form the fourth training dataset taught by Agarwal. The modification would have been obvious because one of the ordinary skills in art would be motivated to correct over fitting due to linguistic training bias during a training process (Agarwal, [Par.0043], “The various embodiments described in FIGS. 1-6 propose an approach for a generative model, which uses LSTM-VAE followed by sentence selection using a LM for correcting linguistic training bias in training data. In this approach, weighted cost annealing technique is used for training the LSTM-VAE. When such sentences are added to the training set, it indirectly forces the model to learn to distinguish the classes based on some other words than such non-concept words. Thus, augmenting training data with automatically generated sentences is able to correct over fitting due to linguistic training bias. The newly generated sentences sometimes belonged to completely new classes not present in the original training data. Further, augmenting training data with automatically generated sentences results in an improved accuracy (2%) of the deep-learning classifier.”)

Claims 4, 17, 11, 12 are rejected under 35 U.S.C. 103 as being unpatentable over Dong et al. (Pub. No. US20200210808– hereinafter, Dong)  in view of Haerterich et al. (Pub. No. US20200097763– hereinafter, Haerterich) and futher in view of AGARWAL et al. (Pub. No. US20190087728– hereinafter, Agarwal) and further in view of Chitta et al. (Pub. No. US20200143274– hereinafter, Chitta). 
Regarding claim 4, Dong teaches the method of claim 1, wherein the forming of the first training dataset comprises (Chitta, [Par.0022], “The data set balancing module 108 may create a balanced training data set from the imbalanced training data set. A training data set may be said to be imbalanced is there exists substantial inequality between the majority class of instances and the minority class of instances. As an example, the question "Can this contract be assigned without con sent?"
may be answered either as "yes" or "no". There may be 90 instances where the answer may be "yes" and, only 10 instances where the answer may be "no". The 90 instances where the answer may be 'yes' may constitute a majority class of instances, whereas the 10 instances where the answer may be 'no' may constitute a minority class of instances. Such an imbalanced training data may lead to an inaccurate and unreliable output when the system tries to predict answer to multiple choice questions. The data set balancing module 108 is configured to counter the effect of the imbalanced training data on the output by converting the imbalanced training data set to a balanced training data set. Example of how the balancing is carried out in discussed later in this document.” Examiner’s note, balancing the training datasets between a majority class (Yes class) and a minority class (NO class ) is considered as a forming training dataset, therefore, a balanced training dataset with majority class is considered as a first training dataset, which is generated (formed) by the data balancing module.) :
oversampling the labeled input dataset, to produce a first supplementary dataset (Chitta, [Par.0032-0038], “ Referring to a step 206 the evidence features 204a,  the question features 202b, and the answers 200c may pass  through data set balancing module 108, in accordance to an embodiment. The input to the data set balancing module 108 comprising the evidence features the question features 204a, and the answers 200c, may include an imbalanced 202b number of the instances, as discussed earlier. Balanced data set may be generated from the imbalanced data set by the data set balancing module 108. The generation of the balanced data set may be achieved by the implementation of SMOTE (Synthetic Minority Oversampling Technique) algorithm

    PNG
    media_image2.png
    600
    594
    media_image2.png
    Greyscale
 …The process described above, may be repeated till the number of the instances of the minority class is approximately equal to the number of the instances of the majority class. According to an embodiment, the output of the data set balancing module may be the balanced data set generated 108 by the SMOTE algorithm. Examiner’s note, a selected group of instance Xnm is selected from instance X (labeled input dataset) is considered as supplementary dataset, that  is generated by SMOTE (Synthetic Minority Oversampling Technique) algorithm. Therefore, the by SMOTE (Synthetic Minority Oversampling Technique) algorithm is considered as Oversampling. However, the claim is not defined what is oversampling mean.);
and combining the labeled input dataset and the first supplementary dataset to form the first training dataset (Chitta, [Par.0032-0038], “



    PNG
    media_image2.png
    600
    594
    media_image2.png
    Greyscale
 

    PNG
    media_image3.png
    245
    631
    media_image3.png
    Greyscale
”
Examiner’s note, a generated (formed) X new dataset by data balancing module is the combination of instance X (Labeled input dataset) and Xnm (supplementary dataset).).
Dong, Haerterich, Agarwal and Chitta are analogous in arts because they have the same filed of endeavor of using a machine learning to classify the dataset.
Accordingly, it would have been prima facie obvious to one of the ordinary skills in the art before the effective filing date of the claimed invention to have modified Dong’s method of improving a data classification by using a neural network in combine with the mothed of forming of the first training dataset comprises: oversampling the labeled input dataset, to produce a first supplementary dataset and combining the labeled input dataset and the first supplementary dataset to form the first training dataset taught by Chitta. The modification would have been obvious because one of the ordinary skills in art would be motivated to balancing the imbalanced training dataset (Chitta, [Par.0032], “ Referring to a step 206 the evidence features 204a,  the question features 202b, and the answers 200c may pass  through data set balancing module 108, in accordance to an embodiment. The input to the data set balancing module 108 comprising the evidence features the question features 204a, and the answers 200c, may include an imbalanced 202b number of the instances, as discussed earlier. Balanced data set may be generated from the imbalanced data set by the data set balancing module 108. The generation of the balanced data set may be achieved by the implementation of SMOTE (Synthetic Minority Oversampling Technique) algorithm”).
Regarding claim 17 is being rejected for the same reason as the claim 4. 
Regarding claim 11, Dong as modified in view Chitta teaches the method of claim 1, wherein the forming of the second training dataset comprises (Chitta, [Par.0022], “The data set balancing module 108 may create a balanced training data set from the imbalanced training data set. A training data set may be said to be imbalanced is there exists substantial inequality between the majority class of instances and the minority class of instances. As an example, the question "Can this contract be assigned without con sent?" may be answered either as "yes" or "no". There may be 90 instances where the answer may be "yes" and, only 10 instances where the answer may be "no". The 90 instances where the answer may be 'yes' may constitute a majority class of instances, whereas the 10 instances where the answer may be 'no' may constitute a minority class of instances. Such an imbalanced training data may lead to an inaccurate and unreliable output when the system tries to predict answer to multiple choice questions. The data set balancing module 108 is configured to counter the effect of the imbalanced training data on the output by converting the imbalanced training data set to a balanced training data set. Example of how the balancing is carried out in discussed later in this document.” Examiner’s note, balancing the training datasets between a majority class (Yes class) and a minority class (NO class ) is considered as a forming training dataset, therefore, a balanced training dataset with minority class is considered as a second training dataset, which is generated (formed) by the data balancing module.) :
oversampling the labeled input dataset, to produce a first supplementary dataset (Chitta, [Par.0032-0038], “ Referring to a step 206 the evidence features 204a,  the question features 202b, and the answers 200c may pass  through data set balancing module 108, in accordance to an embodiment. The input to the data set balancing module 108 comprising the evidence features the question features 204a, and the answers 200c, may include an imbalanced 202b number of the instances, as discussed earlier. Balanced data set may be generated from the imbalanced data set by the data set balancing module 108. The generation of the balanced data set may be achieved by the implementation of SMOTE (Synthetic Minority Oversampling Technique) algorithm

    PNG
    media_image2.png
    600
    594
    media_image2.png
    Greyscale
 …The process described above, may be repeated till the number of the instances of the minority class is approximately equal to the number of the instances of the majority class. According to an embodiment, the output of the data set balancing module may be the balanced data set generated 108 by the SMOTE algorithm. Examiner’s note, a selected group of instance Xnm is selected from instance X (labeled input dataset) is considered as supplementary dataset, that  is generated by SMOTE (Synthetic Minority Oversampling Technique) algorithm. Therefore, the by SMOTE (Synthetic Minority Oversampling Technique) algorithm is considered as Oversampling. However, the claim is not defined what is oversampling mean.);
and combining the labeled input dataset and the first supplementary dataset to form the second training dataset ((Chitta, [Par.0032-0038], “



    PNG
    media_image2.png
    600
    594
    media_image2.png
    Greyscale
 

    PNG
    media_image3.png
    245
    631
    media_image3.png
    Greyscale
”
Examiner’s note, a generated (formed) X new dataset by data balancing module is the combination of instance X (Labeled input dataset) and Xnm (supplementary dataset). Therefore, Balanced training dataset Xnew is generated by data balancing module, that is considered as a second training dataset ).
).
Dong, Haerterich, Agarwal and Chitta are analogous in arts because they have the same filed of endeavor of using a machine learning to classify the dataset.
Accordingly, it would have been prima facie obvious to one of the ordinary skills in the art before the effective filing date of the claimed invention to have modified Dong’s method of improving a data classification by using a neural network in combine with the method of forming of the second training dataset comprises: oversampling the labeled input dataset, to produce a first supplementary dataset and combining the labeled input dataset and the first supplementary dataset to form the first training dataset taught by Chitta. The modification would have been obvious because one of the ordinary skills in art would be motivated to balancing the imbalanced training dataset (Chitta, [Par.0032], “ Referring to a step 206 the evidence features 204a,  the question features 202b, and the answers 200c may pass  through data set balancing module 108, in accordance to an embodiment. The input to the data set balancing module 108 comprising the evidence features the question features 204a, and the answers 200c, may include an imbalanced 202b number of the instances, as discussed earlier. Balanced data set may be generated from the imbalanced data set by the data set balancing module 108. The generation of the balanced data set may be achieved by the implementation of SMOTE (Synthetic Minority Oversampling Technique) algorithm”).
Regarding claim 12, Dong as modified in view of Chitta teaches the method of claim 1, wherein the labeled input dataset comprises: majority class data comprising a first number of data elements and minority class data comprising a second number of data elements the first number exceeding the second number by a factor of at least 5 (Chitta, [Par.0022], “The data set balancing module 108 may create a balanced training data set from the imbalanced training data set. A training data set may be said to be imbalanced is there exists substantial inequality between the majority class of instances and the minority class of instances. As an example, the question "Can this contract be assigned without con sent?" may be answered either as "yes" or "no". There may be 90 instances where the answer may be "yes" and, only 10 instances where the answer may be "no". The 90 instances where the answer may be 'yes' may constitute a majority class of instances, whereas the 10 instances where the answer may be 'no' may constitute a minority class of instances. Such an imbalanced training data may lead to an inaccurate and unreliable output when the system tries to predict answer to multiple choice questions. The data set balancing module 108 is configured to counter the effect of the imbalanced training data on the output by converting the imbalanced training data set to a balanced training data set. Example of how the balancing is carried out in discussed later in this document.” Examiner’s note, the first element “YES” of a majority class is 90, and the second element “NO”of majority class is 10, therefore, the first number exceeding the second number by a factor of at least 5.) :
Claims 5, 13, 18 are rejected under 35 U.S.C. 103 as being unpatentable over Dong et al. (Pub. No. US20200210808– hereinafter, Dong)  ) in view of Haerterich et al. (Pub. No. US20200097763– hereinafter, Haerterich) and further in view of AGARWAL et al. (Pub. No. US20190087728– hereinafter, Agarwal) and further in view of Chitta et al. (Pub. No. US20200143274– hereinafter, Chitta) and further in view of Dalek et al. (Pub. No.: US20190370384-hereinafter, Dalek).
Regarding claim 5, Dong as modified in view of Dalek teaches the method of claim 4, wherein the oversampling of the labeled input dataset comprises using a synthetic minority over-sampling technique (Dalek, [Par.0023], “The data labeling process 14 is the process that is being improved by the below described label propagation system. The data labeling process 14 may include the processes of identifying the classes and attribute noise that may impact separation of the classes. The data labeling process 14 may also include identifying class imbalance issues and possible steps for addressing these, including under-sampling the majority class, over-sampling the minority class, or by creating synthetic samples using techniques such as SMOTE (N. V. Chawla, K. W. Bowyer, L. 0. Hall and W. P. Kegelmeyer (2002) "SMOTE: Synthetic Minority Over-sampling Technique"). As described above, this labeling process may be performed with known techniques, but those known techniques have the technical problems of inaccurate and/or inconsistent labels that adversely affect the supervised machine learning. In contrast, the label propagation process and system described below provides a technical solution to the above problem and provides accu­rate and consistent labeled datasets that enhance the supervised machine learning process.”).
Dong, Haerterich, Agarwal, Chitta and Dalek are analogous in arts because they have the same filed of endeavor of using a machine learning method to improving a data classification.
Accordingly, it would have been prima facie obvious to one of the ordinary skills in the art before the effective filing date of the claimed invention to have modified Dong’s method of improving a data classification by using a neural network and using oversampling of the labeled input dataset comprises using a synthetic minority over-sampling technique taught by Dalek. The modification would have been obvious because one of the ordinary skills in art would be motivated to addressing a class imbalance issue (Dalek, [Par.0023], “The data labeling process 14 is the process that is being improved by the below described label propagation system. The data labeling process 14 may include the processes of identifying the classes and attribute noise that may impact separation of the classes. The data labeling process 14 may also include identifying class imbalance issues and possible steps for addressing these, including under-sampling the majority class, over-sampling the minor­ity class, or by creating synthetic samples using techniques such as SMOTE (N. V. Chawla, K. W. Bowyer, L. 0. Hall and W. P. Kegelmeyer (2002) "SMOTE: Synthetic Minority Over-sampling Technique"). As described above, this label­ing process may be performed with known techniques, but those known techniques have the technical problems of inaccurate and/or inconsistent labels that adversely affect the supervised machine learning. In contrast, the label propa­gation process and system described below provides a technical solution to the above problem and provides accu­rate and consistent labeled datasets that enhance the super­vised machine learning process.”
Regarding claim 18 is being rejected for the same reason as the claim 5. 
Regarding claim 13, Dong as modified in view of Dalek teaches the method of claim 12, wherein the first number exceeds the second number by a factor of at least 15 (Dalek, [Par.0036], “If each URL request is treated as input data to a binary classifier, this dataset would be unbalanced at a ratio of 2000000 to 3 since all non-RIG entries belongs to the majority class. A significant increase in the amount of RIG samples is necessary before pursuing a supervised machine learning approach.”).
Dong, Haerterich, Agarwal, Chitta and Dalek are analogous in arts because they have the same filed of endeavor of using a machine learning method to improving a data classification.
Accordingly, it would have been prima facie obvious to one of the ordinary skills in the art before the effective filing date of the claimed invention to have modified Dong’s method of improving a data classification by using a neural network and having  the first number exceeds the second number by a factor of at least 15 taught by Dalek. The modification would have been obvious because one of the ordinary skills in art would be motivated to increase a performance (Dalek, [Par.0037], “The system and method performance may be increased by adding a significant amount of majority class samples, described further below when using a Diverse Resistance undersampling method to balance the training data. Note that an unbalanced data input and binary classi­fication scenario is just a common special case and the approach will work equally well for balanced datasets and multi-label classification.”).
Claims 6, 19 are rejected under 35 U.S.C. 103 as being unpatentable over Dong et al. (Pub. No. US20200210808– hereinafter, Dong) )  in view of Haerterich et al. (Pub. No. US20200097763– hereinafter, Haerterich) and further in view of AGARWAL et al. (Pub. No. US20190087728– hereinafter, Agarwal) further in view of Chitta et al. (Pub. No. US20200143274– hereinafter, Chitta) and further in view of Siriseriwan et al. (NPL: Adaptive neighbor synthetic minority oversampling technique - Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Pathum Wan, Bangkok, 10300 Thailand, hereinafter, Siriseriwan).
Regarding claim 6, Dong as modified in view of Siriseriwan teaches the method of claim 4, wherein the oversampling of the labeled input dataset comprises using an adaptive synthetic over-sampling technique (Siriweriwan, [Sec.4], “Adaptive neighbor Synthetic Minority Oversampling TEchnique under 1NN outcast handling or ANS is introduced based on two objectives. The first objective is to override the decision on a single value of K from a user using Ki for each positive instance pi. Ki  is the number of possible positive neighbors that is chosen to pair up with a positive instance pi in order to create synthetic instances along the line segment of between that pair…

    PNG
    media_image4.png
    788
    520
    media_image4.png
    Greyscale
”)
Dong, Haerterich, Agarwal, Chitta and Siriweriwan are analogous in arts because they have the same filed of endeavor of using a machine learning method to improving a data classification.
Accordingly, it would have been prima facie obvious to one of the ordinary skills in the art before the effective filing date of the claimed invention to have modified Dong’s method of improving a data classification by using a neural network and the oversampling of the labeled input dataset comprises using an adaptive synthetic over-sampling technique taught by Siriweriwan. The modification would have been obvious because one of the ordinary skills in art would be motivated to increase a performance of data balancing (Siriweriwan, [Abstract], “This paper introduces a new adaptive algorithm called Adaptive neighbor Synthetic Minority Oversampling Technique (ANS) to dynamically adapt the number of neighbors needed for oversampling around different minority regions. This technique also defines a minority outcast as a minority instance having no minority class neighbors. Minority outcasts are neglected by most oversampling techniques but instead, an additional outcast handling method is proposed for the performance improvement via a 1-nearest neighbor model. Based on our experiments in UCI and PROMISE datasets, generated datasets from this technique have improved the accuracy performance of a classification, and the improvement can be verified statistically by the Wilcoxon signed-rank test..”).
Regarding claim 19 is being rejected for the same reason as the claim 6. 
Claim(s) 10 are/is rejected under 35 U.S.C. 103 as being unpatentable over Dong et al. (Pub. No. US20200210808– hereinafter, Dong)  )  in view of Haerterich et al. (Pub. No. US20200097763– hereinafter, Haerterich) and further in view of AGARWAL et al. (Pub. No. US20190087728– hereinafter, Agarwal) and further in view of Bakker et al. ( Pub. No. US20150278470, hereinafter, Bakker).
Regarding claim 10, Dong as modified in view of Bakker teaches the method of claim 9, further comprising validating the second classifier with a second portion of the labeled input dataset, different from the first portion of the labeled input dataset (Bakker, [Par.0051],“After the start of the optimization procedure in step S300, the dataset of the database 50 is divided in step S301 into three equally sized sets, called training set, validation set and test set, each containing the same ratio of cases to controls. In step S302, the training set is used for training or parameter tuning, i.e. search for that set of parameter values that minimizes the prediction, or in this case classification error. Most machine learning methods suffer from so-called 'overfitting', where the method's performance on the training set is much better than its performance on new data that has not been used for training Therefore, in step S303, a separate validation set is used to test whether such over-fitting occurs. The combination of training and validation data allows to find that type of machine learning function and choice of model parameters that is able to grasp the true pattern that hides in the ( training) data, yet is still sufficiently general to predict well on the separate validation data and thus on future data as well. The thus optimized classifiers are used in step S304 to make a prediction on each of the patients in the test set, which has remained unused throughout the foregoing optimization steps. The quality of this prediction (e.g. in terms of sensitivity and specificity) is the final test of the validity of the selected classifier. The test set is selected at random to obtain solid statistics.”).
Dong, Haerterich, Agarwal and Bakker are analogous in arts because they have the same filed of endeavor of using a machine learning method to improving a data classification.
Accordingly, it would have been prima facie obvious to one of the ordinary skills in the art before the effective filing date of the claimed invention to have modified Dong’s method of improving a data classification by using a neural network and comprising validating the second classifier with a second portion of the labeled input dataset, different from the first portion of the labeled input dataset taught by Bakker. The modification would have been obvious because one of the ordinary skills in art would be motivated to increase a classification processing (Bakker, [Par.0051],“After the start of the optimization procedure in step S300, the dataset of the database 50 is divided in step S301 into three equally sized sets, called training set, validation set and test set, each containing the same ratio of cases to controls. In step S302, the training set is used for training or parameter tuning, i.e. search for that set of parameter values that minimizes the prediction, or in this case classification error. Most machine learning methods suffer from so-called 'overfitting', where the method's performance on the training set is much better than its performance on new data that has not been used for training Therefore, in step S303, a separate validation set is used to test whether such over-fitting occurs. The combination of training and validation data allows to find that type of machine learning function and choice of model parameters that is able to grasp the true pattern that hides in the ( training) data, yet is still sufficiently general to predict well on the separate validation data and thus on future data as well. The thus optimized classifiers are used in step S304 to make a prediction on each of the patients in the test set, which has remained unused throughout the foregoing optimization steps. The quality of this prediction (e.g. in terms of sensitivity and specificity) is the final test of the validity of the selected classifier. The test set is selected at random to obtain solid statistics.”).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure is provide below.
Bhanot et al. (Pub. No.: US 20080025591-hereinafter, Zhang) teaches the system comprising a multiple classifier for robust classification strategy.
	Zhang et al. (Pub. No.:US20180165554-hereinafter, Zhang) teaches the system of semi-supervised autoencoder for data analysis.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EM N TRIEU whose telephone number is (571)272-5747.  The examiner can normally be reached on 7:30 - 5:00 M_TH.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas can be reached on (571) 272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/E.T./Examiner, Art Unit 2128  

/OMAR F FERNANDEZ RIVAS/Supervisory Patent Examiner, Art Unit 2128