Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Information Disclosure Statement
The information disclosure statements (IDS) submitted on 2021-09-13, 2021-09-17, 2019-10-08, and 2021-03-05 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Specification
Applicant is reminded of the proper content of an abstract of the disclosure.
A patent abstract is a concise statement of the technical disclosure of the patent and should include that which is new in the art to which the invention pertains. The abstract should not refer to purported merits or speculative applications of the invention and should not compare the invention with the prior art.
If the patent is of a basic nature, the entire technical disclosure may be new in the art, and the abstract should be directed to the entire disclosure. If the patent is in the nature of an improvement in an old apparatus, process, product, or composition, the abstract should include the technical disclosure of the improvement. The abstract should also mention by way of example any preferred modifications or alternatives. 

Extensive mechanical and design details of an apparatus should not be included in the abstract. The abstract should be in narrative form and generally limited to a single paragraph within the range of 50 to 150 words in length.
See MPEP § 608.01(b) for guidelines for the preparation of patent abstracts.
The abstract of the disclosure is objected to because it is over 15 lines in length (17 lines), and it is over 150 words in length (155), when words such as “reliability-index” are not hyphenated.  Correction is required.  See MPEP § 608.01(b).
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 
The disclosure is objected to because of the following informalities:
In Specification [0046], “plays a roll” should be changed to read “plays a role”
In Specification [0054], “fist errors” should be changed to read “first errors”
Appropriate correction is required.
Claim Objections
Claims 1 and 10-13 are objected to because of the following informality:  At the end of the preamble “, comprising” should have the comma removed to read “ comprising”. Appropriate correction is required.
Claims 1 and 10-14 are objected to because of the following informality “input nodes corresponding to the input data and each located” should be changed to read “specifies input nodes corresponding to the input data that are each located” for clarity. Appropriate correction is required.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
“an input-node specification unit that…specifies” in Claims 1, 10, and 13
“a reliability-index acquisition unit that acquires” in Claims 1, 10, and 13
“an output-node specification unit that…specifies” in Claims 1 and 13, “that specifies” in Claims 2 and 5, and “that makes a comparison” in Claims 4, 6, and 7
“a prediction-output generation unit that generates” in Claims 1 and 13
“a highly reliable node specification unit that…selects” in Claim 8
“a calculation possibility determination unit that determines” in Claim 8
“A selective output-node specification unit that specifies” in Claim 8
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-14 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 1 and 11-14 recite the limitation "the predetermined learning processing".  There is insufficient antecedent basis for this limitation in the claim.  Examiner is interpreting the limitation “reliability index obtained through the predetermined learning processing and indicating prediction accuracy” as “reliability index that is obtained through the learning a predetermined set of pieces of to-be-learned data and indicates prediction accuracy”
Claim 3 recites the limitation "the predetermined learning processing".  There is insufficient antecedent basis for this limitation in the claim.  Examiner is interpreting the limitation “obtained through the predetermined learning processing” as “obtained through the learning a predetermined set of pieces of to-be-learned data”.
Claims 2-7 recite the limitation “the learned data included in the state spaces”.  There is insufficient antecedent basis for this limitation in the claim. Examiner is interpreting the limitation as “learned data included in the state spaces”
Claims 11 and 12 recite the limitations “the reliability-index acquisition unit” and “the output-node specification unit”. There is insufficient antecedent basis for these limitations in the claim.  Examiner is interpreting the limitations as “the reliability-index acquisition step” and “the output-node specification step”, respectively.
The following claim limitations invoke 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
“an input-node specification unit that…specifies” in Claims 1, 10, and 13
“a reliability-index acquisition unit that acquires” in Claims 1, 10, and 13
“an output-node specification unit that…specifies” in Claims 1 and 13, “that specifies” in Claims 2 and 5, and “that makes a comparison” in Claims 4, 6, and 7
“a prediction-output generation unit that generates” in Claims 1 and 13
“a highly reliable node specification unit that…selects” in Claim 8
“a calculation possibility determination unit that determines” in Claim 8
“A selective output-node specification unit that specifies” in Claim 8
However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function.  There is no tangible hardware specified for these “units”, nor is there an indication that these “units” are programmed computers, in which case a specific algorithm would be sufficient to provide structure.  There is also not an association between any specific algorithms in the Specification, and any of the “units” recited.
Therefore, the claims 1, 2, 4, 6, 7, 8, 10, and 13 are indefinite and is rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph.
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 
(b)        Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 

If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.
Dependent claims 2-9 are rejected because it inherits the deficiencies of their parent claims.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claim 12 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because it is directed to “a computer program”.  While a programmed computer or non-transitory storage medium comprising a computer are both examples of eligible subject matter, merely claiming a “computer program” is not eligible as it is considered “software per se”, regardless of how it would theoretically “cause a computer to function” if it were stored or executed on a computer.
Claim 14 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because it is directed to “a learned model”.  While a programmed computer or non-transitory storage medium comprising a computer are both examples of eligible subject matter, merely claiming a “computer program” is not eligible as it is considered “software per se”.  The Specification does not define the “learned model” as a piece of hardware, and machine learning models are commonly understood to be computer programs, thus the claim amounts to “software per se”. 
Claims 1-14 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis:
In the instant case, Claims 1-10 are directed to an information processing device, Claim 11 is directed to a method, and Claim 13 is directed to an IC chip.  These claims fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).  Claims 12 and 14, as shown above, do not fall within one of the four statutory categories.
Step 2 Analysis:
Based on the claims being determined to be within one of the four categories (Step 1), it must be determined if the claims are directed to a judicial exception (i.e., law of nature, natural phenomenon, and abstract idea).  In this case the claims fall within the judicial exception of an abstract idea, specifically, “Mental Processes (processes that can be performed in the human mind, or by a human using a pen and paper)”.
Step 2A: Prong 1 analysis:
The claims recite:
Claims 1 and 10-14:
 “generates a prediction output corresponding to input data, based on a learned model that is obtained by causing a learning model having a tree structure configured by a plurality of hierarchically arranged nodes each associated with a corresponding one of hierarchically divided state spaces to learn a predetermined set of pieces of to-be-learned data”.  Generating a prediction output corresponding to input data, based on a learned model is something that can be performed in the human mind.  Using a tree-based model that has already been learned can be performed by a human with pen and paper.  The claim here merely states that the model is already learned, and does not provide details on the learning process itself.  Thus, the limitation recites a mental process.
“based on the input data, specifies input nodes corresponding to the input data and each located on a corresponding one of layers from beginning to end of the learning 
“acquires a reliability index obtained through the predetermined learning processing and indicating prediction accuracy”.  Acquiring a value indicating an accuracy can be performed by a human with pen and paper.  The “predetermined learning processing” merely indicates that the model has already been learned, and no details are given on the learning process itself.  The limitation recites a mental process.
“based on the reliability index acquired by the reliability-index acquisition unit, specifies, from the input nodes corresponding to the input data, an output node that is a basis of the generation of a prediction output”.  Specifying based on an index falls under “observation, evaluation, judgment, or opinion”, and is thus a mental process.
“generates a prediction output, based on the to-be-learned data that is included in the state spaces that corresponds to the output node specified by the output-node specification unit”.  Generating an output based on data falls under “observation, evaluation, judgment, or opinion”, and is thus a mental process.
Step 2A:  Prong 2 analysis:
This judicial exception is not integrated into a practical application for Claims 1 and 10-14.  Additional elements “input-node specification unit”, “reliability-index acquisition unit”, “output-node specification unit”, and “prediction-output generation unit” are given no structure in the specification.  However, even if they are assumed to be software modules or 
Step 2B analysis:
Claims 1 and 10-14 do not include additional elements that are sufficient to amount to significantly more than the judicial exception.  As discussed above with respect to integration of the abstract idea into a practical application, additional elements “input-node specification unit”, “reliability-index acquisition unit”, “output-node specification unit”, and “prediction-output generation unit” are given no structure in the specification.  However, even if they are assumed to be software modules or computer hardware, they amount to mere instructions to apply the exception on a computer.  This applies no meaningful limits on the practice of the judicial exception, and therefore the claims do not amount to significantly more than the judicial exception.
	Dependent claim(s) 2-9 when analyzed as a whole are held to be patent ineligible under 35 U.S.C. 101 because the additional recited limitation(s) fail(s) to establish that the claim(s) is/are not directed to an abstract idea, as they recite further embellishment of the judicial exception.  
	Claim 2 recites the same limitations as Claim 1, providing further details on the reliability index and specifying the output node, and is also directed to a mental process.
	Claim 3 recites the same limitations as Claim 2, providing details on error calculation with a forgetting factor, and is also directed to a mental process.
Claim 4 recites the same limitations as Claim 1, providing further details on the reliability index and output node specification, and is also directed to a mental process.

Claim 6 recites the same limitations as Claim 5, providing further details on the reliability index and output node specification, and is also directed to a mental process.
Claim 7 recites the same limitations as Claim 5, providing further details on the reliability index and output node specification, and is also directed to a mental process.
Claim 8 recites the same limitations as Claim 1, providing further details on output node specification.  Additional elements “highly reliable node specification unit”, “calculation possibility determination unit”, “selective output-node specification unit” are given no structure in the specification.  However, even if they are assumed to be software modules or computer hardware, they amount to mere instructions to apply the exception on a computer.
Claim 9 recites the same limitations as Claim 8, providing further details on calculation possibility, and is also directed to a mental process.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1 and 10-14 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Gama et. al. (Learning Decision Trees from Dynamic Data Streams”; hereinafter Gama), as further graphically evidenced by corresponding tutorial Gama et. al. (“Learning with Local Drift GamaTutorial).  Note that Gama, Abstract, discloses “This paper presents a system for induction of forest of functional trees from data streams able to detect concept drift. The Ultra Fast Forest of Trees (UFFT) is an incremental algorithm, which works online, processing each example in constant time, and performing a single scan over the training examples”, while GamaTutorial also discloses “concept drift” on Page 3 and “UFFT” on Page 18.
As per Claim 1, Gama teaches An information processing device that generates a prediction output corresponding to input data, based on a learned model that is obtained by causing a learning model having a tree structure configured by a plurality of hierarchically arranged nodes each associated with a corresponding one of hierarchically divided state spaces to learn a predetermined set of pieces of to-be-learned data (Gama, Section 4.1, end of first paragraph, discloses:  “All algorithms ran on a Centrino at 1.5GHz with 512 MB of RAM and using Linux Mandrake”, thus disclosing an information processing device. Gama, Section 3, discloses:  “UFFT is an algorithm for supervised classification learning that generates a forest of binary trees”, thus disclosing a tree structure, which inherently has a plurality of hierarchically arranged nodes.  Gama, Section 3 Line 7, continues:  “During the training phase the algorithm maintains a short term memory”, and thus by a “training phase”, Gama discloses based on a learned model that is obtained by causing a learning model to learn a predetermined set of pieces of to-be-learned data.  Gama, Bottom of page 1361, discloses:  “All decision nodes contain naive Bayes to detect changes in the class distribution of the examples that traverse the node, that correspond to detect shifts in different regions of the instance space.”  Here, Gama discloses the “nodes”, which are hierarchically arranged, correspond to “instance space” and thus discloses nodes each associated with a corresponding one of hierarchically divided state spaces.  Gama, Section 3.1.5.1, discloses:  “Each tree in the forest makes a prediction”, and thus Gama discloses generates a prediction output corresponding to input data.  GamaTutorial illustrates this on Pages 17:

    PNG
    media_image1.png
    585
    1198
    media_image1.png
    Greyscale

an input-node specification unit that, based on the input data, specifies input nodes corresponding to the input data and each located on a corresponding one of layers from beginning to end of the learning tree structure (Gama, Section 3.1.5.1, discloses:  “Each tree in the forest makes a prediction”.  In order to make a prediction output, that prediction output must be based on input data, otherwise a prediction is impossible.  The input data traverses the tree as per Gama, Bottom of page 1361:  “All decision nodes contain naive Bayes to detect changes in the class distribution of the examples that traverse the node”, and thus the input nodes are corresponding to the input data.  The tree has layers from root to leaf, as is shown in the illustration from GamaTutorial, and thus the input data is each located on a corresponding one of layers from beginning to end of the learning tree structure.)
(Note from 112(b) rejections that Examiner is interpreting “the predetermined learning processing” as “the learning a predetermined set of pieces of to-be-learned data”. Gama, Section 3.2, discloses:  “The UFFT algorithm maintains, at each node of all decision trees, a naive-Bayes classifier. Those classifiers were constructed using the sufficient statistics needed to evaluate the splitting criteria when that node was a leaf. When the leaf becomes a node the naive-Bayes classifier will classify the examples that traverse the node. The basic idea of the drift detection method is to control this online error-rate. If the distribution of the examples that traverse a node is stationary, the error rate of naive-Bayes decreases. If there is a change on the distribution of the examples the naive-Bayes error will increase.”  Here, Gama discloses acquires a reliability index (“error-rate”) indicating prediction accuracy (“If the distribution of the examples that traverse a node is stationary, the error rate of naive-Bayes decreases. If there is a change on the distribution of the examples the naive-Bayes error will increase”).  The tree is built by training, as indicated by Gama, Top of page 1361:  “When a new training example becomes available, it will cross the corresponding binary decision trees from the root node till a leaf. At each node, the naïve Bayes installed at that node classifies the example.”  Training comprises training data that is to be learned.  Thus the tree itself, and consequently the reliability index (“error-rate”) is obtained through the learning a predetermined set of pieces of to-be-learned data.)
an output-node specification unit that, based on the reliability index acquired by the reliability-index acquisition unit, specifies, from the input nodes corresponding to the input (Gama, Section 3.1.4, discloses:  “To classify an unlabeled example, the example traverses the tree from the root to a leaf. It follows the path established, at each decision node, by the splitting test at the appropriate attribute-value. The leaf reached classifies the example.”  Here, Gama establishes an output node (“the leaf reached”) that is a basis of the generation of a prediction output (“classifies the examples”).  Thus, Gama discloses that the output node is a leaf node.  Gama, Section 3.2, discloses:  “If the distribution of the examples that traverse a node is stationary, the error rate of naive-Bayes decreases. If there is a change on the distribution of the examples the naive-Bayes error will increase [17]. When the system detect an increase of the naive-Bayes error in a given node, an indication of a change in the distribution of the examples, this suggest that the splitting-test that has been installed at this node is no longer appropriate. In such cases, all the subtree rooted at that node is pruned, and the node becomes a leaf.”  Here, Gama discloses that leaf nodes can be destroyed and created from internal nodes (input nodes) based on the reliability index (“error-rate”).  Since the output node must be a leaf node, then Gama discloses specifies an output node from the input nodes based on the reliability index.)
a prediction-output generation unit that generates a prediction output, based on the to-be-learned data that is included in the state spaces that corresponds to the output node specified by the output-node specification unit.  (Recall that Gama, Bottom of page 1361, discloses:  “All decision nodes contain naive Bayes to detect changes in the class distribution of the examples that traverse the node, that correspond to detect shifts in different regions of the instance space.”  Here, Gama discloses the “instance space”, and thus discloses the to-.  Recall also that Gama, Section 3.1.5.1, discloses:  “Each tree in the forest makes a prediction”, and thus Gama discloses generates a prediction output.  Recall that Gama, Section 3.1.4, discloses:  “To classify an unlabeled example, the example traverses the tree from the root to a leaf. It follows the path established, at each decision node, by the splitting test at the appropriate attribute-value. The leaf reached classifies the example.”  Thus, Gama discloses that the prediction (“classifies”) corresponds to the output node (“the leaf reached”)).

As per Claim 5, Gama teaches the information processing device according to claim 1.  Gama teaches wherein the reliability index is generated for each of the input nodes under a predetermined condition by referring to a prediction output at the each input node or a node among the input nodes that is located on a layer among the layers that is lower than the each input node (Recall that Gama, Section 3.2, discloses:  “The UFFT algorithm maintains, at each node of all decision trees, a naive-Bayes classifier. Those classifiers were constructed using the sufficient statistics needed to evaluate the splitting criteria when that node was a leaf. When the leaf becomes a node the naive-Bayes classifier will classify the examples that traverse the node. The basic idea of the drift detection method is to control this online error-rate. If the distribution of the examples that traverse a node is stationary, the error rate of naive-Bayes decreases. If there is a change on the distribution of the examples the naive-Bayes error will increase.”  Here, Gama discloses a reliability index (“error-rate”) under a predetermined condition by referring to a prediction output (“If the distribution of the examples that traverse a node is stationary, the error rate of naive-Bayes decreases. If there is a change on the distribution of the examples the naive-Bayes error will increase”).  The “predetermined condition” could be anything, but may be interpreted as the condition that the prediction output be compared to the true output.  It is generated for each of the input nodes as the nodes are traversed:  “classify the examples that traverse the node”.  This is referring to a prediction output at the each input node or a node among the input nodes that is located on a layer among the layers that is lower than the each input node – in this case, at the each input node (“examples that traverse the node”))
and wherein the output-node specification unit specifies the output node based on the reliability index having been generated for the each input node. (Gama, Section 3.1.4, discloses:  “To classify an unlabeled example, the example traverses the tree from the root to a leaf. It follows the path established, at each decision node, by the splitting test at the appropriate attribute-value. The leaf reached classifies the example.”  Here, Gama establishes an output node (“the leaf reached”) that is a basis of the generation of a prediction output (“classifies the examples”).  Thus, Gama discloses that the output node is a leaf node.  Gama, Section 3.2, discloses:  “If the distribution of the examples that traverse a node is stationary, the error rate of naive-Bayes decreases. If there is a change on the distribution of the examples the naive-Bayes error will increase [17]. When the system detect an increase of the naive-Bayes error in a given node, an indication of a change in the distribution of the examples, this suggest that the splitting-test that has been installed at this node is no longer appropriate. In such cases, all the subtree rooted at that node is pruned, and the node becomes a leaf.”  Here, Gama discloses that leaf nodes can be destroyed and created from internal nodes (input nodes) based on the reliability index (“error-rate”).  Since the output node must be a leaf node, then Gama discloses specifies an output node from the input nodes based on the reliability index generated for each input node.)

As per Claim 10, Claim 10 is an information processing device claim corresponding to information processing device claim 1.  The difference is a lack of an output node specification unit and a prediction output generation unit.  Another difference is the language that the reliability index is “gradually updated”.  Gama, Section 3.2, discloses:  “If the distribution of the examples that traverse a node is stationary, the error rate of naive-Bayes decreases. If there is a change on the distribution of the examples the naive-Bayes error will increase [17].”  Here, Gama discloses that the error rate increases or decreases, and thus it is being gradually updated.  Claim 10 is rejected for the same reasons as Claim 1.

As per Claim 11, Claim 11 is a method claim corresponding to information processing device Claim 1.  Claim 11 is rejected for the same reasons as Claim 1.

As per Claim 12, Claim 12 is a computer program claim corresponding to information processing device Claim 1.  The information processing device is inherently running a computer program.  Claim 12 is rejected for the same reasons as Claim 1.

As per Claim 13, Claim 13 is an IC chip claim corresponding to information processing device claim Claim 1.  The information processing device inherently comprises an IC chip, and an input terminal.  Claim 13 is rejected for the same reasons as Claim 1.

As per Claim 14, Claim 14 is a learned model claim corresponding to information processing device Claim 1.  The learned model is understood to be either computer hardware, which is an information processing device, or a computer program, which an information processing device is inherently running.  Claim 14 is rejected for the same reasons as Claim 1.




Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Gama as further graphically evidenced by GamaTutorial in view of Woods et. al. (“Combination of multiple classifiers using local accuracy estimates”; hereinafter Woods).   
As per Claim 2, Gama teaches the information processing device according to claim 1.  Gama teaches wherein the reliability index comprises first errors each generated at a corresponding input node among the input nodes based on a difference between an output corresponding to the input data and a prediction output based on learned data included in the (Recall that Gama, Bottom of page 1361, discloses:  “All decision nodes contain naive Bayes to detect changes in the class distribution of the examples that traverse the node, that correspond to detect shifts in different regions of the instance space.”  Here, Gama discloses the “instance space”, and thus discloses learned data that is included in the state spaces.  Gama, Section 3.2, discloses:  “The UFFT algorithm maintains, at each node of all decision trees, a naive-Bayes classifier. Those classifiers were constructed using the sufficient statistics needed to evaluate the splitting criteria when that node was a leaf. When the leaf becomes a node the naive-Bayes classifier will classify the examples that traverse the node. The basic idea of the drift detection method is to control this online error-rate. If the distribution of the examples that traverse a node is stationary, the error rate of naive-Bayes decreases. If there is a change on the distribution of the examples the naive-Bayes error will increase.”  Here, Gama discloses first errors (“error-rate”) each generated at a corresponding input node among the input nodes (“at each node of all decision trees”).  This is based on a difference between an output corresponding to the input data and a prediction output, as it is an error rate based on the output of a classifier , which provides a prediction output.)
wherein the output-node specification unit specifies, as the output node, a node which is among the input nodes (Gama, Section 3.1.4, discloses:  “To classify an unlabeled example, the example traverses the tree from the root to a leaf. It follows the path established, at each decision node, by the splitting test at the appropriate attribute-value. The leaf reached classifies the example.”  Here, Gama establishes an output node (“the leaf reached”) that is a basis of the generation of a prediction output (“classifies the examples”).  Thus, Gama discloses that the output node is a leaf node.  Gama, Section 3.2, discloses:  “If the distribution of the examples that traverse a node is stationary, the error rate of naive-Bayes decreases. If there is a change on the distribution of the examples the naive-Bayes error will increase [17]. When the system detect an increase of the naive-Bayes error in a given node, an indication of a change in the distribution of the examples, this suggest that the splitting-test that has been installed at this node is no longer appropriate. In such cases, all the subtree rooted at that node is pruned, and the node becomes a leaf.”  Here, Gama discloses that leaf nodes can be destroyed and created from internal nodes (input nodes) based on the reliability index (“error-rate”).  Since the output node must be a leaf node, then Gama discloses that an input node may become an output node, and thus Gama discloses specifies, as the output node, a node which is among the input nodes.)
However, Gama does not teach and for which a corresponding first error among the first errors is minimal.
Woods teaches and for which a corresponding first error among the first errors is minimal.  (Note above that Gama discloses that each node comprises a Naïve Bayes classifier (“naive-Bayes error in a given node”).  Gama, Abstract discloses:  “Decision nodes and leaves contain naive-Bayes classifiers playing different roles during the induction process. Naive-Bayes in leaves are used to classify test examples. Naive-Bayes in inner nodes play two different roles. They can be used as multivariate splitting-tests if chosen by the splitting criteria, and used to detect changes in the class-distribution of the examples that traverse the node.”  Here, Gama discloses Naive-Bayes classifiers in inner nodes.  This is further illustrated in GamaTutorial Page 19:

    PNG
    media_image2.png
    456
    636
    media_image2.png
    Greyscale

Now, let us consider the teachings of Gama up to, but not including, the pruning based on the highest error.  Note that Gama states that the Naïve-Bayes classifiers “monitor” the Naïve Bayes error.  “Monitoring” is useful in itself, and does not necessarily require pruning, and thus Gama does not explicitly “teach away” from potentially using these errors for other purposes.
Now let us consider Woods.  Based on Gama, a plurality of Naïve Bayes Classifiers with corresponding error-rates (first error) are disclosed.  Woods, Section 3.2, Last Sentence of Paragraph 1, discloses:  “When the individual classifiers disagree, local accuracy is estimated for each classifier, and the decision of the classifier with the highest local accuracy estimate is selected.”  A “highest accuracy” also means “lowest error”.  Thus, Woods discloses specifies a And thus in combination with Gama, discloses specifies, as the output node, a node which is among the input nodes and for which a corresponding first error among the first errors is minimal.)
Gama and Woods are analogous art because they are both in the field of endeavor of machine learning.
Therefore, it would have been obvious to a person having ordinary skill in the art, before the effective filing date of the invention, to combine the decision tree with a classifier at each node of Gama, with the choosing of the classifier with the highest accuracy of Woods.  One would have been motivated to do so in order to improve the accuracy of the prediction (Woods, Conclusion:  “We have shown that even if all the individual classifiers have been optimized, dynamic classifier selection by local accuracy is still capable of improving overall performance significantly. By contrast, simple voting techniques, and even a recently proposed CMC algorithm, were not able to show any significant improvement when the individual classifiers were sufficiently optimized. At times, some of the other CMC algorithms actually hurt performance. The proposed DCS-LA algorithm was always capable of improving performance.”)

As per Claim 4, Gama teaches the information processing device according to claim 1.  Recall above that Gama Section 3 discloses learning a predetermined set of pieces of to-be-learned (and subsequently, learned) data (“training”).  Recall also that Gama, Bottom of page 1361, discloses state spaces (“instance space”) which is also illustrated in GamaTutorial, and thus discloses learned data included in the state spaces. Recall that Gama, Section 3.1.5.1, discloses:  “Each tree in the forest makes a prediction”, and thus prediction output and therefore prediction output based on learned data included in the state spaces.  Recall that Gama, Section 3.2, discloses:  “The UFFT algorithm maintains, at each node of all decision trees, a naive-Bayes classifier. Those classifiers were constructed using the sufficient statistics needed to evaluate the splitting criteria when that node was a leaf. When the leaf becomes a node the naive-Bayes classifier will classify the examples that traverse the node. The basic idea of the drift detection method is to control this online error-rate. If the distribution of the examples that traverse a node is stationary, the error rate of naive-Bayes decreases. If there is a change on the distribution of the examples the naive-Bayes error will increase.”  Here, Gama discloses a tree, and thus that data corresponds to the corresponding input node.  Gama here also discloses that “error-rate” are each generated at a corresponding input node among the input nodes (“at each node of all decision trees”).  This is based on a difference between an output corresponding to the input data and a prediction output, as it is an error rate based on the output of a classifier, which provides a prediction output.
Gama also discloses an end prediction error at an end node among the input nodes.  An “end node” may be any node.  One could calculate these errors in any order they like, and the last node may be the “end node”, and thus the error is the “end prediction error”.  
Gama, Section 3.2, discloses:  “If the distribution of the examples that traverse a node is stationary, the error rate of naive-Bayes decreases. If there is a change on the distribution of the examples the naive-Bayes error will increase [17].”  Here, Gama discloses that the error rate increases or decreases, and thus it is being gradually updated, and so this algorithm is performed iteratively.  Thus, the second iteration may be produce a second error. Since the error calculation is the same calculation that produces the first error, this discloses second error based on first errors.
However, Gama does not teach corresponding error is minimal.
Woods teaches corresponding error is minimal.  (Woods, Section 3.2, Last Sentence of Paragraph 1, discloses:  “When the individual classifiers disagree, local accuracy is estimated for each classifier, and the decision of the classifier with the highest local accuracy estimate is selected.”  A “highest accuracy” also means “lowest error”.  Thus, Woods discloses specifies a classifier for which a corresponding first error among the first errors is minimal.)
This concept, when combined with Gama, results in calculating a second error for a node at an input node which is among the input nodes and for which a corresponding first error among the first errors is minimal.
The combination of Woods and Gama then teaches wherein the output-node specification unit makes a comparison in a magnitude relation for the end prediction error and the second error, and specifies, as the output node, the input node for which the corresponding first error is minimal when the second error is smaller than the end prediction error, otherwise, specifies, as the output node, the end node among the input nodes.  (As shown above, Gama discloses an end error, first error, and second error.  Woods discloses choosing the smallest error, which comprises makes a comparison in a magnitude relation (a complicated way of saying “comparing”).  Choosing the smallest error suggests specifies, as the output node, the input node for which the corresponding first error is minimal when the second error is smaller than the end prediction error, because if the second error is the second iteration of error corresponding to the given node, and that node’s first error is minimal, then the second error would also be minimal, which means less than all other nodes, including the end node.  When the end error is smaller than the second error, then it specifies, as the output node, the end node.  Again, this amounts to choosing the lowest error as suggested by Woods, since the end error which is smaller.)

As per Claim 6, Gama teaches the information processing device according to claim 5.  Recall above that Gama Section 3 discloses learning a predetermined set of pieces of to-be-learned (and subsequently, learned) data (“training”).  Recall also that Gama, Bottom of page 1361, discloses state spaces (“instance space”) which is also illustrated in GamaTutorial, and thus discloses learned data included in the state spaces. Recall that Gama, Section 3.1.5.1, discloses:  “Each tree in the forest makes a prediction”, and thus prediction output and therefore prediction output based on learned data included in the state spaces.  Recall that Gama, Section 3.2, discloses:  “The UFFT algorithm maintains, at each node of all decision trees, a naive-Bayes classifier. Those classifiers were constructed using the sufficient statistics needed to evaluate the splitting criteria when that node was a leaf. When the leaf becomes a node the naive-Bayes classifier will classify the examples that traverse the node. The basic idea of the drift detection method is to control this online error-rate. If the distribution of the examples that traverse a node is stationary, the error rate of naive-Bayes decreases. If there is a change on the distribution of the examples the naive-Bayes error will increase.”  Here, Gama discloses a tree, and thus that data corresponds to the corresponding input node.  Gama here also discloses that “error-rate” are each generated at a corresponding input node among the input nodes (“at each node of all decision trees”).  This is based on a difference between as it is an error rate based on the output of a classifier, which provides a prediction output.
Gama also discloses first errors and third errors.  Gama’s algorithm traverses the tree, and thus on any tree with over 3 layers there are at least first errors and third errors.  The first error has a corresponding input node and if the third error is performed on a node 2 levels down from the first error node, then that discloses third errors each generated at a corresponding input node on a layer among the layers that is lower than the corresponding input node.
However, Gama does not teach corresponding error is minimal.
Woods teaches corresponding error is minimal.  (Woods, Section 3.2, Last Sentence of Paragraph 1, discloses:  “When the individual classifiers disagree, local accuracy is estimated for each classifier, and the decision of the classifier with the highest local accuracy estimate is selected.”  A “highest accuracy” also means “lowest error”.  Thus, Woods discloses specifies a classifier for which a corresponding first error among the first errors is minimal.)
This concept, when combined with Gama, results in specifies, as the output node, a node which is among the input nodes and for which a condition that the corresponding first error is smaller than the corresponding third error is satisfied, since in this case one is choosing the node with the smallest error.

As per Claim 7, Gama teaches the information processing device according to claim 5.  Recall above that Gama Section 3 discloses learning a predetermined set of pieces of to-be-learned (and subsequently, learned) data (“training”).  Recall also that Gama, Bottom of page 1361, discloses state spaces (“instance space”) which is also illustrated in GamaTutorial, and thus discloses learned data included in the state spaces. Recall that Gama, Section 3.1.5.1, discloses:  “Each tree in the forest makes a prediction”, and thus prediction output and therefore prediction output based on learned data included in the state spaces.  Recall that Gama, Section 3.2, discloses:  “The UFFT algorithm maintains, at each node of all decision trees, a naive-Bayes classifier. Those classifiers were constructed using the sufficient statistics needed to evaluate the splitting criteria when that node was a leaf. When the leaf becomes a node the naive-Bayes classifier will classify the examples that traverse the node. The basic idea of the drift detection method is to control this online error-rate. If the distribution of the examples that traverse a node is stationary, the error rate of naive-Bayes decreases. If there is a change on the distribution of the examples the naive-Bayes error will increase.”  Here, Gama discloses a tree, and thus that data corresponds to the corresponding input node.  Gama here also discloses that “error-rate” are each generated at a corresponding input node among the input nodes (“at each node of all decision trees”).  This is based on a difference between an output corresponding to the input data and a prediction output, as it is an error rate based on the output of a classifier, which provides a prediction output.
Gama also discloses first errors and fourth errors and fifth errors.  Gama’s algorithm traverses the tree, and thus on any tree with over 5 layers there are at least first errors and fourth errors and fifth errors.  The first error has a corresponding input node and if the fourth and fifth errors are performed on nodes 3 and 4 levels down from the first error node, then that discloses fourth errors each generated at a corresponding input node on a layer among the layers that is lower than the corresponding input node and fifth errors each generated at a 
However, Gama does not teach corresponding error is minimal.
Woods teaches corresponding error is minimal.  (Woods, Section 3.2, Last Sentence of Paragraph 1, discloses:  “When the individual classifiers disagree, local accuracy is estimated for each classifier, and the decision of the classifier with the highest local accuracy estimate is selected.”  A “highest accuracy” also means “lowest error”.  Thus, Woods discloses specifies a classifier for which a corresponding first error among the first errors is minimal.)
This concept, when combined with Gama, results in specifies, as a node of interest, a node which is among the input nodes and for which a condition that the corresponding fourth error is smaller than the corresponding fifth error is satisfied, since in this case one is choosing the node with the smallest error.  In this case that the fourth error is smaller than the fifth error, then one determines, as a node of interest, a node which is among the input nodes and for which a corresponding first error is smaller than any other first error among first errors at nodes that are among the input nodes and that are lower than or same as the node.  Again, choosing the smallest first error is suggested by Woods.  Otherwise, if the fourth error is not smaller than the fifth error, proceed to a node among the input nodes that is located on a lower layer among the layers until the condition in which the corresponding fourth error is smaller than the corresponding fifth error is satisfied.  Iterating until the fourth error is smaller, is also suggested by Woods, as we are looking for the smallest error.  When the condition of the fourth error being smaller than the fifth error is satisfied, the output-node specification unit specifies the node of interest as the output node.  Again, here, we are choosing the smallest error.  While in contrast, when the condition is not satisfied, the output-node specification unit causes the comparison for the corresponding fourth error and the corresponding fifth error to sequentially proceed to a node among the input nodes that is located on a lower layer among the layers until the condition in which the corresponding fourth error is smaller than the corresponding fifth error is satisfied, and wherein, when there does not exist any node for which the condition that the corresponding fourth error is smaller than the corresponding fifth error is satisfied until an arrival at a node among the input nodes that is one layer higher than the end node, the output-node specification unit specifies the end node as the output node.  Here, if the smallest error is not found, then it defaults to the end node.  Gama discloses this as they state in the Abstract:  “Naive-Bayes in leaves are used to classify test examples”, here Gama discloses the leaf, or end node, being used for the output.)

As per Claim 8, the combination of Gama and Woods teaches the information processing device according to claim 1.  Gama teaches a reliable node specification unit and reliability index acquired by the reliability index acquisition unit and reliability from among the input nodes corresponding to the input node (Recall that Gama, Section 3.2, discloses:  “The UFFT algorithm maintains, at each node of all decision trees, a naive-Bayes classifier. Those classifiers were constructed using the sufficient statistics needed to evaluate the splitting criteria when that node was a leaf. When the leaf becomes a node the naive-Bayes classifier will classify the examples that traverse the node. The basic idea of the drift detection method is to control this online error-rate. If the distribution of the examples that traverse a node is stationary, the error rate of naive-Bayes decreases. If there is a change on the distribution of the examples the naive-Bayes error will increase.”  Here, Gama discloses an “error-rate”, and thus a reliable node specification unit and reliability index acquired by the reliability index acquisition unit.  Gama also here discloses reliability from among the input nodes corresponding to the input node (“classify the examples that traverse the node”)).
However, Gama does not teach selects a highly reliable node having highest reliability from among the input nodes corresponding to the input node
Woods teaches selects a highly reliable node having highest reliability from among the input nodes corresponding to the input node (Woods, Section 3.2, Last Sentence of Paragraph 1, discloses:  “When the individual classifiers disagree, local accuracy is estimated for each classifier, and the decision of the classifier with the highest local accuracy estimate is selected.”  A “highest accuracy” also means “lowest error”.  Thus, Woods discloses selects a highly reliable node having highest reliability from among the input nodes corresponding to the input node)
Gama teaches a calculation possibility determination unit that determines whether or not a node among the input nodes that is located on a layer among the layers that is one layer lower than the highly reliable node is a node for which appropriate calculation is possible (Recall that Gama discloses a classifier in each node in Gama, Abstract: “Decision nodes and leaves contain naive-Bayes classifiers playing different roles during the induction process. Naive-Bayes in leaves are used to classify test examples. Naive-Bayes in inner nodes play two different roles. They can be used as multivariate splitting-tests if chosen by the splitting criteria, and used to detect changes in the class-distribution of the examples that traverse the node.”  This is on multiple layers (“decision nodes and leaves”), which includes a node among .  An “appropriate calculation” is not a specific term and may be very broadly interpreted.  Examiner is interpreting this “appropriate calculation is possible” as “a calculation where the reliability is lower (error is higher) is the result”.  If this is not the result (reliability is higher), then obviously it is not possible that the calculation results in lower reliability.)
However, Gama does not teach and a selective output-node specification unit that specifies the highly reliable node as the output node that is the basis of the generation of the prediction output when the node that is located on the layer one layer lower than the highly reliable node is the node for which the appropriate calculation is possible, and specifies the node that is located on the layer one layer lower than the highly reliable node as the output node that is the basis of the generation of the prediction output when, in contrast, the node that is located on the layer one layer lower than the highly reliable node is not the node for which the appropriate calculation is possible.  
Woods teaches and a selective output-node specification unit that specifies the highly reliable node as the output node that is the basis of the generation of the prediction output when the node that is located on the layer one layer lower than the highly reliable node is the node for which the appropriate calculation is possible, and specifies the node that is located on the layer one layer lower than the highly reliable node as the output node that is the basis of the generation of the prediction output when, in contrast, the node that is located on the layer one layer lower than the highly reliable node is not the node for which the appropriate calculation is possible.  (Recall above, that the “appropriate calculation” was interpreted by Examiner as where the node that is located on the layer one layer lower than the highly reliable node had lower reliability.  Also recall that Gama teaches nodes and layers.  Recall that Woods, Section 3.2, Last Sentence of Paragraph 1, discloses:  “When the individual classifiers disagree, local accuracy is estimated for each classifier, and the decision of the classifier with the highest local accuracy estimate is selected.”  A “highest accuracy” also means “lowest error”.  Thus, if the node that is located on the layer one layer lower than the highly reliable node, is the node for which the appropriate calculation is possible, then that node one layer lower is not as reliable as the highly reliable node.  Therefore, the highest accuracy node, as suggested by Woods is the highly reliable node, and thus is disclosed that specifies the highly reliable node as the output node that is the basis of the generation of the prediction output.  Otherwise, the node that is located on the layer one layer lower than the highly reliable node is more reliable, and thus in this case the node that is located on the layer one layer lower than the highly reliable node is not the node for which the appropriate calculation is possible and Woods would choose the lower node as the highest accuracy node, and thus is disclosed specifies the node that is located on the layer one layer lower than the highly reliable node as the output node that is the basis of the generation of the prediction output.)

As per Claim 9, the combination of Gama and Woods teaches the information processing device according to claim 8.  Gama teaches wherein, in the determination by the calculation possibility determination unit on the possibility of the appropriate calculation, when a total number of pieces of to-be-learned data corresponding to the node that is located on the [highly reliable] node is larger than or equal to two, it is determined that the appropriate calculation is possible, and when the total number of the pieces of to-be- learned data corresponding to the node that is located on the layer one layer lower than the [highly reliable] node is one, it is determined that the appropriate calculation is impossible.  (Recall that Gama discloses a classifier in each node in Gama, Abstract: “Decision nodes and leaves contain naive-Bayes classifiers playing different roles during the induction process. Naive-Bayes in leaves are used to classify test examples. Naive-Bayes in inner nodes play two different roles. They can be used as multivariate splitting-tests if chosen by the splitting criteria, and used to detect changes in the class-distribution of the examples that traverse the node.”  Thus, Gama teaches nodes and layers, including node that is located on the layer one layer lower than the [highly reliable] node.  Gama also discloses in Section 3 Last Paragraph:  “To detect concept drift we maintain, at each inner node, a naive-Bayes classifier trained with the examples that traverse the node”.  Here, Gama discloses “trained with the examples”, with “examples” in the plural form.  This implies that the training, and thus reliability/error calculation, must involve two or more pieces of data.  Thus, Gama discloses for a total number of pieces of to-be-learned data that is larger than or equal to two, it is determined that the appropriate calculation is possible, whereas for one piece of data, it is determined that the appropriate calculation is impossible.)
However, Gama does not teach highly reliable node.
Woods teaches highly reliable node (Woods, Section 3.2, Last Sentence of Paragraph 1, discloses:  “When the individual classifiers disagree, local accuracy is estimated for each classifier, and the decision of the classifier with the highest local accuracy estimate is selected.”  A “highest accuracy” also means “lowest error”.  Thus, Woods discloses selects a highly reliable node having highest reliability from among the input nodes corresponding to the input node).

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Gama as further graphically evidenced by GamaTutorial with Woods in view of Liu et. al. (“FP-ELM: An online sequential learning algorithm for dealing with concept drift”; hereinafter Liu).   
As per Claim 3, the combination of Gama and Woods teaches the information processing device according to claim 2.  Note from 112(b) rejections that Examiner is interpreting “the predetermined learning processing” as “the learning a predetermined set of pieces of to-be-learned data”.  Gama teaches first error having been already obtained through the learning a predetermined set of pieces of to-be-learned data and prediction output based on the learned data included in the state spaces that corresponds to the corresponding input node.  Recall above that Gama Section 3.2 discloses a first error (“error-rate”) and Gama Section 3 discloses learning a predetermined set of pieces of to-be-learned data (“training”).  Recall also that Gama, Bottom of page 1361, discloses state spaces (“instance space”) which is also illustrated in GamaTutorial.  Recall that Gama, Section 3.1.5.1, discloses:  “Each tree in the forest makes a prediction”, and thus prediction output.
However, Gama does not teach wherein the each first error is updated by performing a weighting addition using a forgetting coefficient a (0 < a < 1) on the each first error and an absolute value of a difference between the output corresponding to the input data and a prediction output.
Liu teaches wherein the each first error is updated by performing a weighting addition using a forgetting coefficient a (0 < a < 1) on the each first error and an absolute value of a difference between the output corresponding to the input data and a prediction output.  (Liu, Pg 323, Right Column above Eq 5, discloses:  “ELM is to minimize the training error 
    PNG
    media_image3.png
    19
    53
    media_image3.png
    Greyscale
”.  Liu, Pg 324 Top right, discloses: “To pay more attention to the new data chunk, we give a forgetting parameter α1 (0<α1<1) to the old data chunkא0. Formally, the new output weight matrix β(1) is the solution to minimize

    PNG
    media_image4.png
    39
    385
    media_image4.png
    Greyscale
”
Here, L1 is a first error as it is to be “minimized”. It comprises a difference between the output corresponding to the input data and a prediction output
    PNG
    media_image3.png
    19
    53
    media_image3.png
    Greyscale
.  Note that the fact that the 
    PNG
    media_image5.png
    25
    70
    media_image5.png
    Greyscale
 is squared, makes the sign irrelevant, as x^2 = (-x)^2 = |x|^2, where |x| is the absolute value of x.  Thus ||H1B – T1||^2 = ||  |(H1B-T1)|  ||^2.  Therefore, here is disclosed an absolute value of a difference between the output corresponding to the input data and a prediction output.  Also note that alpha1 has been described as a forgetting coefficient (“forgetting parameter”) and (0<α1<1).  Finally, note that the loss function, first error L, is updated by performing a weighting addition, as 
    PNG
    media_image6.png
    22
    165
    media_image6.png
    Greyscale
 is an addition operation, and the first term in this addition is weighted by the forgetting coefficient alpha1.)
Gama, and Liu are analogous art because they are both in the field of endeavor of machine learning.
Therefore, it would have been obvious to a person having ordinary skill in the art, before the effective filing date of the invention, to combine the decision tree with a classifier at each 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Lee et. al. (US 2005/0144147 A1) discloses a decision tree with evaluation of the reliability of the tree nodes
Domingos et. al. ("Mining High-Speed Data Streams"), cited by the inventors’ paper (Numakura et. al.’s “FAD learning: Separate Learning for Three Accelerations -Learning for Dynamics of Boat through Motor Babbling”), lays the foundation for VFDT, a decision tree algorithm for data streams
Last ("Online classification of nonstationary data streams"), also cited by Numakura et. al. above, discloses on Page 9 Figure 1, mapping input attributes to nodes and layers in a state space
Anagnostopoulos et. al. ("Information-Theoretic Data Discarding for Dynamic Trees on Data Streams") discloses in Section 5, applying a forgetting factor Lambda [0,1] to decision trees
Zhang ("Flexible and Approximate Computation through State-Space Reduction") discloses applying decision trees to a state space
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEONARD A SIEGER whose telephone number is (571)272-9710.  The examiner can normally be reached on M-F 8:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on (571) 272-9767.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR 





/L.A.S./Examiner, Art Unit 2126   
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126