DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. Regarding Claims 1, 8, and 15, the phrase “proper subsets of the base classes” is a relative term which renders the claim indefinite. The phrase “proper subsets” is not defined by the claim in such a way as to give bounds to what qualifies as “proper”, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. For these reasons, Claims 1, 8, and 15 are indefinite. Claims 2-7, 9-14, and 16-20 are rejected by virtue of their dependency on Claims 1, 8, and 15.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5-10, 12-17, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Meyerson et al (US 2019/0130257), in view of Josang et al (“Uncertainty Characteristics of Subjective Opinions”, 21st International Conference On Information Fusion, (10 July 2018), 1998-2005).
Regarding Claim 1, Meyerson teaches a method (Fig. 1, Fig. 9) for classifying an object ([0126], Fig. 1, model 101 with encoder 102 and numerous decoders, training data, a trainer, and an initializer, [0131], encoder 102 can include a classification component, [0158], each decoder includes a classification component), the method comprising: 
receiving data of an object to be classified ([0127], Fig. 1, encoder 102 is a processor that receives information characterizing input data and generates an alternative representation and/or characterization of the input data, [0153], encoder 102 produces an output which is fed as input to each of the decoders); and 
determining, using a neural network ([0127-0131], Fig. 1, encoder 102 is a neural network, encoder 102 comprises learnable components, parameters, and hyperparameters, an activation component, and a classification component, [0154-0158], Fig. 1, each decoder is a neural network, encoder 102 comprises learnable components, parameters, and hyperparameters, an activation component, and a classification component), a classification of the object including an indication of the probabilities of classes ([0154], Fig. 1, each decoder is a processor that receives, from the encoder 102, information characterizing input data (such as the encoding) and generates an alternative representation and/or characterization of the input data, such as classification scores).
Meyerson fails to teach wherein the classification is a hyper-opinion classification of the object including an indication of the probabilities of base classes and composite classes that are “or” combinations of proper subsets of the base classes.
In the same field of endeavor, Josang teaches wherein the classification is a hyper-opinion classification of the object including an indication of the probabilities of base classes and composite classes that are “or” combinations of proper subsets of the base classes (Sec. III.C.1-2, hyper-opinion representation allow to represent multiple choices under a specific opinion value where belief mass is allowed to be assigned to a composite value consisting of a set of singleton values, where the overlapping of any composite beliefs are ignored and the domain is composed of mutually disjoint values).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the characterization of data using one or more neural network models for classification, as taught in Meyerson, to further include implementing hyper-opinion classification over composite classes of data, as taught in Josang, in order to allow for multiple choices to be represented under a specific opinion value, while considering and mitigating the vagueness and uncertainty associated with the opinion value. (See Josang Sec. III.C.1-2)
Regarding Claim 2, Meyerson, as modified by Josang, teaches all aspects of the claimed invention as disclosed in Claim 1 above. The combination, particularly Meyerson further teaches wherein the neural network is trained using a cost function that includes one or more of an entropy component, a penalty for selecting uncertainty over the composite classes, or a least squares component that includes a hyper parameter indicating cost for choosing a composite class of the composite classes ([0127-0129, 0154-0156], Fig. 1, encoder 102/each decoder is a neural network comprising learnable components, parameters, and hyperparameters that can be trained by backpropagating errors using an optimization algorithm, [0131, 0158], encoder 102/each decoder includes classification component, [0094], cross-entropy loss is used for all classification tasks).
Regarding Claim 3, Meyerson, as modified by Josang, teaches all aspects of the claimed invention as disclosed in Claim 2 above. The combination, particularly Meyerson further teaches wherein the cost function includes two or more of an entropy component, a penalty for selecting uncertainty over the composite classes, or a least squares component that includes a hyper parameter indicating cost for choosing a composite class of the composite classes ([0127-0129, 0154-0156], Fig. 1, encoder 102/each decoder is a neural network comprising learnable components, parameters, and hyperparameters that can be trained by backpropagating errors using an optimization algorithm, [0131, 0158], encoder 102/each decoder includes classification component, [0094], cross-entropy loss is used for all classification tasks).
Regarding Claim 5, Meyerson, as modified by Josang, teaches all aspects of the claimed invention as disclosed in Claim 1 above. The combination, particularly Meyerson further teaches wherein a target vector for training the neural network includes dependence of the composite on the base classes ([0114], supermodule refers to a sequence, arrangement, composition, and/or cascades of one or more modules, [0117], representing the hyperparameters values for each of the supermodules as vectors, embedding the vectors in a vector space, and clustering the vectors using a clustering algorithm such as Bayesian, K-means, or K-medoids algorithms, [0127-0129, 0154-0156], Fig. 1, encoder 102/each decoder is a neural network comprising learnable components, parameters, and hyperparameters that can be trained).
Regarding Claim 6, Meyerson, as modified by Josang, teaches all aspects of the claimed invention as disclosed in Claim 1 above. The combination, particularly Meyerson further teaches wherein the neural network includes an output layer that implements a soft+ function to determine the classification ([0130, 0157], encoder 102/ each decoder includes an activation component that applies a non-linearity function, such as a sigmoid function, rectified linear units (ReLUs), hyperbolic tangent function, absolute of hyperbolic tangent function, leaky ReLUs (LReLUs), and parametrized ReLUs).
Regarding Claim 7, Meyerson, as modified by Josang, teaches all aspects of the claimed invention as disclosed in Claim 1 above. While Meyerson teaches wherein training the neural network includes backpropagating error based on the classification accuracy ([0129, 0156], Fig. 1, encoder 102/each decoder comprises learnable components, parameters, and hyperparameters that can be trained by backpropagating errors using an optimization algorithm), Meyerson fails to teach projecting composite class probabilities and combining the projected composite class probabilities with base class probabilities to determine classification accuracy.
In the same field of endeavor, Josang teaches projecting composite class probabilities and combining the projected composite class probabilities with base class probabilities to determine classification accuracy (Sec. III.C.2., we can project a hyper-opinion into a multinomial opinion, excluding any overlapping of composite set beliefs, where approximation by projection removes information in the representation of opinions, specifically removing the vagueness of hyper-opinions with the sacrifice of estimation accuracy in the probability of a singleton opinion).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the characterization of data using one or more neural network models for classification, as taught in Meyerson, to further include implementing hyper-opinion classification over composite classes of data, where said hyper-opinion is further projected over base multinomial opinions, as taught in Josang, in order to allow for multiple choices to be represented under a specific opinion value, while allowing an opinion to be viewed without the veil of vagueness and facilitating a more direct and intuitive interpretation of the opinion. (See Josang Sec. III.C.1-2)
Regarding Claim 8, Meyerson teaches a non-transitory machine-readable medium including instructions that, when executed by a machine ([0187-0189], Fig. 9, system including memory and one or more processors), cause the machine (Fig. 1, Fig. 9) to perform operation for classifying an object ([0126], Fig. 1, model 101 with encoder 102 and numerous decoders, training data, a trainer, and an initializer, [0131], encoder 102 can include a classification component, [0158], each decoder includes a classification component), the operations comprising: 
receiving data of an object to be classified ([0127], Fig. 1, encoder 102 is a processor that receives information characterizing input data and generates an alternative representation and/or characterization of the input data, [0153], encoder 102 produces an output which is fed as input to each of the decoders); and 
determining, using a neural network ([0127-0131], Fig. 1, encoder 102 is a neural network, encoder 102 comprises learnable components, parameters, and hyperparameters, an activation component, and a classification component, [0154-0158], Fig. 1, each decoder is a neural network, encoder 102 comprises learnable components, parameters, and hyperparameters, an activation component, and a classification component), a classification of the object including an indication of the probabilities of classes ([0154], Fig. 1, each decoder is a processor that receives, from the encoder 102, information characterizing input data (such as the encoding) and generates an alternative representation and/or characterization of the input data, such as classification scores).
Meyerson fails to teach wherein the classification is a hyper-opinion classification of the object including an indication of the probabilities of base classes and composite classes that are “or” combinations of proper subsets of the base classes.
In the same field of endeavor, Josang teaches wherein the classification is a hyper-opinion classification of the object including an indication of the probabilities of base classes and composite classes that are “or” combinations of proper subsets of the base classes (Sec. III.C.1-2, hyper-opinion representation allow to represent multiple choices under a specific opinion value where belief mass is allowed to be assigned to a composite value consisting of a set of singleton values, where the overlapping of any composite beliefs are ignored and the domain is composed of mutually disjoint values).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the characterization of data using one or more neural network models for classification, as taught in Meyerson, to further include implementing hyper-opinion classification over composite classes of data, as taught in Josang, in order to allow for multiple choices to be represented under a specific opinion value, while considering and mitigating the vagueness and uncertainty associated with the opinion value. (See Josang Sec. III.C.1-2)
Regarding Claim 9, Meyerson, as modified by Josang, teaches all aspects of the claimed invention as disclosed in Claim 8 above. The combination, particularly Meyerson further teaches wherein the neural network is trained using a cost function that includes one or more of an entropy component, a penalty for selecting uncertainty over the composite classes, or a least squares component that includes a hyper parameter indicating cost for choosing a composite class of the composite classes ([0127-0129, 0154-0156], Fig. 1, encoder 102/each decoder is a neural network comprising learnable components, parameters, and hyperparameters that can be trained by backpropagating errors using an optimization algorithm, [0131, 0158], encoder 102/each decoder includes classification component, [0094], cross-entropy loss is used for all classification tasks).
Regarding Claim 10, Meyerson, as modified by Josang, teaches all aspects of the claimed invention as disclosed in Claim 9 above. The combination, particularly Meyerson further teaches wherein the cost function includes two or more of an entropy component, a penalty for selecting uncertainty over the composite classes, or a least squares component that includes a hyper parameter indicating cost for choosing a composite class of the composite classes ([0127-0129, 0154-0156], Fig. 1, encoder 102/each decoder is a neural network comprising learnable components, parameters, and hyperparameters that can be trained by backpropagating errors using an optimization algorithm, [0131, 0158], encoder 102/each decoder includes classification component, [0094], cross-entropy loss is used for all classification tasks).
Regarding Claim 12, Meyerson, as modified by Josang, teaches all aspects of the claimed invention as disclosed in Claim 8 above. The combination, particularly Meyerson further teaches wherein a target vector for training the neural network includes dependence of the composite on the base classes ([0114], supermodule refers to a sequence, arrangement, composition, and/or cascades of one or more modules, [0117], representing the hyperparameters values for each of the supermodules as vectors, embedding the vectors in a vector space, and clustering the vectors using a clustering algorithm such as Bayesian, K-means, or K-medoids algorithms, [0127-0129, 0154-0156], Fig. 1, encoder 102/each decoder is a neural network comprising learnable components, parameters, and hyperparameters that can be trained).
Regarding Claim 13, Meyerson, as modified by Josang, teaches all aspects of the claimed invention as disclosed in Claim 8 above. The combination, particularly Meyerson further teaches wherein the neural network includes an output layer that implements a soft+ function to determine the classification ([0130, 0157], encoder 102/ each decoder includes an activation component that applies a non-linearity function, such as a sigmoid function, rectified linear units (ReLUs), hyperbolic tangent function, absolute of hyperbolic tangent function, leaky ReLUs (LReLUs), and parametrized ReLUs).
Regarding Claim 14, Meyerson, as modified by Josang, teaches all aspects of the claimed invention as disclosed in Claim 8 above. While Meyerson teaches wherein training the neural network includes backpropagating error based on the classification accuracy ([0129, 0156], Fig. 1, encoder 102/each decoder comprises learnable components, parameters, and hyperparameters that can be trained by backpropagating errors using an optimization algorithm), Meyerson fails to teach projecting composite class probabilities and combining the projected composite class probabilities with base class probabilities to determine classification accuracy.
In the same field of endeavor, Josang teaches projecting composite class probabilities and combining the projected composite class probabilities with base class probabilities to determine classification accuracy (Sec. III.C.2., we can project a hyper-opinion into a multinomial opinion, excluding any overlapping of composite set beliefs, where approximation by projection removes information in the representation of opinions, specifically removing the vagueness of hyper-opinions with the sacrifice of estimation accuracy in the probability of a singleton opinion).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the characterization of data using one or more neural network models for classification, as taught in Meyerson, to further include implementing hyper-opinion classification over composite classes of data, where said hyper-opinion is further projected over base multinomial opinions, as taught in Josang, in order to allow for multiple choices to be represented under a specific opinion value, while allowing an opinion to be viewed without the veil of vagueness and facilitating a more direct and intuitive interpretation of the opinion. (See Josang Sec. III.C.1-2)
Regarding Claim 15, Meyerson teaches a system (Fig. 1, Fig. 9) for classifying an object ([0126], Fig. 1, model 101 with encoder 102 and numerous decoders, training data, a trainer, and an initializer, [0131], encoder 102 can include a classification component, [0158], each decoder includes a classification component), the system comprising: 
a memory ([0187-0189], Fig. 9, system including memory and one or more processors) including data specifying parameters of a neural network stored thereon ([0127-0131], Fig. 1, encoder 102 is a neural network, encoder 102 comprises learnable components, parameters, and hyperparameters, an activation component, and a classification component, [0154-0158], Fig. 1, each decoder is a neural network, encoder 102 comprises learnable components, parameters, and hyperparameters, an activation component, and a classification component); 
processing circuitry configured to: receive data of an object to be classified ([0127], Fig. 1, encoder 102 is a processor that receives information characterizing input data and generates an alternative representation and/or characterization of the input data, [0153], encoder 102 produces an output which is fed as input to each of the decoders); and 
determine, by executing the neural network on the received data, a classification of the object including an indication of the probabilities of classes  ([0154], Fig. 1, each decoder is a processor that receives, from the encoder 102, information characterizing input data (such as the encoding) and generates an alternative representation and/or characterization of the input data, such as classification scores).
Meyerson fails to teach wherein the classification is a hyper-opinion classification of the object including an indication of the probabilities of base classes and composite classes that are “or” combinations of proper subsets of the base classes.
In the same field of endeavor, Josang teaches wherein the classification is a hyper-opinion classification of the object including an indication of the probabilities of base classes and composite classes that are “or” combinations of proper subsets of the base classes (Sec. III.C.1-2, hyper-opinion representation allow to represent multiple choices under a specific opinion value where belief mass is allowed to be assigned to a composite value consisting of a set of singleton values, where the overlapping of any composite beliefs are ignored and the domain is composed of mutually disjoint values).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the characterization of data using one or more neural network models for classification, as taught in Meyerson, to further include implementing hyper-opinion classification over composite classes of data, as taught in Josang, in order to allow for multiple choices to be represented under a specific opinion value, while considering and mitigating the vagueness and uncertainty associated with the opinion value. (See Josang Sec. III.C.1-2) 
Regarding Claim 16, Meyerson, as modified by Josang, teaches all aspects of the claimed invention as disclosed in Claim 15 above. The combination, particularly Meyerson further teaches wherein the neural network parameters are trained using a cost function that includes one or more of an entropy component, a penalty for selecting uncertainty over the composite classes, or a least squares component that includes a hyper parameter indicating cost for choosing a composite class of the composite classes ([0127-0129, 0154-0156], Fig. 1, encoder 102/each decoder is a neural network comprising learnable components, parameters, and hyperparameters that can be trained by backpropagating errors using an optimization algorithm, [0131, 0158], encoder 102/each decoder includes classification component, [0094], cross-entropy loss is used for all classification tasks).
Regarding Claim 17, Meyerson, as modified by Josang, teaches all aspects of the claimed invention as disclosed in Claim 16 above. The combination, particularly Meyerson further teaches wherein the cost function includes two or more of an entropy component, a penalty for selecting uncertainty over the composite classes, or a least squares component that includes a hyper parameter indicating cost for choosing a composite class of the composite classes ([0127-0129, 0154-0156], Fig. 1, encoder 102/each decoder is a neural network comprising learnable components, parameters, and hyperparameters that can be trained by backpropagating errors using an optimization algorithm, [0131, 0158], encoder 102/each decoder includes classification component, [0094], cross-entropy loss is used for all classification tasks).
Regarding Claim 19, Meyerson, as modified by Josang, teaches all aspects of the claimed invention as disclosed in Claim 15 above. The combination, particularly Meyerson further teaches wherein a target vector for training the neural network includes dependence of the composite on the base classes ([0114], supermodule refers to a sequence, arrangement, composition, and/or cascades of one or more modules, [0117], representing the hyperparameters values for each of the supermodules as vectors, embedding the vectors in a vector space, and clustering the vectors using a clustering algorithm such as Bayesian, K-means, or K-medoids algorithms, [0127-0129, 0154-0156], Fig. 1, encoder 102/each decoder is a neural network comprising learnable components, parameters, and hyperparameters that can be trained).
Regarding Claim 20, Meyerson, as modified by Josang, teaches all aspects of the claimed invention as disclosed in Claim 15 above. While Meyerson teaches wherein training the neural network includes backpropagating error based on the classification accuracy ([0129, 0156], Fig. 1, encoder 102/each decoder comprises learnable components, parameters, and hyperparameters that can be trained by backpropagating errors using an optimization algorithm), Meyerson fails to teach projecting composite class probabilities and combining the projected composite class probabilities with base class probabilities to determine classification accuracy.
In the same field of endeavor, Josang teaches projecting composite class probabilities and combining the projected composite class probabilities with base class probabilities to determine classification accuracy (Sec. III.C.2., we can project a hyper-opinion into a multinomial opinion, excluding any overlapping of composite set beliefs, where approximation by projection removes information in the representation of opinions, specifically removing the vagueness of hyper-opinions with the sacrifice of estimation accuracy in the probability of a singleton opinion).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the characterization of data using one or more neural network models for classification, as taught in Meyerson, to further include implementing hyper-opinion classification over composite classes of data, where said hyper-opinion is further projected over base multinomial opinions, as taught in Josang, in order to allow for multiple choices to be represented under a specific opinion value, while allowing an opinion to be viewed without the veil of vagueness and facilitating a more direct and intuitive interpretation of the opinion. (See Josang Sec. III.C.1-2)

Claims 4, 11, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Meyerson et al (US 2019/0130257), in view of Josang et al (“Uncertainty Characteristics of Subjective Opinions”, 21st International Conference On Information Fusion, (10 July 2018), 1998-2005), and further in view of Dasgupta et al (US 2020/0143252).
Regarding Claims 4, 11, and 18, Meyerson, as modified by Josang, teaches all aspects of the claimed invention as disclose din Claims 3, 10, and 17 above. Meyerson further teaches wherein the cost function includes: an entropy component and a least squares component that includes a hyper parameter indicating cost for choosing a composite class of the composite classes ([0127-0129, 0154-0156], Fig. 1, encoder 102/each decoder is a neural network comprising learnable components, parameters, and hyperparameters that can be trained by backpropagating errors using an optimization algorithm, [0131, 0158], encoder 102/each decoder includes classification component, [0094], cross-entropy loss is used for all classification tasks).
The combination fails to teach wherein the cost function includes a penalty for selecting uncertainty over the composite classes.
In the same field of endeavor, Dasgupta teaches wherein the cost function includes a penalty for selecting uncertainty over the composite classes ([0060], a penalty term to the cost may be introduced, [0095], the orthogonality of the set of embeddings is optimized based on a cost function, wherein the cost function includes a penalty term).
It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the characterization of data using one or more neural network models for classification where the neural networks are trained in accordance with one or more cost functions, as taught in Meyerson, modified by Josang, to further include a penalty among the cost calculations, as taught in Dasgupta, in order to reduce complexity of deep learning while still being able to forecast and characterize uncertainty simultaneously. (See Dasgupta [0002-0006])

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: Kadayam Viswanathan et al (US 2021/0073631) discloses neural network systems and related machine learning methods that use a dual neural network architecture to determine uncertainties associated with predicted output data ([0007-0016]).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARGARET G MASTRODONATO whose telephone number is (571)270-7803. The examiner can normally be reached M-F 9:00-6:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Charles Appiah can be reached on (571) 272-7904. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MARGARET G MASTRODONATO/Primary Examiner, Art Unit 2641