DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 04/20/2022 has been entered.

Response to Arguments
Applicant’s arguments, see page 11 of the reply, filed 04/20/2022, with respect to the rejection of claims 18-10, 12, and 22-24 are rejected under 35 U.S.C. 112(b) have been fully considered, but are not found persuasive for the following reasons. Applicant, at page 11 of the reply, indicates that FIG. 1 and corresponding description in the specification show the structures related to claim 8. However, FIG. 1 and the corresponding description, as well as FIG. 5 and its corresponding description, merely provide black box illustration of means-plus-function limitation and corresponding restatement of the function performed. Merely restating a function associated with a means-plus-function limitation is insufficient to provide the corresponding structure for definiteness. See MPEP 2181(II)(B) and 2181 (III). Therefore, the disclosure doesn’t provide the description necessary to support the limitations invoking 35 U.S.C. 112(f).
Applicant’s arguments, see pages 12-16 of the reply, filed 04/20/2022, with respect to the rejection of claims 1-3, 5-10, 12, 15-17, and 19-25 under 35 U.S.C. 101 have been fully considered and are persuasive.  Therefore, the rejections have been withdrawn. 
Applicant’s arguments, see page 17 of the reply, filed 04/20/2022, with respect to the rejection of claims 1, 5, and 21 under 35 U.S.C. 102(a)(1) as being anticipated by Rossi have been fully considered and are persuasive. Specifically, Applicant’s argument that Rossi doesn’t teach “normalizing and aggregating a respective set of similarity values associated with a respective input label” as recited in independent claims 1, 8 and 15 is found persuasive. However, a rejection is made in view of Singh et al., “Score Normalization and Aggregation for Active Learning in Multi-label Classification”.
Applicant’s argument at pages 16-17 that Rossi doesn’t teach “sampling the set of training vectors to select a subset of training vectors that represent the set of input labels” has been fully considered, but is not found persuasive for the following reasons. Applicant, at page 17 of the reply, asserts that Rossi provides mere recitation of using sampling. This is incorrect. Rossi not only indicates that sampling can performed, but indicates how this sampling occurs by stating “In particular, given a new test instance x we can sample a small fraction of training instances denoted by Ds via an arbitrary distribution F and use this smaller set for predicting labels for x.” Rossi at section 3.2. That is, Rossi states that sampling is used and, when sampling is performed, the sampled training set Ds – not the entire training set D - is used in the prediction of labels. Therefore, Rossi teaches “sampling the set of training vectors to select a subset of training vectors that represent the set of input labels” as recited in claims 1, 8, and 15. Further, as noted above, Rossi’s sampled training set Ds is used in the label prediction, which comprises the determining of similarity values. Rossi at section 3, 1st paragraph. Therefore, Rossi also teaches “determining a similarity value between a respective input label and a corresponding feature label in the subset of training vectors to generate a set of similarity values for the input label” as recited in independent claims 1, 8, and 15.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
“a similarity calculation logic block configured to…” and “a label ranking logic block configured to…” in claim 8; and
“a label size prediction logic block configured to…” in claim 12.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 8-10, 12, and 22-24 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim limitations “a similarity calculation logic block” and “a label ranking logic block” in claim 8,  and “a label size prediction logic block” in claim 12 each invoke 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. In particular, the specification is devoid of any structure that performs the cited functions of the “similarity calculation logic block”, “label ranking logic block”, and “label size prediction logic block” and merely restates the functions. Therefore, the claim is indefinite and is rejected under 35 U.S.C. 112(b) or pre-AIA  35 U.S.C. 112, second paragraph.
Applicant may:
(a)        Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph; 
(b)        Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(c)        Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)).
If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: 
(a)        Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or 
(b)        Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181.
Claims 9, 10, and 22-24 depend from claim 8 and are rejected for incorporating the defects noted above for claim 8.
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 8-10, 12, and 22-24 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. As described above regarding the rejection of claims 8-10, 12, and 22-24 under 35 U.S.C. 112(b), the disclosure doesn’t provide adequate structure to perform the claims functions of the “similarity calculation logic block” and “label ranking logic block” in claim 8 and “label size prediction logic block” in claim 12. The specification doesn’t demonstrate that applicant has made an invention that achieves that claimed function because the invention isn’t described with sufficient detail such that one or ordinary skill in the art can reasonably conclude that the inventor had possession of the claimed invention. Claims 9, 10, and 22-24 depend from claim 8 and are rejected for incorporating the defects of claim 8 indicated above.


 Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1, 2, 3, 5, and 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Rossi et al., “Similarity-based Multi-label Learning”, (herein Rossi) in view of Singh et al., “Score Normalization and Aggregation for Active Learning in Multi-label Classification” (herein Singh).

Regarding claim 1, Rossi teaches a method for facilitating multi-label classification for a machine learning mechanism, the method comprising: 
storing, by a computer system, a set of training vectors, wherein a respective vector represents an object, wherein a respective vector is associated with one or more feature labels that belong to a label set, and wherein a respective feature label corresponds to a feature of the object [Storing a training set, wherein a vector xi represents and instance (i.e. object) and is associated with a label set Yi with each label represents a class (i.e. feature) of the instance. Rossi at section 2, 1st paragraph; section 3.2.];
receiving, by the computer system, an input vector representing a human- recognizable input object, wherein the input vector is associated with a set of input labels, and wherein a respective input label corresponds to an input feature of the input object [Receiving unseen instance xi associated with a set of labels. Rossi at section 2, 1st paragraph; section 3, 1st paragraph.] and 
classifying the input object using the machine learning mechanism, independently of supervision, on the computer system [Predicting/classifying the set of labels for the unseen instance xi. Rossi at section 2, 1st paragraph; section 3, 1st paragraph.];
wherein the machine learning mechanism performs the classification based on the set of input labels by: 
sampling the set of training vectors to select a subset of training vectors that represent the set of input labels, thereby reducing computational complexity for subsequent operations [A fraction of the training instances is sampled via an arbitrary distribution and this subset Ds is used in predicting labels. Rossi at section 3.2, 2nd paragraph];
determining a similarity value between a respective input label and a corresponding feature label in the subset of training vectors to generate a set of similarity values for the input label [A similarity function is used to determine the similarity between a label k and a corresponding label in the subset of training vectors Ds.to generate a similarity value fk(xi). See Rossi at section 3, 1st paragraph], wherein the set of similarity values indicates a likelihood of an input feature corresponding to the input label being associated with the input object [Similarity fk(xi) is the confidence, or likelihood, of the label being associated with unseen input instance. See Rossi at section 3, 1st paragraph; section 2, 1st paragraph]; and
learning one or more input labels for the input object based on the corresponding sets of similarity values [Label set Yi is predicted for input xi based on the similarity values f(xi). Rossi at section 3, last paragraph], 
wherein the learning of the one or more input labels are directed to an unseen instance for the machine learning mechanism [The prediction/classification of the set of labels is for the unseen instance xi. Rossi at section 2, 1st paragraph; section 3, 1st paragraph],
thereby allowing the computer system to determine one or more input features of the input object based on the learning of the one or more input labels [The Similarity-based Multi-label Learning allows determining various features of the input based on the predicted label set Yi, such as determining scene features (i.e. desert, mountains, etc) for an input scene based on the predicted scene labels. See Rossi at section 3, 1st paragraph; section 4.2].
Rossi doesn’t teach normalizing and aggregating a respective set of similarity values associated with a respective input label such that the learning is based the corresponding normalized and aggregated set of similarity values. In the same field of multi-label learning, Singh teaches normalizing and aggregating a respective set of classification scores associated with a respective input label such that the learning is based the corresponding normalized and aggregated set of classification scores [Singh at section III(A), 1st paragraph; Algorithm 1]. Normalizing and aggregating scores in multi-label learning facilitates the selection of maximally informative examples that lead to good classification with minimal effort [See Singh at section II(A), 2nd paragraph; section III, 1st and 4th paragraphs]. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of then invention, to modify the multi-label classification of Rossi so that a respective set of classification scores (i.e. Rossi’s similarity values) are normalized and aggregated, as taught by Singh, such that the classification is performed by normalizing and aggregating a respective set of similarity values associated with a respective input label and the learning is based the corresponding normalized and aggregated set of similarity values in order to facilitate the selection of maximally informative examples that lead to good classification with minimal effort.

Regarding claim 2, Rossi, as modified, teaches the method of claim 1, wherein prior to learning the input, the method further comprises associating the respective normalized and aggregated set of similarity values to the corresponding feature label in the subset of training vectors [The normalized and aggregated scores (i.e. similarity scores) are used to select informative examples for classifiers and are therefore associated with corresponding feature labels in the training subset prior to learning the input. See Singh at 4th Page, left column, 3rd full paragraph; See also rejection of claim 1 above.].

Regarding claim 3, Rossi, as modified, teaches the method of claim 1, further comprising ranking the input labels based on corresponding normalized and aggregated sets of similarity values [The normalizing and aggregating used in the combination includes ranking the input labels based on corresponding normalized and aggregated sets of scores, wherein the scores in the combination are similarity values. See claim 1 rejection above; Singh at section III(A)(3) “Ranking Score Normalization” and Algorithm 1].  

Regarding claim 5, Rossi and Singh teach the method of claim 1, further comprising predicting a number of features associated with the object based on the corresponding sets of similarity values [The size of xi is predicted (i.e. number of features) based on the similarity values. Rossi at section 3.1, 1st and 2nd paragraphs].

Regarding claim 21, Rossi and Singh teach the method of claim 1, wherein the machine learning mechanism performs the classification based on the set of input labels further by learning a threshold function based on the set of training vectors; and wherein learning the one or more input labels further comprises applying the threshold function to the corresponding set of similarity values of a respective input label [The label set is inferred/predicted by learning a threshold function. Rossi at section 3.1 on page 4; Equation 13].


Claims 6 and 7 are rejected under 35 U.S.C. 103 as being unpatentable over Rossi and Singh and, further, in view of Tsoumakas et al., “Effective and Efficient Multilabel Classification in Domains with Large Numbers of Labels” (herein Tsoumakas).

Regarding claim 6, Rossi and Singh teach the method of claim 1. Rossi and Singh don’t teach that the feature labels are organized in one or more hierarchies. In the same field of multi-label learning, Tsoumakas teaches hierarchical multi-label classification using feature labels that are organized into one or more hierarchies [The labels are organized into a hierarchy. Tsoumakas at section 2, 1st – 2nd paragraphs; Fig. 1]. Using feature labels that organized into one or more hierarchies facilitates classification with improved performance and that is computationally efficient [Tsoumakas at Abstract]. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the multi-label classification of Rossi and Singh to use the hierarchical multi-label classification using feature labels that are organized in one or more hierarchies, as taught by Tsoumakas, such that the modified classifiers in the hierarchy are those of Rossi and Singh, in order facilitate multi-label learning with improved performance and that is computationally efficient.

Regarding claim 7, Rossi, as modified, teaches the method of claim 1. Rossi, as modified, doesn’t teach that subsequent to a high-level feature label being determined to be associated with the input vector, the method further comprises a second round of similarity-based determination process to determine one or more sub-labels corresponding to the high-level feature label [The hierarchical classification uses an iterative process (i.e. comprising a second round) to determine a subset of labels corresponding to the high-level feature label. Tsoumakas at section 2, 3rd – 5th paragraphs; Fig. 1. The classifiers are Rossi’s similarity-based classifier. See claim 6 rejection above.]. 


Claims 8-10, 12, 15-17, 19, 22, and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Rossi in view of Chidlovskii, U.S. Patent Application Publication No. 2011/0302111, (herein Chidlovskii) and, further, in view of Singh.

Regarding claim 8, Rossi teaches a system for facilitating multi-label classification for a machine learning mechanism, the system comprising: 
storage circuitry configurable to store a set of training vectors, wherein a respective vector represents an object, wherein a respective vector is associated with one or more feature labels that belong to a label set, and wherein a respective feature label corresponds to a feature of the object [Storing a training set, wherein a vector xi represents and instance (i.e. object) and is associated with a label set Yi with each label represents a class (i.e. feature) of the instance. Rossi at section 2, 1st paragraph; section 3.2.]; 
a similarity calculation logic block configured to receive an input vector representing a human-recognizable input object, wherein the input vector is associated with a set of input labels, wherein a respective input label corresponds to an input feature of the input object [Receiving unseen instance xi associated with a set of labels. Rossi at section 2, 1st paragraph; section 3, 1st paragraph.]; and 
a label ranking logic block configured to classify the input object using the machine learning mechanism, independently of supervision [Predicting/classifying the set of labels for the unseen instance xi. Rossi at section 2, 1st paragraph; section 3, 1st paragraph.], 
wherein the machine learning mechanism performs the classification based on the set of input labels by: 
sampling the set of training vectors to select a subset of training vectors that represent the set of input labels, thereby reducing computational complexity for subsequent operations [A fraction of the training instances is sampled via an arbitrary distribution and this subset Ds is used in predicting labels. Rossi at section 3.2, 2nd paragraph]; 
determining a similarity value between a respective input label and a corresponding feature label in the subset of training vectors to generate a set of similarity values for the input label [A similarity function is used to determine the similarity between a label k and a corresponding label in the subset of training vectors Ds.to generate a similarity value fk(xi). See Rossi at section 3, 1st paragraph], wherein the set of similarity values indicates a likelihood of an input feature corresponding to the input label being associated with the input object [Similarity fk(xi) is the confidence, or likelihood, of the label being associated with unseen input instance. See Rossi at section 3, 1st paragraph; section 2, 1st paragraph]; 
learning one or more input labels for the input object based on the corresponding sets of similarity values [Label set Yi is predicted for input xi based on the similarity values f(xi). Rossi at section 3, last paragraph], 
wherein the learning of the one or more input labels are directed to an unseen instance for the machine learning mechanism [The prediction/classification of the set of labels is for the unseen instance xi. Rossi at section 2, 1st paragraph; section 3, 1st paragraph], 
thereby allowing the compute system to determine one or more input features of the input object based on the learning of the one or more input labels [The Similarity-based Multi-label Learning allows determining various features of the input based on the predicted label set Yi, such as determining scene features (i.e. desert, mountains, etc) for an input scene based on the predicted scene labels. See Rossi at section 3, 1st paragraph; section 4.2].
Rossi doesn’t teach: processing circuitry; and normalizing and aggregating a respective set of similarity values associated with a respective input label such that the learning is based the corresponding normalized and aggregated set of similarity values. In the same field of multi-label learning, Chidlovskii teaches processing circuitry for embodying the classification system [Processing device 10. Chidlovskii at paragraph 28.]. A person of ordinary skill in the art would have recognized that a classification system requires processing circuitry to implement/execute the classification algorithm. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the system of Rossi to further include processing circuitry, as taught by Chidlovskii, in order to implement/execute the classification algorithm.
Rossi and Chidlovskii, as combined, don’t teach: normalizing and aggregating a respective set of similarity values associated with a respective input label such that the learning is based the corresponding normalized and aggregated set of similarity values. In the same field of multi-label learning, Singh teaches normalizing and aggregating a respective set of classification scores associated with a respective input label such that the learning is based the corresponding normalized and aggregated set of classification scores [Singh at section III(A), 1st paragraph; Algorithm 1]. Normalizing and aggregating scores in multi-label learning facilitates the selection of maximally informative examples that lead to good classification with minimal effort [See Singh at section II(A), 2nd paragraph; section III, 1st and 4th paragraphs]. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of then invention, to modify the multi-label classification of Rossi so that a respective set of classification scores (i.e. Rossi’s similarity values) are normalized and aggregated, as taught by Singh, such that the classification is performed by normalizing and aggregating a respective set of similarity values associated with a respective input label and the learning is based the corresponding normalized and aggregated set of similarity values in order to facilitate the selection of maximally informative examples that lead to good classification with minimal effort.

Regarding claim 9, Rossi, as modified, teaches the system of claim 8, wherein the label ranking logic block is further configured to associate the respective normalized and aggregated set of similarity values to the corresponding feature label in the subset of training vectors [The normalized and aggregated scores (i.e. similarity scores) are used to select informative examples for classifiers and are therefore associated with corresponding feature labels in the training subset prior to learning the input. See Singh at 4th Page, left column, 3rd full paragraph; See also rejection of claim 8 above.].

Regarding claim 10, Rossi, as modified, teaches the system of claim 8, wherein while determining the input labels, the label ranking logic block is further configured to rank the input labels based on corresponding normalized and aggregated set of similarity values [The normalizing and aggregating used in the combination includes ranking the input labels based on corresponding normalized and aggregated sets of scores, wherein the scores in the combination are similarity values. See claim 8 rejection above; Singh at section III(A)(3) “Ranking Score Normalization” and Algorithm 1].

Regarding claim 12, Rossi, as modified, teaches the system of claim 8, further comprising a label size prediction logic block configured to predict a number of features associated with the object based on the corresponding sets of similarity values [The size of xi is predicted (i.e. number of features) based on the similarity values. Rossi at section 3.1, 1st and 2nd paragraphs].

Regarding claim 15, Rossi teaches a method for facilitating multi-label classification for a machine learning mechanism, the method comprising: 
storing, by a computer system, a set of training vectors, wherein a respective vector represents an object, wherein a respective vector is associated with one or more feature labels that belong to a label set, and wherein a respective feature label corresponds to a feature of the object [Storing a training set, wherein a vector xi represents and instance (i.e. object) and is associated with a label set Yi with each label represents a class (i.e. feature) of the instance. Rossi at section 2, 1st paragraph; section 3.2.];
receiving, by the computer system, an input vector representing a human- recognizable input object, wherein the input vector is associated with a set of input labels, and wherein a respective input label corresponds to an input feature of the input object [Receiving unseen instance xi associated with a set of labels. Rossi at section 2, 1st paragraph; section 3, 1st paragraph.] and 
classifying the input object using the machine learning mechanism, independently of supervision, on the computer system [Predicting/classifying the set of labels for the unseen instance xi. Rossi at section 2, 1st paragraph; section 3, 1st paragraph.];
wherein the machine learning mechanism performs the classification based on the set of input labels by: 
sampling the set of training vectors to select a subset of training vectors that represent the set of input labels, thereby reducing computational complexity for subsequent operations [A fraction of the training instances is sampled via an arbitrary distribution and this subset Ds is used in predicting labels. Rossi at section 3.2, 2nd paragraph];
determining a similarity value between a respective input label and a corresponding feature label in the subset of training vectors to generate a set of similarity values for the input label [A similarity function is used to determine the similarity between a label k and a corresponding label in the subset of training vectors Ds.to generate a similarity value fk(xi). See Rossi at section 3, 1st paragraph], wherein the set of similarity values indicates a likelihood of an input feature corresponding to the input label being associated with the input object [Similarity fk(xi) is the confidence, or likelihood, of the label being associated with unseen input instance. See Rossi at section 3, 1st paragraph; section 2, 1st paragraph]; and
learning one or more input labels for the input object based on the corresponding sets of similarity values [Label set Yi is predicted for input xi based on the similarity values f(xi). Rossi at section 3, last paragraph], 
wherein the learning of the one or more input labels are directed to an unseen instance for the machine learning mechanism [The prediction/classification of the set of labels is for the unseen instance xi. Rossi at section 2, 1st paragraph; section 3, 1st paragraph],
thereby allowing the computer system to determine one or more input features of the input object based on the learning of the one or more input labels [The Similarity-based Multi-label Learning allows determining various features of the input based on the predicted label set Yi, such as determining scene features (i.e. desert, mountains, etc) for an input scene based on the predicted scene labels. See Rossi at section 3, 1st paragraph; section 4.2].
Rossi doesn’t teach: a non-transitory computer-readable storage medium storing instructions which when executed by a computer system cause the computer system to perform the above method; and normalizing and aggregating a respective set of similarity values associated with a respective input label such that the learning is based the corresponding normalized and aggregated set of similarity values. In the same field of multi-label learning, Chidlovskii teaches a non-transitory computer-readable storage medium storing instructions which when executed by a computer system cause the computer system to perform a method of multi-label learning [Processing device 10. Chidlovskii at paragraph 28.]. A person of ordinary skill in the art would have recognized that computer implemented classification requires stored instructions to implement/execute the classification algorithm. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the system of Rossi to further include a non-transitory computer-readable storage medium storing instructions which when executed by a computer system cause the computer system to perform the method of multi-label learning, as taught by Chidlovskii, in order to implement/execute the classification algorithm.
Rossi and Chidlovskii, as combined, don’t teach: normalizing and aggregating a respective set of similarity values associated with a respective input label such that the learning is based the corresponding normalized and aggregated set of similarity values. In the same field of multi-label learning, Singh teaches normalizing and aggregating a respective set of classification scores associated with a respective input label such that the learning is based the corresponding normalized and aggregated set of classification scores [Singh at section III(A), 1st paragraph; Algorithm 1]. Normalizing and aggregating scores in multi-label learning facilitates the selection of maximally informative examples that lead to good classification with minimal effort [See Singh at section II(A), 2nd paragraph; section III, 1st and 4th paragraphs]. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of then invention, to modify the multi-label classification of Rossi so that a respective set of classification scores (i.e. Rossi’s similarity values) are normalized and aggregated, as taught by Singh, such that the classification is performed by normalizing and aggregating a respective set of similarity values associated with a respective input label and the learning is based the corresponding normalized and aggregated set of similarity values in order to facilitate the selection of maximally informative examples that lead to good classification with minimal effort.

Regarding claim 16, Rossi, as modified, teaches the non-transitory computer-readable storage medium of claim 15, wherein prior to learning the input labels, the method further comprises associating the respective normalized and aggregated set of similarity values to the corresponding feature label in the subset of training vectors [The normalized and aggregated scores (i.e. similarity scores) are used to select informative examples for classifiers and are therefore associated with corresponding feature labels in the training subset prior to learning the input. See Singh at 4th Page, left column, 3rd full paragraph; See also rejection of claim 15 above.].

Regarding claim 17, Rossi, as modified, teaches the non-transitory computer-readable storage medium of claim 16, wherein the method further comprises ranking the input labels based on corresponding normalized and aggregated sets of similarity values [The normalizing and aggregating used in the combination includes ranking the input labels based on corresponding normalized and aggregated sets of scores, wherein the scores in the combination are similarity values. See claim 15 rejection above; Singh at section III(A)(3) “Ranking Score Normalization” and Algorithm 1].

Regarding claim 19, Rossi, as modified, teaches the non-transitory computer-readable storage medium of claim 15, wherein the method further comprises predicting a number of features associated with the object based on the corresponding sets of similarity values [The size of xi is predicted (i.e. number of features) based on the similarity values. Rossi at section 3.1, 1st and 2nd paragraphs].

Regarding claim 22, Rossi, as modified, teaches the system of claim 8, wherein the machine learning mechanism performs the classification based on the set of input labels further by learning a threshold function based on the set of training vectors; and wherein the label ranking logic block is further configured to learn the one or more input labels by applying the threshold function to the corresponding set of similarity values of respective input label [The label set is inferred/predicted by learning a threshold function. Rossi at section 3.1 on page 4; Equation 13].

Regarding claim 25, Rossi, as modified, teaches the non-transitory computer-readable storage medium of claim 15, wherein the machine learning mechanism performs the classification based on the set of input labels further by learning a threshold function based on the set of training vectors; and wherein the learning the one or more input labels further comprises applying the threshold function to the corresponding set of similarity values of a respective input label [The label set is inferred/predicted by learning a threshold function. Rossi at section 3.1 on page 4; Equation 13].


Claims 20, 23, and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Rossi,  Chidlovskii, and Singh and, further, in view of Tsoumakas.

Regarding claim 20, Rossi, as modified, teaches the non-transitory computer-readable storage medium of claim 15. Rossi, as modified, doesn’t teach that the feature labels are organized in one or more hierarchies. In the same field of multi-label learning, Tsoumakas teaches hierarchical multi-label classification using feature labels that are organized into one or more hierarchies [The labels are organized into a hierarchy. Tsoumakas at section 2, 1st – 2nd paragraphs; Fig. 1]. Using feature labels that organized into one or more hierarchies facilitates classification with improved performance and that is computationally efficient [Tsoumakas at Abstract]. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the multi-label classification of Rossi and Singh to use the hierarchical multi-label classification using feature labels that are organized in one or more hierarchies, as taught by Tsoumakas, such that the modified classifiers in the hierarchy are those of Rossi (as previously modified), in order facilitate multi-label learning with improved performance and that is computationally efficient.

Regarding claim 23, Rossi, as modified, teaches the system of claim 8. Rossi, as modified, doesn’t teach that the feature labels are organized in one or more hierarchies. In the same field of multi-label learning, Tsoumakas teaches hierarchical multi-label classification using feature labels that are organized into one or more hierarchies [The labels are organized into a hierarchy. Tsoumakas at section 2, 1st – 2nd paragraphs; Fig. 1]. Using feature labels that organized into one or more hierarchies facilitates classification with improved performance and that is computationally efficient [Tsoumakas at Abstract]. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the multi-label classification of Rossi and Singh to use the hierarchical multi-label classification using feature labels that are organized in one or more hierarchies, as taught by Tsoumakas, such that the modified classifiers in the hierarchy are those of Rossi (as previously modified), in order facilitate multi-label learning with improved performance and that is computationally efficient.

Regarding claim 24, Rossi, as modified, teaches the system of claim 23, wherein, subsequent to a high-level feature label being determined to be associated with the input vector, the similarity calculation logic block and label ranking logic block are configured to perform second round of similarity-based determination process to determine one or more sub-labels corresponding to the high-level feature label [The hierarchical classification uses an iterative process (i.e. comprising a second round) to determine a subset of labels corresponding to the high-level feature label. Tsoumakas at section 2, 3rd – 5th paragraphs; Fig. 1. The classifiers are Rossi’s similarity-based classifier. See claim 8 rejection above.].

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BENJAMIN P GEIB whose telephone number is (571)272-8628. The examiner can normally be reached Monday - Friday 8:30 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, ALEXEY SHMATOV can be reached on (571)270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/BENJAMIN P GEIB/Primary Examiner, Art Unit 2123