Detailed Action
This action is in response to Applicant's communications filed 14 September 2022.  
Claims 1, 10, and 15 were amended.  No claims were cancelled. No claims were withdrawn.  No claims were added.  Therefore, claims 1 and 5-15 are pending in this Application. 
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments/Arguments
Applicant's arguments, filed 14 September, regarding the rejections of claims 1 and 5-15 under 35 USC 102 and 103 have been fully considered but are moot because the arguments do not apply to any of the references being used in the current rejection.
Applicant’s arguments, filed 14 September, with respect to the rejections of claims 1 and 5-15 under 35 USC 102 and 35 USC 103 are regarding newly amended claims and are addressed in the current rejection. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1, 5, and 10-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Marszalek et al. (Semantic Hierarchies for Visual Object Recognition, hereinafter "Marszalek") in view of Farabet et al. (Learning Hierarchical Features for Scene Labeling, hereinafter "Farabet").

Regarding Claim 1,
Marszalek teaches an information processing apparatus comprising: 
an acquisition section configured to acquire a semantic network (Figure 2, WordNet subgraphs; "Knowledge can be modeled by ontologies. For example, lexical semantic networks are used to model human psycholinguistic knowledge. One of the most popular semantic networks for English language is WordNet [6]. It groups words into sets of synonyms and records different semantic relations between them. This allows to infer, for example, that a car is a wheeled vehicle and that a motorcycle is also a wheeled vehicle, thus both should incorporate a wheel." sec. 1, p. 1)
including information indicating a relationship between a plurality of nodes ("Thus, synsets model concepts and are represented with nodes in the semantic graph. Between synsets semantic relationships are deﬁned. Between nouns, antonymy (opposition in meaning), hypernymy/hyponymy (superterm/subterm) and holonymy/meronymy (is a part of/contains) relationships are possible. A synset can also create a domain (a topical class), to which other synsets are linked. Semantic relations are represented with directed edges (links) in the semantic graph." sec. 2.2, p. 3), 
acquire identification information of data, and acquire a label corresponding to each node of the plurality of nodes forming the semantic network (Figure 2; "As we focus on detection and need strong reasoning for training the hierarchic classiﬁer, we ﬁrst extract from the WordNet the synsets that correspond to the class labels and then follow the hypernymy and meronymy links to obtain the subgraph." sec. 2.2, p. 3); and 
a learning section configured to learn a classification model that classifies the data into the label corresponding to each node ("In order to explain the construction of the semantic hierarchic classiﬁer, we ﬁrst discuss a model framework in which a discriminative SVM classiﬁer (cf. subsection 2.1) is associated with each edge of the obtained semantic graph (cf. subsection 2.2)." sec. 2.3, p. 4), on a basis of the semantic network, the identification information, and a plurality of labels that have been acquired by the acquisition section (Figure 2; "As we focus on detection and need strong reasoning for training the hierarchic classiﬁer, we ﬁrst extract from the WordNet the synsets that correspond to the class labels and then follow the hypernymy and meronymy links to obtain the subgraph." sec. 2.2, p. 3), 
the classification model classifying of the data using a learning criterion as to whether a relationship between the plurality of the labels included in a classification result of the data conforms to the relationship between the plurality of nodes in the semantic network ("We train a given Bi|A classifier associated with the Bi->A hypernymy or meronymy edge by training a binary SVM classifier with P = supp(Bi) N = supp(A) - supp(Bi) (2) where P is the set of positive training exemplars and N is the set of negative ones. Given a test sample and knowing that it represents the A concept, we can then consider descending through hyponymy and holonymy links to Bi. We do so, when the detector associated with the Bi->A link returns a positive answer. For instance, if we know that a test image satisfies the organism concept, we can check whether it satisfies the person concept by running the person|organism classifier distinguishing between people and other organisms like animals." sec. 2.3, p. 4). 

Marszalek does not explicitly teach wherein the acquisition section and the learning section are each implemented via at least one processor.
Farabet teaches wherein the acquisition section and the learning section ("This paper presents a scene parsing system that relies on deep learning methods to approach both questions. The main idea is to use a convolutional network (ConvNet) [27] operating on a large input window to produce label hypotheses for each pixel location. The convolutional net is fed with raw image pixels (after band-pass filtering and contrast normalization), and trained in supervised mode from fully labeled images to produce a category for each pixel location. ConvNets are composed of multiple stages, each of which contains a filter bank module, a nonlinearity, and a spatial pooling module. With end-to-end training, ConvNets can automatically learn hierarchical feature representations." sec. 1, p. 1915) are each implemented via at least one processor ("Implementations on multicore machines, general-purpose GPUs, digital signal processors, or specialized architectures implemented on FPGAs is straightforward." p. 1927).
Marszalek and Farabet are analogous art because both are directed to semantic models for visual object recognition. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the semantic classifier of Marszalek with the semantic classifier of Farabet.  The modification would have been obvious because one of ordinary skill in the art would be motivated to improve accuracy and speed, as suggested by Farabet ("The system yields record accuracies... while being an order of magnitude faster than competing approaches" Abstract, p. 1915).

Regarding Claim 5,
The Marszalek/Farabet combination teaches the information processing apparatus of claim 1.  Marszalek further teaches wherein the learning section performs learning on a basis of a feedback to output information regarding a learning result ("We train a given Bi|A classifier associated with the Bi->A hypernymy or meronymy edge by training a binary SVM classifier with P = supp(Bi) N = supp(A) - supp(Bi) (2) where P is the set of positive training exemplars and N is the set of negative ones. Given a test sample and knowing that it represents the A concept, we can then consider descending through hyponymy and holonymy links to Bi. We do so, when the detector associated with the Bi->A link returns a positive answer. For instance, if we know that a test image satisfies the organism concept, we can check whether it satisfies the person concept by running the person|organism classifier distinguishing between people and other organisms like animals." sec. 2.3, p. 4; training the SVM classifier using positive and negative training examples teaches learning using feedback).

Regarding Claim 10,
The Marszalek/Farabet combination teaches the information processing apparatus of claim 5.  Farabet further teaches wherein the classification model is mounted by a neural network ("a convolutional network (ConvNet)" sec. 1, p. 1915), wherein the output information includes output values of one or more units included in the neural network ("ConvNets [26], [27] are trainable architectures composed of multiple stages. The input and output of each stage are sets of arrays called feature maps."), and
wherein the one or more units included in the neural network are each implemented via at least one processor ("Implementations on multicore machines, general-purpose GPUs, digital signal processors, or specialized architectures implemented on FPGAs is straightforward." p. 1927).
Marszalek and Farabet are analogous art because both are directed to semantic models for visual object recognition. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the semantic classifier of Marszalek with the semantic classifier of Farabet.  The modification would have been obvious because one of ordinary skill in the art would be motivated to improve accuracy and speed, as suggested by Farabet ("The system yields record accuracies... while being an order of magnitude faster than competing approaches" Abstract, p. 1915).

Regarding Claim 11,
The Marszalek/Farabet combination teaches the information processing apparatus of claim 10.  Farabet further teaches wherein the output information includes a clustering result of the output values ("We used the gPb hierarchies of Arbelaez et al., which are computed using spectral clustering to produce semantically consistent contours of objects." sec. 5.3, p. 1924). 
Marszalek and Farabet are analogous art because both are directed to semantic models for visual object recognition. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the semantic classifier of Marszalek with the semantic classifier of Farabet.  The modification would have been obvious because one of ordinary skill in the art would be motivated to improve accuracy and speed, as suggested by Farabet ("The system yields record accuracies... while being an order of magnitude faster than competing approaches" Abstract, p. 1915).

Regarding Claim 12,
The Marszalek/Farabet combination teaches the information processing apparatus of claim 10.  Farabet further teaches wherein the one or more units correspond to a plurality of units constituting an intermediate layer ("ConvNets [26], [27] are trainable architectures composed of multiple stages. The input and output of each stage are sets of arrays called feature maps. For example, if the input is a color image, each feature map would be a two-dimensional array containing a color channel of the input image (for an audio input, each feature map would be a one-dimensional array, and for a video or volumetric image, it would be a three-dimensional array). At the output, each feature map represents a particular feature extracted at all locations on the input. Each stage is composed of three layers: a filter bank layer, a nonlinearity layer, and a feature pooling layer. A typical ConvNet is composed of one, two, or three such three-layer stages, followed by a classification module. Because they are trainable, arbitrary input modalities can be modeled beyond natural images. " sec. 3.1, p. 1918; "we use a three-stage ConvNet. The first two layers of the network are composed of a bank of filters of size 7 x 7 followed by tanh units and 2 x 2 max-pooling operations. The last layer is a simple filter bank." sec. 5.1, p. 1923).
Marszalek and Farabet are analogous art because both are directed to semantic models for visual object recognition. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the semantic classifier of Marszalek with the semantic classifier of Farabet.  The modification would have been obvious because one of ordinary skill in the art would be motivated to improve accuracy and speed, as suggested by Farabet ("The system yields record accuracies... while being an order of magnitude faster than competing approaches" Abstract, p. 1915).

Regarding Claim 13,
The Marszalek/Farabet combination teaches the information processing apparatus of claim 10.  Farabet further teaches wherein the one or more units correspond to one unit of an intermediate layer ("ConvNets [26], [27] are trainable architectures composed of multiple stages. The input and output of each stage are sets of arrays called feature maps. For example, if the input is a color image, each feature map would be a two-dimensional array containing a color channel of the input image (for an audio input, each feature map would be a one-dimensional array, and for a video or volumetric image, it would be a three-dimensional array). At the output, each feature map represents a particular feature extracted at all locations on the input. Each stage is composed of three layers: a filter bank layer, a nonlinearity layer, and a feature pooling layer. A typical ConvNet is composed of one, two, or three such three-layer stages, followed by a classification module. Because they are trainable, arbitrary input modalities can be modeled beyond natural images. " sec. 3.1, p. 1918; "we use a three-stage ConvNet. The first two layers of the network are composed of a bank of filters of size 7 x 7 followed by tanh units and 2 x 2 max-pooling operations. The last layer is a simple filter bank." sec. 5.1, p. 1923)..
Marszalek and Farabet are analogous art because both are directed to semantic models for visual object recognition. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the semantic classifier of Marszalek with the semantic classifier of Farabet.  The modification would have been obvious because one of ordinary skill in the art would be motivated to improve accuracy and speed, as suggested by Farabet ("The system yields record accuracies... while being an order of magnitude faster than competing approaches" Abstract, p. 1915).

Regarding Claim 14,
The Marszalek/Farabet combination teaches the information processing apparatus of claim 5.  Farabet further teaches wherein the output information includes a co-occurrence histogram of the label ("A classifier is then applied to all the aggregated feature grids to produce a histogram of categories, the entropy of which measures the “impurity” of the segment. Each pixel is then labeled by the minimally impure node above it, which is the segment that best “explains” the pixel." Figure 4, p. 1920).
Marszalek and Farabet are analogous art because both are directed to semantic models for visual object recognition. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the semantic classifier of Marszalek with the semantic classifier of Farabet.  The modification would have been obvious because one of ordinary skill in the art would be motivated to improve accuracy and speed, as suggested by Farabet ("The system yields record accuracies... while being an order of magnitude faster than competing approaches" Abstract, p. 1915).

Regarding Claim(s) 15,
Claim(s) 15 recite(s) a method executed by a processor (Farabet: "processors" p. 1927) corresponding to the steps recited in claim(s) 1, respectively.  The Marszalek/Farabet combination teaches the limitations of claim(s) 15 as set forth above in connection with claim(s) 1.  Therefore, claim(s) 15 is/are rejected under the same rationale as respective claim(s) 1.

Claim(s) 6-9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Marszalek et al. (Semantic Hierarchies for Visual Object Recognition, hereinafter "Marszalek") in view of Farabet et al. (Learning Hierarchical Features for Scene Labeling, hereinafter "Farabet") and further in view of Mottaghi et al. (The Role of Context for Object Detection and Semantic Segmentation in the Wild, hereinafter "Mottaghi").

Regarding Claim 6,
The Marszalek/Farabet combination teaches the information processing apparatus according to claim 5.  The Marszalek/Farabet combination does not explicitly teach wherein the output information includes information that proposes an input of the semantic network that is new.
Mottaghi teaches wherein the output information includes information that proposes an input of the semantic network that is new ("Our dataset contains pixel-wise labels for the 10,103 trainval images of the PASCAL VOC 2010 detection challenge (Fig. 1 shows example labels). There are 540 categories in the dataset, divided into three types: (i) objects, (ii) stuff and (iii) hybrids. Objects are classes that are defined by shape." sec. 3, p. 2; "We provided the annotators with an initial set of 80 carefully chosen labels and asked them to include more classes if a region did not fit into any of these classes. We provided the annotators with an initial set of 80 carefully chosen labels and asked them to include more classes if a region did not fit into any of these classes." sec. 3, p. 2; adding more classes teaches information that proposes an input of the semantic network that is new).
Marszalek and Mottaghi are analogous art because they are both directed to image classification. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the image classifier of the Marszalek/Farabet combination with the feedback of Mottaghi.  The modification would have been obvious because one of ordinary skill in the art would be motivated to improve object detection, as suggested by Mottaghi ("We show that this contextual reasoning significantly helps in detecting objects at all scales" Abstract, p. 355).

Regarding Claim 7,
The Marszalek/Farabet/Mottaghi combination teaches the information processing apparatus according to claim 6.  Mottaghi further teaches wherein the output information includes information that proposes the semantic network that is new ("Our dataset contains pixel-wise labels for the 10,103 trainval images of the PASCAL VOC 2010 detection challenge (Fig. 1 shows example labels). There are 540 categories in the dataset, divided into three types: (i) objects, (ii) stuff and (iii) hybrids. Objects are classes that are defined by shape." sec. 3, p. 2; "We provided the annotators with an initial set of 80 carefully chosen labels and asked them to include more classes if a region did not fit into any of these classes. We provided the annotators with an initial set of 80 carefully chosen labels and asked them to include more classes if a region did not fit into any of these classes." sec. 3, p. 2; adding more classes indicates that the semantic network is new or incomplete).
Marszalek and Mottaghi are analogous art because they are both directed to image classification. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the image classifier of the Marszalek/Farabet combination with the feedback of Mottaghi.  The modification would have been obvious because one of ordinary skill in the art would be motivated to improve object detection, as suggested by Mottaghi ("We show that this contextual reasoning significantly helps in detecting objects at all scales" Abstract, p. 355).

Regarding Claim 8,
The Marszalek/Farabet/Mottaghi combination teaches the information processing apparatus according to claim 7.  Mottaghi further teaches wherein the output information includes information indicating the semantic network inferred from another label associated with other data ("Our dataset contains pixel-wise labels for the 10,103 trainval images of the PASCAL VOC 2010 detection challenge (Fig. 1 shows example labels). There are 540 categories in the dataset, divided into three types: (i) objects, (ii) stuff and (iii) hybrids. Objects are classes that are defined by shape." sec. 3, p. 2; "We provided the annotators with an initial set of 80 carefully chosen labels and asked them to include more classes if a region did not fit into any of these classes. We provided the annotators with an initial set of 80 carefully chosen labels and asked them to include more classes if a region did not fit into any of these classes." sec. 3, p. 2; adding more classes indicates that the semantic network is new or incomplete).
Marszalek and Mottaghi are analogous art because they are both directed to image classification. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the image classifier of the Marszalek/Farabet combination with the feedback of Mottaghi.  The modification would have been obvious because one of ordinary skill in the art would be motivated to improve object detection, as suggested by Mottaghi ("We show that this contextual reasoning significantly helps in detecting objects at all scales" Abstract, p. 355).

Regarding Claim 9,
The Marszalek/Farabet combination teaches the information processing apparatus according to claim 5.  The Marszalek/Farabet combination does not explicitly teach wherein the output information includes information that proposes association of the label that is new with the data.
Mottaghi teaches wherein the output information includes information that proposes association of the label that is new with the data ("Our dataset contains pixel-wise labels for the 10,103 trainval images of the PASCAL VOC 2010 detection challenge (Fig. 1 shows example labels). There are 540 categories in the dataset, divided into three types: (i) objects, (ii) stuff and (iii) hybrids. Objects are classes that are defined by shape." sec. 3, p. 2; "We provided the annotators with an initial set of 80 carefully chosen labels and asked them to include more classes if a region did not fit into any of these classes. We provided the annotators with an initial set of 80 carefully chosen labels and asked them to include more classes if a region did not fit into any of these classes." sec. 3, p. 2; adding more classes outside the 80 chosen labels teaches information proposes associations of new labels).
Marszalek and Mottaghi are analogous art because they are both directed to image classification. It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the image classifier of the Marszalek/Farabet combination with the feedback of Mottaghi.  The modification would have been obvious because one of ordinary skill in the art would be motivated to improve object detection, as suggested by Mottaghi ("We show that this contextual reasoning significantly helps in detecting objects at all scales" Abstract, p. 355).

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHARLES C KUO whose telephone number is (571)270-7477. The examiner can normally be reached M-F: 9:00 a.m. - 6:00 p.m..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on (571) 272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CHARLES C KUO/Examiner, Art Unit 2126  
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126