DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Applicant’s response, filed 23 December 2021, to the last office action has been entered and made of record. 
In response to the cancellation of claims 6-7, 16-17, and 26-27, they are acknowledged and made of record.
In response to the amendments to the claims, they are acknowledged, supported by the original disclosure, and no new matter is added.
In response to the amendments to the claims, specifically addressing the rejections under 35 U.S.C. § 112 (a) / (pre-AIA ), first paragraph, of the previous Office action, the amended language has overcome the respective rejections, and the rejections have been withdrawn.

Response to Arguments
Applicant’s arguments, see p. 9-11 of Applicant’s reply, filed 23 December 2021, with respect to amended independent claims 1, 11, and 21 and corresponding dependent claims 3-4, 9-10, 13-14, 19-20, 23-24, and 29-30, which incorporate the amended subject matter, have been fully considered and are persuasive.  The rejections of 24 September 2021 have been withdrawn. 

Allowable Subject Matter
Claims 1, 3-4, 9-11, 13-14, 19-21, 23-24, and 29-30 allowed.
The following is an examiner’s statement of reasons for allowance: 
Regarding the subject matter of the amended independent claims 1, 11, and 21, the prior art of record, alone or in combination, fails to fairly teach or suggest, combined with the other recited claimed subject matter, the following limitations:
“operating at least two parallel, independent processing pipelines on a whole image to generate independent results, wherein the at least two parallel, independent processing pipelines includes both an entity processing pipeline and a whole image processing pipeline, wherein the entity processing pipeline operates on the whole image and uses a convolutional neural network (CNN) which scans the whole image to identify a number and type of entities in the whole image, resulting in an entity feature space, and wherein the whole image processing pipeline uses a CNN to extract visual features from the whole image, resulting in a visual feature space;
fusing the independent results of the entity and whole image processing pipelines to generate a fused scene class, such that in fusing the independent results to generate the fused scene class, two classifiers are trained separately for each of the visual feature space and entity features space to generate independent class probability distributions over scene types, with the independent class probability distributions being multiplied and renormalized to generate the fused scene class”.

Previously cited Lin, Wang, He, and Shen references were cited to suggest a system for assessing images using a deep convolutional neural network (DCNN) which uses two independent columns for a global and fine-grained view input to classify a scene feature for a  corresponding image (see Lin Fig. 3, [0025], [0033]-[0034], and [0053]), where Lin taught at least one layer of the first and second column are merged into a fully connected layer and the fully connected layer is jointly trained to classify at least one image feature, where the feature may be a scene of the image (see Lin [0029], [0046], and [0053]) and a probability of each input being assigned to a class for a particular feature and the results are averaged to determine the highest class to be selected (see Lin [0051]). 
However, the combined teachings of Lin, Wang, He, and Shen do not fairly teach, alone or in combination, that the independent results of the two columns of the DCNN are fused by generating independent class probability distributions over scene types by separately trained two classifiers for each of an equivalent to a visual feature space and entity features space, with the independent class probability distributions being multiplied and renormalized to generate a fused scene class. 

Further search and consideration of the prior art, failed to yield a fair teaching, alone or in combination, of the noted combination of claimed subject matter. 
Li et al. (CN 102622607, cited in the IDS dated 11 January 2022) is pertinent in teaching a remote sensing image classification method based on multi feature fusion, where support vector machines classifiers are trained upon extracted visual word bag features, color histogram features, and textural features of a training set of remote sensing images, and are used to classify visual word bag features, color histogram features, and textural features of a test image, and the resulting classification probabilities are combined with a weighted synthesis method to obtain the final remote sensing image classification result (see Li Abstract and [0079]-[0083]). While Li suggests the fusing of resulting classification probabilities using a weighted synthesis method, Li fails to teach that the features used for classification are from an equivalent to the claimed at least two parallel, independent processing pipelines on a whole image which use corresponding convolutional neural networks to scan the whole image to identify a number and type of entities in the whole image, resulting in an entity feature space, and extract visual features from the whole image, resulting in a visual feature space, and that two classifiers are trained separately for each of the visual feature space and entity features space to generate independent class probability distributions over scene types, with the independent class probability distributions being multiplied and renormalized to generate the fused scene class.
Dixit et al. (“Scene Classification with Semantic Fisher Vectors”) is pertinent in teaching a scene classification method which uses a convolutional neural network to extract a bag of semantics features from an image and used to form a semantic fisher vector representation for representing the scene of the image (see Dixit Fig. 3 and sect. 4. Semantic FV embedding), and further suggests combining the results with the results of a Places CNN trained on a large dataset of scene images as a simple concatenation for scene classification (see Dixit sect. 6.4 The Places CNN and sect. 7.4. Comparisons to the state of the art). As Dixit discloses that the results are combined as a simple concatenation of the results, Dixit fails to fairly teach alone or in combination, that the combining the results from the Places CNN and the semantic fisher vector based scene classification are multiplied and renormalized.
Xiong et al. (“Recognize Complex Events from Static Images by Fusing Deep Channels”) is pertinent in teaching an event recognition framework which uses an upper channel deep convolutional neural network to capture visual appearance of an image and a lower channel deep convolutional network to detect and capture interactions among humans and objects to output a fused representation (see Xiong Fig. 3, sect. 4.1 Model Visual Appearance with CNN, sect. 4.4 Detect Characterize Objects, and sect. 4.5 Channel Fusion). While Xiong suggests that visual appearance channel and detection channel are integrated and derive a fused representation, Xiong fails to further teach, alone or in combination, that the fused representation from the results of the two channels being multiplied and renormalized. 

Regarding claims 3-4, 9-10, 13-14, 19-20, 23-24, and 29-30, they are dependent claims of independent claims 1, 11, and 21, which incorporate the allowable subject matter of the respective independent claims they depend from, and are therefore allowed.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TIMOTHY WING HO CHOI whose telephone number is (571)270-3814. The examiner can normally be reached 9:00 AM to 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, VINCENT RUDOLPH can be reached on (571) 272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/TIMOTHY CHOI/Examiner, Art Unit 2661                                                                                                                                                                                                        

/VINCENT RUDOLPH/Supervisory Patent Examiner, Art Unit 2661