Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed 12/2/21 have been fully considered but they are not persuasive. 
In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986).
Applicant argues that none of the cited references, alone or in any combination, teaches, discloses, nor suggests a training or detection method for human recognition in a top-view image and then activating one or more lights of a lighting system based on a positive detection of the human in the top-view image, as now recited in amended claims 1 and 9.
In response to applicant's argument that Kwatra, Yuh, Savvides, and McClure are not concerned with human recognition in a top-view image, the fact that applicant has recognized another advantage which would flow naturally from following the suggestion of the prior art cannot be the basis for patentability when the differences would otherwise be obvious.  See Ex parte Obiaya, 227 USPQ 58, 60 (Bd. Pat. App. & Inter. 1985).
The test for obviousness is not whether the features of a secondary reference may be bodily incorporated into the structure of the primary reference.... Rather, the test is what the combined teachings of those references would have suggested to those of ordinary skill in the art.” In reKeller, 642 F.2d 413, 425, 208 USPQ 871, 881 (CCPA 1981). See also In reSneed, 710 
Applicant argues Kwatra particularly lacks the disclosure pertaining to top-view images, and further lacks the disclosure pertaining to identifying a person in such an image.
The Examiner respectfully submits that An-Ti teaches a training method for object recognition (e.g., see the training method illustrated in at least figure 1), the training method comprising:  providing at least one top-view training image(e.g., see top-view outlined in at figure 2b); aligning at least one training object present in the training image along a pre-set direction(this is met in at least I. Introduction, 3rd paragraph, "we rotate each radially oriented bounding-box containing a human body or non-human objects"); labelling at least one training object from the at least one training image using a pre-defined labelling scheme (this limitation is met by at least under section 2.5 - Collection of Training Samples).
Kwatra teaches generating  (i.e., identifying) a bounding contour that comprises the positive training object based on the labelling of the at least one positive training object (e.g., see at least “the computing device determines an initial set of bounding boxes for the image based on the plurality of segments.”-figures 5-9, abstract, col. 1 line 40- col. 2 line 64).
Applicant maintains that Yuh is not relevant prior art for identifying a person in a top-view image. Yuh discloses in Paragraph [0082] that reliable landmarks such as the attachment of the falx to the skull, the dorsum sella, the orbits or globes, or other skull or facial landmarks that are rarely altered or displaced by the presence of pathological intracranial conditions, can be used to aid in registration. Similarly, paragraph [0085] of Yuh discloses that "spatial information 
  In response to Applicant's argument that non-analogous art is not properly prior art; the Examiner cannot pick statements out of their proper context. In re Pagliaro, 657 F.2d 1219, 1225; 210 U.S.P.Q. 888, 892 (C.C.P.A. 1981)., it has been held that a prior art reference must either be in the field of applicant’s endeavor or, if not, then be reasonably pertinent to the particular problem with which the applicant was concerned, in order to be relied upon as a basis for rejection of the claimed invention.  See In re Oetiker, 977 F.2d 1443, 24 USPQ2d 1443 (Fed. Cir. 1992).  In this case, Yuh teaches wherein the at least one training object comprises two or more body landmarks (e.g., see defined landmarks in at least, 0082 and  0085 ); selecting the two or more body landmarks (e.g., see at least 0087 -0091 –features of interest); determining an aspect ratio based on the two or more body landmarks (i.e., “anchor boxes” of multiple scales); wherein the bounding contour is based on the aspect ratio prior to generating the bounding contour (e.g., Anchor boxes are a set of predefined bounding boxes of a certain height and width. These boxes are defined to capture the scale and aspect ratio of specific object classes you want 
Applicant further submits that Savvides discloses a method for facial landmark localization, which is contrary to Chiang because Savvides focuses on identifying parts of the face. Savvides particularly lacks the teaching or suggestion of identifying a person in a top-view image.
The Examiner submits that a reference may be said to teach away when a person of ordinary skill, upon reading the reference, would be discouraged from following the path set out in the reference, or would be led in a direction divergent from the path that was taken by the applicant.'" Ricoh Co., Ltd. v. Quanta Computer, Inc., 550 F.3d 1325, 1332 (Fed. Cir. 2008) (quoting Kahn, 441 F.3d at 990). A reference does not teach away if it merely expresses a general preference for an alternative invention from amongst options available to the ordinarily skilled artisan, and the reference does not discredit or discourage investigation into the invention claimed. In re Fulton, 391 F.3d 1195, 1201 (Fed. Cir. 2004). 
Applicant further submits that McClure teaches an image processing sensor system, but McClure particularly lacks the teaching or disclosure directed to identifying a person or a top of the head or anything of this nature in a top-view image.
McClure teaches activating and/or deactivating one or more lights of a lighting system based on the result of the object recognition (i.e., 0018 – “system elements includes a digital image capture and processing device that is able to capture digital images from the field of view, programmable circuitry that is configured to process and analyze the captured images locally in cooperation with data stored in or available to the device, and provide an output signal based on 
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include activating and/or deactivating one or more lights of a lighting system based on the result of the object recognition for the purpose of improving the selectivity and specificity of detection objects and events.
Applicant argues that the issue at hand in the present Application is to be able to identify a person from a top-view image, determine that a person is present in the image, and then activate a lighting system when such a person is present in the image(s). However, Applicant respectfully submits that several of the cited documents have nothing to do with identifying a person in a top-view image. Such references generally disclose object recognition, but do nothing to help one skilled in the art to identifying when a person has entered a room based on a top-view image. The Examiner cherry-picks from each particular reference to read on a given limitation in Applicant’s claims without considering the reference as a whole. It is respectfully
submitted that the Examiner may not pick and choose from the reference only those portions which support the Examiner’s position. /n re Pagliaro, 657 F.2d 1219, 1225; 210 U.S.P.Q. 888, 892 (C.C.P.A. 1981).  Moreover, Applicant further submits that one skilled in the art would not consider a prior art reference directed to brain imagery (Yuh) for purposes of determining whether a person is present in room based on a top-view image. Nor would one skilled in the art combine Yuh with any of the cited references to arrive at Applicant’s invention. “It is impermissible to use the claimed invention as an instruction manual or ‘template’ to piece together the teachings of the prior art so that the claimed invention is rendered obvious.” In re 
In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986).
Applicant’s arguments with respect to claim(s) have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Assuming arguendo that  several of the cited documents have nothing to do with identifying a person in a top-view image. Such references generally disclose object recognition, but do nothing to help one skilled in the art to identifying when a person has entered a room based on a top-view image as argued by the Applicant, based on an updated search to assist the Applicant,  the Examiner respectfully submits Tiwari WO/2014009920 teaches a novel human occupancy detection system method for use in lighting/HVAC control. The system uses the vision based algorithm to detect human head- top circles in top view videos captured by ceiling mounted camera (e.g., see at least the abstract). Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to try identifying when a person has entered a room based on a top-view image for the purpose of determining human occupancy as suggested by at least Tiwari. Furthermore, the cited art is directed towards object detection which includes human objects. Thus, the Applicant’s argument that disclosing object recognition has nothing to help one skilled in the art to identifying when a person has entered a room based on a top-view image is untenable.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-5, 9-12, 15 and 18-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over An-Ti CHIANG et al.; Human Detection in Fish-Eye Images Using HOG-Based Detectors An-Ti’ in view of Kwatra et al. US Patent No.: 9,483, 701, hereinafter, ‘Kwatra’ and further in view of Yuh et al. WO  2017/106645 A1, ‘Yuh’, hereinafter and further in view of Savvides et al. US Patent No.: 10,121,055 B1, hereinafter, ‘Savvides’ and further in view of McClure et al. US Patent Pub. No.: 2010/0214408, hereinafter, ‘McClure’ and further in view of Tiwari WO/2014009920.
 	Consider Claims 1 and 20, An-Ti teaches a training method for object recognition (e.g., see the training method illustrated in at least figure 1), the training method comprising: 
providing at least one top-view training image(e.g., see top-view outlined in at figure 2b); 
aligning at least one training object present in the training image along a pre-set direction(this is met in at least I. Introduction, 3rd paragraph, "we rotate each radially oriented bounding-box containing a human body or non-human objects"); labelling at least one training object from the at least one training image using a pre-defined labelling scheme (this limitation is met by at least under section 2.5 - Collection of Training Samples); extracting at least one feature vector for describing the content of the at least one labelled training object and at least one feature vector for describing at least one background scene( this limitation is met by at least figure 1 and 2.5 about positive and negative samples) and training a classifier model based on the extracted feature vectors (e.g., this limitation is met by at least figure 1 and 2.3 HOG+SVM for Each Detection Window ).
  	However, An-Ti does not specifically teach generating a bounding contour that comprises the positive training object based on the labelling of the at least one positive training object(i.e., identifying the object and then determining the bounding contour).
based on the plurality of segments.”-figures 5-9, abstract, col. 1 line 40- col. 2 line 64).
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include generating  (i.e., identifying) a bounding contour that comprises the positive training object based on the labelling of the at least one positive training object for the purpose of accurately detecting objects as suggested by Kwatra.
 	However, An-Ti does not specifically teach wherein the at least one training object comprises two or more body landmarks; selecting the two or more body landmarks; determining an aspect ratio based on the two or more body landmarks; wherein the bounding contour is based on the aspect ratio prior to generating the bounding contour.
 	In analogous art, Yuh teaches wherein the at least one training object comprises two or more body landmarks (e.g., see defined landmarks in at least, 0082 and  0085 ); selecting the two or more body landmarks (e.g., see at least 0087 -0091 –features of interest); determining an aspect ratio based on the two or more body landmarks (i.e., “anchor boxes” of multiple scales); wherein the bounding contour is based on the aspect ratio prior to generating the bounding contour (e.g., Anchor boxes are a set of predefined bounding boxes of a certain height and width. These boxes are defined to capture the scale and aspect ratio of specific object classes you want to detect and are typically chosen based on object sizes in your training datasets ).
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include wherein the at least one training object comprises two or more 
 	However, An-Ti , Kwatra and Yu does not appear to teach wherein the at least one training object comprises two or more body landmarks; selecting the two or more body landmarks; determining an aspect ratio of the at least one training object based on the two or more body landmarks; wherein the bounding contour is based on the aspect ratio prior to generating the bounding contour.
 	In analogous art, Savvides teaches wherein the at least one training object comprises two or more body landmarks; selecting the two or more body landmarks; determining an aspect ratio of the at least one training object based on the two or more body landmarks; wherein the bounding contour is based on the aspect ratio prior to generating the bounding contour (i.e., Savvides teaches in at least col. 7 lines 25-30- “During the training stage, we construct landmark, expression, and pose specific local appearance (texture) models for each landmark, including the seed landmarks. A crop of a fixed size is generated around the ground truth landmark locations and resized to a fixed size”. col. 15 line 47 – col. 16  line 14 – “For initializing the algorithms on the LFPW and COFW datasets, we used the bounding box initializations provided along with the datasets that were obtained using a face detector. Since our approach requires a bounding box that matches the aspect ratio of our training crops (a square crop that encloses most of the facial region), we converted the provided bounding boxes into square regions and also expanded the widths and heights of the regions by a factor of 1.5 to enclose the face. We must point out that our method is insensitive to the facial crop (since we train our algorithm without using a face detector to generate the training crops) and that we only carry out this normalization in order to ensure that the crop is a square region that has sufficient padding to allow for local texture crops to be generated and evaluated by our classifiers. Also, our initial search regions for the seed landmark candidates were fixed based on this choice of crop region, but can be changed if the crop region changes”. )
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include wherein the at least one training object comprises two or more body landmarks; selecting the two or more body landmarks; determining an aspect ratio of the at least one training object based on the two or more body landmarks; wherein the bounding contour is based on the aspect ratio prior to generating the bounding contour for the purpose of improved facial landmark localization and overcoming prior art methods of processing facial images.
 	An-Ti , Kwatra ,Yu or Savvides appear to teach activating and/or deactivating one or more lights of a lighting system based on the result of the object recognition.
 	In analogous art, McClure teaches activating and/or deactivating one or more lights of a lighting system based on the result of the object recognition (i.e., 0018 – “system elements includes a digital image capture and processing device that is able to capture digital images from the field of view, programmable circuitry that is configured to process and analyze the captured images locally in cooperation with data stored in or available to the device, and provide an output signal based on the outcome of the analysis. The output signal may, for example, conditionally activate a remote device, such as a lock, a light,…”).
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include activating and/or deactivating one or more lights of a lighting 
  	However, Assuming arguendo that the several cited documents have nothing to do with identifying a person in a top-view image. Such references generally disclose object recognition, but do nothing to help one skilled in the art to identifying when a person has entered a room based on a top-view image;
 	Tiwari teaches a novel human occupancy detection system method for use in lighting/HVAC control. The system uses the vision based algorithm to detect human head- top circles in top view videos captured by ceiling mounted camera (e.g., see at least the abstract).   
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to try identifying when a person has entered a room based on a top-view image for the purpose of determining human occupancy as suggested by at least Tiwari. 
 	Consider Claim 2, An-Ti teaches the training method according to claim 1 comprising a distortion correction step after providing the at least one top-view training image and before labelling the at least one training object from the at least one training image (e.g., see distortion correction and unwrapping as known from the prior art but having known disadvantages, see section 1, Introduction, end of first paragraph. – the prior art “first Warp the fish-eye view to normal view” i.e., the distortion would occur before subsequent steps).
 	Consider Claim 3, An-Ti teaches wherein aligning the training object present in the training image along the pre-set direction comprises unwrapping the training image(e.g., see distortion correction and unwrapping as known from the prior art but having known disadvantages, see section 1, Introduction, end of first paragraph. – the prior art “first Warp the fish-eye view to normal view” i.e., the distortion would occur before subsequent steps).
 	Consider Claim 4, An-Ti teaches wherein aligning the training object present in the training image along the pre-set direction comprises rotating the at least one training object (e.g., see at least section 2.2).
 	Consider Claim 5, An-Ti teaches wherein the labelled training object is resized to a standard window size (e.g., see at least section 1 "resampling all resulting images to the same size").
 	Consider Claim 9, An-Ti teaches a detection method for object recognition, the detection method comprising: providing at least one top-view test image(e.g., see top-view outlined in at figure 2b); applying a test window on the at least one test image e.g., this limitation is met by at least 2.3 HOG+SVM for Each Detection Window ); extracting at least one feature vector for describing the content of the test window( e.g., this limitation is met by at least figure 1 and 2.3 HOG+SVM for Each Detection Window ); applying the classifier model trained by a training method for object recognition on the at least one feature vector(e.g., this limitation is met by at least figure 1 and 2.3 HOG+SVM for Each Detection Window ) comprising: providing the at least one top-view training image(e.g., see top-view outlined in at figure 2b); aligning at least one training object present in the training image along a pre-set direction(this is met in at least I. Introduction, 3rd paragraph, "we rotate each radially oriented bounding-box containing a human body or non-human objects"); labelling the at least one training object from the at least one training image using a pre-defined labelling scheme(this limitation is met by at least under section 2.5 - Collection of Training Samples); extracting the at least one feature vector for describing the content of the at least one labelled training object and the at this limitation is met by at least figure 1 and 2.5 about positive and negative samples); and training the classifier model based on the extracted feature vectors(e.g., this limitation is met by at least figure 1 and 2.3 HOG+SVM for Each Detection Window ).
 	However, An-Ti , Kwatra and Yu does not appear to teach wherein the at least one training object comprises two or more body landmarks; selecting the two or more body landmarks; determining an aspect ratio of the at least one training object based on the two or more body landmarks; wherein the bounding contour is based on the aspect ratio prior to generating the bounding contour.
 	In analogous art, Savvides teaches wherein the at least one training object comprises two or more body landmarks; selecting the two or more body landmarks; determining an aspect ratio of the at least one training object based on the two or more body landmarks; wherein the bounding contour is based on the aspect ratio prior to generating the bounding contour (i.e., Savvides teaches in at least col. 7 lines 25-30- “During the training stage, we construct landmark, expression, and pose specific local appearance (texture) models for each landmark, including the seed landmarks. A crop of a fixed size is generated around the ground truth landmark locations and resized to a fixed size”. col. 15 line 47 – col. 16  line 14 – “For initializing the algorithms on the LFPW and COFW datasets, we used the bounding box initializations provided along with the datasets that were obtained using a face detector. Since our approach requires a bounding box that matches the aspect ratio of our training crops (a square crop that encloses most of the facial region), we converted the provided bounding boxes into square regions and also expanded the widths and heights of the regions by a factor of 1.5 to enclose the face. We must point out that our method is insensitive to the facial crop (since we train our algorithm without using a face detector to generate the training crops) and that we only carry out this normalization in order to ensure that the crop is a square region that has sufficient padding to allow for local texture crops to be generated and evaluated by our classifiers. Also, our initial search regions for the seed landmark candidates were fixed based on this choice of crop region, but can be changed if the crop region changes”. )
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include wherein the at least one training object comprises two or more body landmarks; selecting the two or more body landmarks; determining an aspect ratio of the at least one training object based on the two or more body landmarks; wherein the bounding contour is based on the aspect ratio prior to generating the bounding contour for the purpose of improved facial landmark localization and overcoming prior art methods of processing facial images.
 	An-Ti , Kwatra ,Yu or Savvides appear to teach activating and/or deactivating one or more lights of a lighting system based on the result of the object recognition.
 	In analogous art, McClure teaches activating and/or deactivating one or more lights of a lighting system based on the result of the object recognition (i.e., 0018 – “system elements includes a digital image capture and processing device that is able to capture digital images from the field of view, programmable circuitry that is configured to process and analyze the captured images locally in cooperation with data stored in or available to the device, and provide an output signal based on the outcome of the analysis. The output signal may, for example, conditionally activate a remote device, such as a lock, a light,…”).
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include activating and/or deactivating one or more lights of a lighting 
 	However, Assuming arguendo that the several cited documents have nothing to do with identifying a person in a top-view image. Such references generally disclose object recognition, but do nothing to help one skilled in the art to identifying when a person has entered a room based on a top-view image;
 	Tiwari teaches a novel human occupancy detection system method for use in lighting/HVAC control. The system uses the vision based algorithm to detect human head- top circles in top view videos captured by ceiling mounted camera (e.g., see at least the abstract).   
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to try identifying when a person has entered a room based on a top-view image for the purpose of determining human occupancy as suggested by at least Tiwari. 
 	Consider Claim 10, An-Ti teaches wherein applying the test window on the at least one test image; extracting the at least one feature vector for describing the content of the test window; and applying the classifier model trained by the training method for object recognition are repeated for different orientation angles of the test image provided in providing the at least one top-view test image (e.g., this is met by at least section 2.2.).
 	Consider Claim 11, An-Ti teaches wherein ROI samples resulting from applying a test window on the at least one test image are varied by resizing to different pre-selected sizes prior to extracting at least one feature vector for describing the content of the test window (e.g., this is met by multiple detection window size before feature extraction, see section 2.2, 2nd paragraph. Extrapolating feature vectors to resize the ROI samples is a well-known alternative).
Claim 12, An-Ti teaches wherein ROI samples resulting from applying the test window on the at least one test image are varied by resizing to different pre-selected sizes, feature vectors are extracted by extracting the at least one feature vector for describing the content of the test window from the varied ROI samples, and further feature vectors are calculated by extrapolation from these extracted feature vectors(e.g., this is met by multiple detection window size before feature extraction, see section 2.2, 2nd paragraph. Extrapolating feature vectors to resize the ROI samples is a well-known alternative).
  	Consider Claim 15, An-Ti teaches surveillance system comprising at least one vision-based camera sensor, wherein the surveillance system is adapted to perform the detection method according to claim 9(e.g., see at least the abstract).
 	Consider Claim 18, An-Ti , Kwatra and Yu teaches the claimed invention except wherein the aspect ratio is determined in real-time.
 	In analogous art, Savvides teaches wherein the aspect ratio is determined in real-time (e.g.,  col. 6 lines 27-34 – “the term “image” as used herein shall mean a physical image, a physical data file, stored on a storage media or stored on online storage, containing one or more images in any format, an image in any format obtained directly from an image sensor, such as a camera, in real time or otherwise, a scanned image, a single image or a video containing a series of image frames” ).
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include wherein the aspect ratio is determined in real-time for the purpose of improved facial landmark localization and overcoming prior art methods of processing facial images.
Claim 19, An-Ti , Kwatra and Yu teaches the claimed invention except wherein the aspect ratio corresponds to a real body’s aspect ratio of the at least one training object.
 	In analogous art, Savvides teaches wherein the aspect ratio corresponds to a real body’s aspect ratio of the at least one training object (i.e., the face – see at least the abstract).
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include wherein the aspect ratio corresponds to a real body’s aspect ratio of the at least one training object for the purpose of improved facial landmark localization and overcoming prior art methods of processing facial images.
Claims 6-8 and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over An-Ti CHIANG et al.; Human Detection in Fish-Eye Images Using HOG-Based Detectors Over Rotated Windows; 2014 IEEE International Conference on Multimedia and Expo (ICME); July 14,2014; pages 1-6, hereinafter, ‘An-Ti’ in view of Kwatra et al. US Patent No.: 9,483, 701, hereinafter, ‘Kwatra’ and further in view of Yuh et al. WO  2017/106645 A1, ‘Yuh’, hereinafter further in view of Savvides et al. US Patent No.: 10,121,055 B1, hereinafter, ‘Savvides’ and further in view of ARTHUR D. COSTEA et al.; Obstacle Localization and Recognition for Autonomous Forklifts Using Omnidirectional Stereovision; 2015 IEEE Intelligent Vehicles Symposium (IV); June 28,2015; pages 531-536, hereinafter, ‘Arthur’.
  	Consider Claims 6-7 and 13, An-Ti teaches the claimed invention except wherein extracting the at least one feature vector for describing the content of the at least one labelled training object and the at least one feature vector for describing the at least one background scene comprises extracting the at least one feature vector according to an Aggregated Channel Feature (ACF) scheme, and (claim 7) wherein the ACF scheme is a Grid ACF scheme.
claim 7) wherein the ACF scheme is a Grid ACF scheme (e.g., see at least section 3B wherein the extraction of feature according to an ACF scheme – the scheme is extracted on a grid base, "8 aggregated channels are obtained by computing an average for 4x4 pixel cells".).
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include wherein extracting the at least one feature vector for describing the content of the at least one labelled training object and the at least one feature vector for describing the at least one background scene comprises extracting the at least one feature vector according to an Aggregated Channel Feature (ACF) scheme, and (claim 7) wherein the ACF scheme is a Grid ACF scheme improving image recognition. 
 	 Consider Claim 8, An-Ti teaches the claimed invention except wherein the classifier model is a decision tree model.
	In analogous art, Arthur teaches wherein the classifier model is a decision tree model( see at least decision tree model in section 3B. Random Forest model is a well known equivalent classifier model.)
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include wherein the classifier model is a decision tree model improving image recognition. 
Claims 16-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over An-Ti CHIANG et al.; Human Detection in Fish-Eye Images Using HOG-Based Detectors Over An-Ti’ in view of Kwatra et al. US Patent No.: 9,483, 701, hereinafter, ‘Kwatra’ and further in view of Yuh et al. WO  2017/106645 A1, ‘Yuh’, hereinafter in view of Savvides et al. US Patent No.: 10,121,055 B1, hereinafter, ‘Savvides’ and further in view of McClure et al. US Patent Pub. No.: 2010/0214408, hereinafter, ‘McClure’ and and further in view of Tiwari WO/2014009920 further in view of Kajiya et al. US Patent No.: 5, 999, 189, hereinafter,  Kajiya.
 	Consider Claims 16 and 17,  An-Ti in view of Kwatra teaches the claimed invention except wherein the bounding contour is rotated along the pre-set direction (e.g., a vertical direction as outlined in claim).
 	In analogous art, Kajiya teaches wherein the bounding contour is rotated along the pre-set direction (e.g., see at least figures 15 and 16 where the bounding box is rotated to enclose the object ).
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include rotating a bounding contour for the purpose of capturing a particular image as suggested by Kajiya.  The Examiner further submits that the vertical direction is one of a select finite number of directions in which one would at least obviously try based on the obvious set of finite obvious options in order to capture a particular image based on the size and shape of the object of interest. 
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective 

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-5 , 9-12 and 14-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over An-Ti CHIANG et al.; Human Detection in Fish-Eye Images Using HOG-Based Detectors Over Rotated Windows; 2014 IEEE International Conference on Multimedia and Expo (ICME); July 14,2014; pages 1-6, hereinafter, ‘An-Ti’ in view of Kwatra et al. US Patent No.: 9,483, 701, hereinafter, ‘Kwatra’ and further in view of Yuh et al. WO  2017/106645 A1, ‘Yuh’, hereinafter and further in view of Rodrigues- Serrano et al. US Patent Pub. No.: 2014/02700350, 
 	Consider Claims 1, An-Ti teaches a training method for object recognition (e.g., see the training method illustrated in at least figure 1), the training method comprising: 
providing at least one top-view training image(e.g., see top-view outlined in at figure 2b); 
aligning at least one training object present in the training image along a pre-set direction(this is met in at least I. Introduction, 3rd paragraph, "we rotate each radially oriented bounding-box containing a human body or non-human objects"); labelling at least one training object from the at least one training image using a pre-defined labelling scheme (this limitation is met by at least under section 2.5 - Collection of Training Samples); extracting at least one feature vector for describing the content of the at least one labelled training object and at least one feature vector for describing at least one background scene( this limitation is met by at least figure 1 and 2.5 about positive and negative samples) and training a classifier model based on the extracted feature vectors (e.g., this limitation is met by at least figure 1 and 2.3 HOG+SVM for Each Detection Window ).
  	However, An-Ti does not specifically teach generating a bounding contour that comprises the positive training object based on the labelling of the at least one positive training object(i.e., identifying the object and then determining the bounding contour).
 	In analogous art, Kwatra teaches generating  (i.e., identifying) a bounding contour that comprises the positive training object based on the labelling of the at least one positive training object (e.g., see at least “the computing device determines an initial set of bounding boxes for the image based on the plurality of segments.”-figures 5-9, abstract, col. 1 line 40- col. 2 line 64).
i.e., identifying) a bounding contour that comprises the positive training object based on the labelling of the at least one positive training object for the purpose of accurately detecting objects as suggested by Kwatra.
 	However, An-Ti does not specifically teach wherein the at least one training object comprises two or more body landmarks; selecting the two or more body landmarks; determining an aspect ratio based on the two or more body landmarks; wherein the bounding contour is based on the aspect ratio prior to generating the bounding contour.
 	In analogous art, Yuh teaches wherein the at least one training object comprises two or more body landmarks (e.g., see defined landmarks in at least, 0082 and  0085 ); selecting the two or more body landmarks (e.g., see at least 0087 -0091 –features of interest); determining an aspect ratio based on the two or more body landmarks (i.e., “anchor boxes” of multiple scales); wherein the bounding contour is based on the aspect ratio prior to generating the bounding contour (e.g., Anchor boxes are a set of predefined bounding boxes of a certain height and width. These boxes are defined to capture the scale and aspect ratio of specific object classes you want to detect and are typically chosen based on object sizes in your training datasets ).
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include wherein the at least one training object comprises two or more body landmarks; selecting the two or more body landmarks; determining an aspect ratio based on the two or more body landmarks; wherein the bounding contour is based on the aspect ratio prior to generating the bounding contour for the purpose of improving object detection as taught by Yuh.
aspect ratio prior to generating the bounding contour.
 	In analogous art, Rodrigues- Serrano teaches wherein the at least one training object comprises two or more body landmarks; selecting the two or more body landmarks; determining an aspect ratio of the at least one training object based on the two or more body landmarks; wherein the bounding contour is based on the aspect ratio prior to generating the bounding contour (i.e., Rodrigues- Serrano teaches in at least [0075] - “the images in the training set may have been manually annotated with localization information 60, such as coordinates of a bounding box enclosing the object of interest… In some embodiments, the aspect ratio of the rectangle may be fixed. In other embodiments, the aspect ratio of the bounding box may vary between training images.”. )
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include wherein the at least one training object comprises two or more body landmarks; selecting the two or more body landmarks; determining an aspect ratio of the at least one training object based on the two or more body landmarks; wherein the bounding contour is based on the aspect ratio prior to generating the bounding contour for the purpose of improved object localization.
 	An-Ti , Kwatra ,Yu or Rodrigues- Serrano appear to teach activating and/or deactivating one or more lights of a lighting system based on the result of the object recognition.
., 0018 – “system elements includes a digital image capture and processing device that is able to capture digital images from the field of view, programmable circuitry that is configured to process and analyze the captured images locally in cooperation with data stored in or available to the device, and provide an output signal based on the outcome of the analysis. The output signal may, for example, conditionally activate a remote device, such as a lock, a light,…”).
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include activating and/or deactivating one or more lights of a lighting system based on the result of the object recognition for the purpose of improving the selectivity and specificity of detection objects and events.
 	However, Assuming arguendo that the several cited documents have nothing to do with identifying a person in a top-view image. Such references generally disclose object recognition, but do nothing to help one skilled in the art to identifying when a person has entered a room based on a top-view image;
 	Tiwari teaches a novel human occupancy detection system method for use in lighting/HVAC control. The system uses the vision based algorithm to detect human head- top circles in top view videos captured by ceiling mounted camera (e.g., see at least the abstract).   
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to try identifying when a person has entered a room based on a top-view image for the purpose of determining human occupancy as suggested by at least Tiwari. 

Claim 2, An-Ti teaches the training method according to claim 1 comprising a distortion correction step after providing the at least one top-view training image and before labelling the at least one training object from the at least one training image (e.g., see distortion correction and unwrapping as known from the prior art but having known disadvantages, see section 1, Introduction, end of first paragraph. – the prior art “first Warp the fish-eye view to normal view” i.e., the distortion would occur before subsequent steps).
 	Consider Claim 3, An-Ti teaches wherein aligning the training object present in the training image along the pre-set direction comprises unwrapping the training image(e.g., see distortion correction and unwrapping as known from the prior art but having known disadvantages, see section 1, Introduction, end of first paragraph. – the prior art “first Warp the fish-eye view to normal view” i.e., the distortion would occur before subsequent steps).
 	Consider Claim 4, An-Ti teaches wherein aligning the training object present in the training image along the pre-set direction comprises rotating the at least one training object (e.g., see at least section 2.2).
 	Consider Claim 5, An-Ti teaches wherein the labelled training object is resized to a standard window size (e.g., see at least section 1 "resampling all resulting images to the same size").
 	Consider Claim 9, An-Ti teaches a detection method for object recognition, the detection method comprising: providing at least one top-view test image(e.g., see top-view outlined in at figure 2b); applying a test window on the at least one test image e.g., this limitation is met by at least 2.3 HOG+SVM for Each Detection Window ); extracting at least one feature vector for describing the content of the test window( e.g., this limitation is met by at least figure 1 and 2.3 HOG+SVM for Each Detection Window ); applying the classifier model trained by a training method for object recognition on the at least one feature vector(e.g., this limitation is met by at least figure 1 and 2.3 HOG+SVM for Each Detection Window ) comprising: providing the at least one top-view training image(e.g., see top-view outlined in at figure 2b); aligning at least one training object present in the training image along a pre-set direction(this is met in at least I. Introduction, 3rd paragraph, "we rotate each radially oriented bounding-box containing a human body or non-human objects"); labelling the at least one training object from the at least one training image using a pre-defined labelling scheme(this limitation is met by at least under section 2.5 - Collection of Training Samples); extracting the at least one feature vector for describing the content of the at least one labelled training object and the at least one feature vector for describing at least one background scene( this limitation is met by at least figure 1 and 2.5 about positive and negative samples); and training the classifier model based on the extracted feature vectors(e.g., this limitation is met by at least figure 1 and 2.3 HOG+SVM for Each Detection Window ).
 	However, An-Ti , Kwatra and Yu does not appear to teach wherein the at least one training object comprises two or more body landmarks; selecting the two or more body landmarks; determining an aspect ratio of the at least one training object based on the two or more body landmarks; wherein the bounding contour is based on the aspect ratio prior to generating the bounding contour.
 	In analogous art, Rodrigues- Serrano teaches wherein the at least one training object comprises two or more body landmarks; selecting the two or more body landmarks; determining an aspect ratio of the at least one training object based on the two or more body landmarks; wherein the bounding contour is based on the aspect ratio prior to generating the bounding [0075] - “the images in the training set may have been manually annotated with localization information 60, such as coordinates of a bounding box enclosing the object of interest… In some embodiments, the aspect ratio of the rectangle may be fixed. In other embodiments, the aspect ratio of the bounding box may vary between training images.”. )
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include wherein the at least one training object comprises two or more body landmarks; selecting the two or more body landmarks; determining an aspect ratio of the at least one training object based on the two or more body landmarks; wherein the bounding contour is based on the aspect ratio prior to generating the bounding contour for the purpose of improved object localization.
 	An-Ti , Kwatra ,Yu or Rodrigues- Serrano appear to teach activating and/or deactivating one or more lights of a lighting system based on the result of the object recognition.
 	In analogous art, McClure teaches activating and/or deactivating one or more lights of a lighting system based on the result of the object recognition (i.e., 0018 – “system elements includes a digital image capture and processing device that is able to capture digital images from the field of view, programmable circuitry that is configured to process and analyze the captured images locally in cooperation with data stored in or available to the device, and provide an output signal based on the outcome of the analysis. The output signal may, for example, conditionally activate a remote device, such as a lock, a light,…”).
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include activating and/or deactivating one or more lights of a lighting 
 	However, Assuming arguendo that the several cited documents have nothing to do with identifying a person in a top-view image. Such references generally disclose object recognition, but do nothing to help one skilled in the art to identifying when a person has entered a room based on a top-view image;
 	Tiwari teaches a novel human occupancy detection system method for use in lighting/HVAC control. The system uses the vision based algorithm to detect human head- top circles in top view videos captured by ceiling mounted camera (e.g., see at least the abstract).   
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to try identifying when a person has entered a room based on a top-view image for the purpose of determining human occupancy as suggested by at least Tiwari. 
 	Consider Claim 10, An-Ti teaches wherein applying the test window on the at least one test image; extracting the at least one feature vector for describing the content of the test window; and applying the classifier model trained by the training method for object recognition are repeated for different orientation angles of the test image provided in providing the at least one top-view test image (e.g., this is met by at least section 2.2.).
 	Consider Claim 11, An-Ti teaches wherein ROI samples resulting from applying a test window on the at least one test image are varied by resizing to different pre-selected sizes prior to extracting at least one feature vector for describing the content of the test window (e.g., this is met by multiple detection window size before feature extraction, see section 2.2, 2nd paragraph. Extrapolating feature vectors to resize the ROI samples is a well-known alternative).
Claim 12, An-Ti teaches wherein ROI samples resulting from applying the test window on the at least one test image are varied by resizing to different pre-selected sizes, feature vectors are extracted by extracting the at least one feature vector for describing the content of the test window from the varied ROI samples, and further feature vectors are calculated by extrapolation from these extracted feature vectors(e.g., this is met by multiple detection window size before feature extraction, see section 2.2, 2nd paragraph. Extrapolating feature vectors to resize the ROI samples is a well-known alternative).
 	Consider Claim 15, An-Ti teaches surveillance system comprising at least one vision-based camera sensor, wherein the surveillance system is adapted to perform the detection method according to claim 9(e.g., see at least the abstract).
Claims 6-8 and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over An-Ti CHIANG et al.; Human Detection in Fish-Eye Images Using HOG-Based Detectors Over Rotated Windows; 2014 IEEE International Conference on Multimedia and Expo (ICME); July 14,2014; pages 1-6, hereinafter, ‘An-Ti’ in view of Kwatra et al. US Patent No.: 9,483, 701, hereinafter, ‘Kwatra’ and further in view of Yuh et al. WO  2017/106645 A1, ‘Yuh’, hereinafter in view of Rodrigues- Serrano et al. US Patent Pub. No.: 2014/02700350, hereinafter, ‘Rodrigues- Serrano’ and further in view of McClure et al. US Patent Pub. No.: 2010/0214408, hereinafter, ‘McClure’ and further in view of ARTHUR D. COSTEA et al.; Obstacle Localization and Recognition for Autonomous Forklifts Using Omnidirectional Stereovision; 2015 IEEE Intelligent Vehicles Symposium (IV); June 28,2015; pages 531-536, hereinafter, ‘Arthur’.
  	Consider Claims 6-7 and 13, An-Ti teaches the claimed invention except wherein extracting the at least one feature vector for describing the content of the at least one labelled claim 7) wherein the ACF scheme is a Grid ACF scheme.
 	In analogous art, Arthur teaches wherein extracting the at least one feature vector for describing the content of the at least one labelled training object and the at least one feature vector for describing the at least one background scene comprises extracting the at least one feature vector according to an Aggregated Channel Feature (ACF) scheme, and (claim 7) wherein the ACF scheme is a Grid ACF scheme (e.g., see at least section 3B wherein the extraction of feature according to an ACF scheme – the scheme is extracted on a grid base, "8 aggregated channels are obtained by computing an average for 4x4 pixel cells".).
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include wherein extracting the at least one feature vector for describing the content of the at least one labelled training object and the at least one feature vector for describing the at least one background scene comprises extracting the at least one feature vector according to an Aggregated Channel Feature (ACF) scheme, and (claim 7) wherein the ACF scheme is a Grid ACF scheme improving image recognition. 
 	 Consider Claim 8, An-Ti teaches the claimed invention except wherein the classifier model is a decision tree model.
	In analogous art, Arthur teaches wherein the classifier model is a decision tree model( see at least decision tree model in section 3B. Random Forest model is a well known equivalent classifier model.)
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include wherein the classifier model is a decision tree model improving image recognition. 
Claims 16-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over An-Ti CHIANG et al.; Human Detection in Fish-Eye Images Using HOG-Based Detectors Over Rotated Windows; 2014 IEEE International Conference on Multimedia and Expo (ICME); July 14,2014; pages 1-6, hereinafter, ‘An-Ti’ in view of Kwatra et al. US Patent No.: 9,483, 701, hereinafter, ‘Kwatra’ and further in view of Yuh et al. WO  2017/106645 A1, ‘Yuh’, hereinafter in view of Rodrigues- Serrano et al. US Patent Pub. No.: 2014/02700350, hereinafter, ‘Rodrigues- Serrano’ and further in view of Kajiya et al. US Patent No.: 5, 999, 189, hereinafter,  Kajiya.  
 	Consider Claims 16 and 17,  An-Ti in view of Kwatra teaches the claimed invention except wherein the bounding contour is rotated along the pre-set direction (e.g., a vertical direction as outlined in claim).
 	In analogous art, Kajiya teaches wherein the bounding contour is rotated along the pre-set direction (e.g., see at least figures 15 and 16 where the bounding box is rotated to enclose the object ).
 	Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date to include rotating a bounding contour for the purpose of capturing a particular image as suggested by Kajiya.  The Examiner further submits that the vertical direction is one of a select finite number of directions in which one would at least obviously try based on the obvious set of finite obvious options in order to capture a particular image based on the size and shape of the object of interest. 
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHARLES TERRELL SHEDRICK whose telephone number is (571)272-8621. The examiner can normally be reached 8A-5P.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lester G Kincaid can be reached on 571 272 7922. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CHARLES T SHEDRICK/Primary Examiner, Art Unit 2646