DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Allowable Subject Matter
Claims 1, 5-20, and 22-24 are allowed.
REASONS FOR ALLOWANCE
The following is an examiner’s statement of reasons for allowance: The claimed invention includes an object-tracking system comprising: a camera configured to capture an image of a surrounding environment in accordance with a first camera configuration, wherein the camera is moveable within a local environment and configured to adopt a second camera configuration; an inertial measurement unit (IMU) associated with the camera, wherein the IMU is configured to generate inertial data representing at least one of an angular velocity or linear acceleration of the camera; and a computer that is operatively coupled with the camera and the inertial measurement unit (IMU), wherein the computer is configured to: process the image from the camera by a graphical processing unit (GPU), wherein the GPU is configured to use feature extraction to process the image, and wherein the GPU is configured to provide an object detector with a preprocessed image, detect a moveable object within the image using a bounding box and a detection algorithm, wherein the detection algorithm is selected from a library of object detection algorithms, and wherein the selection is based on a type of object being detected and the surrounding environment, estimate a current position of the moveable object, estimate a current position of the camera relative to the current position of the 
Further, the claimed invention includes a positioning system, comprising: a camera, wherein the camera is oriented in accordance with a current pan, tilt, and/or zoom (PTZ) configuration, and wherein the camera is configured to capture an image while oriented in accordance with the current PTZ configuration; a processor configured to process the image using a computer vision technique via a graphical processing unit (GPU), wherein the GPU is configured to use feature extraction to process the image, and wherein the GPU is configured to provide an object detector with a preprocessed image; a controller configured to receive a current PTZ configuration from the camera, develop a new PTZ configuration, and communicate the new PTZ configuration to the camera; a detector configured to detect a moveable object within the image, wherein the moveable object is detected using a bounding box and a detection algorithm selected from a library of object detection algorithms, wherein the selection is based on a type of object being detected and a surrounding environment, and wherein the detector is configured to deactivate a detection algorithm if it is no longer compatible with the type of object being detected; and a state estimator configured to store a current estimated position of a user and calculate a new estimated position of the user based on the type of object, an estimated location of the moveable object, and a stored map of an environment, 
Lastly, the claimed invention includes a method for visually localizing an individual, the method comprising the steps of: capturing an image containing an object via a camera using a first pan, tilt, and/or zoom (PTZ) configuration, wherein the camera is associated with the individual and moveable within a local environment; processing the image to determine an appropriate detection algorithm based on a characteristic of the object and a surrounding environment; selecting the appropriate detection algorithm from a library of detection algorithms; detecting the object within the image using the detection algorithm, wherein the detection algorithm circumscribes the object with a bounding box, wherein the detection algorithm is selected from a library of object detection algorithms, and wherein the selection is based on a type of object being detected and the surrounding environment; and wherein the determining whether the object is moving or stationary; in response to determining the object is stationary: estimating a position of the object in relation to one of a user or other objects, wherein the position is estimated using a Kalman filter and inertial measurements from an inertial measurement unit (IMU), and storing the position of the object in a map memory; determining a second PTZ configuration; and orientating the camera in accordance with the second PTZ configuration; and generating a real-time map of the local environment in a GPS-denied environment, wherein the real-time map reflects a current position of the camera and the position of the object.

Tracking unit 50 performs several functions, it controls video decoder 58 and captures video frames acquired by camera 22; it registers video frames taken at different times to remove the effects of camera motion; it performs a video content analysis to detect target objects which are in motion within the FOV of camera 22; it calculates the relative direction, speed and size of the detected target objects; it sends direction and speed commands to camera 22; it performs all serial communications associated with the above functions; and it controls the operation of the status indicators 70, 72 and relay 74 [See Sablak, 0043]. Tracking unit 50 does not require the two images which are used to determine the motion of the POI to 
The process then returns to block 84 where the first flag will no longer be true and the process will proceed to block 108 where a single new image will be grabbed and overwrite image I2 in the buffer [See Sablak, 0084]. The tilt value of camera 22 for new image I2 is then obtained at block 110 from the integral controller of camera 22 for later calculation of the desired focal length [See Sablak, 0084]. The new image is then subsampled at block 112 and corners are detected and a list of such corners created for the subsampled images at block 114 [See Sablak, 0084]. The warping and alignment process described above is then performed at block 116 to align images I1 and I2 [See Sablak, 0084]. At block 118, the image difference of the two aligned images is then calculated to determine if a moving object is included in the images [See Sablak, 0084]. If a moving target object is present in the images, the centroid of the target object is determined at block 120 [See Sablak, 0084]. At block 122 images I1 and I2 and the data associated therewith are swapped as described above with respect to block 98 [See Sablak, 0084]. At block 124 the size of the detected target object, i.e., the Blob_Size, is compared to a threshold value and, if the target object is not large enough, or if no target object has been 
Sablak fails to explicitly disclose detect a moveable object within the image using a bounding box and a detection algorithm, wherein the detection algorithm is selected from a library of object detection algorithms, and wherein the selection is based on a type of object being detected and the surrounding environment.
Roumeliotis et al. (Hereafter, “Roumeliotis”) [US 2016/0327395 A1] discloses inverse filtering and square root inverse filtering techniques for optimizing the performance of a vision-aided inertial navigation system (VINS) [See Roumeliotis, Abstract]. In one example, instead of keeping all features in the system's state vector as SLAM features, which can be inefficient when the number of features per frame is large or their track length is short, an estimator of the VINS may classify the features into either SLAM or MSCKF features [See Roumeliotis, Abstract]. The SLAM features are used for SLAM-based state estimation, while the MSCKF features are used to further constrain the poses in the sliding window [See Roumeliotis, Abstract]. In one example, a square root inverse sliding window filter (SQRT-ISWF) is used for state estimation [See Roumeliotis, Abstract].
FIG. 1 is a block diagram illustrating a vision-aided inertial navigation system (VINS) 10 that navigates an environment 2 having a plurality of features 15 using one or more image sources and inertial measurement unit (IMUs) [See Roumeliotis, 0019]. That is, as described herein, VINS 10 may perform simultaneous localization and mapping (SLAM) by constructing a 




FIG. 2 illustrates an example implementation of VINS 10 in further detail [See Roumeliotis, 0022]. Image source 12 of VINS 10 images an environment in which VINS 10 operates so as to produce image data 14 [See Roumeliotis, 0022]. That is, image source 12 generates image data 14 that captures a number of features visible in the environment [See Roumeliotis, 0022]. Image source 12 may be, for example, one or more cameras that capture 2D or 3D images, a laser scanner or other optical device that produces a stream of 1D image data, a depth sensor that produces image data indicative of ranges for features within the environment, a stereo vision system having multiple cameras to produce 3D information, a Doppler radar and the like [See Roumeliotis, 0022]. In this way, image data 14 provides exteroceptive information as to the external environment in which VINS 10 operates [See 
Feature extraction and tracking module 12 extracts features 15 from image data 14 acquired by image source 12 and stores information describing the features in feature database 25 [See Roumeliotis, 0026]. Feature extraction and tracking module 12 may, for example, perform corner and edge detection to identify features and track features 15 across images using, for example, the Kanade-Lucas-Tomasi (KLT) techniques described in Bruce D. Lucas and Takeo Kanade, An iterative image registration technique with an application to stereo vision, In Proc. of the International Joint Conference on Artificial Intelligence, pages 674-679, Vancouver, British Columbia, Aug. 24-28 1981, the entire content of which in incorporated herein by reference [See Roumeliotis, 0026]. Outlier rejection module 13 provides robust outlier rejection of measurements from image source 12 and IMU 16 [See Roumeliotis, 0027]. For example, outlier rejection module may apply a Mahalanobis distance tests to the feature measurements to identify and reject outliers [See Roumeliotis, 0027]. As one example, outlier rejection module 13 may apply a 2-Point Random sample consensus (RANSAC) technique described in Laurent Kneip, Margarita Chli, and Roland Siegwart, Robust Real-Time Visual Odometty with a Single Camera and an Imu, In Proc. of the British Machine Vision Conference, pages 16.1-16.11, 
Roumeliotis fails to explicitly disclose detect a moveable object within the image using a bounding box and a detection algorithm, wherein the detection algorithm is selected from a library of object detection algorithms, and wherein the selection is based on a type of object being detected and the surrounding environment.
Fahn et al. (Hereafter, “Fahn”) [US 2009/0128618 A1] discloses a system and method for automatically selecting an object from a field of view of a handheld image capture device [See Fahn, Abstract]. The system includes sensors configured to sense features of one or more objects in the field of view and a decision unit configured to automatically select one or more objects from the field of view based on the sensed features using a decision algorithm that is based on a decision structure, wherein the decision structure receives and prioritizes inputs from the sensors [See Fahn, Abstract]. The system may also optionally include an object movement detecting module configured to detect movement of objects, and a manual selection unit configured to provide user priorities; if included, the information from these elements may also be used by the decision unit to automatically select the object or objects [See Fahn, Abstract]. FIG. 1 shows an imager with a multiple-axis actuating mechanism (hereinafter "MAAM imager") [See Fahn, 0034]. The MAAM imager shown in FIG. 1 is a single-imager handheld camera with a multiple-axis actuating mechanism (herein after a "single-imager MAAM camera") [See Fahn, 0034]. The single-imager MAAM camera 100 includes a camera body 120 and an actuated imager 110 [See Fahn, 0034]. In certain embodiments, the imager comprises an image sensor and lens, wherein the lens is positioned proximately to the 
The single-imager MAAM camera, such as that shown in FIGS. 2A and 2B may be used for an auto-centering purpose, e.g., centering an object of interest in the center of the captured image field [See Fahn, 0045]. In other embodiments, the object of interest could be centered in a particular zone or placed at the intersection of particular zones of the captured image field [See Fahn, 0045]. Assuming an object of interest is selected, the selected object may be centered automatically by a combination of the panning and the tilting motions of the actuated imager [See Fahn, 0045]. The auto-center feature will be described in detail in reference to FIG. 4 below [See Fahn, 0045]. The method and system for selecting an object of interest and centering the selected object automatically will be discussed in detail in reference to FIGS. 7 and 9 below [See Fahn, 0045]. Initially, the bicyclist 401, being located inside the wide field of 
FIG. 5 illustrates the auto-zoom capability according to some embodiments of the automatic image capture system [See Fahn, 0052]. Here, the auto-zoom capability will be described in reference to a dual-imager MAAM camera 300 (FIG. 3B) [See Fahn, 0052]. However, it will be understood that the auto-zoom capability may be implemented also with a single-imager MAAM camera 100 such as shown in FIGS. 1, 2A, 2B, and 2C [See Fahn, 0052]. An object of interest 501, such as a bicyclist in the illustration, may be moving or stationary [See Fahn, 0052]. For the purpose of illustration of the auto-zoom capability, the object of interest is assumed to be stationary [See Fahn, 0052]. This is because even if the object is moving in an absolute sense with respect to the background, the object remains stationary in a relative sense within an image field 550 and 560 of the actuated imager 310 due to the auto-center process as discussed above in reference to FIG. 4 [See Fahn, 0052]. As previously described, the dual-imager MAAM camera 300 includes the stationary imager 330 and the actuated imager 310 [See Fahn, 0052]. Here, the actuated imager 330, in addition to having the pan DOF and the tilt 
Fahn fails to explicitly disclose detect a moveable object within the image using a bounding box and a detection algorithm, wherein the detection algorithm is selected from a library of object detection algorithms, and wherein the selection is based on a type of object being detected and the surrounding environment.
Ju et al. (Hereafter, “Ju”) [US 2019/0156138 A1] discloses a method, a system, and a computer-readable recording medium for image-based object tracking are provided [See Ju, Abstract]. The method includes following steps [See Ju, Abstract]. A video stream including a plurality of images is received [See Ju, Abstract]. The video stream is generated through photographing an enclosed space by an image capturing device, and a moving range of a plurality of observed objects is limited to the enclosed space [See Ju, Abstract]. A plurality of moving objects are detected from the video stream, and frames associated with each of the moving objects are generated for the images [See Ju, Abstract]. The images include a current image and a previous image [See Ju, Abstract]. By analyzing position projecting relationship between current frames in the current image and previous frames in the previous image, linking relationship between the current frames in the current image and the observed objects is established [See Ju, Abstract]. The observed objects in the enclosed space are tracked according to the established linking relationship [See Ju, Abstract].

Ju fails to explicitly disclose detect a moveable object within the image using a bounding box and a detection algorithm, wherein the detection algorithm is selected from a library of object detection algorithms, and wherein the selection is based on a type of object being detected and the surrounding environment.
Thus, the claimed subject matter cannot be anticipated by any of the references found nor if they were combined, would they make the claimed invention obvious.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the 
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Kaitlin A Retallick whose telephone number is (571)270-3841. The examiner can normally be reached Monday-Friday 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chris Kelley can be reached on (571) 272-7331. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KAITLIN A RETALLICK/Primary Examiner, Art Unit 2482