DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Preliminary Amendment
This is Office Action is responsive to communications filed on 09/14/2020. Claims 1-21 are pending in the instant application. Claims 1, 11 and 21 are independent. An Office Action on the merits follows here below. 
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Information Disclosure Statement
The information disclosure statements (IDS) submitted on 09/14/2020 and 06/22/2021 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) because the claim limitations uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  
Such claim limitations are: 
in claim 1: “an input module”; “a crowd estimation technique integration module”
Although not expressly detailed here, for the sake of brevity, each instance of the limitation “module” in the subsequent dependent claims has been interpreted under 35 U.S.C. 112(f).
Because these claim limitations are being interpreted under 35 U.S.C. 112(f) they are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have these limitations interpreted under 35 U.S.C. 112(f) applicant may: (1) amend the claim limitations to avoid them being interpreted under 35 U.S.C. 112(f) (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitations recites sufficient structure to perform the claimed function so as to avoid them being interpreted under 35 U.S.C. 112(f).
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 2, 5, 7, 10, 11, 12, 15, 17 and 21 are rejected under 35 U.S.C. 102(a)(1) and/or (a)(2) as being anticipated by Kilambi et al. (US 20080118106 A1).
Regarding Claim 1: Kilambi discloses a method for crowd estimation (Refer to para [021]; “An example of the present subject matter includes a method to estimate the number of people in a scene or an environment. The crowd can include an individual or a dense group of people moving together. Each individual or group is separately tracked as an entity using, for example, an extended Kalman filter based tracker.”) comprising: performance modeling of each of a plurality of crowd estimation techniques based on an accuracy thereof at different crowd levels and/or at different locations (Refer to para [154]; “To improve robustness and stability, an example of the present subject matter uses extensions to a Kalman filter tracker based on the history of estimates. Various methods, such as those based on a heuristic training or based on shape models, can be used for estimating crowd size. Motion trajectories of these crowds can be generated for further data analysis. The system can be configured to count and track people in the presence of occlusions, group merges, and splits. The system is substantially view-point invariant as it uses data from camera calibration methods.”) receiving an image of a crowd (Refer to para [047]; “The segmented image is transformed into world coordinates through projection using the camera calibration information. The camera is calibrated to allow extraction of three dimensional (3D) information from a two dimensional (2D) image taken by that camera. The camera may be virtual or real.”) selecting one or more of the plurality of crowd estimation techniques in response to the performance modeling of the one or more of the plurality of crowd estimation techniques and an estimated crowd level and/or an estimated location (Refer to para [035-038]; “All blobs whose area (area of polygon which is the intersection of the projected blob onto ground and head planes) exceeds an area threshold are classified either as a group or as a large object (i.e., a bus or a car). According to one example, the area threshold is selected as some value less than the area corresponding to two individuals in the real world. For blobs larger than the area threshold, the extended Kalman filter (EKF) tracker is initialized by observing the velocity of blobs for a small number of frames. If the velocity of a blob remains above a particular velocity threshold, then it is assumed that the blob corresponds to a vehicle and the system does not estimate a count for that region. The remaining large blobs are classified as groups and the counting algorithm is applied and a group tracker is initialized. For all blobs whose area is less than the area threshold, a comparison is made as to height and width. A blob is treated as an individual person if the blob height is greater than the width. For each such blob, the present subject matter initializes an EKF tracker and tracks the blob for a minimum number of frames. If the blob can be reliably tracked for a minimum number of frames, then the blob corresponds to an individual. All other blobs correspond to noise and are thus discarded. In one example, it is assumed that individuals are taller than wider and so all blobs whose width is greater than the height either correspond to a group or noise. For these blobs, if they can be reliably tracked using a Kalman filter for a minimum number of frames, then determine if the blob is a group, otherwise the blob is discarded as noise.”) and estimating a crowd count of the crowd in the received image in accordance with the selected one or more of the plurality of crowd estimation techniques (Refer to para [060-064 and 070]; “The counting procedure is applied to groups of individuals. Initially, the present subject matter determines if the object being tracked represents an individual or a group using a tracking method. For each foreground region, determine if the area exceeds 2K. If not, then initialize a single person tracker for this region and assume that it corresponds to one person. If the area exceeds 2K, then switch to group tracking mode (described in the tracking section) and assume that the tracked object represents a group. As such, the count for the group in the current frame is estimated to be Count=Area/K.Update the estimate of the group tracker if the blob is already being tracked, otherwise initialize a group tracker with this Count as the initial estimate of the count for the group. Sum all Counts, including all individuals and groups in the frame, to find the number of people in the scene. This can be done in real-time on a frame-by-frame basis. A probabilistic approach may be used to provide an estimate for counts of a group based on shape probabilities.”).

Regarding Claim 2: Kilambi discloses receiving the image of the crowd (Refer to para [133-136] and Figure 4, wherein the Examiner considers a “per frame count” to equate with a still image analysis) determining a region of interest within the image of the crowd (Refer to para [137]; “Table II shows the average count for the method based on the modal estimate of all the per frame counts over the lifetime of the blob from 3 different video sequences. In some cases, the shape-based method outperforms the heuristic approach for larger groups. In cases where groups of 2 or 3 people are miscounted, the images show that the people are not really moving together but appear together in a single blob. The ellipse fitting method may be vulnerable to this error as reflected in the tables. Overestimates are noted when the groups are far from the camera or near the horizon. As the distance from the camera increases, the per pixel error increases, (i.e., the distance between two neighboring pixels is greater). This type of error can be reduced by establishing a region-of-interest in which data beyond the region is not considered in the calculation.”) and estimating one or both of the crowd level of the crowd in the region of interest within the image of the crowd or the location of the crowd in the region of interest within the image of the crowd (Refer to para [138 and 143]; “Estimates further away from a camera are weighted so that they have a lesser influence on the count than the estimates made closer to the camera.”).

Regarding Claim 5: Kilambi discloses assigning a real-time confidence value to each of the plurality of crowd estimation techniques in accordance with the performance modeling thereof (Refer to para [019 and 022]; “An example of the present system can be configured to monitor a crowded urban environment and to monitor groups of people in real-time. The number of people in a scene can be counted and tracked. Using prior knowledge obtained from the scene and camera calibration data, the system learns the parameters for estimation. This information can be used to estimate the count of people in the scene in real-time. The present subject matter operates on an image having a foreground region that has been segmented through a background estimation technique. The segmented foreground region can be generated in real-time using various methods, including for example, mixtures of Gaussians.”).

Regarding Claim 7: Kilambi discloses combining crowd count estimation results from the multiple ones of the plurality of crowd estimation techniques to estimate the crowd count of the crowd in the received image (Refer to para [126]; “The following routine addresses dynamic occlusions. When two groups (or a group and an individual) merge, the shape models, as well as the motion models assumed to estimate the count, may not be valid. Rather than treating the combined object as one object, they are tracked as two separate objects corresponding to the original objects using only the predicted values of the Kalman filter. For a merge involving a group, the history of estimates for the group is maintained without updating it. Thus, the modal estimate before the merge is used as the count for the group. The count for the merged group then becomes the sum of counts of all groups and individuals that were part of the merger. During this time, only the age since the last update list is incremented. When this exceeds a threshold (for example, 30 frames), then the estimate is deleted or removed from the list. If all estimates have been deleted and the merged group still has not split, that is interpreted to mean that the merged group is now moving as one group and a new estimate of the count is calculated and the tracker is initialized again.”).

Regarding Claim 10: Kilambi discloses measuring a crowd level in a foreground of the image of the crowd to provide the estimated crowd level utilized in the selecting step (Refer to para [023]; “Using the segmented foreground region of the image, the present system identifies regions corresponding to humans based on known characteristics as to human shape and motion.”).

Regarding Claim 11: Kilambi discloses a system for crowd estimation (Refer to para [021]; “An example of the present subject matter includes a method to estimate the number of people in a scene or an environment. The crowd can include an individual or a dense group of people moving together. Each individual or group is separately tracked as an entity using, for example, an extended Kalman filter based tracker.”) comprising: a plurality of performance modeling modules configured to model performance of each of a plurality of crowd estimation techniques based on an accuracy thereof at different crowd levels and/or at different locations (Refer to para [154]; “To improve robustness and stability, an example of the present subject matter uses extensions to a Kalman filter tracker based on the history of estimates. Various methods, such as those based on a heuristic training or based on shape models, can be used for estimating crowd size. Motion trajectories of these crowds can be generated for further data analysis. The system can be configured to count and track people in the presence of occlusions, group merges, and splits. The system is substantially view-point invariant as it uses data from camera calibration methods.”)
an input module configured to receive an image of a crowd (Refer to para [047]; “The segmented image is transformed into world coordinates through projection using the camera calibration information. The camera is calibrated to allow extraction of three dimensional (3D) information from a two dimensional (2D) image taken by that camera. The camera may be virtual or real.”) a crowd estimation technique integration module configured to select one or more of the plurality of crowd estimation techniques in response to modeling the performance of the one or more of the plurality of crowd estimation techniques (Refer to para [035-038]; “All blobs whose area (area of polygon which is the intersection of the projected blob onto ground and head planes) exceeds an area threshold are classified either as a group or as a large object (i.e., a bus or a car). According to one example, the area threshold is selected as some value less than the area corresponding to two individuals in the real world. For blobs larger than the area threshold, the extended Kalman filter (EKF) tracker is initialized by observing the velocity of blobs for a small number of frames. If the velocity of a blob remains above a particular velocity threshold, then it is assumed that the blob corresponds to a vehicle and the system does not estimate a count for that region. The remaining large blobs are classified as groups and the counting algorithm is applied and a group tracker is initialized. For all blobs whose area is less than the area threshold, a comparison is made as to height and width. A blob is treated as an individual person if the blob height is greater than the width. For each such blob, the present subject matter initializes an EKF tracker and tracks the blob for a minimum number of frames. If the blob can be reliably tracked for a minimum number of frames, then the blob corresponds to an individual. All other blobs correspond to noise and are thus discarded. In one example, it is assumed that individuals are taller than wider and so all blobs whose width is greater than the height either correspond to a group or noise. For these blobs, if they can be reliably tracked using a Kalman filter for a minimum number of frames, then determine if the blob is a group, otherwise the blob is discarded as noise.”) and an estimated crowd level and/or an estimated location and estimate a crowd count of the crowd in the received image in accordance with the selected one or more of the plurality of crowd estimation techniques (Refer to para [060-064 and 070]; “The counting procedure is applied to groups of individuals. Initially, the present subject matter determines if the object being tracked represents an individual or a group using a tracking method. For each foreground region, determine if the area exceeds 2K. If not, then initialize a single person tracker for this region and assume that it corresponds to one person. If the area exceeds 2K, then switch to group tracking mode (described in the tracking section) and assume that the tracked object represents a group. As such, the count for the group in the current frame is estimated to be Count=Area/K.Update the estimate of the group tracker if the blob is already being tracked, otherwise initialize a group tracker with this Count as the initial estimate of the count for the group. Sum all Counts, including all individuals and groups in the frame, to find the number of people in the scene. This can be done in real-time on a frame-by-frame basis. A probabilistic approach may be used to provide an estimate for counts of a group based on shape probabilities.”).

Regarding Claim 12: Kilambi discloses the input module receives the image of the crowd (Refer to para [133-136] and Figure 4, wherein the Examiner considers a “per frame count” to equate with a still image analysis) and determines a region of interest within the image of the crowd (Refer to para [137]; “Table II shows the average count for the method based on the modal estimate of all the per frame counts over the lifetime of the blob from 3 different video sequences. In some cases, the shape-based method outperforms the heuristic approach for larger groups. In cases where groups of 2 or 3 people are miscounted, the images show that the people are not really moving together but appear together in a single blob. The ellipse fitting method may be vulnerable to this error as reflected in the tables. Overestimates are noted when the groups are far from the camera or near the horizon. As the distance from the camera increases, the per pixel error increases, (i.e., the distance between two neighboring pixels is greater). This type of error can be reduced by establishing a region-of-interest in which data beyond the region is not considered in the calculation.”) and wherein the crowd estimation technique integration module estimates one or both of the crowd level of the crowd in the region of interest within the image of the crowd or the location of the crowd in the region of interest within the image of the crowd (Refer to para [138 and 143]; “Estimates further away from a camera are weighted so that they have a lesser influence on the count than the estimates made closer to the camera.”).

Regarding Claim 15: Kilambi discloses the plurality of performance modeling modules further assigns a real-time confidence value to each of the plurality of crowd estimation techniques in accordance with modeling the performance thereof (Refer to para [019 and 022]; “An example of the present system can be configured to monitor a crowded urban environment and to monitor groups of people in real-time. The number of people in a scene can be counted and tracked. Using prior knowledge obtained from the scene and camera calibration data, the system learns the parameters for estimation. This information can be used to estimate the count of people in the scene in real-time. The present subject matter operates on an image having a foreground region that has been segmented through a background estimation technique. The segmented foreground region can be generated in real-time using various methods, including for example, mixtures of Gaussians.”).

Regarding Claim 17: Kilambi discloses the crowd estimation technique integration module selects multiple ones of the plurality of crowd estimation techniques and combines crowd estimation results from the multiple ones of the plurality of crowd estimation techniques to estimate the crowd count of the crowd in the received image (Refer to para [126]; “The following routine addresses dynamic occlusions. When two groups (or a group and an individual) merge, the shape models, as well as the motion models assumed to estimate the count, may not be valid. Rather than treating the combined object as one object, they are tracked as two separate objects corresponding to the original objects using only the predicted values of the Kalman filter. For a merge involving a group, the history of estimates for the group is maintained without updating it. Thus, the modal estimate before the merge is used as the count for the group. The count for the merged group then becomes the sum of counts of all groups and individuals that were part of the merger. During this time, only the age since the last update list is incremented. When this exceeds a threshold (for example, 30 frames), then the estimate is deleted or removed from the list. If all estimates have been deleted and the merged group still has not split, that is interpreted to mean that the merged group is now moving as one group and a new estimate of the count is calculated and the tracker is initialized again.”).

Regarding Claim 21: Kilambi discloses a non-transitory computer readable medium storing a program for causing a computer to perform a method (Refer to para [021 and 148]; “Computer 300 includes memory (not shown) for storage of instructions or data. The memory can be internal or removable storage (or memory) of various types.”) the method comprising: performance modeling of each of a plurality of crowd estimation techniques based on an accuracy thereof at different crowd levels and/or at different locations (Refer to para [154]; “To improve robustness and stability, an example of the present subject matter uses extensions to a Kalman filter tracker based on the history of estimates. Various methods, such as those based on a heuristic training or based on shape models, can be used for estimating crowd size. Motion trajectories of these crowds can be generated for further data analysis. The system can be configured to count and track people in the presence of occlusions, group merges, and splits. The system is substantially view-point invariant as it uses data from camera calibration methods.”) receiving an image of a crowd (Refer to para [047]; “The segmented image is transformed into world coordinates through projection using the camera calibration information. The camera is calibrated to allow extraction of three dimensional (3D) information from a two dimensional (2D) image taken by that camera. The camera may be virtual or real.”) selecting one or more of the plurality of crowd estimation techniques in response to the performance modeling of the one or more of the plurality of crowd estimation techniques and an estimated crowd level and/or an estimated location (Refer to para [035-038]; “All blobs whose area (area of polygon which is the intersection of the projected blob onto ground and head planes) exceeds an area threshold are classified either as a group or as a large object (i.e., a bus or a car). According to one example, the area threshold is selected as some value less than the area corresponding to two individuals in the real world. For blobs larger than the area threshold, the extended Kalman filter (EKF) tracker is initialized by observing the velocity of blobs for a small number of frames. If the velocity of a blob remains above a particular velocity threshold, then it is assumed that the blob corresponds to a vehicle and the system does not estimate a count for that region. The remaining large blobs are classified as groups and the counting algorithm is applied and a group tracker is initialized. For all blobs whose area is less than the area threshold, a comparison is made as to height and width. A blob is treated as an individual person if the blob height is greater than the width. For each such blob, the present subject matter initializes an EKF tracker and tracks the blob for a minimum number of frames. If the blob can be reliably tracked for a minimum number of frames, then the blob corresponds to an individual. All other blobs correspond to noise and are thus discarded. In one example, it is assumed that individuals are taller than wider and so all blobs whose width is greater than the height either correspond to a group or noise. For these blobs, if they can be reliably tracked using a Kalman filter for a minimum number of frames, then determine if the blob is a group, otherwise the blob is discarded as noise.”) and estimating a crowd count of the crowd in the received image in accordance with the selected one or more of the plurality of crowd estimation techniques (Refer to para [060-064 and 070]; “The counting procedure is applied to groups of individuals. Initially, the present subject matter determines if the object being tracked represents an individual or a group using a tracking method. For each foreground region, determine if the area exceeds 2K. If not, then initialize a single person tracker for this region and assume that it corresponds to one person. If the area exceeds 2K, then switch to group tracking mode (described in the tracking section) and assume that the tracked object represents a group. As such, the count for the group in the current frame is estimated to be Count=Area/K.Update the estimate of the group tracker if the blob is already being tracked, otherwise initialize a group tracker with this Count as the initial estimate of the count for the group. Sum all Counts, including all individuals and groups in the frame, to find the number of people in the scene. This can be done in real-time on a frame-by-frame basis. A probabilistic approach may be used to provide an estimate for counts of a group based on shape probabilities.”).
Allowable Subject Matter
Claims 3, 4, 6, 8, 9, 13, 14, 16, 18, 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The prior art either singly or in combination fails to expressly disclose “…a confidence value observer coupled to the crowd estimation technique integration module configured to remove one of the plurality of crowd estimation techniques from selection when the real-time confidence value of the one of the plurality of crowd estimation techniques falls below a confidence value threshold.” The prior art also fails to expressly disclose: “…dynamically combining the crowd count estimation results from the multiple ones of the plurality of crowd estimation techniques in accordance with the real-time confidence value of the multiple ones of the plurality of crowd estimation techniques to estimate the crowd count of the crowd in the received image.”
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Related Applications: 17/042,474 and 17/042,465 (Examiner notes there are pending Double Patenting rejections in Application 17/042,474)
US 8358806 B2
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MIA M THOMAS whose telephone number is (571)270-1583. The examiner can normally be reached M-Th 8:30am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Edward (Ed) Urban can be reached on 572-272-7899. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

MIA M. THOMAS
Primary Examiner
Art Unit 2665



/MIA M THOMAS/Primary Examiner
Art Unit 2665