DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Regarding claims 1, 7 and 16, the phrase "and/or from slightly there above" renders the claim indefinite because it is unclear whether the limitation(s) following the phrase are part of the claimed invention.  See MPEP § 2173.05(d).  First, the limitational term “and/or” is not allowed as part of the claim limitation.  It can either be “and” or “or” and not both.  Second, what is meant by the limitational term “slightly there above”?  Further clarification is necessary.
Claims 2-6, 8-15 are inherited from independent claims 1 and 7, therefore are rejected under 112(b) as being indefinite. 
Claims 1rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite in that it fails to point out what is included or excluded by the claim language. Claims 1, 7 and 16 has the claimed limitation term “essentially” or “substantially” as part of the claimed limitation “forward-looking angle considered eye level of people.  This claim is an omnibus type claim.
Claims 2-6, 8-15 are inherited from independent claims 1 and 7, therefore are rejected under 112(b) as being indefinite.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Goldner et al., US2016/0283798 A1 in view of Hicks et al., US2019/0387185 A1, and further in view of McDonald et al., WO2020/153971 A1.
Regarding claim 1, Goldner teaches A method performed by a people distinguishing system for in an image distinguishing human beings in a crowd (Fig. 1, par. 0023; a video surveillance and analytics system 100 for detecting change, according to embodiments of the invention. A video camera 101 may monitor or survey a crowded scene or background 107, such as a train station, office lobby, waiting room, or other settings with numerous humans or people.), said method comprising: identifying by detecting objects, one or more detected objects as human beings in an image derived from a thermal camera adapted to capture a scene in an essentially forward-looking angle considered eye level of people and/or from slightly there above (par. 0023; video camera 101 may be a thermal video camera which detects or records infrared thermal energy.); identifying at least a first grouping of adjoining pixels in said image, not comprised in the one or more detected human beings, having an intensity within a predeterminable intensity range (par. 0033; using this coarse estimation and the fact that most human heads have nearly elliptical shape, the Hough transform may be performed to detect moving heads, by trying different possible head sizes (e.g., a refinement of the estimated head sizes).); determining a grouping pixel area of said at least first grouping in said image (par. 0029; local head size, or the size of a typical human head at different positions in a video frame, may be estimated or calculated based on the pixels with optical flow found in step 214.); and determining for at least a first vertical position (yexpected) of said grouping pixel area in said image, based on head size reference data, an expected pixel area (xexpected) size and form of a human head at said at least first vertical position (yexpected), wherein said head size reference data is represented by data indicative of a relationship between vertical positions (y) in the image and corresponding expected pixel area (x) i.e. size and form of a human head (par. 0034;  It may require knowing a typical human size in pixels at any place in the frame, or in other words, knowing the scene geometry parameters horizon h and scale s. In addition, the slanted candidate human may be rotated before operating the detection, according to the vanishing point V, such that it becomes vertical in the frame. Given a candidate sample of human object based on the HOG descriptor, the Dalal & Triggs algorithm may use a trained SVM (support vector machine) model to calculate the matching score (detection score) between the sample and the trained model.).  
Goldner fails to teach the following recited limitations.  However, Hicks teaches identifying by classifying objects, one or more detected objects classified as human beings in an image derived from a thermal camera adapted to capture a scene in an essentially forward-looking angle considered eye level of people and/or from slightly there above (par. 0021; a sensor, such as a lidar or visible light camera detects a person-sized shape at a particular location in the sensor's field of regard. Post-processing classifies the object as a pedestrian. The system then queries the thermal camera for the probability of a genuine pedestrian existing at the corresponding location in the field of regard of the thermal sensor of the camera.).  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Goldner’s teachings with Hicks’s teachings in order to predict the behavior of objects (Hicks, par. 0003).
Goldner and Hicks failed to teach the remaining recited limitations.  However, McDonald teaches comparing at least a portion resembling to at least some extent a human head of said grouping pixel area with said expected head pixel area (xexpected) for said at least first vertical position (yexpected) (par. 0035; the computing system may attempt to iteratively match the face detection with one of the person detection for each face detection until an approved association is found, and vice versa. As an example, for each face detection, the computing system can identify any person detection in the threshold distance (e.g., based on the comparison of their corresponding position), and can attempt to detect the face and the identified one of the person detection associated (e.g., from the closest to the start).); and determining that said at least first grouping comprises at least a first overlapping human being, when at least a first comparison resulting from said comparing exceeds a predeterminable conformity threshold (par. 0037; a face detection can be matched with a person detection if their respective bounding shapes (e.g., bounding boxes, semantic segmentations, etc.) have an amount of overlap that exceeds a threshold. For example, the amount of overlap can be measured as a percentage of the entire face bounding shape.).  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Goldner’s and Hicks’s teachings with McDonald’s teachings in order to perform face hallucination and/or screening (McDonald, par. 0001).

Regarding claims 2 and 8, Goldner, Hicks and McDonald teach all the limitations in claims 1 and 7.  McDonald further teaches estimating a number of overlapping human beings in said at least first grouping based on number of comparisons exceeding said conformity threshold (par. 0037; if the face detection and the corresponding boundary shape (e.g., boundary frame, semantic segmentation and so on) detected by the person has an overlap amount exceeding the threshold value, the face detection can be matched with the person detection. For example, the overlapping amount can be measured as the percentage of the whole face boundary shape).  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Goldner’s and Hicks’s teachings with McDonald’s teachings in order to perform face hallucination and/or screening (McDonald, par. 0001).

Regarding claims 3 and 9, Goldner, Hicks and McDonald teach all the limitations in claims 2 and 7.  McDonald further teaches estimating a total number of people in said image by adding said overlapping number of human beings to the one or more detected human beings (par. 0044; There are several different ways in which the increasing or updating of confidence values can be performed. An additional model could be trained to handle confidence value updating. For example, the model can take as inputs the face and person detection confidence values and a corresponding association score. The model can have a target output of 1 if the face is a true positive and a target output of 0 if the face is false positive. The updated confidence value for the face can be the output of this additional model.).  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Goldner’s and Hicks’s teachings with McDonald’s teachings in order to perform face hallucination and/or screening (McDonald, par. 0001). 

Regarding claims 4 and 10, Goldner, Hicks and McDonald teach all the limitations in claims 1 and 7.  Goldner further teaches wherein said identifying at least a first grouping comprises said intensity range being based on an intensity measure of at least one of the one or more detected human beings (par. 0028; The KLT algorithm may use spatial intensity information to direct the search for the positions between two frames that yield the best match. The algorithm may examine fewer potential matches between the images than with traditional feature tracking techniques.).

Regarding claims 5 and 11, Goldner, Hicks and McDonald teach all the limitations in claims 1 and 7.  Hicks further teaches wherein said head size reference data is based on mapping, for two or more of the detected human beings, a respective vertical position (y1, y2) and pixel area (x1, x2) of a head of the detected human being in said image (par. 0103; the system can be used to map the distance to a number of points within the field of regard. Each of these depth-mapped points may be referred to as a pixel or a voxel. A collection of pixels captured in succession (which may be referred to as a depth map, a point cloud, or a frame) may be rendered as an image or may be analyzed to identify or detect objects or to determine a shape or distance of objects within the FOR.).  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Goldner’s teachings with Hicks’s teachings in order to predict the behavior of objects (Hicks, par. 0003).

Regarding claims 6 and 12, Goldner, Hicks and McDonald teach all the limitations in claims 5 and 11.  McDonald further teaches wherein said head size reference data is based on interpolation from said mapping (par. 0121;  the body pose landmarks of the person detection may include only landmarks associated with a torso or limbs of the body and generating the hallucinated face pose landmarks can include projecting a body map (e.g., that includes information about typical spacing between various body landmarks and/or facial landmarks) onto the torso or limb landmarks to identify the hallucinated face pose landmarks.).  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Goldner’s and Hicks’s teachings with McDonald’s teachings in order to perform face hallucination and/or screening (McDonald, par. 0001).

Regarding claim 7, Goldner teaches A people distinguishing system for distinguishing human beings in a crowd shown in an image (Fig. 1, par. 0023; a video surveillance and analytics system 100 for detecting change, according to embodiments of the invention. A video camera 101 may monitor or survey a crowded scene or background 107, such as a train station, office lobby, waiting room, or other settings with numerous humans or people.), said people distinguishing system comprising: a human identifying unit for identifying by detecting objects, one or more detected objects as human beings in an image derived from a thermal camera adapted to capture a scene in an essentially forward-looking angle considered eye level of people and/or from slightly there above (par. 0023; video camera 101 may be a thermal video camera which detects or records infrared thermal energy.); a grouping identifying unit for identifying at least a first grouping of adjoining pixels in said image, not comprised in the one or more detected human beings, having an intensity within an predeterminable intensity range (par. 0033; using this coarse estimation and the fact that most human heads have nearly elliptical shape, the Hough transform may be performed to detect moving heads, by trying different possible head sizes (e.g., a refinement of the estimated head sizes).); an area determining unit adapted for determining a grouping pixel area of said at least first grouping in said image (par. 0029; local head size, or the size of a typical human head at different positions in a video frame, may be estimated or calculated based on the pixels with optical flow found in step 214.);3Docket No. 15495US01 Preliminary Amendmentan expected area determining unit for determining for at least a first vertical position (yexpected) of said grouping pixel area in said image, based on head size reference data, an expected pixel area (xexpected) size and form of a human head at said at least first vertical position (yexpected), wherein said head size reference data is represented by data indicative of a relationship between vertical positions (y) in the image and corresponding expected pixel area (x) i.e. size and form of a human head (par. 0034;  It may require knowing a typical human size in pixels at any place in the frame, or in other words, knowing the scene geometry parameters horizon h and scale s. In addition, the slanted candidate human may be rotated before operating the detection, according to the vanishing point V, such that it becomes vertical in the frame. Given a candidate sample of human object based on the HOG descriptor, the Dalal & Triggs algorithm may use a trained SVM (support vector machine) model to calculate the matching score (detection score) between the sample and the trained model.). 
Goldner fails to teach the following recited limitations.  However, Hicks teaches a human identifying unit for identifying by classifying objects, one or more detected objects classified as human beings in an image derived from a thermal camera adapted to capture a scene in an essentially forward-looking angle considered eye level of people and/or from slightly there above (par. 0021; a sensor, such as a lidar or visible light camera detects a person-sized shape at a particular location in the sensor's field of regard. Post-processing classifies the object as a pedestrian. The system then queries the thermal camera for the probability of a genuine pedestrian existing at the corresponding location in the field of regard of the thermal sensor of the camera.).  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Goldner’s teachings with Hicks’s teachings in order to predict the behavior of objects (Hicks, par. 0003).
Goldner and Hicks failed to teach the remaining recited limitations.  However, McDonald teaches a comparing unit for comparing at least a portion resembling to at least some extent a human head of said grouping pixel area with said expected head pixel area (xexpected) for said at least first vertical position (yexpected) (par. 0035; the computing system may attempt to iteratively match the face detection with one of the person detection for each face detection until an approved association is found, and vice versa. As an example, for each face detection, the computing system can identify any person detection in the threshold distance (e.g., based on the comparison of their corresponding position), and can attempt to detect the face and the identified one of the person detection associated (e.g., from the closest to the start).); and a conformity determining unit for determining that said at least first grouping comprises at least a first overlapping human being, when at least a first comparison resulting from said comparing exceeds a predeterminable conformity threshold (par. 0037; a face detection can be matched with a person detection if their respective bounding shapes (e.g., bounding boxes, semantic segmentations, etc.) have an amount of overlap that exceeds a threshold. For example, the amount of overlap can be measured as a percentage of the entire face bounding shape.).  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Goldner’s and Hicks’s teachings with McDonald’s teachings in order to perform face hallucination and/or screening (McDonald, par. 0001).

Regarding claim 13, Goldner, Hicks and McDonald teach all the limitations in claim 7.  Goldner further teaches A thermal camera comprising a people distinguishing system (par. 0023; video camera 101 may be a thermal video camera.).

Regarding claim 14, Goldner, Hicks and McDonald teach all the limitations in claim 1.  Goldner further teaches A non-transitory computer program product comprising a memory that stores a computer program containing computer program code arranged to cause a computer or a processor to execute the steps of a method (par. 0063; an article such as a non-transitory computer or processor readable medium, or a computer or processor non-transitory storage medium, such as for example a memory when executed by a processor or controller, carry out methods.).

Regarding claim 15, Goldner, Hicks and McDonald teach all the limitations in claim 14.  Goldner further teaches wherein the memory is a non-volatile memory (par. 0063; a USB flash memory).

Regarding claim 16, Goldner teaches A people distinguishing system for distinguishing human beings in a crowd shown in an image (Fig. 1, par. 0023; a video surveillance and analytics system 100 for detecting change, according to embodiments of the invention. A video camera 101 may monitor or survey a crowded scene or background 107, such as a train station, office lobby, waiting room, or other settings with numerous humans or people.), said people distinguishing system comprising: circuitry configured to identify via by detection of objects, one or more detected objects as human beings in an image derived from a thermal camera adapted to capture a scene in a substantially forward-looking angle considered eye level of people and/or from slightly there above (par. 0023; video camera 101 may be a thermal video camera which detects or records infrared thermal energy.); identify at least a first grouping of adjoining pixels in said image, not comprised in the one or more detected human beings, having an intensity within an predeterminable intensity range (par. 0033; using this coarse estimation and the fact that most human heads have nearly elliptical shape, the Hough transform may be performed to detect moving heads, by trying different possible head sizes (e.g., a refinement of the estimated head sizes).); determine a grouping pixel area of said at least first grouping in said image (par. 0029; local head size, or the size of a typical human head at different positions in a video frame, may be estimated or calculated based on the pixels with optical flow found in step 214.);3Docket No. 15495US01 Preliminary Amendment 5Docket No. 15495US01Preliminary Amendmentdetermine for at least a first vertical position (yexpected) of said grouping pixel area in said image, based on head size reference data, an expected pixel area (xexpected) size and form of a human head at said at least first vertical position (yexpected), wherein said head size reference data is represented by data indicative of a relationship between vertical positions (y) in the image and corresponding expected pixel area (x) i.e. size and form of a human head (par. 0034;  It may require knowing a typical human size in pixels at any place in the frame, or in other words, knowing the scene geometry parameters horizon h and scale s. In addition, the slanted candidate human may be rotated before operating the detection, according to the vanishing point V, such that it becomes vertical in the frame. Given a candidate sample of human object based on the HOG descriptor, the Dalal & Triggs algorithm may use a trained SVM (support vector machine) model to calculate the matching score (detection score) between the sample and the trained model.). 
Goldner fails to teach the following recited limitations.  However, Hicks teaches circuitry configured to identify via by classification of objects, one or more detected objects classified as human beings in the image derived from a thermal camera adapted to capture a scene in a substantially forward-looking angle considered eye level of people and/or from slightly there above (par. 0021; a sensor, such as a lidar or visible light camera detects a person-sized shape at a particular location in the sensor's field of regard. Post-processing classifies the object as a pedestrian. The system then queries the thermal camera for the probability of a genuine pedestrian existing at the corresponding location in the field of regard of the thermal sensor of the camera.).  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Goldner’s teachings with Hicks’s teachings in order to predict the behavior of objects (Hicks, par. 0003). 
Goldner and Hicks failed to teach the remaining recited limitations.  However, McDonald teaches compare at least a portion resembling to at least some extent a human head of said grouping pixel area with said expected head pixel area (xexpected) for said at least first vertical position (yexpected) (par. 0035; the computing system may attempt to iteratively match the face detection with one of the person detection for each face detection until an approved association is found, and vice versa. As an example, for each face detection, the computing system can identify any person detection in the threshold distance (e.g., based on the comparison of their corresponding position), and can attempt to detect the face and the identified one of the person detection associated (e.g., from the closest to the start).); and determine that said at least first grouping comprises at least a first overlapping human being, when at least a first comparison resulting from said comparing exceeds a predeterminable conformity threshold (par. 0037; a face detection can be matched with a person detection if their respective bounding shapes (e.g., bounding boxes, semantic segmentations, etc.) have an amount of overlap that exceeds a threshold. For example, the amount of overlap can be measured as a percentage of the entire face bounding shape.).  Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Goldner’s and Hicks’s teachings with McDonald’s teachings in order to perform face hallucination and/or screening (McDonald, par. 0001). 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AYODEJI O AYOTUNDE whose telephone number is (571)270-7983. The examiner can normally be reached Monday - Friday, 7:00am-3:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Yuwen Pan can be reached on 571-272-7855. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/AYODEJI O AYOTUNDE/Primary Examiner, Art Unit 2649