Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Allowable Subject Matter
The following is an examiner’s statement of reasons for allowance: The present invention relates to the field of videoconferencing, and in particular to a technique for group framing by a videoconferencing endpoint.
The prior art of record in combination or alone fails to teach or suggest these elements of independent claims 1, 8, 21 in combination with other elements.  For example independent claim 1 has claim limitations such as  receiving video data from a camera of a videoconferencing endpoint; performing face detection on the video data; saving detected faces for a first threshold time period; postprocessing the saved detected faces during the first threshold time period, wherein postprocessing the saved detected faces comprises: performing a first type of motion detection on regions around the saved detected faces; performing upper body detection on regions around the saved detected faces responsive to not detecting motion; and discarding saved detected faces responsive to the first type of motion detection and upper body detection detecting neither motion nor an upper body in the regions around the saved detected faces; and  a camera, disposed in the housing; a processing unit, disposed in the housing and coupled to the camera; a memory, disposed in the housing and coupled to the processing unit and the camera, in which are stored instructions for performing face detection and upper body detection, comprising instructions that when executed cause the processing unit to: receive video data from the camera corresponding to participants of a videoconference; perform face detection on the video data; save detected faces for a first threshold time period; postprocess the saved detected faces during the first threshold time period, wherein the instructions to cause the processing unit to postprocess the saved detected faces comprise  instructions that when executed cause the processing unit to: perform a first type of motion detection on regions around the saved detected faces; perform upper body detection on regions around the saved detected faces responsive to not detecting motion; and discard saved detected faces responsive to the first type of motion detection and upper body detection detecting neither motion nor an upper body in the regions around the saved detected faces; and frame a group of participants of the videoconference based on the saved detected faces.  Independent claim 21 has claim limitations such as receiving video data from a camera of a videoconferencing endpoint; performing face detection on the video data; saving detected faces for a first  postprocessing the saved detected faces during the first threshold time period, wherein postprocessing the saved detected faces comprises: performing a first type of motion detection on regions around the saved detected faces; performing upper body detection on regions around the saved detected faces responsive to not detecting motion; and  discarding saved detected faces responsive to the first type of motion detection and upper body detection detecting neither motion nor an upper body in the regions around the saved detected faces; and framing the group of participants based on the saved detected faces.  For the above reasons independent claims 1, 8, 21 and dependent claims 2-7, 9-14, 22-27 are allowable.
Claims 1-14, 21-27 (now renumbered 1-21) are allowed.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
In one embodiment, a video conference endpoint may detect a one or more participants within a field of view of a camera of the video conference endpoint. The video conference endpoint may determine one or more alternative framings of an output of the camera of the video conference endpoint based on the detected one or more participants. The video conference endpoint may send the output of the camera of the video conference endpoint to one or more far-end video conference endpoints participating in a video conference with the video conference endpoint. The video conference endpoint may send data descriptive of the one or more alternative framings of the output of the camera to the far-end video conference endpoints. The far-end video conference endpoints may utilize the data to display one of the one or more alternative framings.
	--(US 2015/0296178A1) to Aarrestad et al. discloses use of face and motion detection for best view framing in video conference endpoint which teaches: A video conference endpoint detects faces at associated face positions in video frames capturing a scene. The endpoint frames the video frames to a view of the scene encompassing all of the detected faces. The endpoint detects that a previously detected face is no longer detected. In response, 
	--(US 2019/0058833A1) to Goonetilleke et al. discloses dynamic calibration of detection system for active areas of interest within video data which teaches:  Techniques for calibrating a video detection system are disclosed. A videoconferencing system may include a sensor configured to capture sequential frames of video image data during a videoconference. A processing subsystem may be configured to, using a multidimensional filter, generate data indicative of active areas of interest detected within respective ones of a plurality of video frames captured by the sensor, wherein the multidimensional filter is configured to identify active areas of interest (AAOIs) based at least on part upon a programmable density threshold. The processing subsystem may further: determine, based on the data indicative of AAOIs, that the videoconferencing system is in an unstable state; perform a calibration routine to identify one or more threshold values that reduce system instability; and apply the one or more threshold values to the multidimensional filter during further generation of data indicative of active areas of interest
[0093] As noted above, the process of AAOI detection begins at the pixel level based on frames received from an image sensor. Pixels are processed to determine whether they satisfy certain pixel-level filters, such as a motion filter (which may be determined by detecting a change in color of a pixel relative to one or more previous frames), a color filter (e.g., a skin tone filter representing a palette of colors likely to represent human skin operate to account for artifacts such as shadows or reflections; to apply a priori knowledge about the video context, such as the location of walls, windows, doors, furniture, or other features; to identify and distinguish different types of motion, such as the motion of participants vs. the motion of artifacts; or any other suitable types of filters or heuristic algorithms.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MELUR RAMAKRISHNAIA H whose telephone number is (571)272-8098. The examiner can normally be reached Flexible.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc M Nguyen can be reached on 27503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

/MELUR RAMAKRISHNAIA H/Primary Examiner, Art Unit 2651