DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Allowable Subject Matter
Claims 1-20 are allowed.
The following is an examiner's statement of reasons for allowance: Independent claim 1 recites the uniquely distinct features for: “...receive a video stream, a plurality of audio signals and audiovisual metadata that defines a spatial relationship between images of said video stream and said plurality of audio signals that serve as basis for a spatial audio signal; determine presence of at least a first sound source and a second sound source depicted in an image of the video stream, wherein respective sounds originating from the first and second sound sources are to be represented in said spatial audio signal by a single directional sound component; determine a first zoom factor threshold for zooming said image of the video stream into a corresponding image of a video signal based at least in part on respective positions of said first and second sound sources in said image of the video stream in dependence of said audiovisual metadata; and zoom said image of the video stream into said corresponding image of the video signal in accordance with the first zoom factor threshold.” And Independent claim 14 recites the uniquely distinct features for: “...receiving a video stream, a plurality of audio signals and audiovisual metadata that defines a spatial relationship between images of the video stream and said plurality of audio signals that serve as basis for a spatial audio signal; determining presence of at least a first sound source and a second sound source depicted in an image of the video stream, wherein respective sounds originating from the first and second sound sources are to be represented in said spatial audio signal by a single directional sound component; determining a first zoom factor threshold for zooming said image of the video stream into a corresponding image of a video signal based at least in part on respective positions of said first and second sound sources in said image of the video stream in dependence of said audiovisual metadata; and zooming said image of the video stream into said corresponding image of the video signal in accordance with the first zoom factor threshold.” And Independent claim 20 recites the uniquely distinct features for: “...receiving a video stream, a plurality of audio signals and audiovisual metadata that defines a spatial relationship between images of the video stream and said plurality of audio signals that serve as basis for a spatial audio signal; determining presence of at least a first sound source and a second sound source depicted in an image of the video stream, wherein respective sounds originating from the first and second sound sources are to be represented in said spatial audio signal by a single directional sound component; determining a first zoom factor threshold for zooming said image of the video stream into a corresponding image of a video signal based at least in part on respective positions of said first and second sound sources in said image of the video stream in dependence of said audiovisual metadata; and zooming said image of the video stream into said corresponding image of the video signal in accordance with the first zoom factor threshold.” The closest prior art in Cutler (US 2007/0019066 A1) teaches a method and system for producing normalized images of conference participants so that the participants appear to be approximately the same size when the images are displayed is provided. In one embodiment, the normalizing  HABETS et al. (US 2017/0078819 A1) teaches concepts are provided to achieve spatial sound recording and reproduction such that the recreated acoustical image may, e.g., be consistent to a desired spatial image, which is, for example, determined by the user at the far-end side or by a video-image. The proposed approach uses a microphone array at the near-end side which allows us to decompose the captured sound into direct sound components and a diffuse sound component. The extracted sound components are then transmitted to the far-end side. The consistent spatial sound reproduction may, e.g., be realized by a weighted sum of the extracted 
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOSE M MESA whose telephone number is (571)270-1706. The examiner can normally be reached Monday-Friday 8:30AM-6:00PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Thai Tran can be reached on 571-272-7382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 
1/5/2022
/JOSE M. MESA/
Examiner
Art Unit 2484




/THAI Q TRAN/Supervisory Patent Examiner, Art Unit 2484