DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
This Office Action is responsive to communications filed on 08/31/2021. Claims 1-20 remain pending in the instant application. Claims 1 and 11 are still independent. A complete response to applicants remarks follows here below. 
Response to Arguments
Applicant's arguments filed 08/31/2021 have been fully considered but they are not persuasive:
At page 3, applicant argues: However, Zass merely discloses in paragraph [108] that “obtaining spatial information associated with the audio data”, which comprises “location information related to the location of one or more sound sources associated with sounds present in the audio data”. Even though Zass mentions in paragraph [108] that “Some examples of location information may include information associated with one or more of: direction”, Zass fails to teach or disclose such “direction of the sound sources” can be used as a basis to capture “an image in the direction”. Therefore, Applicant respectfully submits Zass fails to teach or disclose the claimed features.
The Examiner respectfully disagrees. The “location information” includes the direction of audio/sound. Zass as mentioned above already discloses “Some examples of location information may include information associated with one or more of: direction;” More specifically, Zass discloses “obtaining spatial information (652) may comprise obtaining spatial information associated with the audio data.” One of ordinary skill in the art can easily ascertain that “information associated with the audio data” would include direction of the one or more sound sources associated with sounds present in the audio data…” Therefore, the Examiner maintains that the rejection of Zass is proper and will be maintained. 
At page 3, applicant argues: “… regarding the features about “update of cluster identification” in claim 1, the Examiner cited paragraphs [119], [076] and [155-156] of Zass. However, as recited in paragraph [119] of Zass, “One or more portions of the audio data may be identified as associated a group of a plurality of speakers, for example where the group of a plurality of speakers does not include the wearer”. Applicant respectfully submits it is known from the above that what Zass identifies is “audio data” instead of “images” and therefore the cited paragraph of Zass fails to teach or disclose the features about “image identification” as recited in claim 1.
The Examiner respectfully disagrees.  As noted, Zass discloses at para [076]; “For example, visual data captured using image sensors 371 may be analyzed to identify one or more of: low level visual features; objects; faces; persons; events; visual triggers; and so forth. In another example, visual data captured using image sensors 371 may be applied to an inference model.” Zass does in fact teach obtaining information about visual data including faces and persons.
Similarly at para [110], Zass further discloses “In some examples, a speaker location in 2D image and/or 2D video may be detected using detection algorithms, for example by face detection algorithms, by algorithms that detect lips movements, etc., and location information may be calculated, for example: a direction may be calculated based on the based on the speaker location in the 2D image and/or 2D video and/or the capturing parameters; a distance may be calculated based on the based on the speaker location in the 2D image and/or 2D video and/or the capturing parameters; and so on.” More fully at para [111], Zass goes on to 
At page 3, applicant argues: For at least the above reasons, Applicant respectfully submits that Zass fails to teach or suggest the features as highlighted in claim 1 above. Accordingly, withdrawal of the anticipation rejection on claim 1 is respectfully requested.
Applicant respectfully disagrees. The Examiner has replied to each of the applicants arguments above and the applicants arguments are not persuasive. The Examiner maintains that the rejection of Zass are proper and will be maintained. 
With respect to claims 2-10 and 12-20; these claims are still objected to as being dependent upon a rejected base and this objection will be maintained.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1 and 11 are rejected under 35 U.S.C. 102(a)(1) and/or (a)(2) as being anticipated by Zass (US 20180020285 A1).
Regarding Claim 1: Zass discloses a face recognition method (Refer to para [013]; “a system and a method for capturing and processing audio data from the environment of a person are provided.”) comprising: a sound receiving device, configured to detect a sound source (Refer to para [074]; “one or more audio outputting units 351 may be configured to output audio to a user, for example through a headset, through one or more audio speakers, and so forth. In some embodiments, the one or more visual outputting units 352 may be configured to output visual information to a user, for example through a display screen, through an augmented reality display system, through a printer, through LED indicators, and so forth.”) adapted to an electronic device having a sound receiving device (Refer to para [074]; “one or more audio outputting units 351 may be configured to output audio to a user, for example through a headset, through one or more audio speakers, and so forth. In some embodiments, the one or more visual outputting units 352 may be configured to output visual information to a user, for example through a display screen, through an augmented reality display system, through a printer, through LED indicators, and so forth.”) and an image capturing device, wherein the method (Refer to para [076]; “…the one or more image sensors 371 may be configured to capture visual data. Some possible examples of image sensors 371 may include: CCD sensors; CMOS sensors; stills image sensors; video image sensors; 2D image sensors; 3D image sensors; and so forth. Some possible examples of visual data may include: still images; video clips; continuous video; 2D images; 2D videos; 3D images; 3D videos; microwave images; terahertz images; ultraviolet images; infrared images; x-ray images; gamma ray images; visible light images; microwave videos; terahertz videos; ultraviolet videos; infrared videos; visible light videos; x-ray videos; gamma ray videos; and so forth. In some cases, visual data captured using image sensors 371 may be stored in memory, for example in memory units 320.”) comprises the following steps: detecting a direction of a sound source by using the sound receiving device to capture an image in the direction by using the image capturing device (Refer to para [108]; “obtaining spatial information (652) may comprise obtaining spatial information associated with the audio data. In some examples, the obtained spatial 
Regarding Claim 11: Zass discloses a face recognition apparatus (Refer to para [013]; “a system and a method for capturing and processing audio data from the environment of a person are provided.”) comprising: a sound receiving device, configured to detect a sound source (Refer to para [074]; “one or more audio outputting units 351 may be configured to output audio to a user, for example through a headset, through one or more audio speakers, and so forth. In some embodiments, the one or more visual outputting units 352 may be configured to output visual information to a user, for example through a display screen, through an augmented reality display system, through a printer, through LED indicators, and so forth.”) an image capturing device, configured to capture an image (Refer to para [076]; “…the one or more image sensors 371 may be configured to capture visual data. Some possible examples of image sensors 371 may include: CCD sensors; CMOS sensors; stills image sensors; video image sensors; 2D image sensors; 3D image sensors; and so forth. Some possible examples of visual data may include: still images; video clips; continuous video; 2D images; 2D videos; 3D images; 3D videos; microwave images; terahertz images; ultraviolet images; infrared images; x-ray images; gamma ray images; visible light images; microwave videos; terahertz videos; ultraviolet videos; infrared videos; visible light videos; x-ray videos; gamma ray videos; and so forth. In some cases, visual data captured using image sensors 371 may be stored in memory, for example in memory units 320.”) a processor, coupled to the sound receiving device and the image capturing device (Refer to para [070]; “one or more processing units 330 may be configured to execute software programs, for example software programs stored in the one or more memory units 320, software programs received through the one or more communication modules 340, and so forth. Some possible implementation examples of processing units 330 may comprise: one or more single core processors; one or more multicore processors; one or more controllers; one or more application .
Allowable Subject Matter
Claims 2-10, 12-20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
MacMillan (US 20170295318 A1) discloses “Other techniques for identifying relevant sub-frames do not necessarily depend on location data associated with the target and instead identify sub-frames relevant to a particular target based on the spherical media content (e.g., visual and/or audio content) itself. FIG. 8 illustrates an embodiment of a process for generating an output video relevant to a particular target based on audio/video processing. The video server 240 stores 802 a plurality of spherical videos. The video server 240 performs 804 image and/or video processing to automatically identify a target feature that meets specified audio and/or visual criteria. For example, in one embodiment, a facial recognition algorithm is performed on the spherical content to identify and track a particular target face. Alternatively, rather than tracking one particular face, the video server 240 may track regions in the spherical video where faces are generally present. In yet another embodiment, an object recognition and/or object tracking algorithm is performed to identify a region of the spherical video containing one or more particular objects. In yet another embodiment, a motion analysis may be performed to identify a region of motion having some particular characteristics that may be indicative of an activity of interest. For example, a motion thresholding may be applied to locate objects traveling according to a motion 
Hahn (US 20200219135 A1) discloses “information analyzing unit may track the user from the image, recognizes the user's facial expression, estimates the user's sex, and estimates the user's age by using an AI algorithm including an object tracking algorithm, a facial expression recognition algorithm, a sex estimating algorithm, and an age estimating algorithm.”
US 20200066294 A1
US 20170337438 A1
US 9633270 B1
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the                                                                                                                                                                            
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MIA M THOMAS whose telephone number is (571)270-1583. The examiner can normally be reached M-Th 8:30am-4:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Edward (Ed) Urban can be reached on 572-272-7899. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

MIA M. THOMAS
Primary Examiner
Art Unit 2665



/MIA M THOMAS/Primary Examiner
Art Unit 2665