DETAILED ACTION

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
This communication is responsive to the applicant's amendment dated 04/15/2022.  The applicant(s) amended claims 1 and 8-9, canceled claims 2-3, and new claims 1-11 (see the amendment: pages 2-5).
The examiner withdrew previous claim rejection under 35 USC 112 (b), because the applicant amended the corresponding claim(s).  
	
Response to Arguments
Applicant's arguments filed on 04/15/2022 with respect to the claim rejection under 35 USC 102 and 103, have been fully considered but are moot in view of the new ground(s) of rejection, since the amended claims introduce new issue and/or change the scope of the claims. Accordingly, response to the applicant’s arguments (see Remarks: pages 7-11) based on the newly amended claims is directed to new claim rejection with necessitated new ground (see detail below).
Claim Rejections - 35 USC § 103
Claims 1, 5 and 9-11 are rejected under 35 U.S.C. 103 as being unpatentable over FRITZ et al. (US 2019/0371304) hereinafter referenced as FRITZ in view of MURAI (US 2007/0120966). 
As per claim 9, FRITZ discloses ‘audio message extraction’ (title) providing ‘voice-enabled communication device’ including ‘one or more processors’ (p(paragraph)42-p43), comprising:
receiving (or ‘capture’) audio (‘audio signal’, ‘audio input data’) [when a plurality of users are present in a same area] (Note: underline portion is newly amended limitation), (p16,p18); 
specifying (read on ‘determine’ or ‘speaker identification’ for) a speaker (or the ‘member/user speaking’) [among the plurality of users] on the basis of the received audio (same above), (p17-p18); 
determining (or ‘detecting’), on the basis of the received audio (same above), whether or not a specified word (‘a wakeword’) for starting a reception of a predetermined command (read on ‘voice command or spoken request’, ‘request or command’, or ‘audio command’ in a broad sense, wherein the spoken ‘request may be any question, inquiry, instruction, phrase, or other set of one or more words/sounds’) is included in the audio (Fig. 4A, p17-p18, p22, p42); 
specifying (read on ‘recognize’ and/or ‘analyzing’), when (‘after’) it is determined that the specified word is included in the audio, a command (or ‘instruction’) on the basis of a command keyword (read on ‘word(s)’ or ‘phrase’ or ‘phrase’ of ‘the user speaking’ as recognized ‘request or command’ or ‘audio command’) which is included in the audio and follows (‘after’) the specified word (wakeword), (Fig. 4A, p17-p18, p42); 
specifying (‘determine’ or ‘analyzing’), [among the plurality of users] on the basis of the content of the specified command (same above), a target user (read on ‘intended recipient’) with respect to which the command (or ‘instruction’) is to be executed (p18, p42); and 
executing (or ‘send’ or ‘transmitted’) the specified command (such as ‘send a message to Bob’ and/or ‘executing’ related ‘messaging application’ to ‘listen to’ or ‘view’ an ‘audio version’ or ‘text version’ of the ‘message’/ ‘content’ by using a related mechanism on ‘electronic device’) with respect to the specified target user (‘target recipient’), (Fig. 4B, p18, p32-p33). 
It is noted that FRITZ does not expressly disclose the step of receiving/capturing the audio “when a plurality of users are present in a same area” and the step of specifying speaker “among the plurality of users.”  However, the same/similar concept/feature is well known in the art as evidenced by MURAI who discloses ‘speaker predicting apparatus, speaker predicting method, and program product for predicting speaker’ (title)’, comprising ‘collect (receive)’ and ‘detects (specifies) a person (participant, or speaker) who is delivering a speech out a plurality of persons (participants, or users)’ in a ‘conference room (the same area)’, using ‘camera(s)’ and ‘microphone(s)’ with ‘speaker recognition technique’ (Fig. 1, abstract, p20-p22, p26-p27, p36, p39-p41).  Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine teachings FRITZ and MURAI together by providing a mechanism of collecting/receiving audio/speech among multiple persons/users presented in a room (such as a conference room) and detecting/specifying a speaker/person delivering speech among the multiple persons, as claimed, for the purpose (motivation) of improving accuracy of speaker recognition based on sound source in speaking and corresponding image and/or enhancing robustness of speaker recognition (MURAI: p49).

As per claim 1, it recites a system (apparatus). The rejection is based on the same reasons described for claim 9, because the apparatus and method claims are related as apparatus and method of using the same, with each claimed element's function corresponding to the claimed method step, wherein combined teachings disclosed by FRITZ in view of MURAI also read on the limitations of claim 1
As per claim 5 (depending on claim 1), even though FRITZ in view of MURAI further discloses “including a first terminal device (read on one of ‘102’, ‘124’, ‘126’ and ‘128’) corresponding to a first user (read on one of ‘users’, ‘recipients’ or ‘members’) and a second terminal device (read on another one of ‘102’, ‘124’, ‘126’ and ‘128’) corresponding to a second user (read on another one of ‘users’, ‘recipients’ or ‘members’) which are connected to each other via a network (‘104’) (Fig. , wherein the command executor displays (‘page displaying’ or ‘view’), when the target user specifier specifies the first user as the target user (such as a ‘recipient’ named ‘Bob’) with respect to which a first command (such as ‘send a message to Bob’) is to be executed (same above), a first content (read on ‘message content’, such as text version of ‘How are you doing?’ on ‘454’, in light of the specification: Fig. 4, p99) corresponding to the first command on the first terminal device (same above)”, (FRITZ: Figs. 1, 4B, p17-p19, p21, p33), FRITZ in view of MURAI does not expressly teaches “the command executor displays, when the target user specifier specifies the first user and the second user as the target users with respect to which a second command is to be executed, a second content corresponding to the second command on the first terminal device and the second terminal device.”  However, it is noted that in above teachings FRITZ uses multiple networked devices for displaying/viewing text version of the message, and that FRITZ further discloses providing (or transmitting) ‘text version’ of the ‘message’ (or ‘message payload’, i.e. content)’ for access by ‘target recipient(s)’ or ‘multiple receipts’ (p36, p39). Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine separate teachings FRITZ (in view of MURAI) together by providing a mechanism of displaying a message content on different devices (such as ‘124’ and ‘126’) for different target recipients (users) respectively based on a command (such as a command associated with two recipients), as claimed, for which implementation would be within the scope of capability of the skilled person in the art and the result would be predictable.  
As per claim 10 (depending on claim 1), FRITZ in view of MURAI further discloses “wherein the speaker specifier specifies the speaker among the plurality of users (same above) by comparing (determining, identifying and recognizing) the audio received by the audio receiver with audio (such as ‘biometrics’ based on ‘sound of voice’ as ‘speaker identification’) of each of the plurality of users (‘members’ or ‘multiple users’) registered (or stored) in advance in a storage unit (‘a speaker identification module’ and/or ‘a user profile module’)”, (FRITZ: p17, p35, p57; MURAI: p27,p39-p41, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to recognize that the combined teachings of FRITZ and MURAI including ‘automatically determine an identity of the speaker’ (as ‘one of members’) by analyzing the speaker’s voice’ and ‘associated profile’ with ‘biometrics (i.e., speaker identification based on sound of voice’, ‘voice recognition’, and/or ‘detects’/ ‘specifies the speaker with the use of speaker recognition technique’, would inherently include characteristic/feature as claimed).
As per claim 11 (depending on claim 1), FRITZ in view of MURAI further discloses “a camera (such as a ‘102’ or ‘106’) is installed in each area (such as in room ‘100’), and the speaker specifier specifies a direction (such as a direction toward ‘participant 2’, which is assumed as a speaker) from which the audio was received by the audio receiver (same as stated for claim 3), on the basis of a direction that a microphone (such as ‘104’ corresponding to ‘participant 2) of the audio processing device (such as ‘110’ combined with ‘102’, ‘104’, ‘106’) collected the audio, and specifies the speaker among the plurality of users on the basis of an image captured by the camera (such as ‘102’ or ‘106’ corresponding to ‘participant 2’) in the direction”, (MURAI: Figs. 1-2, ‘102’ and ‘106’, p20-p27,p52; ‘camera 106 whose image-capturing direction or the like is changeable according to a control signal may be provided to capture a scene in the conference room 100’; also FRITZ: p17, p40). By given above teachings, it would have been obvious before the effective filing date of the claimed invention to combine separate teachings of FRITZ and MURAI together by providing a mechanism (such as ‘110’ combined with 102, 104 and 106) of controlling/specifying a direction (such as a direction toward speaking participant, i.e. a speaker) from which the audio/voice was received by an ASR based on a direction of associated microphone collected the audio/voice, and detecting /specifying a speaker among multiple participants/users based on an image captured by a camera in the direction as claimed, for the purpose (motivation) of improving accuracy of speaker recognition based on sound source in speaking and corresponding image and/or enhancing robustness of speaker recognition (MURAI: p49).

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over FRITZ in view of MURAI as applied to claim 1, and further in view of HIMMELSTEIN (US 2009/0070434). 
As per claim 4 (depending on claim 1), even though FRITZ in view of MURAI further discloses “the target user specifier specifies, when the command causes predetermined content (such as ‘message content’ in light of the specification: Fig. 4, p99) to be displayed and the content is set with a user browsing (such as ‘page displaying’ or ‘view’ with ‘web browser’)” and “the target user as a user (‘recipient’) to browse the content” (FRITZ: p33, p71), FRITZ in view of MURAI does not expressly disclose the content is set with “a user browsing “permission” and the target user as a user (‘recipient’) having “permission” to browse the content.  However, the same/similar concept/feature is well known in the art as evidenced by HIMMELSTEIN who discloses ‘system and method for efficiently accessing internet resources’ (title) for ‘providing access to information’, comprising ‘receiving a selection of information to be transmitted to the target user; receiving authorization to transmit the information; transmitting the information to the target user’ (p15), and to ‘access website or invoke an e-mail application’ for which ‘user has permission to access’ (p40). Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine teachings of FRITZ, MURAI and HIMMELSTEIN together by providing a mechanism of displaying (or viewing, accessing) a message content that is set with a user browsing (accessing/viewing) permission/authorization (such as using a browser) wherein the target user (or recipient) has a permission/authorization to access and view (browse) the content, for the purpose (motivation) of effectively accessing related information and/or offering improved way of accessing the internet for general uses (HIMMELSTEIN: abstract, p2 and p43). 

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over FRITZ in view of MURAI, as applied to claim 1, and further in view of RATHOD et al. (US 2018/0351895). 
As per claim 6 (depending on claim 1), even though FRITZ in view of MURAI further discloses that “each target terminal device (‘124’, ‘126’, or ‘128’) with respect to which the command specified by the command specifier is to be executed (same as stated for claim 9), wherein the command executor (read on a mechanism of ‘executing’ as stated for claim 9) is provided in each of the terminal devices, a first command executor of a first terminal device (same as stated for claim 9, such as one of ‘124’, ‘126’ and ‘128’ as described for ‘450’) executes…the first command” (same above), and “a second command executor of a second terminal device executes (such as another one of ‘124’, ‘126’ and ‘128’ as described for ‘450’) …the second command (same as or similar to the first command but used in the another device)” (FRITZ: Figs 1, 4B, p19, p21, p33), FRITZ in view of MURAI does not expressly disclose “a command storage area” for the target terminal devices, and the first and second commands being “registered in a first command storage area” corresponding to the first and second terminal devices respectively.  
However, the same/similar concept/feature is well known in the art as evidenced by RATHOD who discloses ‘in the event of selection of message, invoking camera to enabling to capture media and relating, attaching, integrating, overlay message with/on/in captured media and send to message sender’ (title), comprising detecting and recognizing ‘objects in photo or video as per message associated instruction based on object recognition, face and body detection, voice recognition, optical character recognition technologies’, employing ‘object recognition’, ‘voice recognition’ and ‘pattern matching technologies’ to match instruction or task message associated keywords with recognized objects in received instruction or task message…’ (p62 and p125), and comprising ‘database’ for storing and/or providing ‘registered user's profile, accounts, logged activities, indexes, messages or tasks or instructions or requests for sending to target recipients…’(p88-p91).  It is also noted that FRITZ discloses ‘backend server’ including ‘modules’ to ‘store’ ‘instructions and/or commands’, ‘user information’ and/or ‘user profile module’  (p57, p59), wherein the ‘user profile can be associated with multiple devices, and a separate device or communications profile’ which ‘can have a 1:1 relationship with a user profile or a 1:many relationship’ (p19).  Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to combine teachings of FRITZ, MURAI and RATHOD together by providing a mechanism (such as a database in a server, read on a command storage area, in light of the specification: Fig. 2) of storing/providing registered user data (or information or profiles) including user/recipient related instructions (or commands) and corresponding devices for respective executions, as claimed, for which implementation based on combination of the above given teachings would be within the scope of capability of the skilled person in the art and the result would be predictable.

Claims 7-8 are rejected under 35 U.S.C. 103 as being unpatentable over FRITZ in view of MURAI as applied to claim 1, and further in view of WANG et al. (US 2019/0251961) hereinafter referenced as WANG. 
As per claim 7 (depending on claim 1), even though FRITZ further discloses a situation that “a first user (such as a ‘user’/‘speaker’) is specified (‘determined’ or ‘identified’) by the speaker specifier (such as by using ‘speaker identification’ based on ‘sound of voice’ other than ‘wakeword’,  ‘facial recognition’, ‘fingerprint scanner’, ‘fingerprint ID’, and/or a combination thereof’) as the speaker of the audio received by the audio receiver (such as microphone) (p16-p17) and “audio determinator (such as ‘wakeword detector’ with ‘list of wakewords’) determines (or ‘detect’, ‘indicate’) the audio spoken by the first user does not include the specified word” (by generating ‘a true/false output’), (FRITZ: p47-p49), FRITZ in view of MURAI does not expressly disclose “an audio transmitter that transmits the audio received by the audio receiver to a second user” when in the above situation.  However, the same/similar concept/feature is well known in the art as evidenced by WANG who discloses ‘transcription of audio communication to identify command to device’ (title), comprising recognizing ‘voice commands’ during ‘a telephone or video conferencing call,’ ‘providing a subsequent user interface for a user to acknowledge or disregard actions’ based on ‘a potential voice command received during the call’, determining whether the ‘audio’ captured by ‘microphone (receiver)’ is ‘just regular speech for which a command should not be executed’ and is ‘used as to conduct the call’ (wherein a mechanism of conducting the call is read transmitter) (p10-p11, p39, p46).  Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to recognize that captured audio piece determined/detected as regular speech (i.e. not including keyword/wakeword or command word/phrase) during a telephone conferencing call would be  used to conduct the call to output/transmit to the other party device (i.e. second device), and to combine teachings of FRITZ, MURAI and WANG together by providing a mechanism of transmitting (outputting) an audio piece (captured by microphone /receiver) detected/determined as regular speech (i.e. not keyword/wakeword or command word/phrase), as used to conduct the call, to the other party device during a telephone conferencing call, as claimed, for the purpose (motivation) of offering adequate solutions of permitting voice command recognition during a telephone or conferencing call  (WANG: p1, p10). 
As per claim 8, it recites a system (apparatus). The rejection is based on the same reasons described for claim 9, because the apparatus and method claims are related as apparatus and method of using the same, with each claimed element's function corresponding to the claimed method step, wherein the combined teachings of FRITZ, MURAI and WANG also read on or satisfied additional limitations regarding ‘conferencing’ (same/similar as stated for claim 7) and ‘audio process device and a display device placed in each area’ (FRITZ: Figs. 1, 4B, p17-p19, p21, p33; MURAI: Fig. 1, p18-p21; WANG: Figs. 2, 4-5, p10, p32, wherein the obviousness/motivation analysis is same/similar as described for claims 9 and/or 7; Note: those additional limitation(s) that is/are in the preamble but lack connection with claim body could also be interpreted as intended use or field of use without giving a patentable weight).
 
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 


Any inquiry concerning this communication or earlier communications from the examiner should be directed to QI HAN whose telephone number is (571)272-7604.  The examiner can normally be reached on 9-19:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre_Louis Desir can be reached on 571-272-7799.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.                                                                                                                                                                                          

QH/qh
June 9, 2022
/QI HAN/Primary Examiner, Art Unit 2659