DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 2, 3 and 9. (Cancelled)
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5, 7-8 and 10-12  are rejected under 35 U.S.C. 103 as being unpatentable over Goyal further in view of Moon et al (US 2021/0084378) or Lin et al (US 2020/0098354)
Claims 1, 11 and 12, Goyal teaches an information processing device, a method and a non-transitory storage medium comprising: 
an image acquirer that acquires a captured image captured by an imager; (Goyal: endpoint 103 may include camera 204 (e.g., an HD camera) for acquiring video images of the participant location (e.g., of participant 214), [0036]);
(Goyal: endpoints 103 that are currently identified as an N talker (e.g., with a detected human voice) being displayed in composite video image 405, [0076]);
a display target identifier that identifies a display target corresponding to the utterer identified by the utterer identifier from the captured image acquired by the image acquirer; (Goyal: Video images acquired by camera 204 may be displayed locally on display 201 and may also be encoded and transmitted to other videoconferencing endpoints 103 in the videoconference, [0036-0038]); 
a display processor that displays display information corresponding to the display target identified by the display target identifier, on a first display. (Statistics and other information may also be displayed in the video images 455 of composite video image 405. For example, participant location, codec type, user identifiers, etc. may be displayed in respective video images 455 of composite video image 405, [0067, 0076]); and 
(i)	Moon teaches “a voice receiver that receives a voice, (voice signal receiver an utterance, i.e., command, and extract a keyword associated with the plurality of contents from the utterance, [0009];  wherein when a preliminarily registered keyword is included in an utterance content corresponding to the voice received by the voice receiver,  the display target identifier identifies the display target based on the keyword. (the keyword analysis module 314 may determine the intent of the user by matching the meaning of the grasped word to the intent. In addition, the pre-stored keyword matched with a word among the extracted words may be determined as a keyword of the utterance, [0041, 0055-0056, 0067]); and displayed through the display 130, [0060]).
(ii)	Lin teaches “a voice receiver that receives a voice, (audio receiver, [0111]);  wherein when a preliminarily registered keyword is included in an utterance content corresponding to the voice received by the voice receiver,  the display target identifier identifies the display target based on the keyword. (The voice control application may determine, based on the comparing, whether a first utterance of the plurality of utterances (e.g., the plurality of utterances in speech 112 or speech 116) matches the keyword. For example, the voice control application may compare the text of each utterance of the plurality of utterances to the keyword and may determine whether the text of any utterance of the plurality of utterances matches the keyword. When the voice control application determines that text of an utterance of the plurality of utterances matches the keyword. For example, the voice control application may determine that the keyword is a name associated with voice activated device 106 or voice assistant 102 (e.g., "Tom"). The voice control application may compare an utterance of the user ("Tom") to the keyword and may determine whether the utterance matches the keyword. For example, when the voice control application determines that a transcription of the voice input (e.g., the voice input corresponding to speech 112 or speech 116) comprises a word (e.g., utterance) matching the keyword, that one of user 110 or user 114 spoke the keyword, [0070];  Please note that the keyword uttered will be used to compare/match with the keyword in the memory, [0069] and displayed on display 300, [0099-0101]).

Therefore, it would have been obvious to the ordinary artisan before the effective filing date to incorporate the teaching of Moon or Lin into the teaching of Goyal for the purpose of providing precise matching uttered keyword with pre-stored keyword  by matching by matching the meaning of the grasped word to the intent. 
Claim 5. The information processing device according to claim 1, wherein if the display target identified by the display target identifier is a person different from the utterer, the display processor displays an image of the utterer included in the captured image and an image of the person, on the first display in a side-by-side manner. (Goyal: The master endpoint 103a may send composite video image 405 to other endpoints 103 in the videoconference, [0042]. See fig. 7a, 455f for side-by-side display).
Claim 7. The information processing device according to claim 1, wherein if the display target identified by the display target identifier is a display screen of a second display, the display processor displays a display content displayed on the display screen, on the first display, based on display data corresponding to the display content. (Goyal sends video images to other endpoints, Kato furthermore teaches the transceiving of content data, such as image data and/or sound data from the another terminal, [0058]. Also [0060] and fig. 2).
Claim 8. The information processing device according to claim 5, wherein the display processor further displays identification information corresponding to the display (Goyal: participant location, codec type, user identifiers, etc. may be displayed in respective video images 455 of composite video image 405, [0067]).
Claim 10. The information processing device according to claim 1, wherein the display processor continuously displays the display information on the first display since the display information is displayed on the first display until a predetermined time period passes or a display target different from the display target is identified by the display target identifier. (Goyal: Video images acquired by camera 204 may be displayed locally on display 201 and may also be encoded and transmitted to other videoconferencing endpoints 103 in the videoconference, [0036]).

Claims 4, 6 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Goyal in view of Moon or Lin further in view of Johnson et al (US 2015/0109191).
	Claim 4. The information processing device according to claim 1, wherein the display target identifier identifies a line-of-sight direction of the utterer, based on the captured image, and identifies the display target from the captured image, based on the identified line-of-sight direction and the keyword.
Goyal teaches everything in claim 4 but the line-of-sight direction and keyword.
 wherein examiner maps the line of sight direction to a line of eye-gazing.  (Please note Moon teaches the keyword as demonstrated in claim 1)  

Claim 6. The information processing device according to claim 1, wherein if the display target identified by the display target identifier is an object, the display processor displays an image of the object included in the captured image, on the first display and does not display an image of the utterer included in the captured image, on the first display. (Goyal does not teach display target identifier is an object, Johnson teaches the target identifier is object 332 in Fig. 3B within the gaze direction 344.  Goyal further teaches, “the master endpoint may request the endpoints not being displayed to stop sending video to help conserve the resources on the master endpoint. In some embodiments, the master endpoint may ignore (e.g., not decode) video streams from endpoints that are not being displayed, [0006, 0042]).
Therefore, it would have been obvious to the ordinary artisan before the effective filing date to incorporate the teaching of Johnson into the teaching of Goyal for the purpose of activating or deactivating the voice based on the gaze direction being within the range of voice-activation gaze directions or within the range of social-cue directions for effective communication or bandwidth saving.
Claim 13. (New) The information processing device according to claim 4, wherein the display target identifier identifies the display target by preferentially using the utterance content rather than the line-of-sight direction.  (Moon: to obtain an utterance for making a request for at least one content to the contents providing service through the voice signal receiver,[0007, 0024]).

Claims 14 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Goyal in view of Moon or Lin further in view of Johnson et al (US 2015/0109191) and further in view of Yasuda et al (US 2018/0046335).
Claim 14. (New) The combined teaching of Goyal in view of Moon or Lin and Johnson teaches everything in claim 14 except the feature within the double square bracket “wherein the display target identifier determines “[[priority]] of the line-of-sight direction and the utterance content depending on [[a time during which the line-of-sight direction is directed]].  
Yasuda teaches “As a method that performs the determination of selection of an object through only a line of sight, when a line-of-sight position is included in a region corresponding to an object on a display screen for a predetermined given period of time, [0047]; However, when the method that performs the determination of selection of an object through only a line of sight is employed… an erroneous detection may occur, [0048]; a selected object is determined by a particular utterance of speech or a particular gesture, [0049].  Here, examiner maps the selection of a content, i.e., object by utterance have a high priority”.

 “The display of the guidance objects GO1 to GO14 allows the user to direct the line of sight previously to a position at which an object associated with a desired process is displayed, before the selection start operation is performed on an operation device. Thus, the display of the guidance objects GO1 to GO14 allows the user to select an object in a short time, [0071”.

Therefore, it would have been obvious to the ordinary artisan before the effective filing date to incorporate the teaching of Yasuda into the teaching of Goyal for the purpose of selecting the best one over other activating operations pending the amount of time available to ensure the reliability of content presented by either line of sight or by utterance of an object.
Response to Arguments
Applicant’s arguments with respect to the current claim(s) have been considered but are moot because the new ground of rejection does not rely on any reference 
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PHUNG-HOANG J. NGUYEN whose telephone number is (571)270-1949.  The examiner can normally be reached on Reg. Sched. 6:00-3:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/PHUNG-HOANG J NGUYEN/Primary Examiner, Art Unit 2651