DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5 are rejected under 35 U.S.C. 103 as being unpatentable over Gomez et al., US Patent Publication 2012/0317484 in view of Yeung et al., US Patent Publication 2020/0236278.
Regarding independent claim 1, Gomez et al. teaches a video display apparatus (shown in figures 1-3) comprising: 
a video display unit configured to display a video image in a field of view of 5a user (paragraph 0038 explains how the eyeglasses include display elements that display a video in the field of view of the user that is given in paragraphs 0025 and 0029 to be video streams); 
an image pickup unit (video camera 120 of figure 2 and paragraphs 0025 and 0035) configured to take a video image of an outside state (point-of-view video feed described in paragraphs 0025 and 0035); 
a switching unit configured to selectively perform switching as to whether to display a first option or a second video image on the video display unit (paragraph 0030 describes the switching of modes), the second video 10image being the video image taken by the image pickup unit (paragraph 0030 describes the default content as being the point-of-view video feed); and 
a switching control unit (input selection module 616 of figure 6) configured to control the switching unit (as described in paragraph 0059), wherein 
the switching control unit switches the video image displayed on the video display unit when the switching control unit performs image recognition for the video image taken by the image pickup unit (paragraph 0069 explains that “system may select an input source for the multimode input field based on implicit information extracted from input data from the various possible input sources. This implicit information may correspond to certain data patterns in the input data.” and paragraph 0076 explains how input selection is based upon detection of a sequence of frames in the video of the video camera to monitor data including fixation on an object, meaning that the object is recognized from image recognition) and recognizes that an object with a predetermined identification pattern registered in advance (paragraph 0029 explains how an object registered in advance, such as a face to recognize, may be used so that an instruction may be provided, which performs object recognition on image or video content enclosed in the multimode input field, and further performs an image-based search on any object that is detected) is included in the video image taken by the image pickup unit (paragraph 0076 explains that an input source may be selected based on the pattern of objects in the video camera where the identification pattern is the same object recognized as being repeatedly in focus) and the switching control unit 15has detected that a user's hand has made a predetermined gesture (paragraphs 0105-0106 describe the gesture input that is received by the input selection module 616 of figure 6 to indicate the input source to be used).  
While Gomez et al. teaches the switching of input content based on the gesture in paragraphs 0105-0106, Gomez et al. does not explicitly teach a switching of video content based on the gesture. Gomez et al. does not teach using gestures to switch between a first video image or a second video image on the video display unit, the first video image being a video image of a certain content. Yeung et al. teaches using gestures to switch between a first video image or a second video image on the video display unit, the first video image being a video image of a certain content (paragraph 0038 recites “User interface 70 uses this input to determine which client stream should be the active stream (switch between videos 74)” where the input is given earlier to be a touch or gesture input and the videos are of a certain content).
It would have been obvious to one of ordinary skill before the effective filing date to include the concept of switching video displays based on a gesture as taught by Yeung et al. in the system of Gomez et al. The rationale to combine would be to allow all of the streams to allow a viewer to control the video playback (paragraphs 0034 and 0038 of Yeung et al.).
Regarding claim 2, Gomez et al. teaches the video display apparatus according to Claim 1, wherein, when the switching control unit has detected a gesture made by the user within a predetermined range in the video image taken by the image pickup unit, the 20switching control unit switches the video image displayed in the video display unit (paragraphs 0105-0106 describe the gesture input that is received by the input selection module 616 of figure 6 to indicate the input source to be used that must be within the range to be recognized by the device).  
Regarding claim 3, Gomez et al. teaches the video display apparatus according to Claim 1, wherein 
when the switching control unit has detected the user's hand within a shooting range of the image pickup unit (gesture to trigger a viewfinder mode as given in paragraph 0029), the switching control unit displays the 25second video image while superimposing it on the first video image (paragraph 0029 describes the superimposing or overlaying of the content over the video display in the HMD), and 
wherein when the switching control unit has detected a gesture made by the user within a predetermined range in the video image taken by the image pickup unit (gesture recognized within the range as given in paragraphs 0105-0106), the switching control unit displays the second video image in the video display unit (paragraphs 0105-0106 describe the gesture input that is received by the input selection module 616 of figure 6 to indicate the input source to be used that is switched to the appropriate input source content).  
Regarding independent claim 4, Gomez et al. teaches a method for controlling a video display apparatus (shown in figures 1-3) comprising: 
displaying a video image in a field of view of a user (paragraph 0038 explains how the eyeglasses include display elements that display a video in the field of view of the user that is given in paragraphs 0025 and 0029 to be video streams); 
taking an image of an outside state by using an image pickup unit (point-of-view video feed taken by video camera 120 of figure 2 as described in paragraphs 0025 and 0035); and 13 #5559500v1Docket No. 120537-0192UT01 
switching (paragraph 0030 describes the switching of modes), when an image recognition for the video image taken by the image pickup unit is performed (paragraph 0069 explains that “system may select an input source for the multimode input field based on implicit information extracted from input data from the various possible input sources. This implicit information may correspond to certain data patterns in the input data.” and paragraph 0076 explains how input selection is based upon detection of a sequence of frames in the video of the video camera to monitor data including fixation on an object, meaning that the object is recognized from image recognition) and an object with a predetermined identification pattern registered in advance (paragraph 0029 explains how an object registered in advance, such as a face to recognize, may be used so that an instruction may be provided, which performs object recognition on image or video content enclosed in the multimode input field, and further performs an image-based search on any object that is detected) is included in the video image taken by the image pickup unit (paragraph 0076 explains that an input source may be selected based on the pattern of objects in the video camera where the identification pattern is the same object recognized as being repeatedly in focus) and it is detected that a user's hand has made a predetermined gesture (paragraphs 0105-0106 describe the gesture input that is received by the input selection module 616 of figure 6 to indicate the input source to be used), a video image to be displayed in the field of view of the user to one of a first content and a second video image (paragraph 0030 describes the switching of modes), and the second video image being the video image taken by the image pickup unit (paragraph 0030 describes the default content as being the point-of-view video feed).  
While Gomez et al. teaches the switching of input content based on the gesture in paragraphs 0105-0106, Gomez et al. does not explicitly teach a switching of video content based on the gesture. Gomez et al. does not teach a video image to be displayed in the field of view of the user to one of a first video image and a second video image that 5corresponds to the detected gesture, the first video image being a video image of a certain content. Yeung et al. teaches a video image to be displayed in the field of view of the user to one of a first video image and a second video image that 5corresponds to the detected gesture, the first video image being a video image of a certain content (paragraph 0038 recites “User interface 70 uses this input to determine which client stream should be the active stream (switch between videos 74)” where the input is given earlier to be a touch or gesture input and the videos are of a certain content).
It would have been obvious to one of ordinary skill before the effective filing date to include the concept of switching video displays based on a gesture as taught by Yeung et al. in the system of Gomez et al. The rationale to combine would be to allow all of the streams to allow a viewer to control the video playback (paragraphs 0034 and 0038 of Yeung et al.).
Regarding independent claim 5, Gomez et al. teaches a non-transitory computer readable medium storing a program for causing 10a computer provided in a video display apparatus to perform processes (as described in paragraphs 0007-0008 and 0034) comprising: 
displaying a video image in a field of view of a user (paragraph 0038 explains how the eyeglasses include display elements that display a video in the field of view of the user that is given in paragraphs 0025 and 0029 to be video streams); 
taking an image of an outside state by using an image pickup unit (point-of-view video feed taken by video camera 120 of figure 2 as described in paragraphs 0025 and 0035); and 
switching (paragraph 0030 describes the switching of modes), when an image recognition for the video image taken by the image pickup unit is performed (paragraph 0069 explains that “system may select an input source for the multimode input field based on implicit information extracted from input data from the various possible input sources. This implicit information may correspond to certain data patterns in the input data.” and paragraph 0076 explains how input selection is based upon detection of a sequence of frames in the video of the video camera to monitor data including fixation on an object, meaning that the object is recognized from image recognition) and an object with a predetermined identification pattern registered in advance (paragraph 0029 explains how an object registered in advance, such as a face to recognize, may be used so that an instruction may be provided, which performs object recognition on image or video content enclosed in the multimode input field, and further performs an image-based search on any object that is detected) is included in the video image taken by the image pickup unit (paragraph 0076 explains that an input source may be selected based on the pattern of objects in the video camera where the identification pattern is the same object being recognized as repeatedly in focus) and it is detected that a 15user's hand has made a predetermined gesture (paragraphs 0105-0106 describe the gesture input that is received by the input selection module 616 of figure 6 to indicate the input source to be used), a video image to be displayed in the field of view of the user to one of a first content and a second video image (paragraph 0030 describes the switching of modes), and the second video image being the video image taken by the image pickup unit (paragraph 0030 describes the default content as being the point-of-view video feed).
While Gomez et al. teaches the switching of input content based on the gesture in paragraphs 0105-0106, Gomez et al. does not explicitly teach a switching of video content based on the gesture. Gomez et al. does not teach a video image to be displayed in the field of view of the user to one of a first video image and a second video image that 5corresponds to the detected gesture, the first video image being a video image of a certain content. Yeung et al. teaches a video image to be displayed in the field of view of the user to one of a first video image and a second video image that 5corresponds to the detected gesture, the first video image being a video image of a certain content (paragraph 0038 recites “User interface 70 uses this input to determine which client stream should be the active stream (switch between videos 74)” where the input is given earlier to be a touch or gesture input and the videos are of a certain content).
It would have been obvious to one of ordinary skill before the effective filing date to include the concept of switching video displays based on a gesture as taught by Yeung et al. in the system of Gomez et al. The rationale to combine would be to allow all of the streams to allow a viewer to control the video playback (paragraphs 0034 and 0038 of Yeung et al.).
Response to Arguments
Applicant's arguments filed 7/28/22 have been fully considered but they are not persuasive. 
Applicant contends that neither of the references teaches switching based on both the detection of a predetermined input pattern and the detection of a gesture in order to to prevent switching that is unintended by the user. The examiner disagrees. Gomez et al. teaches switching based on the detection of a predetermined input pattern, as given in paragraphs 0029, 0069, and 0076 and switching of input content based on the gesture in paragraphs 0105-0106. Yeung et al. teaches switching based on a gesture, as given in paragraph 0038. Since gesture control is mentioned in Gomez et al. but not fully explained, the teachings of Yeung et al. could obviously be incorporated into the system without impermissible hindsight. Thus, the combination of references renders obvious switching based on both the detection of a predetermined input pattern and the detection of a gesture. While the purpose of preventing switching that is unintended by the user is not explicitly stated, the same purpose is served by the system.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. The references cited in the attached notice of references cited each teach similar material with video content switched based on a detected gesture.

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to PARUL H GUPTA whose telephone number is (571)272-5260. The examiner can normally be reached Monday through Friday, from 10 AM to 7 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ke Xiao can be reached on 571-272-7776. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/PARUL H GUPTA/Primary Examiner, Art Unit 2627