DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

In the response to this Office Action, the Examiner respectfully requests that support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line numbers in the specification and/or drawing figure(s). This will assist the Examiner in prosecuting this application.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 04/05/2022 has been entered.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees.  A nonstatutory double patenting rejection is appropriate where the claims at issue are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the reference application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b).
The USPTO internet Web site contains terminal disclaimer forms which may be used.  Please visit http://www.uspto.gov/forms/.  The filing date of the application will determine what form should be used.  A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission.  For more information about eTerminal Disclaimers, refer to http://www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.  

Claim 1 is rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent 9,958,947 B2. Although the claims at issue are not identical, they are not patentably distinct from each other.
The following is an example for comparing claim 1 of this application and claim 1 of U.S. Patent 9,958,947 B2:
Claim 1 of this application
Claim 1 of U.S. Patent 9,958,947 B2
A method comprising: determining a position, in content output on a first output device, that a user is ocularly focused on; determining, based on the determined position, a confidence value; and causing output, on a second output device different from the first output device, during a time period in which the content is output on the first output device, of the content and a visual indication, based on the confidence value, of the determined position.
1. A method comprising:
causing, by a computing device, display of an image via a first display device during a time period in which the image is displayed via a second display device;
determining a position, in the image displayed via the first display device, that a user is ocularly focused on;
determining a confidence value associated with the determined position; and
sending information configured to cause display, via the second display device, of a visual indication of the position, wherein a location of the visual indication is based on the determined position, in the image displayed via the first display device, that the user is ocularly focused on, and wherein the visual indication is based on the confidence value.


The limitations of claim 1 of current application are broader and are therefore anticipated by those found in claim 1 of U.S. Patent 9,958,947 B2.
Claim 1 is similarly rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent 10,394,336 B2. Although the claims at issue are not identical, they are not patentably distinct from each other.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 4-5, 7-8, 21-23, and 25 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Patent Application Publication 2006/0109237 A1 to Morita et al. (hereinafter "Morita") in view of U.S. Patent 8,914,472 B1 to Lee et al. (hereinafter "Mendis "), and further in view of U.S. Patent Application Publication 2013/0154918 A1 to Vaught et al. (hereinafter "Vaught").
Regarding Claim 1, Morita teaches a method comprising: determining a position, in content output on a first output device, that a user is ocularly focused on and causing display of the content and a visual indication of the determined position (Figs. 1-3; Para. 37-67 of Morita; eye or gaze tracking capability of the headset 210 and/or processor 230 may be used to control a display device, such as the display device 110. For example, the processor 230 detects when a user is looking at a certain button, option or feature on a display and selects or activates the button, option or feature for the user… control of the display device 110 may be shared by multiple users and common information displayed on display device 110 so that multiple users may be accommodated at approximately the same time… a gaze tracking system may be used to indicate a location on the display device 110. That is, a gaze tracking system, such as one using the motion tracking device 120, may determine a focus of the user's gaze. The tracking system may also determine a “dwell time” or length of time that the user focuses on a location. If a user focuses on a location for at least a certain period of time, the tracking system may position a cursor at that location on the display device 110… gazing at a certain location for a certain length of time generates a “roll-over” or overlay of supporting information at the display device 110).
Morita does not explicitly disclose determining, based on the determined position, a confidence value; and causing output, on a second output device different from the first output device, during a time period in which the content is output on the first output device, of the content and a visual indication, based on the confidence value, of the determined position.
However, Mendis teaches causing output, on a second output device different from a first output device, during a time period in which the content is output on the first output device, of the content and a visual indication of a determined position (Figs. 3A-3E; Col. 13, ln. 40 to Col. 14, ln. 59 of Mendis; during the experience-sharing session, the novice's HMD 312 can transmit a video, or any other media content, in-real time to the expert's HMD 302. Accordingly, while looking at the HMD 302, the expert can see not only his own hand performing a task, but also a real-time overlayed video showing the novice's hand performing the task… In an experience-sharing session, media content generated at a first sharing device can be presented to a second sharing device according to the second sharing device's perspective… server system 320 can also receive information in real-time that is indicative of a perspective of the novice's HMD 312. For example, the information can include gyroscopic information that indicates how the novice's HMD 312 is oriented, location information that indicates where the novice's HMD 312 is located, gaze information that indicates a gaze direction of the novice's eye(s)
Fig. 5 of Mendis; At block 506, during the observation phase, the server system can receive second media content from the second HMD and send the second media content in real-time to the first HMD. The second media content can include a point-of-view video recorded at the second HMD…. during the observation phase, the server system can receive information indicative of one or more sensors of the second HMD. The server system can send the information in real-time to the first HMD… the information can include a location of the second HMD).
 Therefore, at the time when the invention was filed, it would have been obvious to a person of ordinary skill in the art to include causing output, via a second output device different from the first output device, of the content using the teachings of Mendis in order to modify the method taught by Morita. The motivation to combine these analogous arts would have been for facilitating an experience-sharing session in real-time between a first head-mountable display (HMD) and a second HMD (Abstract of Mendis).
The combination of Morita and Mendis does not explicitly disclose determining, based on the determined position, a confidence value; and causing output of a visual indication, based on the confidence value, of the determined position.
However, Vaught teaches determining, based on determined position, a confidence value; and causing display, of output and a visual indication, based on the confidence value, of the determined position (Figs. 2-4; Para. 47 of Vaught; confidence values are assigned to the first estimated gaze determined by gaze target component 304 and/or the enhanced estimated user eye gaze determined by estimation component 308
Para. 33, 38 of Vaught; HMD 200 can communicate wirelessly with external devices… acquired data is transmitted wirelessly to a separate device for processing… when gaze line 402 is determined with a higher level of confidence, gaze target area 406 is smaller, and when gaze line 402 is determined with a lower level of confidence, gaze target area 406 is larger to account for the lower confidence in gaze line 402).
 Therefore, at the time when the invention was filed, it would have been obvious to a person of ordinary skill in the art to include determining, based on the determined position, a confidence value; and causing output of a visual indication, based on the confidence value, of the determined position using the teachings of Vaught in order to modify the method taught by the combination of Morita and Mendis. The motivation to combine these analogous arts would have been to provide an enhanced user eye gaze estimation by narrowing a database of eye information and corresponding known gaze lines to a subset of the eye information having gaze lines corresponding to a gaze target area (Abstract of Vaught).

Regarding Claim 2, the combination of Morita, Mendis, and Vaught teaches that the determining the position comprises: determining, based on data generated by a sensor physically attached to the user, the position (Figs. 1-3; Para. 45-67 of Morita; eye or gaze tracking capability of the headset 210 and/or processor 230 may be used to control a display device, such as the display device 110).

Regarding Claim 4, the combination of Morita, Mendis, and Vaught teaches determining, based on the confidence value, to alter the determined position (Figs. 3-4; Para. 47 of Vaught; confidence values are assigned to the first estimated gaze determined by gaze target component 304 and/or the enhanced estimated user eye gaze determined by estimation component 308
Para. 38 of Vaught; when gaze line 402 is determined with a higher level of confidence, gaze target area 406 is smaller, and when gaze line 402 is determined with a lower level of confidence, gaze target area 406 is larger to account for the lower confidence in gaze line 402).

Regarding Claim 5, the combination of Morita, Mendis, and Vaught teaches that the second output device comprises an augmented reality output device (Figs. 1-3; Para. 37-67 of Morita; video switchboxes and/or voice commands may be used with image-guided surgery to switch displays so that only image-guided surgery information is viewed… relevant clinical information germane to a current “focus” or location (e.g., cursor location) on the display device 110 may be displayed on the display device 110. For example, if a radiologist identifies an unknown mass in a patient's kidney and focuses his or her gaze on the kidney, overlays may pop-up on the display device 110 to provide additional information and support to the radiologist. Information may include computer-aided detection (CAD) findings/region of interest (ROI), prothrombin time (PT) values, Creatinine levels, Gleason Score, family history, etc.).

Regarding Claim 7, the combination of Morita, Mendis, and Vaught teaches that a size, a shape, or a color of the visual indication is based on the confidence value (Figs. 3-4; Para. 47 of Vaught; confidence values are assigned to the first estimated gaze determined by gaze target component 304 and/or the enhanced estimated user eye gaze determined by estimation component 308
Para. 38 of Vaught; when gaze line 402 is determined with a higher level of confidence, gaze target area 406 is smaller, and when gaze line 402 is determined with a lower level of confidence, gaze target area 406 is larger to account for the lower confidence in gaze line 402).

Regarding Claim 8, the combination of Morita, Mendis, and Vaught teaches determining, based on the determined position and an object identified in the content, overlay information; and wherein the causing output further comprises causing output, on the second output device, of the overlay information (Figs. 3A-3E; Col. 13, ln. 40 to Col. 14, ln. 59 of Mendis; during the experience-sharing session, the novice's HMD 312 can transmit a video, or any other media content, in-real time to the expert's HMD 302. Accordingly, while looking at the HMD 302, the expert can see not only his own hand performing a task, but also a real-time overlayed video showing the novice's hand performing the task… In an experience-sharing session, media content generated at a first sharing device can be presented to a second sharing device according to the second sharing device's perspective… server system 320 can also receive information in real-time that is indicative of a perspective of the novice's HMD 312. For example, the information can include gyroscopic information that indicates how the novice's HMD 312 is oriented, location information that indicates where the novice's HMD 312 is located, gaze information that indicates a gaze direction of the novice's eye(s)).

Regarding Claim 21, the combination of Morita, Mendis, and Vaught teaches that the second output device is remote from the first output device (Figs. 3A-3E; Col. 13, ln. 40 to Col. 14, ln. 59 of Mendis; during the experience-sharing session, the novice's HMD 312 can transmit a video, or any other media content, in-real time to the expert's HMD 302. Accordingly, while looking at the HMD 302, the expert can see not only his own hand performing a task, but also a real-time overlayed video showing the novice's hand performing the task).

Regarding Claim 22, the combination of Morita, Mendis, and Vaught teaches that each of the first output device and the second output device comprises one or more of a projector, a monitor or television display, a virtual reality headset, or an optical head-mounted display (Figs. 3A-3E; Col. 13, ln. 40 to Col. 14, ln. 59 of Mendis; during the experience-sharing session, the novice's HMD 312 can transmit a video, or any other media content, in-real time to the expert's HMD 302. Accordingly, while looking at the HMD 302, the expert can see not only his own hand performing a task, but also a real-time overlayed video showing the novice's hand performing the task).

Regarding Claim 23, the combination of Morita, Mendis, and Vaught teaches that the causing output comprises sending, to the second output device, the content and the visual indication of the determined position (Figs. 3A-3E; Col. 13, ln. 40 to Col. 14, ln. 59 of Mendis; during the experience-sharing session, the novice's HMD 312 can transmit a video, or any other media content, in-real time to the expert's HMD 302. Accordingly, while looking at the HMD 302, the expert can see not only his own hand performing a task, but also a real-time overlayed video showing the novice's hand performing the task… In an experience-sharing session, media content generated at a first sharing device can be presented to a second sharing device according to the second sharing device's perspective… server system 320 can also receive information in real-time that is indicative of a perspective of the novice's HMD 312. For example, the information can include gyroscopic information that indicates how the novice's HMD 312 is oriented, location information that indicates where the novice's HMD 312 is located, gaze information that indicates a gaze direction of the novice's eye(s)).

Regarding Claim 25, the combination of Morita, Mendis, and Vaught teaches causing output, on the second output device, of a visual indication of the confidence value (Figs. 3A-3E; Col. 13, ln. 40 to Col. 14, ln. 59 of Mendis; during the experience-sharing session, the novice's HMD 312 can transmit a video, or any other media content, in-real time to the expert's HMD 302. Accordingly, while looking at the HMD 302, the expert can see not only his own hand performing a task, but also a real-time overlayed video showing the novice's hand performing the task… In an experience-sharing session, media content generated at a first sharing device can be presented to a second sharing device according to the second sharing device's perspective… server system 320 can also receive information in real-time that is indicative of a perspective of the novice's HMD 312. For example, the information can include gyroscopic information that indicates how the novice's HMD 312 is oriented, location information that indicates where the novice's HMD 312 is located, gaze information that indicates a gaze direction of the novice's eye(s)
Figs. 3-4; Para. 47 of Vaught; confidence values are assigned to the first estimated gaze determined by gaze target component 304 and/or the enhanced estimated user eye gaze determined by estimation component 308… Para. 38 of Vaught; when gaze line 402 is determined with a higher level of confidence, gaze target area 406 is smaller, and when gaze line 402 is determined with a lower level of confidence, gaze target area 406 is larger to account for the lower confidence in gaze line 402).

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable Morita in view of Mendis and Vaught, and further in view of U.S. Patent Application Publication 2012/0133754 A1 to Lee et al. (hereinafter "Lee").
Regarding Claim 3, the combination of Morita, Mendis, and Vaught does not explicitly disclose that the determining the position comprises: determining a physical distance between the user and the first output device; and determining, based on the physical distance, the position.
However, Lee teaches determining a physical distance between a user and a first output device, and determining a position based on the physical distance (Figs. 1-4; Claim 12 of Lee; remote gaze tracking method, comprising: acquiring an entire image using a visible ray, the entire image including a facial region of a user; detecting the facial region from the acquired entire image; acquiring, from the detected facial region, a face width, a distance between eyes, and a distance between an eye and a screen; acquiring an enlarged eye image corresponding to a face of the user, using at least one of the acquired face width, the acquired distance between the eyes, and the acquired distance between the eye and the screen; and tracking an eye gaze of the user, using the acquired entire image).
 Therefore, at the time of the invention, it would have been obvious to a person of ordinary skill in the art to include that the determining the position comprises: determining a physical distance between the user and the first output device; and determining, based on the physical distance, the position using the teachings of Lee in order to modify the system taught by the combination of Morita, Mendis, and Vaught. The motivation to combine these analogous arts would have been to enable control an Internet Protocol Television (IPTV) and content using information on an eye gaze of a viewer in an IPTV environment (Para. 2 of Lee).

Claims 6 and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Morita in view of Mendis and Vaught, and further in view of U.S. Patent Application Publication 2013/0083173 A1 to Geisner et al. (hereinafter "Geisner").
Regarding Claim 6, the combination of Morita, Mendis, and Vaught does not explicitly disclose that the content comprises an image of a geographic environment, the determined position comprises a first geographic location of the geographic environment, and the visual indication is further based on a second geographic location of the second output device.
However, Geisner teaches that content comprises an image of a geographic environment, determined position comprises a first geographic location of the geographic environment, and visual indication is further based on a second geographic location of the second output device (Para. 63-95 of Geisner; environment may be identified by location data which may be used as an index to search in location indexed image and pre-generated 3D map databases 324 or in Internet accessible images 326 for a map or image related data which may be used to generate a map. For example, location data such as GPS data from a GPS transceiver 144 on the display device 2 may identify the location of the user… An example of image related data which may be used to generate a map is meta data associated with any matched image data, from which objects and their positions within a coordinate system for the environment can be identified. For example, a relative position of one or more objects in image data from the outward facing cameras 113 of the user's personal A/V apparatus 8 can be determined with respect to one or more GPS tracked objects in the location from which other relative positions of real and virtual objects can be identified… User location and tracking module 412 keeps track of various users which are utilizing the system. Users can be identified by unique user identifiers, location and other elements… The local copy on the personal A/V apparatus 8 may send device data identifying a user position within the location or environment as well as image and depth data of its field of view and sensor data indicating head position and orientation and gaze direction… The selection of virtual data for display may also be based on gaze direction detected
Figs. 11-12B; Para. 120-125 of Geisner; In step 712, user input is received by the first personal A/V apparatus 8 from a first user indicating a position relative to the first user at which to display 3D virtual data representing a second user. Some examples of a position relative to the first user is likely standing or sitting next to the first user. 3D virtual data can be generated of the second user based on captured image and preferably depth data as well. In step 714, the virtual spectator application 188 transforms the 3D virtual data representing the second user and a current action being performed by the second user to be at the position relative to the first user…  In step 716, the 3D virtual data representing the second user and the current action being performed by the second user are displayed in the near-eye, augmented reality display responsive to the position relative to the first user being within a display field of view of the near-eye, augmented reality display of the first user).
Therefore, at the time when the invention was filed, it would have been obvious to a person of ordinary skill in the art to include that the content comprises an image of a geographic environment, the determined position comprises a first geographic location of the geographic environment, and the visual indication is further based on a second geographic location of the second output device using the teachings of Geisner in order to modify the method taught by the combination of Morita, Mendis, and Vaught. The motivation to combine these analogous arts would have been to provide a personalized experience for a user in relation to an event being viewed from a location remote from the location where the event is occurring (Para. 27 of Geisner).

Regarding Claim 24, the combination of Morita, Mendis, and Vaught does not explicitly disclose that the determining the position is based on detection of an ocular gesture of the user, and the ocular gesture comprises a predetermined blink sequence.
However, Geisner teaches that determining position is based on detection of an ocular gesture of a user, and the ocular gesture comprises a predetermined blink sequence (Para. 79-95 of Geisner; gesture recognition engine 193 can identify actions performed by a user indicating a control or command to an executing application. The action may be performed by a body part of a user, e.g. a hand or finger, but also an eye blink sequence of an eye can be a gesture... gesture recognition engine 193 can identify the throw gesture based on image and depth data from the front facing cameras 113 on his display device 2 or from captured data from the environment depth cameras or from another display device's capture data, or a combination of these… natural user interface also includes the eye tracking software 196 and the eye tracking system 134 in that other user physical actions of a body part such as an eye can be interpreted as commands or requests. The eye tracking software 196 can identify gaze direction or a point of gaze based on pupil position and eye movements, e.g. a blink sequence, indicating a command or request).
Therefore, at the time when the invention was filed, it would have been obvious to a person of ordinary skill in the art to include that the determining the position is based on detection of an ocular gesture of the user, and the ocular gesture comprises a predetermined blink sequence using the teachings of Geisner in order to modify the method taught by the combination of Morita, Mendis, and Vaught. The motivation to combine these analogous arts would have been to provide a personalized experience for a user in relation to an event being viewed from a location remote from the location where the event is occurring (Para. 27 of Geisner).

Claims 9 and 11-15 are rejected under 35 U.S.C. 103 as being unpatentable over Geisner in view of Mendis and further in view of Vaught. 
Regarding Claim 9, Geisner teaches a method comprising: receiving, by a second output device and from a first output device, content output on the first output device, that a user is ocularly focused on (Figs. 3-14; Para. 63-95 of Geisner; environment may be identified by location data which may be used as an index to search in location indexed image and pre-generated 3D map databases 324 or in Internet accessible images 326 for a map or image related data which may be used to generate a map. For example, location data such as GPS data from a GPS transceiver 144 on the display device 2 may identify the location of the user… An example of image related data which may be used to generate a map is meta data associated with any matched image data, from which objects and their positions within a coordinate system for the environment can be identified. For example, a relative position of one or more objects in image data from the outward facing cameras 113 of the user's personal A/V apparatus 8 can be determined with respect to one or more GPS tracked objects in the location from which other relative positions of real and virtual objects can be identified… User location and tracking module 412 keeps track of various users which are utilizing the system. Users can be identified by unique user identifiers, location and other elements
Figs. 11-12B; Para. 120-125 of Geisner; In step 712, user input is received by the first personal A/V apparatus 8 from a first user indicating a position relative to the first user at which to display 3D virtual data representing a second user. Some examples of a position relative to the first user is likely standing or sitting next to the first user. 3D virtual data can be generated of the second user based on captured image and preferably depth data as well. In step 714, the virtual spectator application 188 transforms the 3D virtual data representing the second user and a current action being performed by the second user to be at the position relative to the first user…  In step 716, the 3D virtual data representing the second user and the current action being performed by the second user are displayed in the near-eye, augmented reality display responsive to the position relative to the first user being within a display field of view of the near-eye, augmented reality display of the first user … Joe indicates a side position relative to his head at which to view a virtual representation of his friend Greg 24 who is actually at the game. Greg who is also wearing a personal A/V apparatus 8 comprising a companion processing module 4, display device 2 and wire 6 also has entered user input indicating a side position to his left, a seat near him at which to see Joe 18… In FIG. 12A, Greg's virtual object 24 is seen through Joe's display device 2 as he turns his head to the right side by the processing of the embodiment in FIG. 11. In FIG. 12B, the processing of FIG. 11 displays Joe 18 projected at a left side position relative to Greg 24 who is actually at the game when Greg's head orientation data from inertial sensors 132 indicate he is looking to his left).
Geisner does not explicitly disclose during a time period in which the content is output on the first output device, causing output, on the second output device, of the content and an overlay, on the content, comprising a visual indication of the position, wherein the visual indication is based on a confidence value of the position.
However, Mendis teaches during a time period in which content is output on a first output device, causing output, on a second output device, of the content and an overlay, on the content, comprising a visual indication of a position (Figs. 3A-3E; Col. 13, ln. 40 to Col. 14, ln. 59 of Mendis; during the experience-sharing session, the novice's HMD 312 can transmit a video, or any other media content, in-real time to the expert's HMD 302. Accordingly, while looking at the HMD 302, the expert can see not only his own hand performing a task, but also a real-time overlayed video showing the novice's hand performing the task… In an experience-sharing session, media content generated at a first sharing device can be presented to a second sharing device according to the second sharing device's perspective… server system 320 can also receive information in real-time that is indicative of a perspective of the novice's HMD 312. For example, the information can include gyroscopic information that indicates how the novice's HMD 312 is oriented, location information that indicates where the novice's HMD 312 is located, gaze information that indicates a gaze direction of the novice's eye(s)
Fig. 5 of Mendis; At block 506, during the observation phase, the server system can receive second media content from the second HMD and send the second media content in real-time to the first HMD. The second media content can include a point-of-view video recorded at the second HMD…. during the observation phase, the server system can receive information indicative of one or more sensors of the second HMD. The server system can send the information in real-time to the first HMD… the information can include a location of the second HMD).
Therefore, at the time when the invention was filed, it would have been obvious to a person of ordinary skill in the art to include during a time period in which the content is output on the first output device, causing output, on the second output device, of the content and an overlay, on the content, comprising a visual indication of the position using the teachings of Mendis in order to modify the method taught by Geisner. The motivation to combine these analogous arts would have been for facilitating an experience-sharing session in real-time between a first head-mountable display (HMD) and a second HMD (Abstract of Mendis).
 The combination of Geisner and Mendis does not explicitly disclose that the visual indication is based on a confidence value of the position.
However, Vaught teaches that a visual indication is based on a confidence value of a position (Figs. 2-4; Para. 47 of Vaught; confidence values are assigned to the first estimated gaze determined by gaze target component 304 and/or the enhanced estimated user eye gaze determined by estimation component 308
Para. 33, 38 of Vaught; HMD 200 can communicate wirelessly with external devices… acquired data is transmitted wirelessly to a separate device for processing… when gaze line 402 is determined with a higher level of confidence, gaze target area 406 is smaller, and when gaze line 402 is determined with a lower level of confidence, gaze target area 406 is larger to account for the lower confidence in gaze line 402).
 Therefore, at the time when the invention was filed, it would have been obvious to a person of ordinary skill in the art to include that the visual indication is based on a confidence value of the position using the teachings of Vaught in order to modify the method taught by the combination of Geisner and Mendis. The motivation to combine these analogous arts would have been to provide an enhanced user eye gaze estimation by narrowing a database of eye information and corresponding known gaze lines to a subset of the eye information having gaze lines corresponding to a gaze target area (Abstract of Vaught).

Regarding Claim 11, the combination of Geisner, Mendis, and Vaught teaches causing output, by the second output device, of an updated position and updated content; and sending, to the first output device, updated information received from a second user (Figs. 3B-3E; Col. 13, ln. 40 to Col. 14, ln. 59 of Mendis; during the experience-sharing session, the novice's HMD 312 can transmit a video, or any other media content, in-real time to the expert's HMD 302. Accordingly, while looking at the HMD 302, the expert can see not only his own hand performing a task, but also a real-time overlayed video showing the novice's hand performing the task
Fig. 5 of Mendis; At block 506, during the observation phase, the server system can receive second media content from the second HMD and send the second media content in real-time to the first HMD. The second media content can include a point-of-view video recorded at the second HMD…. during the observation phase, the server system can receive information indicative of one or more sensors of the second HMD. The server system can send the information in real-time to the first HMD. For example, the information can include an orientation of the second HMD. As another example, the information can include a location of the second HMD).

Regarding Claim 12, the combination of Geisner, Mendis, and Vaught teaches that a size or a shape of the visual indication of the position is based on the confidence value associated with the position (Figs. 3-4; Para. 47 of Vaught; confidence values are assigned to the first estimated gaze determined by gaze target component 304 and/or the enhanced estimated user eye gaze determined by estimation component 308
Para. 38 of Vaught; when gaze line 402 is determined with a higher level of confidence, gaze target area 406 is smaller, and when gaze line 402 is determined with a lower level of confidence, gaze target area 406 is larger to account for the lower confidence in gaze line 402).

Regarding Claim 13, the combination of Geisner, Mendis, and Vaught teaches that the content comprises a video image of an environment, and the visual indication indicates a geographic location of an object in the environment (Figs. 3-14; Para. 63-95 of Geisner; environment may be identified by location data which may be used as an index to search in location indexed image and pre-generated 3D map databases 324 or in Internet accessible images 326 for a map or image related data which may be used to generate a map. For example, location data such as GPS data from a GPS transceiver 144 on the display device 2 may identify the location of the user… An example of image related data which may be used to generate a map is meta data associated with any matched image data, from which objects and their positions within a coordinate system for the environment can be identified. For example, a relative position of one or more objects in image data from the outward facing cameras 113 of the user's personal A/V apparatus 8 can be determined with respect to one or more GPS tracked objects in the location from which other relative positions of real and virtual objects can be identified… User location and tracking module 412 keeps track of various users which are utilizing the system. Users can be identified by unique user identifiers, location and other elements
Para. 100-125 of Geisner; In step 602, the virtual event data provider system 404 receives in real time one or more positions of one or more event objects, like a guitarist playing on stage at a concert or the football in a game, participating in the event occurring at a first location. With the aid of a scene mapping engine, the virtual spectator application 188 in step 604 maps the one or more 3D space positions or position volumes of the one or more event objects in the first 3D coordinate system for the first location to a second 3D coordinate system for a second location remote from the first location… In step 606, based on image and depth data captured by capture devices 113 of a respective personal A/V apparatus 8 and transmitted to the virtual event data provider 404, a scene mapping engine 306 of the provider 404 determines a display field of view of a near-eye, augmented reality display of the respective personal A/V apparatus 8 at the second location 608… In step 712, user input is received by the first personal A/V apparatus 8 from a first user indicating a position relative to the first user at which to display 3D virtual data representing a second user. Some examples of a position relative to the first user is likely standing or sitting next to the first user. 3D virtual data can be generated of the second user based on captured image and preferably depth data as well. In step 714, the virtual spectator application 188 transforms the 3D virtual data representing the second user and a current action being performed by the second user to be at the position relative to the first user…  In step 716, the 3D virtual data representing the second user and the current action being performed by the second user are displayed in the near-eye, augmented reality display responsive to the position relative to the first user being within a display field of view of the near-eye, augmented reality display of the first user).

Regarding Claim 14, the combination of Geisner, Mendis, and Vaught teaches that the content comprises an image of an environment, and the visual indication indicates a geographic location of an object in the environment (Figs. 3-14; Para. 100-125 of Geisner; In step 602, the virtual event data provider system 404 receives in real time one or more positions of one or more event objects, like a guitarist playing on stage at a concert or the football in a game, participating in the event occurring at a first location. With the aid of a scene mapping engine, the virtual spectator application 188 in step 604 maps the one or more 3D space positions or position volumes of the one or more event objects in the first 3D coordinate system for the first location to a second 3D coordinate system for a second location remote from the first location… In step 606, based on image and depth data captured by capture devices 113 of a respective personal A/V apparatus 8 and transmitted to the virtual event data provider 404, a scene mapping engine 306 of the provider 404 determines a display field of view of a near-eye, augmented reality display of the respective personal A/V apparatus 8 at the second location 608… In step 712, user input is received by the first personal A/V apparatus 8 from a first user indicating a position relative to the first user at which to display 3D virtual data representing a second user. Some examples of a position relative to the first user is likely standing or sitting next to the first user. 3D virtual data can be generated of the second user based on captured image and preferably depth data as well. In step 714, the virtual spectator application 188 transforms the 3D virtual data representing the second user and a current action being performed by the second user to be at the position relative to the first user…  In step 716, the 3D virtual data representing the second user and the current action being performed by the second user are displayed in the near-eye, augmented reality display responsive to the position relative to the first user being within a display field of view of the near-eye, augmented reality display of the first user
Para. 63-95 of Geisner; environment may be identified by location data which may be used as an index to search in location indexed image and pre-generated 3D map databases 324 or in Internet accessible images 326 for a map or image related data which may be used to generate a map. For example, location data such as GPS data from a GPS transceiver 144 on the display device 2 may identify the location of the user… An example of image related data which may be used to generate a map is meta data associated with any matched image data, from which objects and their positions within a coordinate system for the environment can be identified. For example, a relative position of one or more objects in image data from the outward facing cameras 113 of the user's personal A/V apparatus 8 can be determined with respect to one or more GPS tracked objects in the location from which other relative positions of real and virtual objects can be identified… User location and tracking module 412 keeps track of various users which are utilizing the system. Users can be identified by unique user identifiers, location and other elements… The local copy on the personal A/V apparatus 8 may send device data identifying a user position within the location or environment as well as image and depth data of its field of view and sensor data indicating head position and orientation and gaze direction).

Regarding Claim 15, the combination of Geisner, Mendis, and Vaught teaches receiving, from a second user, overlay information (Figs. 3-14; Para. 100-125 of Geisner; In step 602, the virtual event data provider system 404 receives in real time one or more positions of one or more event objects, like a guitarist playing on stage at a concert or the football in a game, participating in the event occurring at a first location. With the aid of a scene mapping engine, the virtual spectator application 188 in step 604 maps the one or more 3D space positions or position volumes of the one or more event objects in the first 3D coordinate system for the first location to a second 3D coordinate system for a second location remote from the first location… In step 606, based on image and depth data captured by capture devices 113 of a respective personal A/V apparatus 8 and transmitted to the virtual event data provider 404, a scene mapping engine 306 of the provider 404 determines a display field of view of a near-eye, augmented reality display of the respective personal A/V apparatus 8 at the second location 608… In step 712, user input is received by the first personal A/V apparatus 8 from a first user indicating a position relative to the first user at which to display 3D virtual data representing a second user. Some examples of a position relative to the first user is likely standing or sitting next to the first user. 3D virtual data can be generated of the second user based on captured image and preferably depth data as well. In step 714, the virtual spectator application 188 transforms the 3D virtual data representing the second user and a current action being performed by the second user to be at the position relative to the first user…  In step 716, the 3D virtual data representing the second user and the current action being performed by the second user are displayed in the near-eye, augmented reality display responsive to the position relative to the first user being within a display field of view of the near-eye, augmented reality display of the first user
Figs. 3B-3E; Col. 13, ln. 40 to Col. 14, ln. 59 of Mendis; during the experience-sharing session, the novice's HMD 312 can transmit a video, or any other media content, in-real time to the expert's HMD 302. Accordingly, while looking at the HMD 302, the expert can see not only his own hand performing a task, but also a real-time overlayed video showing the novice's hand performing the task).

Regarding Claim 26, the combination of Geisner, Mendis, and Vaught teaches sending, to the first output device, information indicating a location of the second output device (Fig. 5 of Mendis; At block 506, during the observation phase, the server system can receive second media content from the second HMD and send the second media content in real-time to the first HMD. The second media content can include a point-of-view video recorded at the second HMD…. during the observation phase, the server system can receive information indicative of one or more sensors of the second HMD. The server system can send the information in real-time to the first HMD… the information can include a location of the second HMD).

Claims 16-18, 20, and 27 are rejected under 35 U.S.C. 103 as being unpatentable over Geisner in view of Mendis.
Regarding Claim 16, Geisner teaches a method comprising: receiving, via a first output device, an indication of a position that a user is ocularly focused on; determining, based on the position and a geographic environment associated with the user, a first geographic location of the position; and causing output, on a second output device, of a visual indication of the first geographic location of the position (Figs. 3-14; Para. 63-95 of Geisner; environment may be identified by location data which may be used as an index to search in location indexed image and pre-generated 3D map databases 324 or in Internet accessible images 326 for a map or image related data which may be used to generate a map. For example, location data such as GPS data from a GPS transceiver 144 on the display device 2 may identify the location of the user… An example of image related data which may be used to generate a map is meta data associated with any matched image data, from which objects and their positions within a coordinate system for the environment can be identified. For example, a relative position of one or more objects in image data from the outward facing cameras 113 of the user's personal A/V apparatus 8 can be determined with respect to one or more GPS tracked objects in the location from which other relative positions of real and virtual objects can be identified… User location and tracking module 412 keeps track of various users which are utilizing the system. Users can be identified by unique user identifiers, location and other elements
Figs. 11-12B; Para. 120-125 of Geisner; In step 712, user input is received by the first personal A/V apparatus 8 from a first user indicating a position relative to the first user at which to display 3D virtual data representing a second user. Some examples of a position relative to the first user is likely standing or sitting next to the first user. 3D virtual data can be generated of the second user based on captured image and preferably depth data as well. In step 714, the virtual spectator application 188 transforms the 3D virtual data representing the second user and a current action being performed by the second user to be at the position relative to the first user…  In step 716, the 3D virtual data representing the second user and the current action being performed by the second user are displayed in the near-eye, augmented reality display responsive to the position relative to the first user being within a display field of view of the near-eye, augmented reality display of the first user … Joe indicates a side position relative to his head at which to view a virtual representation of his friend Greg 24 who is actually at the game. Greg who is also wearing a personal A/V apparatus 8 comprising a companion processing module 4, display device 2 and wire 6 also has entered user input indicating a side position to his left, a seat near him at which to see Joe 18… In FIG. 12A, Greg's virtual object 24 is seen through Joe's display device 2 as he turns his head to the right side by the processing of the embodiment in FIG. 11. In FIG. 12B, the processing of FIG. 11 displays Joe 18 projected at a left side position relative to Greg 24 who is actually at the game when Greg's head orientation data from inertial sensors 132 indicate he is looking to his left).
Geisner does not explicitly disclose causing output, on a second output device, of a visual indication of the first geographic location of the position from a perspective of the second output device.
However, Mendis teaches causing output, on a second output device, of a visual indication of a first geographic location of a position from a perspective of a second output device (Figs. 3A-3E; Col. 13, ln. 40 to Col. 14, ln. 59 of Mendis; during the experience-sharing session, the novice's HMD 312 can transmit a video, or any other media content, in-real time to the expert's HMD 302. Accordingly, while looking at the HMD 302, the expert can see not only his own hand performing a task, but also a real-time overlayed video showing the novice's hand performing the task… In an experience-sharing session, media content generated at a first sharing device can be presented to a second sharing device according to the second sharing device's perspective… server system 320 can also receive information in real-time that is indicative of a perspective of the novice's HMD 312. For example, the information can include gyroscopic information that indicates how the novice's HMD 312 is oriented, location information that indicates where the novice's HMD 312 is located, gaze information that indicates a gaze direction of the novice's eye(s)
Fig. 5 of Mendis; At block 506, during the observation phase, the server system can receive second media content from the second HMD and send the second media content in real-time to the first HMD. The second media content can include a point-of-view video recorded at the second HMD…. during the observation phase, the server system can receive information indicative of one or more sensors of the second HMD. The server system can send the information in real-time to the first HMD… the information can include a location of the second HMD).
Therefore, at the time when the invention was filed, it would have been obvious to a person of ordinary skill in the art to include causing output, on a second output device, of a visual indication of the first geographic location of the position from a perspective of the second output device using the teachings of Mendis in order to modify the method taught by the combination of Geisner. The motivation to combine these analogous arts would have been for facilitating an experience-sharing session in real-time between a first head-mountable display (HMD) and a second HMD (Abstract of Mendis).

Regarding Claim 17, the combination of Geisner and Mendis teaches that the second output device comprises an augmented reality output device, and wherein the output comprises an augmented reality overlay displaying a graphical indication of the position that the user is ocularly focused on (Figs. 11-12B; Para. 120-125 of Geisner; In step 712, user input is received by the first personal A/V apparatus 8 from a first user indicating a position relative to the first user at which to display 3D virtual data representing a second user. Some examples of a position relative to the first user is likely standing or sitting next to the first user. 3D virtual data can be generated of the second user based on captured image and preferably depth data as well. In step 714, the virtual spectator application 188 transforms the 3D virtual data representing the second user and a current action being performed by the second user to be at the position relative to the first user…  In step 716, the 3D virtual data representing the second user and the current action being performed by the second user are displayed in the near-eye, augmented reality display responsive to the position relative to the first user being within a display field of view of the near-eye, augmented reality display of the first user … Joe indicates a side position relative to his head at which to view a virtual representation of his friend Greg 24 who is actually at the game. Greg who is also wearing a personal A/V apparatus 8 comprising a companion processing module 4, display device 2 and wire 6 also has entered user input indicating a side position to his left, a seat near him at which to see Joe 18… In FIG. 12A, Greg's virtual object 24 is seen through Joe's display device 2 as he turns his head to the right side by the processing of the embodiment in FIG. 11. In FIG. 12B, the processing of FIG. 11 displays Joe 18 projected at a left side position relative to Greg 24 who is actually at the game when Greg's head orientation data from inertial sensors 132 indicate he is looking to his left).

Regarding Claim 18, the combination of Geisner and Mendis teaches determining overlay information associated with the first geographic location of the position; and sending, to the second output device, the overlay information (Figs. 11-12B; Para. 120-125 of Geisner; In step 712, user input is received by the first personal A/V apparatus 8 from a first user indicating a position relative to the first user at which to display 3D virtual data representing a second user. Some examples of a position relative to the first user is likely standing or sitting next to the first user. 3D virtual data can be generated of the second user based on captured image and preferably depth data as well. In step 714, the virtual spectator application 188 transforms the 3D virtual data representing the second user and a current action being performed by the second user to be at the position relative to the first user…  In step 716, the 3D virtual data representing the second user and the current action being performed by the second user are displayed in the near-eye, augmented reality display responsive to the position relative to the first user being within a display field of view of the near-eye, augmented reality display of the first user … Joe indicates a side position relative to his head at which to view a virtual representation of his friend Greg 24 who is actually at the game. Greg who is also wearing a personal A/V apparatus 8 comprising a companion processing module 4, display device 2 and wire 6 also has entered user input indicating a side position to his left, a seat near him at which to see Joe 18… In FIG. 12A, Greg's virtual object 24 is seen through Joe's display device 2 as he turns his head to the right side by the processing of the embodiment in FIG. 11. In FIG. 12B, the processing of FIG. 11 displays Joe 18 projected at a left side position relative to Greg 24 who is actually at the game when Greg's head orientation data from inertial sensors 132 indicate he is looking to his left).

Regarding Claim 20, the combination of Geisner and Mendis teaches determining, based on a plurality of positions of additional ocular foci in the geographic environment, a plurality of additional geographic locations associated with the plurality of positions; and causing output, on the second output device, of additional visual indications of the plurality of additional geographic locations (Figs. 3-14; Para. 63-95 of Geisner; environment may be identified by location data which may be used as an index to search in location indexed image and pre-generated 3D map databases 324 or in Internet accessible images 326 for a map or image related data which may be used to generate a map. For example, location data such as GPS data from a GPS transceiver 144 on the display device 2 may identify the location of the user… An example of image related data which may be used to generate a map is meta data associated with any matched image data, from which objects and their positions within a coordinate system for the environment can be identified. For example, a relative position of one or more objects in image data from the outward facing cameras 113 of the user's personal A/V apparatus 8 can be determined with respect to one or more GPS tracked objects in the location from which other relative positions of real and virtual objects can be identified… User location and tracking module 412 keeps track of various users which are utilizing the system. Users can be identified by unique user identifiers, location and other elements
Figs. 11-12B; Para. 120-125 of Geisner; In step 712, user input is received by the first personal A/V apparatus 8 from a first user indicating a position relative to the first user at which to display 3D virtual data representing a second user. Some examples of a position relative to the first user is likely standing or sitting next to the first user. 3D virtual data can be generated of the second user based on captured image and preferably depth data as well. In step 714, the virtual spectator application 188 transforms the 3D virtual data representing the second user and a current action being performed by the second user to be at the position relative to the first user…  In step 716, the 3D virtual data representing the second user and the current action being performed by the second user are displayed in the near-eye, augmented reality display responsive to the position relative to the first user being within a display field of view of the near-eye, augmented reality display of the first user … Joe indicates a side position relative to his head at which to view a virtual representation of his friend Greg 24 who is actually at the game. Greg who is also wearing a personal A/V apparatus 8 comprising a companion processing module 4, display device 2 and wire 6 also has entered user input indicating a side position to his left, a seat near him at which to see Joe 18… In FIG. 12A, Greg's virtual object 24 is seen through Joe's display device 2 as he turns his head to the right side by the processing of the embodiment in FIG. 11. In FIG. 12B, the processing of FIG. 11 displays Joe 18 projected at a left side position relative to Greg 24 who is actually at the game when Greg's head orientation data from inertial sensors 132 indicate he is looking to his left).

Regarding Claim 27, the combination of Geisner and Mendis teaches that the first geographic location of the position comprises a global positioning system (GPS) location of the position (Para. 63-95 of Geisner; environment may be identified by location data which may be used as an index to search in location indexed image and pre-generated 3D map databases 324 or in Internet accessible images 326 for a map or image related data which may be used to generate a map. For example, location data such as GPS data from a GPS transceiver 144 on the display device 2 may identify the location of the user… An example of image related data which may be used to generate a map is meta data associated with any matched image data, from which objects and their positions within a coordinate system for the environment can be identified. For example, a relative position of one or more objects in image data from the outward facing cameras 113 of the user's personal A/V apparatus 8 can be determined with respect to one or more GPS tracked objects in the location from which other relative positions of real and virtual objects can be identified… User location and tracking module 412 keeps track of various users which are utilizing the system. Users can be identified by unique user identifiers, location and other elements… The local copy on the personal A/V apparatus 8 may send device data identifying a user position within the location or environment as well as image and depth data of its field of view and sensor data indicating head position and orientation and gaze direction).

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Geisner in view of Mendis, and further in view of Vaught.
Regarding Claim 19, the combination of Geisner and Mendis does not explicitly disclose receiving a confidence value associated with the first geographic location of the position, and wherein the visual indication is based on the confidence value.
However, Vaught receiving a confidence value associated with a first geographic location of a position, and wherein a visual indication is based on the confidence value (Figs. 3-4; Para. 47 of Vaught; confidence values are assigned to the first estimated gaze determined by gaze target component 304 and/or the enhanced estimated user eye gaze determined by estimation component 308
Para. 38 of Vaught; when gaze line 402 is determined with a higher level of confidence, gaze target area 406 is smaller, and when gaze line 402 is determined with a lower level of confidence, gaze target area 406 is larger to account for the lower confidence in gaze line 402).
 Therefore, at the time when the invention was filed, it would have been obvious to a person of ordinary skill in the art to include receiving a confidence value associated with the first geographic location of the position, and wherein the visual indication is based on the confidence value using the teachings of Vaught in order to modify the method taught by the combination of Geisner and Mendis. The motivation to combine these analogous arts would have been to provide an enhanced user eye gaze estimation by narrowing a database of eye information and corresponding known gaze lines to a subset of the eye information having gaze lines corresponding to a gaze target area (Abstract of Vaught).

Response to Arguments
Applicant’s arguments with respect to currently amended claims have been fully considered but are believed to be answered by and therefore moot in view of new grounds of rejection presented above.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ABHISHEK SARMA whose telephone number is (571)272-9887.  The examiner can normally be reached on Mon - Fri 8:00-5:00.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexander Eisen can be reached on 571-272-7687.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/ABHISHEK SARMA/

Primary Examiner, Art Unit 2622