DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments filed 5/18/20121, with respect to claims 1-16 have been fully considered but are moot in view new ground(s) of rejection.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1-7, 9, 10 and 13-16  is/are rejected under 35 U.S.C. 103 as being unpatentable over Lacey et al. (PGPUB Document No. US 2019/0362557) in view of Fisher (PGPUB Document No. US 2015/0339589).
Regarding claim 16, Lacey teaches a system comprising: 
One or more processors (Lacey: 0499); 
And a non-transitory memory coupled to the processors comprising instructions executable by the processors, the processors operable when executing the instructions to (Lacey: 0117, 0506): 
virtual reality environment to a user (VR scene shown in FIG.1 (Lacey: 0003, 0094, 0096)); 
Determine a head pose of the user based on headset tracking data associated with a headset worn by the user (head pose inputs (Lacey: 0406) determined by head pose tracking systems (Lacey: 0420)); 
Determine a hand pose of the user based on hand tracking data associated with a device held or worn by a hand of the user (“In one embodiment, a totem (e.g. a user input device), or an object such as a toy gun may be held by the user and tracked by the system. The system preferably will be configured to know that the user is holding the item and understand what kind of interaction the user is having with the item (e.g., if the totem or object is a gun, the system may be configured to understand location and orientation” (Lacey: 0157) Note the totem corresponds to a user input device that is held by the user (“hand”) such as a toy gun, wherein the location and orientation (“pose”) is tracked.); 
Access scene information associated with the displayed virtual reality environment (the use of depth information to assist in hand tracking (Lacey: 0205), wherein the depth information being utilized to determine user interaction with virtual objects (e.g. moving virtual pages (Lacey: 0165) corresponds to said information being “associated with the displayed virtual reality environment” as claimed); 
And determine a predicted focal point of the user within the virtual reality environment by processing the head pose, the hand pose, and the scene information (the convergence of different modes of user input (head pose, eye pose, hand gesture) corresponds to the predicted focal point of the user (Lacey: 0408). Note that hand pose comprise the step of utilizing depth information (“scene information”) as stated above. FIG.51 visually illustrates the converged focal point of the user that corresponds to the common overlapping focal area of all modes of user input (Lacey: 0420-0422)) using a machine-learning model (“The system can apply machine learning techniques to learn the user's behavior patterns and convergence/divergence tendencies” (Lacey: 0445)).

However, Lacey does not expressly teach the machine-learning model is trained based on a plurality of previous predicted focal points of users compared to known focal points of the users, wherein 

Fisher teaches, in a similar field of endeavor (gaze detection, Fisher: 0080), the concept (“During training, the controller may compare the predicted saliency map to the gaze direction/history of the gaze direction/saliency map of the operator” (Fisher: 0103)). 
Therefore, applying the teachings of Fisher results in the machine learning of Lacey further comprising the step of being trained based on a plurality of previous predicted focal points of users (predicted saliency map of Fisher) compared to known focal points of the users (gaze direction/history of the gaze direction/saliency map).
Further, the predicted focal points of the user are based on the teaching of Lacey that rely on head pose, hand pose and scene information (see rejection above). Therefore, the previous predicted focal points of the machine learning training process (utilizing the teaching of Fisher) also rely on head pose, hand pose and scene information (corresponds to “plurality of previous predicted focal points is generated by using the machine-learning model to process a plurality of previous headset tracking data, a plurality of previous hand tracking data, and a plurality of previous scene information” as presently claimed).
Therefore, at the time of the invention, it would have been obvious to one of an ordinary skill in the art to modify the machine learning process of Lacey such as to utilize the teachings of Fisher, because this enables an improved method of training a machine learning model.

Claim 1 is a corresponding method claim of claim 16. The limitations of claim 1 are substantially similar to the limitations of claim 16.  Therefore, it has been analyzed and rejected substantially similar to claim 1.

Regarding claim 2, the combined teachings as applied above teaches the method of Claim 1, further comprising: adjusting an image presented to the user by the computing system based on the predicted focal point of the user within the rendered environment (visual feedback such as indicator 5911, wherein the form of the indicator is “adjusted” based on user input (Lacey: 0440-0443)).

Regarding claim 3, the combined teachings as applied above teaches the method of Claim 1, wherein the head pose is determined with respect to the rendered environment (“Pose relative to the world” (Lacey: 0156). Further, head pose is determined with respect to the user interaction with a particular virtual object (Lacey: 0407-0408)).

Regarding claim 4, the combined teachings as applied above teaches the method of Claim 1, wherein the hand pose is determined with respect to one of the rendered environment (hand pose estimated for determining object (“real world environment”) grasped by the user) or the headset worn by the user.

Regarding claim 5, the combined teachings as applied above teaches the method of Claim 1, wherein determining the hand pose of the user comprises: identifying a hand of the user based on one or more cameras coupled to the computing system analyzing a plurality of images comprising the hand of the user (recognizing gestures using images (Lacey: 0337, 0387) by acquiring images of the user’s hands (Lacey: 0205)).

Regarding claim 6, the combined teachings as applied above teaches the method of Claim 1, wherein the hand tracking data is associated with a device held or worn by a hand of the user (“hand gesture, controller or totem input, etc.” (Lacey: 0420, 0345, element 3620 of FIG.36)).

Regarding claim 7, the combined teachings as applied above teaches the method of Claim 6, further comprising determining an action performed by the device held or worn by the hand of the user (“providing hand gesture input to target a particular virtual object” (Lacey: 0407)), and wherein determining the predicted focal point further comprises processing the action performed by the device held or worn by the hand of the user using the machine-learning model (the determined target based on the user gesture input).

Regarding claim 9, the combined teachings as applied above teaches the method of Claim 1, wherein the scene information comprises semantic information of one or more elements within the rendered environment (“data associated with objects (e.g. location, semantic information, properties, etc.)” (Lacey: 0177)).

Regarding claim 10, the combined teachings as applied above teaches the method of Claim 1, wherein the predicted focal point is a three-dimensional coordinate within the rendered environment (user input (convergence of multimodal inputs or focal point) defined within a 3D Cartesian coordinate system (Lacey: 0088)).

Regarding claim 13, the combined teachings as applied above teaches the method of Claim 1, wherein the predicted focal point is determined without eye tracking sensors (multimodal user input that does not include eye gaze as disclosed in 0398 of Lacey).

Regarding claim 14, the combined teachings as applied above teaches the method of Claim 1, wherein the rendered environment comprises one or more of an augmented reality environment, a virtual reality environment, or a mixed reality environment (“The display 220 can present AR/VR/MR content to a user” (Lacey: 0097)).

Claim 15 is a corresponding computer-readable non-transitory storage media claim of claim 16. The limitations of claim 15 are substantially similar to the limitations of claim 16.  Therefore, it has been analyzed and rejected substantially similar to claim 15. Lacey teaches a computer-readable non-transitory storage media (Lacey: 0508).

Claim 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lacey in view of Fisher as applied to claim 1 above, and further in view of Mukkamala et al. (PGPUB Document No. US 2016/0019718).
Regarding claim 8, Lacey does not expressly teach but Mukkamala teaches the method of Claim 1, wherein the scene information includes color and depth data (Mukkamala teaches the concept of determining user interacting with virtual objects utilizing depth and color data. Note similar to the hand pose tracking of Lacey, Mukkamala discloses interactions with virtual objects using the user’s hand (Mukkamala: 0022, 0025)).
Therefore, at the time of the invention, it would have been obvious to one of an ordinary skill in the art to modify the teachings above such as to further determine user interaction with virtual objects as disclosed by Mukkamala, because this enable an added level of accuracy in identifying the user’s intention/input (“focal point”).

Claims 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lacey in view of Fisher as applied to claim 1 above, and further in view of Alcaide et al. (PGPUB Document No. US 2020/0097076).
Regarding claim 12, Lacey does not expressly teach but Alcaide teaches the method of Claim 1, further comprising: generating a confidence map of one or more locations for the predicted focal point (Alcaide teaches the concept of using a gaze probability map predicting user’s gaze (Alcaide: 0149-0151). Therefore, the eye gaze input (part of the convergence determination process of Lacey) can utilize the teachings of Alcaide) using the machine-learning model (Alcaide: 0067), wherein the confidence map 
Therefore, at the time of the invention, it would have been obvious to one of an ordinary skill in the art to modify the teachings of Lacey such as to utilize the gaze determination teaching of Alcaide, because this is merely one of the many well-known methods for determining gaze direction. Further, the combined teachings yields predictable results.

Allowable Subject Matter
Claim 11 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Devona Faulk can be reached on (571) 272-7515.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/DAVID H CHU/Primary Examiner, Art Unit 2616