Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
2.	The action is responsive to the communications filed on 6/23/2022. Claims 1-20, 141-144 are pending in the case. No Claim is amended. Claims 20-140 were previously cancelled. Claims 1, 11 and 18 are independent claims. Claims 1-20, 141-144 are rejected.
Summary of claims

3.	Claims 1-20, 141-144 are pending, 
	No Claim is amended,
	Claims 20-140 were previously cancelled,
	Claims 1, 11 and 18 are independent claims,
Claims 1-20, and 141-144 are rejected.

Response to Arguments
4.	Regarding to 103 rejections, Applicant’s arguments, see Remarks p. 8-10, filed 6/23/2022, have been fully considered but are not persuasive. 
Applicant argued the cited references including Valli and Ziman did not teach the features required by claim 1, switch between at least three different viewing perspectives including a first-person viewing perspective, a third-person viewing perspective, and a self-viewing perspective. Specifically, Applicant argued Ziman disclosed switching between two different viewing perspectives, first-person view and “God’s view,” and Ziman did not teach switching between at least three different viewing perspectives. Examiner respectfully disagrees.
In response, “[T]he question under 35 U.S.C. § 103 is not merely what the references expressly teach but what they would have suggested to one of ordinary skill in the art at the time the invention was made.”  Merck & Co., Inc. v. Biocraft Laboratories, Inc., 874 F. 2d 804, 807–808 (Fed. Cir. 1989) (Emphasis added). In Valli, Fig. 5A is a schematic plan view illustrating an example 8-camera capture environment with square cells according to some embodiments, specifically, Fig. 5A shows view lines of cameras 504-518 for capturing a user, individual viewpoints are formed by capturing each user by an omni-camera setup (which captures views of a user from all directions around a user) and providing the views from respective directions, for example, a user may have spatially faithful viewpoints only to his or her closest neighbors, captured inside and together with their backgrounds (Valli: Figs. 5A-5B and [0071]).  Please note Valli may capture and provide views of a user from all directions including a viewing direction as seen by a close neighbor or the user himself/herself, for example, as shown in Fig. 5B, cameras 556 or 568 (which are facing to user’s face) may capture and provide views as the user is taking a photo with a “self-mode” of a phone camera. In Valli’s example, at least 8 cameras capture and provide different views from 8 different directions.  Valli does not clearly disclose switching between one viewing perspective and another view perspective in response to input from the user via a client device, and Ziman is cited to teach switching from one view perspective to another view perspective by user command (Ziman: [0119] the master visual control system is able to implement input commands from the user device (e.g., switching between a master camera’s “God’s view” to a first-person view; [0120] enable the user to view the environment when the user provides a command to switch vision from the God’s perspective (third-person view) to first-person view; [0128] switching between third-person view and first-person view at the user’s choice). The combination of Valli and Ziman discloses switching between at least 8 different viewing perspective at the user’s choice, and 8 different viewing perspective include a self-viewing perspective. It would have been obvious to one with ordinary skill, in the art before the effective filing date of the claimed invention, to modify the invention of Valli using the teachings of Ziman to clearly include switching between one view and another view at the user’s choice such as input command from the user device. It would provide Valli’s system with the enhanced capability of allowing user to change viewing perspective so user may have more flexibility to view from different direction in the virtual environment.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 7-12, 14-20 are rejected under 35 U.S.C. 103 as being unpatentable US 20200099891 to Valli, and in view of US 20210011607 to Ziman

Regarding independent claims 1, Valli et al. teach:

 Claim 1 A system enabling interactions in virtual environments, comprising: one or more cloud server computers comprising at least one processor and memory storing data and instructions implementing a virtual environment platform comprising at least one virtual environment comprising a 3D modeled virtual space (Valli, FIGS. 6-8, 21; [0064] telepresence systems … mobility to include users' ability to move around and to move and orient renderings of their locally captured spaces with respect to other participants. [0073]-[0076] cloud servers, display tools 604, 606, 608, 610, system diagram 700, processor 730; [0080]; [[0175] memory 2230/2232 … include a processor operative to perform instructions stored in a non-transitory computer-readable medium; [0054] of Valli teaches: Methods and systems for spatially faithful telepresence disclosed herein support a flexible system of adjustable geometric relationships between multiple meeting sites with multiple mobile participants (or users). Some embodiments of such methods and systems may be used for group conferencing and visitations inside a user-selected, photorealistic 3D-captured or 3D-modelled user environment. Some embodiments of such methods and systems may be used for social interaction and spatial exploration inside a unified virtual landscape with a dynamic unified geometry compiled from separate user spaces, which may enable proximity-based interactions (triggered by distances and/or directions between users or spaces), and is expanded by virtual 3D-modeled environments, 3D objects, and other digital information) configured for display (Valli, [0003] generating a two-dimensional perspective video of the shared virtual geometry from the perspective location of the viewing user);
wherein the one or more cloud server computers are configured to insert a user graphical representation generated from a live data feed of a user obtained by a camera at a three  dimensional coordinate position of the 3D modeled virtual space of the at least one virtual environment, update the user graphical representation in the 3D modeled virtual space of the at least one virtual environment, and enable real-time multi-user collaboration and interactions in the 3D modeled virtual space of the virtual environment (Valli, FIGS. 6-8, 21; [0051]-[0054] 3D virtual worlds; [0063]-[0066] real-time captured depth sensor data (color plus depth) may be bigger than video feeds from a video camera; [0070]-[0078] FIG. 5A shows view lines of cameras 504, 506, 508, 510, 512, 514, 516, 518 for capturing a user 502. Individual viewpoints are formed by capturing each user by an omni-camera setup (which captures views of a user from all directions around a user) and providing the views from respective directions. The views (remote person's faces) are shown by AR glasses over each of the cameras (remote person's eyes). FIG. 5B is a schematic perspective view illustrating an example square capture environment 550 with 8 cameras according to some embodiments .. Virtual conferencing environment; [0072] A camera capture setup defines the position of each local user. For some systems, a user is able to move together with the captured scene inside a tessellated virtual geometry; [0073] FIG. 6 is a system diagram illustrating an example set of interfaces for representing users in a virtual geometry according to some embodiments. FIG. 6 shows a system 600 for creating a virtual geometry for user interaction. Systems and methods disclosed herein may be implemented as a decentralized application where tools to manage a virtual constellation (or geometry) … sets the local coordinate system using the derived/given origin and orientation, and the real-world scale; [0076] a reconstruction and perspective processor 730 combines received calibrated sets of depth and texture into a 3D reconstruction of the local space in real world scale; [0090] Appearing in their virtual spatial positions, users are able to virtually visit and directionally view (up to 360°) participating sites. Conferencing supports a virtual cocktail party-type of interaction over network. Unlike people in a real-life cocktail party, participants are brought virtually to a space, as illustrated in FIG. 8. [0091]-[0094] Users' positions in each local space 902, 904, 906 may be captured by electronic means. Spatially faithful (or correct) perspective views of remote participants 910, 912, 914, 916, 918, 920, separated from their real backgrounds, may be formed and transmitted to each local participant, and positioned according to a unified geometry 908; [0147]; [0152]-[0154] A user position in a local space may be derived and updated 2008. [0185]-[0186]).
Valli pertains to systems and methods for managing user positions in a shared virtual geometry and providing a spatially faithful system (Valli, Abstract) and Valli teaches different features, applied in the mapping above, in relation to different exemplary embodiments.  It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, with the teachings of the various exemplary embodiments before them to modify the combination of features to tailor to the needs and goals at hand (Valli, [0211]).
Further, Valli discloses generating data in different viewing perspective (Valli, FIGS. 5A-5B, [0071] Fig. 5A is a schematic plan view illustrating an example 8-camera capture environment with square cells according to some embodiments; Fig. 5A shows view lines of cameras 504-518 for capturing a user, individual viewpoints are formed by capturing each user by an omni-camera setup (which captures views of a user from all directions around a user) and providing the views from respective directions, for example, a user may have spatially faithful viewpoints only to his or her closest neighbors, captured inside and together with their backgrounds FIGS. 6-8, 21; [0045] Views from remote places may be brought to a user’s eye-point, and immersion may be supported by enabling users to view a whole 360° panorama (although a sub-view may be relatively narrow at a time due to the restricted field-of-view of AR glasses); [0045] Views from remote places may be brought to a user’s eye-point, and immersion may be supported by enabling users to view a whole 360° panorama (although a sub-view may be relatively narrow at a time due to the restricted field-of-view of AR glasses);  [0051]-[0054], [0070]-[0071] FIG. 3 is a schematic perspective view illustrating an example hexagon-shaped cell in a tessellated space 300 showing how a person located in the middle is seen by neighboring people according to some embodiments, [0076] video may be produced for 360° (full panorama) around each remote users eye-point in the local space), but does not clearly disclose switching between one viewing perspective and another view perspective in response to input from the user via a client device, in an analogous art of presenting an interactive virtual environment, Ziman discloses: wherein a viewing perspective of the user accessing the at least one virtual environment is configured to switch between a first-person viewing perspective, a third-person viewing perspective, and a self-viewing perspective in response to input from the user via a client device as the user navigates the at least one virtual environment using the client device (Ziman: [0119] the master visual control system is able to implement input commands from the user device (e.g., switching between a master camera’s “God’s view” to a first-person view; [0120] enable the user to view the environment when the user provides a command to switch vision from the God’s perspective (third-person view) to first-person view; [0128] switching between third-person view and first-person view at the user’s choice);
Valli and Ziman are analogous arts because they are in the same field of endeavor, presenting an interactive virtual environment. Therefore, it would have been obvious to one with ordinary skill, in the art before the effective filing date of the claimed invention, to modify the invention of Valli using the teachings of Ziman to clearly include switching between one view and another view at the user’s choice such as input command from the user device. It would provide Valli’s system with the enhanced capability of allowing user to change viewing perspective so user may have more flexibility to view from different directions in the virtual environment.

Regarding dependent claim 2, Valli et al. teach:
2.   The system of claim 1, wherein the virtual environment is accessible by a client device via a downloadable client application or a web browser application (Valli, FIGS. 6-8, 21; [0075] application controller; [0183] The processor 2218 may further be coupled to other peripherals 2238, which may include one or more software and/or hardware modules that provide additional features, functionality and/or wired or wireless connectivity .. an Internet browser).

Regarding dependent claim 3, Valli et al. teach:
3.   The system of claim 1, wherein the user graphical representation comprises a user 3D virtual cutout with a removed background, or a user real-time 3D virtual cutout with a removed background, or a video with removed background, or video without a removed background (Valli, FIGS. 6-8, 21; [0051]-[0054], [0070]-[0071] a user may have spatially faithful viewpoints only to his or her closest neighbors, captured inside and together with their backgrounds; [0076] a reconstruction and perspective processor 730 combines received calibrated sets of depth and texture into a 3D reconstruction of the local space in real world scale, specifies an origin for the local space either by a rule or by user interaction, specifies an orientation for the local space either by a rule (e.g., compass North) or by user interaction, and sets the local coordinate system using the derived/given origin and orientation, and the real-world scale.  The background around the local user may be removed and made transparent.).


Regarding dependent claim 4, Valli-Ziman teach:
4.   The system of claim 1, wherein the viewing perspective is further configured to switch to a top viewing perspective (Valli, FIGS. 6-8, 21; [0045] Views from remote places may be brought to a user’s eye-point, and immersion may be supported by enabling users to view a whole 360° panorama (although a sub-view may be relatively narrow at a time due to the restricted field-of-view of AR glasses); [0045] Views from remote places may be brought to a user’s eye-point, and immersion may be supported by enabling users to view a whole 360° panorama (although a sub-view may be relatively narrow at a time due to the restricted field-of-view of AR glasses);  [0051]-[0054], [0070]-[0071] FIG. 3 is a schematic perspective view illustrating an example hexagon-shaped cell in a tessellated space 300 showing how a person located in the middle is seen by neighboring people according to some embodiments, [0076] video may be produced for 360° (full panorama) around each remote users eye-point in the local space. For some embodiments, a local background of a perspective video of a user may be replaced with another background. The updated perspective video may be displayed for some embodiments. For some embodiments, a background may be updated based on the location of a user within a local space. For some embodiments, a perspective video may be a panoramic video, such as, for example, a video with a wide-angle or 360-degree view; examiner notes a 360 degree view corresponds to a view from around the point; examiner notes, e.g., [0045] teaching ‘enabling users’ (i.e., users may correspond to  first person, third person; Ziman: [0119] the master visual control system is able to implement input commands from the user device (e.g., switching between a master camera’s “God’s view” to a first-person view; [0120] enable the user to view the environment when the user provides a command to switch vision from the God’s perspective (third-person view) to first-person view; [0128] switching between third-person view and first-person view at the user’s choice).

Regarding dependent claim 5, Valli et al. teach:
5.    The system of claim 1, wherein the viewing perspective is associated with the viewing perspective of the user graphical representation and a virtual camera, and wherein the virtual camera is updated automatically by tracking and analyzing user eye-and-head-tilting data, or head-rotation data, or a combination thereof (Valli, FIGS. 6-8, 21; [0051]-[0054], [0070]-[0071] FIG. 3 is a schematic perspective view illustrating an example hexagon-shaped cell in a tessellated space 300 showing how a person located in the middle is seen by neighboring people according to some embodiments, [0076] video may be produced for 360° (full panorama) around each remote users eye-point in the local space… a perspective video may be a panoramic video, such as, for example, a video with a wide-angle or 360-degree view; [0077] For some embodiments, a positioning and tracking component 720 positions and tracks users in a local space with respect to local and/or unified geometry or coordinate system using wearable or external components (e.g., by a positioning system of AR/VR glasses); examiner notes a 360 degree view corresponds to a view from around the point; [0086], [0147] real time).



Regarding dependent claim 7, Valli et al. teach:
7.   The system of claim 1, wherein the at least one virtual environment is a persistent virtual environment stored in persistent memory storage of the one or more cloud server computers (Valli, FIGS. 6-8, 21; [0051]-[0054], [0073] Systems and methods disclosed herein may be implemented as a decentralized application where tools to manage a virtual constellation (or geometry) with user representations 602 may be implemented in cloud servers, and user capture and display tools 604, 606, 608, 610 may be implemented at each local site, with each user site connected to the cloud via a network.  [0133] Tracking may occur outside a captured space, for example, if a user temporarily leaves a captured room to visit his or her kitchen or mailbox outside. If a user goes outside, tracking, for example, by GPS, enables a user to continue his or her collaboration session, which may have reduced modalities.).

Regarding dependent claim 8, Valli et al. teach:
8.   The system of claim 1, wherein the arrangement of the 3D modeled virtual space of the at least one virtual environment is associated with a contextual theme of the virtual environment related to one or more virtual environment verticals selected from the virtual environment platform. (Valli, FIGS. 6-8, 21; [0051]-[0054], [0052] 3D virtual worlds, for example Second Life and OpenQwaq (formerly known as Teleplace), are a way of interaction between people represented by avatars. Attempts have been made to bring naturalness to the interaction by making avatars and environments close to their real-world exemplars; [0073] FIG. 6 is a system diagram illustrating an example set of interfaces for representing users in a virtual geometry according to some embodiments. FIG. 6 shows a system 600 for creating a virtual geometry for user interaction. Systems and methods disclosed herein may be implemented as a decentralized application where tools to manage a virtual constellation (or geometry) … sets the local coordinate system using the derived/given origin and orientation, and the real-world scale; [0076] a reconstruction and perspective processor 730 combines received calibrated sets of depth and texture into a 3D reconstruction of the local space in real world scale; [0090] Appearing in their virtual spatial positions, users are able to virtually visit and directionally view (up to 360°) participating sites. Conferencing supports a virtual cocktail party-type of interaction over network. Unlike people in a real-life cocktail party, participants are brought virtually to a space, as illustrated in FIG. 8. [0091]-[0094] [0107]-[0109] local users at site 4 in their real context/environment; [0152], [0153] conferencing flowchart (examiner notes ‘conferencing theme’ (conferencing flowchart) v. exploration flowchart (exploration theme); [0162] automated processing chain contexts).

Regarding dependent claim 9, Valli-Ziman teach:
9.    The system of claim 8, wherein the 3D modeled virtual space of the at least one virtual environment comprises virtual objects with corresponding graphical representations (Valli, [0052] interaction between people represented by avatars), and wherein the virtual objects comprise virtual computers including virtual computing resources (Ziman, [0007] the virtual reality environment may include virtual objects, such as a virtual computer)

Regarding dependent claim 10, Valli et al. teach:
10.    The system of claim 1, wherein the virtual environment platform is configured to enable multi-casting or broadcasting of remote events to a plurality of instances of a virtual environment. (Valli, FIGS. 6-8, 21; [0051]-[0054], [0070]-[0071] FIG. 3 is a schematic perspective view illustrating an example hexagon-shaped cell in a tessellated space 300 showing how a person located in the middle is seen by neighboring people according to some embodiments, [0076] video may be produced for 360° (full panorama) around each remote users eye-point in the local space. For some embodiments, a local background of a perspective video of a user may be replaced with another background. The updated perspective video may be displayed for some embodiments. For some embodiments, a background may be updated based on the location of a user within a local space. For some embodiments, a perspective video may be a panoramic video, such as, for example, a video with a wide-angle or 360-degree view; examiner notes a 360 degree view corresponds to a view from around the point; [0086]-[0090] Appearing in their virtual spatial positions, users are able to virtually visit and directionally view (up to 360°) participating sites. Conferencing supports a virtual cocktail party-type of interaction over network. Unlike people in a real-life cocktail party, participants are brought virtually to a space, as illustrated in FIG. 8; [0091]-[0094]; [0147] real time; [0166] the communications system 100 may be a multiple access system and may employ one or more channel access schemes; [0161] broadcast; examiner notes ‘virtual cocktail party’ may correspond to remote event).

Regarding claim 11, Valli et al. teach:
11.   A method enabling interactions in virtual environments, comprising: providing a virtual environment platform comprising a virtual environment in memory of one or more cloud server computers comprising at least one processor (Valli, FIGS. 6-8, 21;  [0064] telepresence systems … mobility to include users' ability to move around and to move and orient renderings of their locally captured spaces with respect to other participants. [0073]-[0076] cloud servers, display tools 604, 606, 608, 610, system diagram 700, processor 730; [0080]; [[0175] memory 2230/2232 … include a processor operative to perform instructions stored in a non-transitory computer-readable medium), wherein the virtual environment comprises a 3D modeled virtual space [0054] of Valli teaches: Methods and systems for spatially faithful telepresence disclosed herein support a flexible system of adjustable geometric relationships between multiple meeting sites with multiple mobile participants (or users). Some embodiments of such methods and systems may be used for group conferencing and visitations inside a user-selected, photorealistic 3D-captured or 3D-modelled user environment. Some embodiments of such methods and systems may be used for social interaction and spatial exploration inside a unified virtual landscape with a dynamic unified geometry compiled from separate user spaces, which may enable proximity-based interactions (triggered by distances and/or directions between users or spaces), and is expanded by virtual 3D-modeled environments, 3D objects, and other digital information) configured for display (Valli, [0003] generating a two-dimensional perspective video of the shared virtual geometry from the perspective location of the viewing user);
receiving a live data feed of a user captured by at least one camera from at least one client device; generating, from the live data feed, a user graphical representation; inserting the user graphical representation into a three-dimensional coordinate position of the 3D modeled virtual space of the virtual environment; updating, from the live data feed, the user graphical representation within the 3D modeled virtual space of the virtual environment; and processing data generated from interactions in the virtual environment, enabling real-time multi-user collaborations and interactions in the 3D modeled virtual space of the virtual environment (Valli, FIGS. 6-8, 21; [0051]-[0054] 3D virtual worlds; [0063]-[0066] real-time captured depth sensor data (color plus depth) may be bigger than video feeds from a video camera; [0070]-[0078] FIG. 5A shows view lines of cameras 504, 506, 508, 510, 512, 514, 516, 518 for capturing a user 502. Individual viewpoints are formed by capturing each user by an omni-camera setup (which captures views of a user from all directions around a user) and providing the views from respective directions. The views (remote person's faces) are shown by AR glasses over each of the cameras (remote person's eyes). FIG. 5B is a schematic perspective view illustrating an example square capture environment 550 with 8 cameras according to some embodiments .. virtual conferencing environment; [0072] A camera capture setup defines the position of each local user. For some systems, a user is able to move together with the captured scene inside a tessellated virtual geometry; [0073] FIG. 6 is a system diagram illustrating an example set of interfaces for representing users in a virtual geometry according to some embodiments. FIG. 6 shows a system 600 for creating a virtual geometry for user interaction. Systems and methods disclosed herein may be implemented as a decentralized application where tools to manage a virtual constellation (or geometry) … sets the local coordinate system using the derived/given origin and orientation, and the real-world scale; [0076] a reconstruction and perspective processor 730 combines received calibrated sets of depth and texture into a 3D reconstruction of the local space in real world scale; [0090] Appearing in their virtual spatial positions, users are able to virtually visit and directionally view (up to 360°) participating sites. Conferencing supports a virtual cocktail party-type of interaction over network. Unlike people in a real-life cocktail party, participants are brought virtually to a space, as illustrated in FIG. 8. [0091]-[0094] Users' positions in each local space 902, 904, 906 may be captured by electronic means. Spatially faithful (or correct) perspective views of remote participants 910, 912, 914, 916, 918, 920, separated from their real backgrounds, may be formed and transmitted to each local participant, and positioned according to a unified geometry 908; [0147]; [0152]-[0154] A user position in a local space may be derived and updated 2008. [0185]-[0186]).
Valli pertains to systems and methods for managing user positions in a shared virtual geometry and providing a spatially faithful system (Valli, Abstract) and Valli teaches different features, applied in the mapping above, in relation to different exemplary embodiments.  It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, with the teachings of the various exemplary embodiments before them to modify the combination of features to tailor to the needs and goals at hand (Valli, [0211]).
Further, Valli discloses generating data in different viewing perspective (Valli, FIGS. 6-8, 21; [0045] Views from remote places may be brought to a user’s eye-point, and immersion may be supported by enabling users to view a whole 360° panorama (although a sub-view may be relatively narrow at a time due to the restricted field-of-view of AR glasses); [0045] Views from remote places may be brought to a user’s eye-point, and immersion may be supported by enabling users to view a whole 360° panorama (although a sub-view may be relatively narrow at a time due to the restricted field-of-view of AR glasses);  [0051]-[0054], [0070]-[0071] FIG. 3 is a schematic perspective view illustrating an example hexagon-shaped cell in a tessellated space 300 showing how a person located in the middle is seen by neighboring people according to some embodiments, [0076] video may be produced for 360° (full panorama) around each remote users eye-point in the local space), but does not clearly disclose switching between one viewing perspective and another view perspective in response to input from the user via a client device, in an analogous art of presenting an interactive virtual environment, Ziman discloses: wherein a viewing perspective of the user accessing the at least one virtual environment is configured to switch between a first-person viewing perspective, a third-person viewing perspective, and a self-viewing perspective in response to input from the user via a client device as the user navigates the at least one virtual environment using the client device (Ziman: [0119] the master visual control system is able to implement input commands from the user device (e.g., switching between a master camera’s “God’s view” to a first-person view; [0120] enable the user to view the environment when the user provides a command to switch vision from the God’s perspective (third-person view) to first-person view; [0128] switching between third-person view and first-person view at the user’s choice);
Valli and Ziman are analogous arts because they are in the same field of endeavor, presenting an interactive virtual environment. Therefore, it would have been obvious to one with ordinary skill, in the art before the effective filing date of the claimed invention, to modify the invention of Valli using the teachings of Ziman to clearly include switching between third-person view and fir-person view at the user’s choice such as input command from the user device. It would provide Valli’s system with the enhanced capability of allowing user to change viewing perspective so user may have more flexibility to view in the virtual environment.

Regarding dependent claim 12, Valli et al. teach:
12.   The method of claim 11, wherein the user graphical representation comprises a user 3D virtual cutout with a removed background, or a user real-time 3D virtual cutout with a removed background, or a video with removed background, or video without a removed background (Valli, FIGS. 6-8, 21; [0051]-[0054], [0070]-[0071] a user may have spatially faithful viewpoints only to his or her closest neighbors, captured inside and together with their backgrounds; [0076] a reconstruction and perspective processor 730 combines received calibrated sets of depth and texture into a 3D reconstruction of the local space in real world scale, specifies an origin for the local space either by a rule or by user interaction, specifies an orientation for the local space either by a rule (e.g., compass North) or by user interaction, and sets the local coordinate system using the derived/given origin and orientation, and the real-world scale.  The background around the local user may be removed and made transparent.).
Regarding dependent claim 14, Valli-Ziman teach:
14.    The method of claim 11, wherein the viewing perspective is further configured to switch to a top viewing perspective (Valli, FIGS. 6-8, 21; [0045] Views from remote places may be brought to a user’s eye-point, and immersion may be supported by enabling users to view a whole 360° panorama (although a sub-view may be relatively narrow at a time due to the restricted field-of-view of AR glasses); [0045] Views from remote places may be brought to a user’s eye-point, and immersion may be supported by enabling users to view a whole 360° panorama (although a sub-view may be relatively narrow at a time due to the restricted field-of-view of AR glasses);  [0051]-[0054], [0070]-[0071] FIG. 3 is a schematic perspective view illustrating an example hexagon-shaped cell in a tessellated space 300 showing how a person located in the middle is seen by neighboring people according to some embodiments, [0076] video may be produced for 360° (full panorama) around each remote users eye-point in the local space. For some embodiments, a local background of a perspective video of a user may be replaced with another background. The updated perspective video may be displayed for some embodiments. For some embodiments, a background may be updated based on the location of a user within a local space. For some embodiments, a perspective video may be a panoramic video, such as, for example, a video with a wide-angle or 360-degree view; examiner notes a 360 degree view corresponds to a view from around the point; examiner notes, e.g., [0045] teaching ‘enabling users’ (i.e., users may correspond to  first person, third person; Ziman: [0119] the master visual control system is able to implement input commands from the user device (e.g., switching between a master camera’s “God’s view” to a first-person view; [0120] enable the user to view the environment when the user provides a command to switch vision from the God’s perspective (third-person view) to first-person view; [0128] switching between third-person view and first-person view at the user’s choice).

Regarding dependent claim 15, Valli et al. teach:
15. The method of claim 11, wherein the virtual environment platform enables engaging in an ad hoc virtual communication by opening up an ad hoc communication channel between client devices, wherein multiple user graphical representations are presented as holding a conversation in the virtual environment.  (Valli, FIGS. 6-8, 21; [0051]-[0054], [0070]-[0071] FIG. 3 is a schematic perspective view illustrating an example hexagon-shaped cell in a tessellated space 300 showing how a person located in the middle is seen by neighboring people according to some embodiments, [0076] video may be produced for 360° (full panorama) around each remote users eye-point in the local space. For some embodiments, a local background of a perspective video of a user may be replaced with another background. The updated perspective video may be displayed for some embodiments. For some embodiments, a background may be updated based on the location of a user within a local space. For some embodiments, a perspective video may be a panoramic video, such as, for example, a video with a wide-angle or 360-degree view; examiner notes a 360 degree view corresponds to a view from around the point; [0086]-[0090] Appearing in their virtual spatial positions, users are able to virtually visit and directionally view (up to 360°) participating sites. Conferencing supports a virtual cocktail party-type of interaction over network. Unlike people in a real-life cocktail party, participants are brought virtually to a space, as illustrated in FIG. 8; [0091]-[0094]; [0147] real time; [0166] the communications system 100 may be a multiple access system and may employ one or more channel access schemes; [0161] broadcast; [0185]-[0186]; examiner notes ‘virtual cocktail party’ may correspond to graphical representation of holding a conversation).

Regarding dependent claim 16, Valli et al. teach:
16.  The method of claim 11, further comprising engaging one or more users in conversations by: transitioning the user graphical representation from a user 3D virtual cutout into a user real-time 3D virtual cutout, or video with a removed background, or a video without a removed background; and (Valli, FIGS. 6-8, 21; [0051]-[0054], [0070]-[0071] a user may have spatially faithful viewpoints only to his or her closest neighbors, captured inside and together with their backgrounds; [0076] a reconstruction and perspective processor 730 combines received calibrated sets of depth and texture into a 3D reconstruction of the local space in real world scale, specifies an origin for the local space either by a rule or by user interaction, specifies an orientation for the local space either by a rule (e.g., compass North) or by user interaction, and sets the local coordinate system using the derived/given origin and orientation, and the real-world scale.  The background around the local user may be removed and made transparent.);
opening up a peer-to-peer (P2P) communication channel between the user client devices, or opening up an indirect communication channel through the cloud server computer, wherein the conversation comprises sending and receiving real-time audio and video displayed from the user real-time 3D virtual cutout of participants or sending and receiving real-time audio played from the user 3D virtual cutout of participants. (Valli, FIGS. 6-8, 21; [0051]-[0054], [0070]-[0071] a user may have spatially faithful viewpoints only to his or her closest neighbors, captured inside and together with their backgrounds; [0076] a reconstruction and perspective processor 730 combines received calibrated sets of depth and texture into a 3D reconstruction of the local space in real world scale, specifies an origin for the local space either by a rule or by user interaction, specifies an orientation for the local space either by a rule (e.g., compass North) or by user interaction, and sets the local coordinate system using the derived/given origin and orientation, and the real-world scale.  The background around the local user may be removed and made transparent; [0106] For some embodiments, each remote user is captured by a virtual camera (using the formed 3D reconstruction) and displayed to local users (by showing views 2′ through 7′ on his or her AR glasses). The background may be a local user’s environment or another environment chosen by the local user. For FIG. 11, a number in a box without an apostrophe indicates a user is local to that meeting site. A number in a box with an apostrophe indicates an image of that user is used at that meeting site. The large circles in FIG. 11 indicate a physical view of a meeting site. The long lines connecting two squares in FIG. 11 are examples of connections between virtual cameras and displays (for users 1 and 3 for the example connections shown in FIG. 11), although, for example, any server-based or peer-to-peer delivery may be used.).


Regarding dependent claim 17, Valli et al. teach:
17.    The method of claim 11, further comprising embedding a clickable link redirecting to the virtual environment into one or more third party sources comprising third-party websites, applications or video-games. (Valli, FIGS. 6-8, 21; [0051]-[0054], [0070]-[0071]; [0173]-[00175] Internet 110 may include a global system of interconnected computer networks and devices that use common communication protocols, such as the transmission control protocol (TCP), user datagram protocol (UDP) and/or the internet protocol (IP) in the TCP/IP internet protocol suite. The networks 112 may include wired and/or wireless communications networks owned and/or operated by other service; [0183] The processor 2218 may further be coupled to other peripherals 2238, which may include one or more software and/or hardware modules that provide additional features, functionality and/or wired or wireless connectivity. For example, the peripherals 2238 may include an accelerometer, an e-compass, a satellite transceiver, a digital camera (for photographs or video), a universal serial bus (USB) port, a vibration device, a television transceiver, a hands-free headset, a Bluetooth® module, a frequency modulated (FM) radio unit, a digital music player, a media player, a video game player module, an Internet browser, a Virtual Reality and/or Augmented Reality (VR/AR) device, an activity tracker, and the like..).




Regarding claim 18, Valli et al. teach:
18.   A computer readable medium having stored thereon instructions configured to cause at least one server computer comprising a processor and memory to perform steps comprising:  
providing a virtual environment platform comprising a virtual environment in memory of one or more cloud server computers comprising at least one processor; (Valli, FIGS. 6-8, 21;  [0064] telepresence systems … mobility to include users' ability to move around and to move and orient renderings of their locally captured spaces with respect to other participants. [0073]-[0076] cloud servers, display tools 604, 606, 608, 610, system diagram 700, processor 730; [0080]; [[0175] memory 2230/2232 … include a processor operative to perform instructions stored in a non-transitory computer-readable medium), wherein the virtual environment comprises a 3D modeled virtual space [0054] of Valli teaches: Methods and systems for spatially faithful telepresence disclosed herein support a flexible system of adjustable geometric relationships between multiple meeting sites with multiple mobile participants (or users). Some embodiments of such methods and systems may be used for group conferencing and visitations inside a user-selected, photorealistic 3D-captured or 3D-modelled user environment. Some embodiments of such methods and systems may be used for social interaction and spatial exploration inside a unified virtual landscape with a dynamic unified geometry compiled from separate user spaces, which may enable proximity-based interactions (triggered by distances and/or directions between users or spaces), and is expanded by virtual 3D-modeled environments, 3D objects, and other digital information) configured for display (Valli, [0003] generating a two-dimensional perspective video of the shared virtual geometry from the perspective location of the viewing user);
receiving a live data feed of a user captured by at least one camera from at least one client device; generating, from the live data feed, a user graphical representation; inserting the user graphical representation into a three-dimensional coordinate position of the 3D modeled virtual space of the virtual environment; updating, from the live data feed, the user graphical representation within the 3D modeled virtual space of the virtual environment; and processing data generated from interactions in the virtual environment, enabling real-time multi-user collaborations and interactions in the 3D modeled virtual space of the virtual environment. (Valli, FIGS. 6-8, 21; [0051]-[0054] 3D virtual worlds; [0063]-[0066] real-time captured depth sensor data (color plus depth) may be bigger than video feeds from a video camera; [0070]-[0078] FIG. 5A shows view lines of cameras 504, 506, 508, 510, 512, 514, 516, 518 for capturing a user 502. Individual viewpoints are formed by capturing each user by an omni-camera setup (which captures views of a user from all directions around a user) and providing the views from respective directions. The views (remote person's faces) are shown by AR glasses over each of the cameras (remote person's eyes). FIG. 5B is a schematic perspective view illustrating an example square capture environment 550 with 8 cameras according to some embodiments .. virtual conferencing environment; [0072] A camera capture setup defines the position of each local user. For some systems, a user is able to move together with the captured scene inside a tessellated virtual geometry; [0073] FIG. 6 is a system diagram illustrating an example set of interfaces for representing users in a virtual geometry according to some embodiments. FIG. 6 shows a system 600 for creating a virtual geometry for user interaction. Systems and methods disclosed herein may be implemented as a decentralized application where tools to manage a virtual constellation (or geometry) … sets the local coordinate system using the derived/given origin and orientation, and the real-world scale; [0076] a reconstruction and perspective processor 730 combines received calibrated sets of depth and texture into a 3D reconstruction of the local space in real world scale; [0090] Appearing in their virtual spatial positions, users are able to virtually visit and directionally view (up to 360°) participating sites. Conferencing supports a virtual cocktail party-type of interaction over network. Unlike people in a real-life cocktail party, participants are brought virtually to a space, as illustrated in FIG. 8. [0091]-[0094] Users' positions in each local space 902, 904, 906 may be captured by electronic means. Spatially faithful (or correct) perspective views of remote participants 910, 912, 914, 916, 918, 920, separated from their real backgrounds, may be formed and transmitted to each local participant, and positioned according to a unified geometry 908; [0147]; [0152]-[0154] A user position in a local space may be derived and updated 2008. [0185]-[0186]).
Valli pertains to systems and methods for managing user positions in a shared virtual geometry and providing a spatially faithful system (Valli, Abstract) and Valli teaches different features, applied in the mapping above, in relation to different exemplary embodiments.  It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, with the teachings of the various exemplary embodiments before them to modify the combination of features to tailor to the needs and goals at hand (Valli, [0211]).
Further, Valli discloses generating data in different viewing perspective (Valli, FIGS. 6-8, 21; [0045] Views from remote places may be brought to a user’s eye-point, and immersion may be supported by enabling users to view a whole 360° panorama (although a sub-view may be relatively narrow at a time due to the restricted field-of-view of AR glasses); [0045] Views from remote places may be brought to a user’s eye-point, and immersion may be supported by enabling users to view a whole 360° panorama (although a sub-view may be relatively narrow at a time due to the restricted field-of-view of AR glasses);  [0051]-[0054], [0070]-[0071] FIG. 3 is a schematic perspective view illustrating an example hexagon-shaped cell in a tessellated space 300 showing how a person located in the middle is seen by neighboring people according to some embodiments, [0076] video may be produced for 360° (full panorama) around each remote users eye-point in the local space), but does not clearly disclose switching between one viewing perspective and another view perspective in response to input from the user via a client device, in an analogous art of presenting an interactive virtual environment, Ziman discloses: wherein a viewing perspective of the user accessing the at least one virtual environment is configured to switch between a first-person viewing perspective, a third-person viewing perspective, and a self-viewing perspective in response to input from the user via a client device as the user navigates the at least one virtual environment using the client device (Ziman: [0119] the master visual control system is able to implement input commands from the user device (e.g., switching between a master camera’s “God’s view” to a first-person view; [0120] enable the user to view the environment when the user provides a command to switch vision from the God’s perspective (third-person view) to first-person view; [0128] switching between third-person view and first-person view at the user’s choice);
Valli and Ziman are analogous arts because they are in the same field of endeavor, presenting an interactive virtual environment. Therefore, it would have been obvious to one with ordinary skill, in the art before the effective filing date of the claimed invention, to modify the invention of Valli using the teachings of Ziman to clearly include switching between third-person view and fir-person view at the user’s choice such as input command from the user device. It would provide Valli’s system with the enhanced capability of allowing user to change viewing perspective so user may have more flexibility to view in the virtual environment.

Regarding dependent claim 19, Valli et al. teach:
19.   The computer-readable medium of claim 18, wherein the user graphical representation comprises a user 3D virtual cutout with a removed background, or a user real-time 3D virtual cutout with a removed background, or a video with removed background, or video without a removed background (Valli, FIGS. 6-8, 21; [0051]-[0054], [0070]-[0071] a user may have spatially faithful viewpoints only to his or her closest neighbors, captured inside and together with their backgrounds; [0076] a reconstruction and perspective processor 730 combines received calibrated sets of depth and texture into a 3D reconstruction of the local space in real world scale, specifies an origin for the local space either by a rule or by user interaction, specifies an orientation for the local space either by a rule (e.g., compass North) or by user interaction, and sets the local coordinate system using the derived/given origin and orientation, and the real-world scale.  The background around the local user may be removed and made transparent.).

Regarding dependent claim 20, Valli et al. teach:
20.  The computer-readable medium of claim 18, wherein the virtual environment platform enables engaging in an ad hoc virtual communication by opening up an ad hoc communication channel between client devices, wherein multiple user graphical representations are presented as holding a conversation in the virtual environment. (Valli, FIGS. 6-8, 21; [0051]-[0054], [0070]-[0071] FIG. 3 is a schematic perspective view illustrating an example hexagon-shaped cell in a tessellated space 300 showing how a person located in the middle is seen by neighboring people according to some embodiments, [0076] video may be produced for 360° (full panorama) around each remote users eye-point in the local space. For some embodiments, a local background of a perspective video of a user may be replaced with another background. The updated perspective video may be displayed for some embodiments. For some embodiments, a background may be updated based on the location of a user within a local space. For some embodiments, a perspective video may be a panoramic video, such as, for example, a video with a wide-angle or 360-degree view; examiner notes a 360 degree view corresponds to a view from around the point; [0086]-[0090] Appearing in their virtual spatial positions, users are able to virtually visit and directionally view (up to 360°) participating sites. Conferencing supports a virtual cocktail party-type of interaction over network. Unlike people in a real-life cocktail party, participants are brought virtually to a space, as illustrated in FIG. 8; [0091]-[0094]; [0147] real time; [0166] the communications system 100 may be a multiple access system and may employ one or more channel access schemes; [0161] broadcast; [0185]-[0186]; examiner notes ‘virtual cocktail party’ may correspond to graphical representation of holding a conversation).


Claims 6 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over US 20200099891 to Valli and Ziman as applied on claims 1 and 11, and in view of US 20180352303 to Siddique.


Regarding dependent claim 6, Valli et al. teach:
6.   The system of claim 1, wherein updating of the user graphical representation within the 3D modeled virtual space of the at least one virtual environment (Valli, FIGS. 6-8, 21; [0051]-[0054], [0070]-[0071] FIG. 3 is a schematic perspective view illustrating an example hexagon-shaped cell in a tessellated space 300 showing how a person located in the middle is seen by neighboring people according to some embodiments, [0076] video may be produced for 360° (full panorama) around each remote users eye-point in the local space … a background may be updated based on the location of a user within a local space. For some embodiments, a perspective video may be a panoramic video, such as, for example, a video with a wide-angle or 360-degree view; examiner notes a 360 degree view corresponds to a view from around the point; [0086], [0147] real time; [0153]-[0154] A user position in a local space may be derived and updated 2008; examiner notes updating `user position’ may correspond to updating a user status), but Valli does not expressly disclose updating a user status to indicate the availability of the user to engage in an interaction, in an analogous art of user interface allowing multiple users to interact, Siddique discloses: comprises updating a user status to indicate the availability of the user to engage in an interaction (Siddique, [0122] the participant status area may be used to indicate the number of viewers or participants in the present interactive experience…any updated participant information may be provided).
Valli and Siddique are analogous arts because they are in the same field of endeavor, user interface allowing multiple users to interact. Therefore, it would have been obvious to one with ordinary skill, in the art before the effective filing date of the claimed invention, to modify the invention of Valli using the teachings of Siddique to clearly include providing the updated user status information. It would provide Valli’s system with the enhanced capability of displaying the updated user status information in the GUI.

As per Claim 13, it recites features that are substantially same as those features claimed by Claim 6, thus the rationales for rejecting Claim 6 are incorporated herein.


Claims 141-144 are rejected under 35 U.S.C. 103 as being unpatentable over US 20200099891 to Valli and Ziman as applied on claim 1, and in view of US 20200294317 to Segal.

Regarding dependent claim 141, Valli discloses the background around the local user may be removed and made transparent (Valli, [0076]) but does not clearly disclose generating graphical representation with removed background, Segal discloses: 
141.   The system of claim 1, wherein the user graphical representation comprises a virtual cutout created via a 3D virtual reconstruction process using the live data feed as input data to generate a 3D mesh or 3D point cloud of a user with removed background (Segal, [0024] use one or more deep learning capabilities during an on-line video conference to process audio-visual content from a device operated by the remotely located person substantially in real-time to extract the person from the background; [0027] the image can be processed such that background and other content is removed and the individuals are virtually extracted, using measurements, the extracted content (e.g., the people) can be placed in images of the physical environment; [0061] elements that were captured such as the background elements have been removed and the user has been extracted).
Valli and Segal are analogous arts because they are in the same field of endeavor, virtual conferencing system. Therefore, it would have been obvious to one with ordinary skill, in the art before the effective filing date of the claimed invention, to modify the invention of Valli using the teachings of Segal to include using one or more deep learning capabilities during a video conference to process content to extract the person from the background and the background and other content may be removed. It would provide Valli’s system with the enhanced capability of providing a more realistic view of the user as if being physically present and located in the conference room (Segal [0061]).

Regarding dependent claim 142, Valli discloses: 
142.   The system of claim 141, wherein a receiving client device presents the virtual cutout using a polygonal structure as a frame to support the virtual cutout (Valli [0051]-[0054], [0070]-[0071] FIG. 3 is a schematic perspective view illustrating an example hexagon-shaped cell (polygonal structure) in a tessellated space 300 showing how a person located in the middle is seen by neighboring people according to some embodiments).

Regarding dependent claim 143, Valli-Segal discloses: 
143.   The system of claim 1, wherein the user graphical representation is generated by a process that includes background removal (Segal, [0024] use one or more deep learning capabilities during an on-line video conference to process audio-visual content from a device operated by the remotely located person substantially in real-time to extract the person from the background; [0027] the image can be processed such that background and other content is removed and the individuals are virtually extracted, using measurements, the extracted content (e.g., the people) can be placed in images of the physical environment; [0061] elements that were captured such as the background elements have been removed and the user has been extracted), and wherein the background removal employs image segmentation (Segal: support semantic segmentation) and neural networks (Segal [003] deep learning architectures, such as deep neural networks, deep belief networks, and recurrent neural networks).

Regarding dependent claim 144, Valli-Segal discloses: 
144.   The system of claim 143, wherein the image segmentation comprises instance segmentation or semantic segmentation (Segal: support semantic segmentation).



Conclusion
THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to Hua Lu whose telephone number is 571-270-1410 and fax number is 571-270-2410.  The examiner can normally be reached on Mon-Fri 7:30 am to 5:00 pm EST. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matthew Ell can be reached on 571-270-3264.  The fax phone number for the organization where this application or proceeding is assigned is 703-273-8300.  
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/HUA LU/
Examiner, Art Unit 2171