Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claim 19 is objected to because of the following informalities:  There appears to be a typographical error at the end of page 37, which is a period and should be a colon.  Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1-3, 5, 8, 14, 19 are rejected under 35 U.S.C. 103 as being unpatentable over Dine et al. (US 2020/0349735)(Hereinafter referred to as Dine) in view of Russell et al. (US 2020/0175759)(Hereinafter referred to as Russell).

A merged reality system (In some implementations, a first electronic device including a first image sensor uses a processor to perform a method. The method involves obtaining a first set ofkeyframes based on images of a physical environment captured by the first image sensor. The method generates a mapping defining relative locations ofkeyframes of the first set ofkeyframes. The method receives a keyframe corresponding to an image of the physical environment captured at a second, different electronic device and localizes the received keyframe to the mapping. The method then receives an anchor from the second electronic device that defines a position of a virtual object relative to the keyframe. The method displays a CGR environment including the virtual object at a location based on the anchor and the mapping. See Abstract) comprising: 
at least one server comprising at least one processor and memory including a data store storing a persistent virtual world system comprising one or more virtual replicas of real world elements (A virtual vase placed on a real world table top on the second electronic device may appear to be placed on the table top on the first electronic device too. The incorporation of the same keyframe and anchor into the mappings on both devices may help ensure precise or more consistent positioning of the vase on the table in both CGR experiences. See paragraph [0009]), 
and a plurality of connected devices communicating through a network and comprising sensing mechanisms configured to capture real-world data as multi-source data from real-world elements (In accordance with some implementations, techniques for sharing virtual objects among the electronic device in a multiuser SLAM will now be described. FIGS. 5A-5U are diagrams that illustrate an example scenario where multiple users that each perform SLAM of a physical environment share virtual objects. See paragraph [0060])(Sensor data and keyframes from multiple user devices); 
wherein the real-world data is sent to the persistent virtual world system stored in the server to enrich said virtual replicas and synchronize the virtual replicas with corresponding real-world elements, and wherein the at least one server merges the real- world data and the virtual data into the persistent virtual world system in order to augment the real-world data with the virtual data (At block 610, the method 600 obtains a first set of keyframes (e.g., 1 or more keyframes) based on images of a physical environment captured by a first image sensor (e.g., camera) of a first electronic device. In some implementations, the keyframes include information from additional sensors at the first electronic device. In some implementations, the keyframes include feature data defining locations of features with respect to a first pose of the first image sensor. See paragraph [0104])(The method 600 can be performed at a mobile device, HMD, desktop, laptop, or server device. See paragraph [0103])( At block 660, the method 600 displays a CGR environment including the virtual object at a location based on the anchor and the mapping. In some implementations, the method 600 displays the CGR environment in a first 3D coordinate system at the first electronic device. In some implementations, the first electronic device displays the CGR environment on a display at the first electronic device. See paragraph [0109]), but is silent to the virtual replicas comprising virtual data and having self-computing capabilities and autonomous behavior.
Russell teaches utilizing machine learning to determine movement of dynamic objects within new augmented reality and virtual reality environments (Based on the synthetic training data, the machine learning model may determine the movement of a new dynamic object within new virtual environment. See Abstract)
Dine and Russell teach of placing objects in augmented/virtual reality environments and Russell teaches that the movement of the dynamic objects can be based on machine learning models, therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine they system of Dine with the machine learning based dynamic object movement technique of Russell such that the objects could freely move around the space in a dynamic way.

the system of claim 1, wherein the virtual replicas include logic and models input through a plurality of software platforms and software engines, and wherein the plurality of software platforms comprise Internet of Things platforms, machine learning platforms, big data platforms, simulation platforms, or spatial data streaming platforms, or a combination thereof, and wherein the plurality of software engines comprise artificial intelligence engines, simulation engines, 3D engines, or haptic engines, or a combination thereof (Russell; Based on the synthetic training data, the machine learning model may determine the movement of a new dynamic object within new virtual environment. See Abstract)(Russell; At 435, the computer system trains a machine learning model based on the synthetic training data. The machine learning model may comprise a plurality of machine learning models and algorithms.)(Russell; By simulating a dynamic object from different viewpoints within the same model it may allow more data points (i.e. more synthetic training data) for training the machine learning model. In one embodiment, by obtaining different viewpoints from within a model, the machine learning model may, upon receiving a new image with a new viewpoint, reference a similar viewpoint in a similar environment so as to better determine a composite dynamic object within the new image from the new viewpoint. See paragraph [0055]).

Regarding claim 3, Dine in view of Russell teaches The system of claim 2, wherein the models comprise one or more of a 3D model, a dynamic model, a geometric model, or a machine learning model, or a combination thereof (Russell; Based on the synthetic training data, the machine learning model may determine the movement of a new dynamic object within new virtual environment. See Abstract)(Russell; At 435, the computer system trains a machine learning model based on the synthetic training data. The machine learning model may comprise a plurality of machine learning models and algorithms.)(Russell; By simulating a dynamic object from different viewpoints within the same model it may allow more data points (i.e. more synthetic training data) for training the machine learning model. In one embodiment, by obtaining different viewpoints from within a model, the machine learning model may, upon receiving a new image with a new viewpoint, reference a similar viewpoint in a similar environment so as to better determine a composite dynamic object within the new image from the new viewpoint. See paragraph [0055]).

Regarding claim 5, Dine in view of Russell teaches The system of claim 3, wherein the machine learning model employs machine learning algorithms based on actual or simulated data that have been used as training data (Russell; Based on the synthetic training data, the machine learning model may determine the movement of a new dynamic object within new virtual environment. See Abstract)(Russell; At 435, the computer system trains a machine learning model based on the synthetic training data. The machine learning model may comprise a plurality of machine learning models and algorithms.)(Russell; By simulating a dynamic object from different viewpoints within the same model it may allow more data points (i.e. more synthetic training data) for training the machine learning model. In one embodiment, by obtaining different viewpoints from within a model, the machine learning model may, upon receiving a new image with a new viewpoint, reference a similar viewpoint in a similar environment so as to better determine a composite dynamic object within the new image from the new viewpoint. See paragraph [0055]).


Regarding claim 8, Dine in view of Russell teaches the system of claim 1, wherein the real-world data comprises real spatial data and the virtual data comprises virtual spatial data, and wherein combinations thereof by the at least one server enable augmenting the real spatial data with the virtual spatial data  (Dine; FIG. 6 is a flowchart representation of a method 600 for representing virtual objects in a CGR experience at a first user ( e.g., between users in a shared multiuser CGR experience) from the perspective of a different originating user. In some implementations, the method 600 is performed by an electronic device (e.g., FIGS. 1-3). The method 600 can be performed at a mobile device, HMD, desktop, laptop, or server device. See paragraph [0103])(Dine; In another example, the controller 110 is a remote server located outside of the physical enviromnent 105 ( e.g., a cloud server, central server, etc.). See paragraph [0023] (Dine; At block 660, the method 600 displays a CGR environment including the virtual object at a location based on the anchor and the mapping. In some implementations, the method 600 displays the CGR environment in a first 3D coordinate system at the first electronic device. In some implementations, the first electronic device displays the CGR environment on a display at the first electronic device. See paragraph [0109])).

Regarding claim 14, Dine teaches a method to generate a merged reality system (In some implementations, a first electronic device including a first image sensor uses a processor to perform a method. The method involves obtaining a first set ofkeyframes based on images of a physical environment captured by the first image sensor. The method generates a mapping defining relative locations ofkeyframes of the first set ofkeyframes. The method receives a keyframe corresponding to an image of the physical environment captured at a second, different electronic device and localizes the received keyframe to the mapping. The method then receives an anchor from the second electronic device that defines a position of a virtual object relative to the keyframe. The method displays a CGR environment including the virtual object at a location based on the anchor and the mapping. See Abstract), the method comprising: mapping real world objects into a virtual world, by generating virtual replicas of the real world objects (A virtual vase placed on a real world table top on the second electronic device may appear to be placed on the table top on the first electronic device too. The incorporation of the same keyframe and anchor into the mappings on both devices may help ensure precise or more consistent positioning of the vase on the table in both CGR experiences. See paragraph [0009]); 
adding models and real-world data related to the real world objects to the virtual replicas  (In accordance with some implementations, techniques for sharing virtual objects among the electronic device in a multiuser SLAM will now be described. FIGS. 5A-5U are diagrams that illustrate an example scenario where multiple users that each perform SLAM of a physical environment share virtual objects. See paragraph [0060])(Sensor data and keyframes from multiple user devices); 
merging the real-world data and virtual data; and augmenting the real-world data with the virtual data (At block 610, the method 600 obtains a first set of keyframes (e.g., 1 or more keyframes) based on images of a physical environment captured by a first image sensor (e.g., camera) of a first electronic device. In some implementations, the keyframes include information from additional sensors at the first electronic device. In some implementations, the keyframes include feature data defining locations of features with respect to a first pose of the first image sensor. See paragraph [0104])(The method 600 can be performed at a mobile device, HMD, desktop, laptop, or server device. See paragraph [0103])( At block 660, the method 600 displays a CGR environment including the virtual object at a location based on the anchor and the mapping. In some implementations, the method 600 displays the CGR environment in a first 3D coordinate system at the first electronic device. In some implementations, the first electronic device displays the CGR environment on a display at the first electronic device. See paragraph [0109]), but is silent to hereby providing self-computing capabilities and autonomous behavior to the virtual replicas
Russell teaches utilizing machine learning to determine movement of dynamic objects within new augmented reality and virtual reality environments (Based on the synthetic training data, the machine learning model may determine the movement of a new dynamic object within new virtual environment. See Abstract)


Regarding claim 19, Dine teaches one or more non-transitory computer readable-media having stored thereon instructions configured to cause a computer system comprising memory and at least one processor to perform steps (The memory 220 comprises a non-transitory computer readable storage medium. In some implementations, the memory 220 or the non-transitory computer readable storage medium of the memory 220 stores the following programs, modules and data structures, or a subset thereof including an optional operating system 230 and computer-generated reality (CGR) module 240. See paragraph [0037]) (To that end, as a non-limiting example, in some implementations the controller 110 includes one or more processing units 202 (e.g., microprocessors See paragraph [0035])
comprising: 
mapping real world objects into a virtual world by generating virtual replicas of the real world objects in the virtual world (A virtual vase placed on a real world table top on the second electronic device may appear to be placed on the table top on the first electronic device too. The incorporation of the same keyframe and anchor into the mappings on both devices may help ensure precise or more consistent positioning of the vase on the table in both CGR experiences. See paragraph [0009]); 
adding models and real-world data related to the real world objects to the virtual replicas (In accordance with some implementations, techniques for sharing virtual objects among the electronic device in a multiuser SLAM will now be described. FIGS. 5A-5U are diagrams that illustrate an example scenario where multiple users that each perform SLAM of a physical environment share virtual objects. See paragraph [0060])(Sensor data and keyframes from multiple user devices), ; 
merging the real-world data and virtual data; and augmenting the real-world data with the virtual data (At block 610, the method 600 obtains a first set of keyframes (e.g., 1 or more keyframes) based on images of a physical environment captured by a first image sensor (e.g., camera) of a first electronic device. In some implementations, the keyframes include information from additional sensors at the first electronic device. In some implementations, the keyframes include feature data defining locations of features with respect to a first pose of the first image sensor. See paragraph [0104])(The method 600 can be performed at a mobile device, HMD, desktop, laptop, or server device. See paragraph [0103])( At block 660, the method 600 displays a CGR environment including the virtual object at a location based on the anchor and the mapping. In some implementations, the method 600 displays the CGR environment in a first 3D coordinate system at the first electronic device. In some implementations, the first electronic device displays the CGR environment on a display at the first electronic device. See paragraph [0109]), but is silent to thereby providing self-computing capabilities and autonomous behavior to the virtual replicas; 
Russell teaches utilizing machine learning to determine movement of dynamic objects within new augmented reality and virtual reality environments (Based on the synthetic training data, the machine learning model may determine the movement of a new dynamic object within new virtual environment. See Abstract)
Dine and Russell teach of placing objects in augmented/virtual reality environments and Russell teaches that the movement of the dynamic objects can be based on machine learning models, therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine they system of Dine with the machine learning based dynamic object .


Claim 4 is/are rejected under 35 U.S.C. 103 as being unpatentable over Dine et al. (US 2020/0349735)(Hereinafter referred to as Dine) in view of Russell et al. (US 2020/0175759)(Hereinafter referred to as Russell) in view of Foley et al. (“KD-Tree Acceleration Structures for a GPU Raytracer”, ACM, 2005.)(Hereinafter referred to as Foley).

Regarding claim 4, Dine in view of Russell teaches the system of claim 3, but is silent to wherein the 3D model comprises a 3D data structure representing at least one 3D object, the 3D data structure comprising quadtrees, BSP trees, sparse voxel octrees, 3D arrays, kD trees, point clouds, wire-frames, boundary representations (B-Rep), constructive solid geometry trees (CSG Trees), bintrees, or hexagonal structures, or combinations thereof.
Foley teaches utilizing a KD-tree data structure is a faster acceleration structure when compared against a uniform grid (To date, most GPU-based raytracers have relied upon uniform grid acceleration structures. In contrast, the kd-tree has gained widespread use in CPU-based raytracers and is regarded as the best general-purpose acceleration structure. We demonstrate two kd-tree traversal algorithms suitable for GPU implementation and integrate them into a streaming raytracer. We show that for scenes with many objects at different scales, our kd-tree algorithms are up to 8 times faster than a uniform grid. See Abstract).
Dine in view of Russell and Foley teach of rendering graphics and Foley teaches that by utilizing a KD-tree accelerations structure an 8 times improvement in speed can be made vs a uniform grid, therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of .

Claim 6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Dine et al. (US 2020/0349735)(Hereinafter referred to as Dine) in view of Russell et al. (US 2020/0175759)(Hereinafter referred to as Russell) in view of DiVerdi et al. (“Level of Detail Interfaces”, IEEE, 2004.)(Hereinafter referred to as DiVerdi).


Regarding claim 6, Dine in view of Russell teaches the system of claim 2, wherein the models consider a level of detail required by a specific scenario computation, wherein the level of detail adjusts complexity of a model representation depending on distance of the virtual replica from a viewer, object importance, viewpoint-relative speed or position, classification of individual viewers, or combinations thereof.
	DiVerdi teaches a technique in which the level of detail of the object changes based on the distance from the viewpoint to take into consideration diminished screen space (We present the novel level of detail interface based on the marriage of level of detail geometry and an adaptable user interface. Level of detail interfaces allow applications to paramaterize their display of data and interface widgets with respect to distance from the camera, to best take advantage of diminished screen space in a 3D environment. See Abstract)
	Dine in view of Russell and DiVerdi teach of presenting graphical information to a user and DiVerdi teaches that the level of detail can adjusted with respect to distance from the camera, therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to combine .
	

Claim 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Dine et al. (US 2020/0349735)(Hereinafter referred to as Dine) in view of Russell et al. (US 2020/0175759)(Hereinafter referred to as Russell) in view of Klas et al. (“VR is on the Edge: How to Deliver 360 Videos in Mobile Networks”, ACM, 2017)(Hereinafter referred to as Klas).

Regarding claim 13, Dine in view of Russell teaches the system of claim 1, but is silent to wherein the system employs a cloud to edge distributed computing infrastructure. 
However, Dine teaches of implementing a cloud based computing structure (Dine; FIG. 6 is a flowchart representation of a method 600 for representing virtual objects in a CGR experience at a first user ( e.g., between users in a shared multiuser CGR experience) from the perspective of a different originating user. In some implementations, the method 600 is performed by an electronic device (e.g., FIGS. 1-3). The method 600 can be performed at a mobile device, HMD, desktop, laptop, or server device. See paragraph [0103])(Dine; In another example, the controller 110 is a remote server located outside of the physical enviromnent 105 ( e.g., a cloud server, central server, etc.). See paragraph [0023]). 
Klas teaches distributed cloud-computer capabilities at the edge of a mobile network can reduce network congestion and response time to achieve efficient network operation for a better user experience (MEC provides distributed cloud-computing capabilities at the edge of the mobile network, within the Radio Access Network (RAN) and in close proximity to customers. The aim is to reduce network congestion and response time, achieve highly efficient network operation, and offer a better user experience. Leveraging IT technologies and APIs, MEC also allows mobile operators to open their network to authorized application developers and content providers, providing direct access to real-time information from the underlying radio transport (e.g. an API to the Radio Network Information Service which provides real-time details about the device’s radio access bearer). Moving workloads to the edge is then the argument for enabling highly responsive services, supported by smaller and
 slimmer devices, with improved user QoE (Quality of Experience) in 4G networks immediately, without waiting for 5G enhancements. See section 3.2).
Dine in view of Russell and Klas teach of providing virtual experiences to users of mobile devices and Klas teches that by utilizing distributed cloud-computing at the edge of the mobile network the system can realize network efficiencies, therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine the system of Dine in view of Russell  with the distributed cloud-computer capabilities of Klas such that the system could provide the user with a better, network efficient, experience.


Allowable Subject Matter
Claims 7, 9-12, 15-18 and 20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  

The prior art of record alone or in combination is silent to the limitations “wherein the classification of individual viewers comprises artificial intelligence viewers or human viewers, and wherein the level of detail is further adjusted depending on a sub-classification of artificial intelligence viewer or of human viewer.” of claim 7 when read in light of the rest of the limitations in claim 7 and the claims to which claim 7 depends and thus claim 7 contains allowable subject matter.


The prior art of record alone or in combination is silent to the limitations “wherein the virtual spatial data represents a desired location input by a user via a user device, the desired location being different from a real location of the user, prompting the processor to create a copy of an avatar of the user in the desired location.” of claim 9 when read in light of the rest of the limitations in claim 9 and the claims to which claim 9 depends and thus claim 9 contains allowable subject matter.

The prior art of record alone or in combination is silent to the limitations “wherein connected virtual replicas create a virtual replica network.” of claim 10 when read in light of the rest of the limitations in claim 10 and the claims to which claim 10 depends and thus claim 10 contains allowable subject matter.
	Claims 11 and 12 contain allowable subject matter because they depend on a claim that contains allowable subject matter.

The prior art of record alone or in combination is silent to the limitations “ further comprising: obtaining real spatial data and virtual spatial data; checking whether the real and virtual spatial data coincide; where the real spatial data and virtual spatial data do not coincide, creating a copy of a user avatar in a desired location; where the real spatial data and virtual spatial data coincide, merging the real and virtual spatial data; and augmenting the real spatial data with the virtual spatial data. ” of claim 15 when read in light of the rest of the limitations in claim 15 and the claims to which claim 15 depends and thus claim 15 contains allowable subject matter.
	Claims 16-18 contain allowable subject matter because they depend on a claim containing allowable subject matter.



The prior art of record alone or in combination is silent to the limitations “ further comprising: obtaining real spatial data corresponding to a real location and virtual spatial data corresponding to a desired location in the virtual world; checking whether the real and virtual spatial data coincide; where the real spatial data and virtual spatial data do not coincide, creating a copy of a user avatar in the desired location; where the real spatial data and virtual spatial data coincide, merging the real and virtual spatial data; and augmenting the real spatial data with the virtual spatial data. ” of claim 20 when read in light of the rest of the limitations in claim 20 and the claims to which claim 20 depends and thus claim 20 contains allowable subject matter.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICHOLAS R WILSON whose telephone number is (571)272-0936.  The examiner can normally be reached on M-F 7:30-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on (572)-272-7794.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.







/NICHOLAS R WILSON/Primary Examiner, Art Unit 2611