Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION

Status of Claims
Claims 1-20 are currently pending in this application.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on June 3, 2021 are hereby acknowledged.  All references have been considered by the examiner. Initialed copies of the PTO-1449 are included in this correspondence.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1-3, 8 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Gausebeck (2019/0026957; IDS).

Regarding claim 1, Gausebeck teaches a method for generating a floorplan of an indoor scene (e.g., computer-implemented method for developing and training 2D-from-3D models; Gausebeck: [0027].  The 3D model generation component 118 can generate a 3D model or representation of the 3D model of an environment corresponding to a floorplan model of the environment, a dollhouse model of the environment (e.g., in implementations in which the environment comprises an interior of an architectural space, such as house), and the like. Gausebeck: [0077] L.14-20), comprising: 
determining a room classification of a room and a wall classification of a wall for the room from an input image of an indoor scene (e.g., A floorplan model generated by the 3D model generation component 118 can be a 3D floorplan model or a 2D floorplan model. A 3D floorplan model can comprise edges of each floor, wall, and ceiling as lines. Lines for floors, walls and ceilings can be dimensioned (e.g., annotated) with an associated size. In one or more embodiments, a 3D floorplan model can be navigated via a viewer on a remote device in 3D. In an aspect, subsections of the 3D floorplan model (e.g., rooms) can be associated with a textual data (e.g., a name). Gausebeck: [0080] L.1-10.  It is obvious that a room is identified with area enclosed in a plurality of walls and hence a plurality of lines enclosing an area); and 
determining a floorplan based at least in part upon the room classification and the wall classification without constraining a total number of rooms in the indoor scene or a size of the room (e.g., Measurement data (e.g., square footage, etc.) associated with surfaces can also be determined based on the derived 3D data corresponding to the respective surfaces and associated with the respective surfaces. These measurements can be displayed in association with viewing and/or navigation of the 3D floorplan model. Calculation of area (e.g., square footage) can be determined for any identified surface or portion of a 3D model with a known boundary, for example, by summing areas of polygons comprising the identified surface or the portion of the 3D model. Displays of individual items (e.g., dimensions) and/or classes of items can be toggled in a floorplan via a viewer on a remote device (e.g., via a user interface on a remote client device). A 2D floorplan model can include surfaces (e.g., walls, floors, ceilings, etc), portals (e.g., door openings) and/or window openings associated with derived 3D data 116 used to generate a 3D model and projected to a flat 2D surface. In yet another aspect, a floorplan can be viewed at a plurality of different heights with respect to vertical surfaces (e.g., walls) via a viewer on a remote device. Gausebeck: [0080] L.10-30.  It is obvious that a room is an area enclosed with walls with at least one door opening).

Regarding claim 2, Gausebeck teaches the method of claim 1, wherein determining the room classification of the room and the wall classification of the wall comprises: 
identifying the input image, wherein the input image comprises one image or a sequence of images from a three-dimensional scan of the indoor scene (e.g., Portions of the 3D model geometric data (e.g., the mesh) can include image data describing texture, color, intensity, and the like. For example, the geometric data can comprise data points of geometry in addition to comprising texture coordinates associated with the data points of geometry (e.g., texture coordinates that indicate how to apply texture data to geometric data). In various embodiments, received 2D image data 102 (or portions thereof) can be associated with portions of the mesh to associate visual data from the 2D image data 102 (e.g., texture data, color data, etc.) with the mesh. In this regard, the 3D model generation component 118 can generate 3D models based and 2D images and the 3D data respectively associated with the 2D images. In an aspect, data used to generate 3D models can be collected from scans (e.g. utilizing sensors) of real-world scenes, spaces (e.g. houses, office spaces, outdoor spaces, etc.), objects (e.g. furniture, decorations, goods, etc.), and the like. Data can also be generated based on computer implemented 3D modeling systems. Gausebeck: [0071]. in some implementations, a representation or rendering of a 3D model can be a 2D image or panorama associated with the 3D model from a specific perspective of a virtual camera located at a specific navigation position and orientation relative to the 3D model. Gausebeck: [0057] L.8-12); and 
determining an input point cloud for the input image (e.g., In various embodiments, the 3D model generation component 118 can employ the derived 3D data 116 for respective images received by the computing device 104 to generate reconstructed 3D models of objects or environments included in the images. The 3D models described herein can include data representing positions, geometric shapes, curved surfaces, and the like. For example, a 3D model can include a collection of points represented by 3D coordinates, such as points in a 3D Euclidean space. The collection of points can be associated with each other (e.g. connected) by geometric entities. For example, a mesh comprising a series of triangles, lines, curved surfaces (e.g. non-uniform rational basis splines (NURBS)), quads, n-grams, or other geometric shapes can connect the collection of points. For example, a 3D model of an interior environment of building can comprise mesh data (e.g., a triangle mesh, a quad mesh, a parametric mesh, etc.), one or more texture-mapped meshes (e.g., one or more texture-mapped polygonal meshes, etc.), a point cloud, a set of point clouds, surfels and/or other data constructed by employing one or more 3D sensors.  Gausebeck: [0070] L.1-21).

Regarding claim 3, Gausebeck teaches the method of claim 2, wherein determining the room classification of the room and the wall classification of the wall further comprises: 
identifying a subset of the input point cloud (e.g., a representation or rendering of a 3D model can be the 3D model or a part of the 3D model generated from a specific navigation position and orientation of a virtual camera relative to the 3D model and generated using aligned sets or subsets of captured 3D data employed to generate the 3D model. Gausebeck: [0057] L.13-18.  A 3D model of an interior environment of building can comprise mesh data (e.g., a triangle mesh, a quad mesh, a parametric mesh, etc.), one or more texture-mapped meshes (e.g., one or more texture-mapped polygonal meshes, etc.), a point cloud, a set of point clouds, surfels and/or other data constructed by employing one or more 3D sensors.  Gausebeck: [0070] L.15-21); and 
training a deep network with at least a synthetic dataset (e.g., The 3D-from-2D model development module 3314 can further include model training component 3318, which can be configured to employ the training data to train and/or develop one or more 3D-from-2D neural network models included in the 3D-from-2D model database 3326. Gausebeck: [0245] L.13-18. In some implementations, the training data development component 3316 can employ a textured 3D mesh of a 3D space model included in the 3D model and alignment data 3304 to generate 2D images from camera positions where a real camera was never placed. For instance, the training data development component 3316 can use capture position/orientation information for respective images included in the indexed 2D image data 3306 to determine various virtual capture position/orientation combinations that are not represented by the captured 2D images. The training data development component 3316 can further generate synthetic images of the 3D model from these virtual capture positions/orientations. In some implementations, the training data development component 3316 can generate synthetic 2D images from various perspective of the 3D model that correspond to a sequence of images captured by a virtual camera in association with navigating the 3D space model, wherein the navigation assimilates a capture scenario as if a user were actually walking through the environment represented by the 3D model while holding a camera and capturing images along the way. Gausebeck: [0249] L.20-40).

Regarding claim 8, the claim is a system claim of method claim 1.  The claim is similar in scope to claim 1 and it is rejected under similar rationale as claim 1.
Claim 8 further recites: 
a processor (e.g., The method can comprise receiving a panoramic image by a system comprising a processor, Gausebeck: [0032] L.3-4); and 
memory operatively coupled to the processor and storing a sequence of instructions which, when executed by the processor, causing the processor to perform a set of acts (e.g., a memory that stores computer executable components, and a processor that executes the computer executable components stored in the memory. Gausebeck: [0044] L.3-5. Aspects of systems, apparatuses or processes explained in this disclosure can constitute machine-executable components embodied within machine(s), e.g. embodied in one or more computer readable mediums (or media) associated with one or more machines. Such components, when executed by the one or more machines, e.g. computer(s), computing device(s), virtual machine(s), etc. can cause the machine(s) to perform the operations described. Gausebeck: [0061] L.6-13).

Regarding claim 15, the claim is a wearable extended reality device claim of method claim 1.  The claim is similar in scope to claim 1 and it is rejected under similar rationale as claim 1.
Claim 15 further recites: 
an optical system having an array of micro-displays or micro-projectors to present digital contents to an eye of a user (e.g., the user device 130 can include but is not limited to: a desktop computer, a laptop computer, a mobile phone, a smartphone, a tablet personal computer (PC), a personal digital assistant (PDA), a heads-up display (HUD), a virtual reality (VR) headset, augmented reality (AR) headset or device, a standalone digital camera, or another type of wearable computing device. Gausebeck: [0062] L.40-49); 
a processor coupled to the optical system (e.g., a processor that executes the computer executable components stored in the memory. The computer executable components can comprise a 3D data derivation component configured to employ one or more 3D-from-2D neural network models to derive 3D data for the 2D images. Gausebeck: [0044] L.4-6; the computer executable components can comprise a communication component configured to send the 2D images and the 3D data to an external device, Gausebeck: [0044] L.13-16; the communication component also be configured to receive the 3D model from the external device and device can render the 3D model via a display of the device. Gausebeck: [0044] L.20-23); and 
memory operatively coupled to the processor and storing a sequence of instructions which, when executed by the processor, causes the processor to perform a set of acts, the set of acts (e.g., a memory that stores computer executable components, and a processor that executes the computer executable components stored in the memory. Gausebeck: [0044] L.3-5. Aspects of systems, apparatuses or processes explained in this disclosure can constitute machine-executable components embodied within machine(s), e.g. embodied in one or more computer readable mediums (or media) associated with one or more machines. Such components, when executed by the one or more machines, e.g. computer(s), computing device(s), virtual machine(s), etc. can cause the machine(s) to perform the operations described. Gausebeck: [0061] L.6-13).

Claims 4-7, 9-10, 13-14, 16-17 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Gausebeck as applied to claim(s) 3 (8 and 15) and further in view of Ebrahimi et al. (2019/0035099; IDS).

Regarding claim 4, Gausebeck teaches the method of claim 3, wherein determining the room classification of the room and the wall classification of the wall further comprises: 
generating, at a deep network, one or more room cluster labels for one or more vertices represented in the subset and a wall cluster label for the wall (see 4_1 below).
While Gausebeck does not explicitly teach, Ebrahimi teaches:
(4_1). generating, at a deep network, one or more room cluster labels for one or more vertices represented in the subset and a wall cluster label for the wall (e.g., In some embodiments, maps may be three dimensional maps, e.g., indicating the position of walls, furniture, doors, and the like in a room being mapped. In some embodiments, maps may be two dimensional maps, e.g., point clouds or polygons or finite ordered list indicating obstructions at a given height (or range of height, for instance from zero to 5 or 10 centimeters or less) above the floor. Ebrahimi: [0053] L.1-8. Nodes on the graph are identified in response to non-core corresponding depth vectors being within a threshold distance of a core depth vector in the graph, and in response to core depth vectors in the graph being reachable by other core depth vectors in the graph, where to depth vectors are reachable from one another if there is a path from one depth vector to the other depth vector where every link and the path is a core depth vector and is it within a threshold distance of one another. The set of nodes in each resulting graph, in some embodiments, may be designated as a cluster, and points excluded from the graphs may be designated as outliers that do not correspond to clusters. Ebrahimi: [0061] L.18-29. Some embodiments may then determine the centroid of each cluster in the spatial dimensions of an output depth vector for constructing floor plan maps. Ebrahimi: [0062] L.1-3).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Ebrahimi into the teaching of Gausebeck so that clustering is used for constructing floor plan map.

Regarding claim 5, the combined teaching of Gausebeck and Ebrahimi teaches the method of claim 4, wherein generating the one or more room cluster labels and the wall cluster label comprises: 
performing a nested partitioning on a set of points to divide the set of points into a plurality of overlapping local regions based at least in part upon a distance metric pertaining to the indoor scene (e.g., using a standalone digital camera, a smartphone, or similar device with a camera, a user can walk around an environment and take 2D images at several points nearby along the way, capturing different perspectives of the environment. In another example implementation, related 2D images can include 2D images from nearby or overlapping perspectives captured by a single camera in association rotation of the camera about a fixed axis. In another implementation, related 2D images can include two or more images respectively captured by two or more cameras with partially overlapping fields-of-view or different perspective of an environment (e.g., captured by different cameras at or near the same time). Gausebeck: [0141] L.28-41. The image correlation component 920 can be configured to automatically classify respective frames of video included in a same video clip having less than a defined duration and/or associated with a defined range of movement based on the capture device motion data 904 (e.g., movement in a particular direction less than a threshold distance or degree of rotation) as being related. Gausebeck: [0151] L.24-32); and 
extracting a local feature that captures a geometric structure in the indoor scene at least by recursively performing semantic feature extraction on the nested partitioning of the set of points (e.g., The alignment process can involve iteratively aligning different point clouds from neighboring and overlapping images captured from different positions and orientations relative to an object or environment to generate a global alignment between the respective point clouds using correspondences in derived position information for the respective points. Visual feature information including correspondences in color data, texture data, luminosity data, etc. for respective points or pixels included in the point clouds can also be used (along with other sensor data if available) to generate the aligned data. Gausebeck: [0075] L.16-26).

Regarding claim 6, the combined teaching of Gausebeck and Ebrahimi teaches the method of claim 5, wherein generating the one or more room cluster labels and a wall cluster label comprises: 
abstracting the local feature into a higher-level feature or representation (e.g., Deep learning models can include one or more layers that learn using supervised learning (e.g., classification) and/or unsupervised learning (e.g., pattern analysis) manners. In some implementations, deep learning techniques for deriving 3D data from 2D images can learn using multiple levels of representations that correspond to different levels of abstraction, wherein the different levels form a hierarchy of concepts. Gausebeck: [0068] L.14-22); and 
adaptively weighing a plurality of local features at multiple, different scales or resolutions (e.g., There are many existing models for 3D-from-2D depth prediction based on deep convolutional neural networks. One approach is to use fully convolutional residual networks that directly predict depth values as regression outputs. Other models use multi-scale neural networks to separate overall scale prediction from prediction of the fine details. Gausebeck: [0069] L.1-7. In one or more implementations, the 3D-from-2D convolutional neural network accounts for weighted values applied to respective pixels based on their projected angular area during training. In this regard, the 3D-from-2D neural network model can include a model that was trained based on weighted values applied to respective pixels of projected panoramic images in association with deriving depth data for the respective pixels, wherein the weighted values varied based on an angular area of the respective pixels. For example, during training, the weighted values were decreased as the angular area of the respective pixels decreased. In addition, in some implementations, downstream convolutional layers of the convolutional layers that follow a preceding layer are configured to re-project a portion of the panoramic image processed by the preceding layer in association with deriving depth data for the panoramic image, resulting in generation of a re-projected version of the panoramic image for each of the downstream convolutional layers. Gausebeck: [0118] L.1-19).

Regarding claim 7, the combined teaching of Gausebeck and Ebrahimi teaches the method of claim 6, wherein generating the one or more room cluster labels and a wall cluster label comprises: 
combining the plurality of local features at the multiple, different scales or resolutions (e.g., In some embodiments, thresholding may be used in identifying the area of overlap wherein areas or objects of interest within an image may be identified using thresholding as different areas or objects have different ranges of pixel intensity. For example, an object captured in an image, the object having high range of intensity, can be separated from a background having low range of intensity by thresholding wherein all pixel intensities below a certain threshold are discarded or segmented, leaving only the pixels of interest. In some embodiments, a metric can be used to indicate how good of an overlap there is between the two sets of perceived depths. Ebrahimi: [0047] L.41-52); and 
assigning the one or more room cluster labels and the wall cluster label to a metric space for the indoor scene based at least in part upon the distance metric (e.g., Some embodiments may implement DB-SCAN on depths and related values like pixel intensity, e.g., in a vector space that includes both depths and pixel intensities corresponding to those depths, to determine a plurality of clusters, each corresponding to depth measurements of the same feature of an object. Some embodiments may execute a density-based clustering algorithm, like DBSCAN, to establish groups corresponding to the resulting clusters and exclude outliers. To cluster according to depth vectors and related values like intensity, some embodiments may iterate through each of the depth vectors and designate a depth vectors as a core depth vector if at least a threshold number of the other depth vectors are within a threshold distance in the vector space (which may be higher than three dimensional in cases where pixel intensity is included). Some embodiments may then iterate through each of the core depth vectors and create a graph of reachable depth vectors, where nodes on the graph are identified in response to non-core corresponding depth vectors being within a threshold distance of a core depth vector in the graph, and in response to core depth vectors in the graph being reachable by other core depth vectors in the graph, where to depth vectors are reachable from one another if there is a path from one depth vector to the other depth vector where every link and the path is a core depth vector and is it within a threshold distance of one another. The set of nodes in each resulting graph, in some embodiments, may be designated as a cluster, and points excluded from the graphs may be designated as outliers that do not correspond to clusters. Ebrahimi: [0061]).

Regarding claim 9, Gausebeck teaches the system of claim 8, the memory comprising the sequence of instructions which, when executed by the processor, causes the processor to perform determining the floorplan further comprises instructions which, when executed by the processor, causes the processor to perform the set of acts that further comprises: 
generating a shape for the room using at least the room classification and the wall classification, wherein the room classification comprises a room cluster label assigned to or associated with the room, and the wall classification comprises one or more wall cluster labels assigned to or associated with one or more walls of the room, and the one or more walls comprise the wall (see 9_1 below); and 
generating the floorplan at least by aggregating or integrating an estimated room perimeter relative to a global coordinate system based at least in part upon the shape, wherein the shape comprises a polygon of a DeepPerimeter type (see 9_2 below).
While Gausebeck does not explicitly teach, Ebrahimi teaches:
(9_1). generating a shape for the room using at least the room classification and the wall classification, wherein the room classification comprises a room cluster label assigned to or associated with the room, and the wall classification comprises one or more wall cluster labels assigned to or associated with one or more walls of the room, and the one or more walls comprise the wall (e.g., In some embodiments, maps may be three dimensional maps, e.g., indicating the position of walls, furniture, doors, and the like in a room being mapped. In some embodiments, maps may be two dimensional maps, e.g., point clouds or polygons or finite ordered list indicating obstructions at a given height (or range of height, for instance from zero to 5 or 10 centimeters or less) above the floor. Ebrahimi: [0053] L.1-8. Nodes on the graph are identified in response to non-core corresponding depth vectors being within a threshold distance of a core depth vector in the graph, and in response to core depth vectors in the graph being reachable by other core depth vectors in the graph, where to depth vectors are reachable from one another if there is a path from one depth vector to the other depth vector where every link and the path is a core depth vector and is it within a threshold distance of one another. The set of nodes in each resulting graph, in some embodiments, may be designated as a cluster, and points excluded from the graphs may be designated as outliers that do not correspond to clusters. Ebrahimi: [0061] L.18-29. Some embodiments may then determine the centroid of each cluster in the spatial dimensions of an output depth vector for constructing floor plan maps. Ebrahimi: [0062] L.1-3);
(9_2). generating the floorplan at least by aggregating or integrating an estimated room perimeter relative to a global coordinate system based at least in part upon the shape, wherein the shape comprises a polygon of a DeepPerimeter type (e.g., Since the overlapping depths from the first and second fields of view within the area of overlap do not necessarily have the exact same values and a range of tolerance between their values is allowed, the overlapping depths from the first and second fields of view are used to calculate new depths for the overlapping area using a moving average or another suitable mathematical convolution. This is expected to improve the accuracy of the depths as they are calculated from the combination of two separate sets of measurements. The newly calculated depths are used as the depths for the overlapping area, substituting for the depths from the first and second fields of view within the area of overlap. The new depths are then used as ground truth values to adjust all other perceived depths outside the overlapping area. Once all depths are adjusted, a first segment of the floor plan is complete. This method may be repeated such that the camera perceives depths (or pixel intensities indicative of depth) within consecutively overlapping fields of view as it moves, and the control system identifies the area of overlap and combines overlapping depths to construct a floor plan of the environment. Ebrahimi: [0034] L.55-75. The resulting floor plan may be encoded in various forms. For instance, some embodiments may construct a point cloud of two dimensional or three dimensional points by transforming each of the vectors into a vector space with a shared origin, e.g., based on the above-described displacement vectors, in some cases with displacement vectors refined based on measured depths. Or some embodiments may represent maps with a set of polygons that model detected surfaces, e.g., by calculating a convex hull over measured vectors within a threshold area, like a tiling polygon. Polygons are expected to afford faster interrogation of maps during navigation and consume less memory than point clouds at the expense of greater computational load when mapping. Vectors need not be labeled as “vectors” in program code to constitute vectors, which is not to suggest that other mathematical constructs are so limited. In some embodiments, vectors may be encoded as tuples of scalars, as entries in a relational database, as attributes of an object, etc. Similarly, it should be emphasized that images need not be displayed or explicitly labeled as such to constitute images. Moreover, sensors may undergo some movement while capturing a given image, and the “pose” of a sensor corresponding to a depth image may, in some cases, be a range of poses over which the depth image is captured. Ebrahimi: [0052]).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Ebrahimi into the teaching of Gausebeck so that overlapping features are detected and labeled to determine a floor plan.

Regarding claim 10, the combined teaching of Gausebeck and Ebrahimi teaches the system of claim 9, the memory comprising the sequence of instructions which, when executed by the processor, causes the processor to perform generating the shape further comprises instructions which, when executed by the processor, causes the processor to perform the set of acts that further comprises: 
performing a deep estimation on an RGB (red green blue) frame of the input image of the indoor scene (e.g., The standard models 114 can also include one or more models that perform 3D-from-2D depth estimation using non-parameter algorithms. Non-parameter algorithms learn depth from a single RGB image, relying on the assumption that the similarities between regions in the RGB images imply similar depth cues. After clustering the training dataset based on global features, these models first search the candidate RGB-D of the input RGB image in the feature space, then, the candidate pairs are warped and fused to obtain the final depth.  Gausebeck: [0067] L.9-18); and 
generating a depth map and a wall segmentation mask at least by using a multi-view depth estimation network and a segmentation module, wherein the segmentation module is based at least in part upon a PSPNet (Pyramid scene parsing network) and a ResNet (residual network) (e.g., There are many existing models for 3D-from-2D depth prediction based on deep convolutional neural networks. One approach is to use fully convolutional residual networks that directly predict depth values as regression outputs. Other models use multi-scale neural networks to separate overall scale prediction from prediction of the fine details. Some models refine the results by incorporating fully-connected layers, adding conditional random field (CRF) elements to the network, or predicting additional outputs such as normal vectors and combining those with the initial depth predictions to produce refined depth predictions. Gausebeck: [0069]).

Regarding claim 13, the combined teaching of Gausebeck and Ebrahimi teaches the system of claim 9, the memory comprising the sequence of instructions which, when executed by the processor, causes the processor to perform generating the floorplan further comprises instructions which, when executed by the processor, causes the processor to perform the set of acts that further comprises: 
identifying a room instance and a wall instance from a scan of the indoor environment (e.g., A floorplan model generated by the 3D model generation component 118 can be a 3D floorplan model or a 2D floorplan model. A 3D floorplan model can comprise edges of each floor, wall, and ceiling as lines. Lines for floors, walls and ceilings can be dimensioned (e.g., annotated) with an associated size. In one or more embodiments, a 3D floorplan model can be navigated via a viewer on a remote device in 3D. In an aspect, subsections of the 3D floorplan model (e.g., rooms) can be associated with a textual data (e.g., a name). Gausebeck: [0080] L.1-10); and 
estimating a closed perimeter for the room instance (e.g., Measurement data (e.g., square footage, etc.) associated with surfaces can also be determined based on the derived 3D data corresponding to the respective surfaces and associated with the respective surfaces. These measurements can be displayed in association with viewing and/or navigation of the 3D floorplan model. Calculation of area (e.g., square footage) can be determined for any identified surface or portion of a 3D model with a known boundary, for example, by summing areas of polygons comprising the identified surface or the portion of the 3D model. Displays of individual items (e.g., dimensions) and/or classes of items can be toggled in a floorplan via a viewer on a remote device (e.g., via a user interface on a remote client device). A 2D floorplan model can include surfaces (e.g., walls, floors, ceilings, etc), portals (e.g., door openings) and/or window openings associated with derived 3D data 116 used to generate a 3D model and projected to a flat 2D surface. In yet another aspect, a floorplan can be viewed at a plurality of different heights with respect to vertical surfaces (e.g., walls) via a viewer on a remote device. Gausebeck: [0080] L.10-30).

Regarding claim 14, the combined teaching of Gausebeck and Ebrahimi teaches the system of claim 13, the memory comprising the sequence of instructions which, when executed by the processor, causes the processor to perform generating the floorplan further comprises instructions which, when executed by the processor, causes the processor to perform the set of acts that further comprises: 
predicting a number of clusters at least by using a voting architecture (e.g., These standard models 114 can include various types of 3D-from-2D prediction models configured to receive a single 2D image as input and process the 2D image using one or more machine learning techniques to infer or predict 3D/depth data for 2D image. The machine learning techniques can include for example, supervised learning techniques, unsupervised learning techniques, semi-supervised learning techniques, decision tree learning techniques, association rule learning techniques, artificial neural network techniques, inductive logic programming techniques, support vector machine techniques, clustering techniques, Bayesian network techniques, reinforcement learning techniques, representation learning techniques, and the like.  Gausebeck: [0066] L.13-25); and 
extracting a plurality of features at least by performing room or wall regression that computes the plurality of features at one or more scales (e.g., There are many existing models for 3D-from-2D depth prediction based on deep convolutional neural networks. One approach is to use fully convolutional residual networks that directly predict depth values as regression outputs. Other models use multi-scale neural networks to separate overall scale prediction from prediction of the fine details. Some models refine the results by incorporating fully-connected layers, adding conditional random field (CRF) elements to the network, or predicting additional outputs such as normal vectors and combining those with the initial depth predictions to produce refined depth predictions. Gausebeck: [0069]).

Regarding claims 16, 17 and 20, the claims are wearable extended reality device claims of system claims 9, 10 and 13 respectively.  The claims are similar in scope to claims 9, 10 and 13 respectively and they are rejected under similar rationale as claims 9, 10 and 13 respectively.

Claims 11-12 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Gausebeck and Ebrahimi as applied to claim(s) 10 (17) and further in view of Steinbrucker et al. (2019/0197777; IDS).

Regarding claim 11, the combined teaching of Gausebeck and Ebrahimi teaches the system of claim 10, the memory comprising the sequence of instructions which, when executed by the processor, causes the processor to perform generating the shape further comprises instructions which, when executed by the processor, causes the processor to perform the set of acts that further comprises: 
extracting a wall point cloud at least by fusing one or more mask depth images with pose trajectory using a marching cube algorithm (see 11_1 below); 
isolating a depth prediction corresponding to the wall point cloud at least by training a deep segmentation network (see 11_2 below); and 
projecting the depth prediction to a three-dimensional (3D) point cloud (see 11_3 below).
While the combined teaching of Gausebeck and Ebrahimi does not explicitly teach, Steinbrucker teaches:
(11_1). extracting a wall point cloud at least by fusing one or more mask depth images with pose trajectory using a marching cube algorithm (e.g., Some embodiments relate to a method of operating a computing system to generate a three-dimensional (3D) representation of a portion of a scene. The method includes receiving a query from an application requesting a planar geometry representation; searching a plane data store for plane data corresponding to the query; generating a rasterized plane mask from the plane data corresponding to the query, the rasterized plane mask comprising a plurality of plane coverage points; generating the 3D representation of the portion of the scene based at least in part on the rasterized plane mask according to the requested planar geometry representation; and sending the generated 3D representation of the portion of the scene to the application. Steinbrucker: [0037]. FIG. 21 shows a plane extraction system 1300, according to some embodiments. The plane extraction system 1300 may include depth fusion 1304, which may receive multiple depth maps 1302. The multiple depth maps 1302 may be created by one or more users wearing depth sensors, and/or downloaded from local/remote memories. The multiple depth maps 1302 may represent multiple views of a same surface. There may be differences between the multiple depth maps, which may be reconciled by the depth fusion 1304. Steinbruck: [0207]. In some embodiments, the depth fusion 1304 may generate SDFs 1306 based, at least in part, on the method 600. Mesh bricks 1308 may be extracted from the SDFs 1306 by, for example, applying a marching cube algorithm over corresponding bricks (e.g., bricks [0000]-[0015] in FIG. 23). Plane extraction 1310 may detect planar surfaces in the mesh bricks 1308 and extract planes based at least in part on the mesh bricks 1308. Steinbrucker: [0208] L.1-8);
(11_2). isolating a depth prediction corresponding to the wall point cloud at least by training a deep segmentation network (e.g., in some embodiments, a mesh simplification method may include mesh block segmentation, pre-simplification, mesh planarization, and post-simplification. Steinbrucker: [0234] L.1-4. Depth data from the depth camera and/or image data from the visual camera may be processed to extract points representing the real objects in the physical world. Images from the visual camera, such as a stereoscopic camera, may be processed to compute a three-dimensional (3D) reconstruction of the physical world. In some embodiments, depth data may be generated from the images from the visual cameras, for example, using deep learning techniques. Steinbrucker: [0358] L.3-);
(11_3). projecting the depth prediction to a three-dimensional (3D) point cloud (e.g., The stitching component 508 can further apply pixel color data to the depth map or point cloud by projecting the color data from the respective 2D images onto the depth map or point cloud. This can involve casting rays out from the color cameras along each captured pixel towards the interesting portion of the depth map or point cloud to colorize the depth map or point cloud. Gausebeck: [0112] L.8-14. These 3D reconstruction data may be in any suitable formats including meshes, point clouds, voxels, and the like. Steinbrucker: [0275] L.12-14).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Steinbrucker into the combined teaching of Gausebeck and Ebrahimi so the marching cube algorithm is applied to determine surfaces of walls and floors to facilitate the determination of a floor plan.

Regarding claim 12, the combined teaching of Gausebeck, Ebrahimi and Steinbrucker teaches the system of claim 11, the memory comprising the sequence of instructions which, when executed by the processor, causes the processor to perform generating the shape further comprises instructions which, when executed by the processor, causes the processor to perform the set of acts that further comprises: 
clustering the 3D point cloud into a plurality of clusters at least by detecting, at the deep segmentation network, one or more points that belong to a same plane instance (e.g., A planarization operation may be performed on mesh block 3202 to generate mesh block 3203. The planarization operation may include detecting planar areas in mesh 3202 based on, for example, the plane (or primitive) normals of the faces. Values of plane normals x1, y1, z1 of a first face 3212 and plane normals x2, y2, z2 of a second face 3214 may be compared. The comparison result of the plane normals of the first and second faces may indicate angles between the plane normals (e.g., angles between x1 and x2). When the comparison result is within a threshold value, it may be determined that the first and second planes are on a same planar area. In the illustrated example, planes 3212, 3214, 3216, and 3218 may be determined as on a first planar area corresponding to plane 3228; planes 3220, 3222, 3224, and 3226 may be determined as on a second same planar area corresponding to plane 3230.  Steinbrucker: [0256]. The blocks, for example, may be formatted as mesh blocks, in which features of objects in the physical world, such as corners, become points in the mesh block, or are used as points to create a mesh block. Connections between points in the mesh may indicate groups of points on the same surface of a physical object. Steinbrucker: [0303] L.21-27); and 
translating the plurality of clusters into a set of planes that forms a perimeter layout for the floorplan (e.g., In some embodiments, the 3D reconstruction 408 may be stored as a mesh, with groups of points defining vertices of triangles that represent surfaces. In some embodiments, the 3D reconstruction 4908 may be generated using other techniques such as room layout detection system, and/or object detection. In some embodiments, a number of techniques may be used together to generate the 3D reconstruction 4908. For example, object detection may be used for known physical objects in the physical world, 3D modeling may be used for unknown physical objects in the physical world, and room layout detection system may also be used to identify the boundaries in the physical world such as walls and floors.  Steinbrucker: [0376] L.4-17).

Regarding claims 18-19, the claims are wearable extended reality device claims of system claims 11-12 respectively.  The claims are similar in scope to claims 11-12 respectively and they are rejected under similar rationale as claims 11-12 respectively.

Conclusion
The prior arts made of record and not relied upon is considered pertinent to applicant's disclosure:
a).	Mehr (2018/0330184) teaches that “A computer-implemented method for determining an architectural layout. The method comprises providing a cycle of points that represents a planar cross section of a cycle of walls, and, assigned to each respective point, a respective first datum that represents a direction normal to the cycle of points at the respective point. The method also comprises minimizing a Markov Random Field energy thereby assigning, to each respective point, a respective one of the set of second data. The method also comprises identifying maximal sets of consecutive points to which a same second datum is assigned, and a cycle of vertices bounding a cycle of segments which represents the architectural layout. Such a method constitutes an improved solution for determining an architectural layout.” (Mehr: Abstract).
b).	Tang (2021/0225090) teaches that “Various implementations disclosed herein include devices, systems, and methods that generate floorplans and measurements using a three-dimensional (3D) representation of a physical environment generated based on sensor data.” (Tang: Abstract).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SING-WAI WU whose telephone number is (571)270-5850. The examiner can normally be reached 9:00am - 5:30pm (Central Time).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on 571-272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SING-WAI WU/Primary Examiner, Art Unit 2611