Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION

Status of Claims
Claims 1-20 are currently pending in this application.

Response to Amendments
The applicant amended independent claims 1, 8 and 15 with features similar to “determining a wall classification of a wall in a room at least by classifying a first subset of entities into a first cluster from an input image of an indoor scene and a room classification of the room at least by classifying a second subset of entities into a second cluster based at least in part upon the first cluster” and “determining a floorplan based at least in part upon the room classification and the wall classification with an unrestricted maximum number of clusters for rooms or walls in the indoor scene, wherein the floorplan comprises respective representations of a plurality of rooms or walls comprising a structure having a spatial extent that is smaller than a filtering size threshold below which structural information is filtered out”.
The applicant amended dependent claims 2, 9 and 16 with features similar to “the floorplan comprises a respective representation for the structure having the spatial extent that is smaller than the filtering size threshold”.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1-3, 8 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Gausebeck (2019/0026957; IDS) in view of Mehr et al. (2018/0330184) and further in view of Bell et al. (2016/0055268).

Regarding claim 1, Gausebeck teaches a method for generating a floorplan of an indoor scene (e.g., computer-implemented method for developing and training 2D-from-3D models; Gausebeck: [0027].  The 3D model generation component 118 can generate a 3D model or representation of the 3D model of an environment corresponding to a floorplan model of the environment, a dollhouse model of the environment (e.g., in implementations in which the environment comprises an interior of an architectural space, such as house), and the like. Gausebeck: [0077] L.14-20), comprising: 
determining a wall classification of a wall in a room at least by classifying a first subset of entities into a first cluster from an input image of an indoor scene and a room classification of the room at least by classifying a second subset of entities into a second cluster based at least in part upon the first cluster (e.g., A floorplan model generated by the 3D model generation component 118 can be a 3D floorplan model or a 2D floorplan model. A 3D floorplan model can comprise edges of each floor, wall, and ceiling as lines. Lines for floors, walls and ceilings can be dimensioned (e.g., annotated) with an associated size. In one or more embodiments, a 3D floorplan model can be navigated via a viewer on a remote device in 3D. In an aspect, subsections of the 3D floorplan model (e.g., rooms) can be associated with a textual data (e.g., a name). Gausebeck: [0080] L.1-10.  It is obvious that a room is identified with area enclosed in a plurality of walls and hence a plurality of lines enclosing an area.  A room is thus classified with at least one classified wall.  See 1_1 below); and 
determining a floorplan based at least in part upon the room classification and the wall classification with an unrestricted maximum number of clusters for rooms or walls in the indoor scene (e.g., Measurement data (e.g., square footage, etc.) associated with surfaces can also be determined based on the derived 3D data corresponding to the respective surfaces and associated with the respective surfaces. These measurements can be displayed in association with viewing and/or navigation of the 3D floorplan model. Calculation of area (e.g., square footage) can be determined for any identified surface or portion of a 3D model with a known boundary, for example, by summing areas of polygons comprising the identified surface or the portion of the 3D model. Displays of individual items (e.g., dimensions) and/or classes of items can be toggled in a floorplan via a viewer on a remote device (e.g., via a user interface on a remote client device). A 2D floorplan model can include surfaces (e.g., walls, floors, ceilings, etc), portals (e.g., door openings) and/or window openings associated with derived 3D data 116 used to generate a 3D model and projected to a flat 2D surface. In yet another aspect, a floorplan can be viewed at a plurality of different heights with respect to vertical surfaces (e.g., walls) via a viewer on a remote device. Gausebeck: [0080] L.10-30.  It is obvious that a room is an area enclosed with walls with at least one door opening), wherein the floorplan comprises respective representations of a plurality of rooms or walls comprising a structure having a spatial extent that is smaller than a filtering size threshold below which structural information is filtered out (see 1_2 below).
While Gausebeck does not explicitly teach, Mehr teaches:
(1_1). determining a wall classification of a wall in a room at least by classifying a first subset of entities into a first cluster from an input image of an indoor scene (e.g., solutions that aim in specific at extracting the layout of a 3D reconstructed indoor scene also exist. Such a determination of an architectural layout has many applications in Scene Modeling, Scene Understanding, Augmented Reality, and all domains where it is necessary to precisely build an accurate 3D indoor scene or to get accurate measurements. Mehr: [0027] L.1-7. It is therefore provided a computer-implemented method for determining an architectural layout. The method comprises providing a cycle of points that represents a planar cross section of a cycle of walls. The method also comprises providing, assigned to each respective point, a respective first datum that represents a direction normal to the cycle of points at the respective point. Mehr: [0041] L.1-7) and a room classification of the room at least by classifying a second subset of entities into a second cluster based at least in part upon the first cluster (e.g., solutions that aim in specific at extracting the layout of a 3D reconstructed indoor scene also exist. Such a determination of an architectural layout has many applications in Scene Modeling, Scene Understanding, Augmented Reality, and all domains where it is necessary to precisely build an accurate 3D indoor scene or to get accurate measurements. Two main families of methods exist in the state of the art: methods based on learning algorithms to cluster the walls, ceiling and floor, and methods based on pure geometric hypotheses to extract the layout (for instance, the room is a cuboid, the walls are vertical and their projection is a line on the floor plan, etc).  Mehr: [0027]. This paper aims at identifying the different rooms in the final layout of the reconstructed indoor scene. It identifies the walls despite the occlusions, and projects them on the floor plan, in order to build a cell graph which will be used to cluster the different rooms of the indoor scene.  Mehr: [0036] L.1-5. The providing of the cycle of points comprises providing a 3D point cloud representing a room that includes the cycle of walls, identifying the projection plane and the 3D point cloud representing the cycle of walls from the 3D point cloud representing the room, projecting the 3D point cloud representing the cycle of walls on the projection plane, determining the 2D point cloud, and determining the concave hull; Mehr: [0054] the identifying of the projection plane and of the 3D point cloud representing the cycle of walls comprises detecting planes in the 3D point cloud representing the room with a random sample consensus algorithm; and/or Mehr: [0055].  Therefore, the identifying (classification) of (a second( clusters of 3D point clouds represent a room includes the cycle of walls (that is based on at least one of the first cluster of 3D point cloud of one of the cycle of walls));
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Mehr into the teaching of Gausebeck so that 3D point cloud can be identified (classified) as a wall with a cycle of points (Mehr: [0052] L.) and as a room with a cycle of walls (Mehr: [0054] L.2-3).
While the combined teaching of Gausebeck and Mehr does not explicitly teach, Bell teaches:
(1_2). determining a floorplan based at least in part upon the room classification and the wall classification with an unrestricted maximum number of clusters for rooms or walls in the indoor scene, wherein the floorplan comprises respective representations of a plurality of rooms or walls comprising a structure having a spatial extent that is smaller than a filtering size threshold below which structural information is filtered out (e.g., Captured three-dimensional (3D) data associated with a 3D model of an architectural environment is received and at least a portion of the captured 3D data associated with a flat surface is identified. Bell: Abstract L.2-5. Additionally, the first identification component 202 can associate an identified flat surface and/or an identified non-flat surface with a particular identifier and/or a particular score. For example, an identified flat surface can be assigned an identifier value associated with a wall, an identifier value associated with a floor, or an identifier value associated with a ceiling based on a calculated score. The first identification component 202 can assign an identifier value to an identified flat surface based on one or more criteria. For example, an identified flat surface can be assigned an identifier associated with a wall in response to a determination that a normal vector associated with the identified flat surface is approximately horizontal, that a size (e.g., a surface area, etc.) of the identified flat surface satisfies a particular threshold level, that a height associated with the identified flat surface satisfies a particular threshold level, that a bottom boundary of the identified flat surface corresponds to (e.g., matches) a height or boundary of an identified floor, that the identified flat surface correspond to a size and near-opposite orientation of an identified wall, etc. In another example, an identified flat surface can be assigned an identifier associated with a floor in response to a determination that a normal vector associated with the identified flat surface is approximately vertical, that a size (e.g., a surface area, etc.) of the identified flat surface satisfies a particular threshold level, that the identified flat surface is associated with a lowest average height with respect to other identified near-horizontal planes in an identified room and/or a 3D model, that the identified flat surface correspond to a size and near-opposite orientation of an identified ceiling, etc. In yet another example, an identified surface can be assigned an identifier associated with a ceiling in response to a determination that a normal vector associated with the identified surface is pointed in a range of downward directions, that a size (e.g., a surface area, etc.) of the identified surface satisfies a particular threshold level, that a boundary of the identified surface corresponds to (e.g., matches) a boundary of an identified wall, that an identified surface forms part of an outer shell of an identified room, that an identified surface forms part of a convex hull surrounding an identified room, that an identified surface forms part of a surface that is associated with a certain amount of (e.g., a small amount of) or no captured 3D data in a particular region (e.g., a region associated with a particular size) behind the identified surface, that the identified surface correspond to a size and near-opposite orientation of an identified floor, etc. In an aspect, a largest unidentified plane in a given room and/or in an entire 3D model can be iteratively identified as a wall, floor or ceiling (e.g., a surface) until a certain threshold associated with an enclosing of a captured volume is reached.  Bell: [0058].  Therefore, a surface (associated with a captured 3D data; a surface is an area of a cycle of points as disclosed in Mehr) has to meet a threshold to qualify for a wall (Wall: Surface > Twall) and a captured volume (that includes a cycle of walls as disclosed in Mehr) has to meet a threshold to qualify for a room (Room: captured volume > Troom).  The threshold is thus set as the filtering (discrimination) criterion for a wall or a room and when a surface or a captured volume, without other criteria.  The unrestricted (non-limiting) maximum number is taken as threshold value and the spatial extent is interpreted as captured volume of 3D data with associated surface).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Bell into the combined teaching of Gausebeck and Mehr so that processing 3D data (e.g., 3D-reconstructed data) to facilitate semantic understanding of the 3D data are presented (Bell: [0025] L.1-3). 

Regarding claim 2, the combined teaching of Gausebeck, Mehr and Bell teaches the method of claim 1, wherein determining the room classification of the room and the wall classification of the wall comprises: 
identifying the input image, wherein the input image comprises one image or a sequence of images from a three-dimensional scan of the indoor scene (e.g., Portions of the 3D model geometric data (e.g., the mesh) can include image data describing texture, color, intensity, and the like. For example, the geometric data can comprise data points of geometry in addition to comprising texture coordinates associated with the data points of geometry (e.g., texture coordinates that indicate how to apply texture data to geometric data). In various embodiments, received 2D image data 102 (or portions thereof) can be associated with portions of the mesh to associate visual data from the 2D image data 102 (e.g., texture data, color data, etc.) with the mesh. In this regard, the 3D model generation component 118 can generate 3D models based and 2D images and the 3D data respectively associated with the 2D images. In an aspect, data used to generate 3D models can be collected from scans (e.g. utilizing sensors) of real-world scenes, spaces (e.g. houses, office spaces, outdoor spaces, etc.), objects (e.g. furniture, decorations, goods, etc.), and the like. Data can also be generated based on computer implemented 3D modeling systems. Gausebeck: [0071]. in some implementations, a representation or rendering of a 3D model can be a 2D image or panorama associated with the 3D model from a specific perspective of a virtual camera located at a specific navigation position and orientation relative to the 3D model. Gausebeck: [0057] L.8-12); and 
determining an input point cloud for the input image (e.g., In various embodiments, the 3D model generation component 118 can employ the derived 3D data 116 for respective images received by the computing device 104 to generate reconstructed 3D models of objects or environments included in the images. The 3D models described herein can include data representing positions, geometric shapes, curved surfaces, and the like. For example, a 3D model can include a collection of points represented by 3D coordinates, such as points in a 3D Euclidean space. The collection of points can be associated with each other (e.g. connected) by geometric entities. For example, a mesh comprising a series of triangles, lines, curved surfaces (e.g. non-uniform rational basis splines (NURBS)), quads, n-grams, or other geometric shapes can connect the collection of points. For example, a 3D model of an interior environment of building can comprise mesh data (e.g., a triangle mesh, a quad mesh, a parametric mesh, etc.), one or more texture-mapped meshes (e.g., one or more texture-mapped polygonal meshes, etc.), a point cloud, a set of point clouds, surfels and/or other data constructed by employing one or more 3D sensors.  Gausebeck: [0070] L.1-21), wherein the floorplan comprises a respective representation for the structure having the spatial extent that is smaller than the filtering threshold (e.g., In FIG. 6, the environment 600 includes a floor portion 602, a wall portion 603, other wall portions 604, 606 and 608 (e.g., a wall portion 604, wall portions 606 and a wall portion 608), a window opening 610 and an object 612. Bell: [0087] L.6-10. A particular portion of captured 3D data (e.g., a section of a 3D model) that corresponds to an object and/or a surface can be refined once the object and/or the surface are segmented. The second identification component 204 can implement one or more algorithms for refining surface boundaries and/or determining whether a portion of mesh data corresponds to a particular surface. In an aspect, the second identification component 204 can reclassify a flat surface as corresponding to an object based on an evaluation of one or more characteristics associated with the flat surface. For example, the second identification component 204 can reclassify a flat surface as corresponding to an object in response to a determination that a size of the flat surface (e.g., a surface area of the flat surface) is below a threshold level, that the flat surface does not correspond to an outer shell and/or a convex hull of an identified room, and/or that a visual appearance of the flat surface (e.g., a hue, a texture, etc.) matches an object, etc. Bell: [0061]).

Regarding claim 3, the combined teaching of Gausebeck, Mehr and Bell teaches the method of claim 2, wherein determining the room classification of the room and the wall classification of the wall further comprises: 
identifying a subset of the input point cloud (e.g., a representation or rendering of a 3D model can be the 3D model or a part of the 3D model generated from a specific navigation position and orientation of a virtual camera relative to the 3D model and generated using aligned sets or subsets of captured 3D data employed to generate the 3D model. Gausebeck: [0057] L.13-18.  A 3D model of an interior environment of building can comprise mesh data (e.g., a triangle mesh, a quad mesh, a parametric mesh, etc.), one or more texture-mapped meshes (e.g., one or more texture-mapped polygonal meshes, etc.), a point cloud, a set of point clouds, surfels and/or other data constructed by employing one or more 3D sensors.  Gausebeck: [0070] L.15-21); and 
training a deep network with at least a synthetic dataset (e.g., The 3D-from-2D model development module 3314 can further include model training component 3318, which can be configured to employ the training data to train and/or develop one or more 3D-from-2D neural network models included in the 3D-from-2D model database 3326. Gausebeck: [0245] L.13-18. In some implementations, the training data development component 3316 can employ a textured 3D mesh of a 3D space model included in the 3D model and alignment data 3304 to generate 2D images from camera positions where a real camera was never placed. For instance, the training data development component 3316 can use capture position/orientation information for respective images included in the indexed 2D image data 3306 to determine various virtual capture position/orientation combinations that are not represented by the captured 2D images. The training data development component 3316 can further generate synthetic images of the 3D model from these virtual capture positions/orientations. In some implementations, the training data development component 3316 can generate synthetic 2D images from various perspective of the 3D model that correspond to a sequence of images captured by a virtual camera in association with navigating the 3D space model, wherein the navigation assimilates a capture scenario as if a user were actually walking through the environment represented by the 3D model while holding a camera and capturing images along the way. Gausebeck: [0249] L.20-40).

Regarding claim 8, the claim is a system claim of method claim 1.  The claim is similar in scope to claim 1 and it is rejected under similar rationale as claim 1.
Claim 8 further recites: 
a processor (e.g., The method can comprise receiving a panoramic image by a system comprising a processor, Gausebeck: [0032] L.3-4); and 
memory operatively coupled to the processor and storing a sequence of instructions which, when executed by the processor, causing the processor to perform a set of acts (e.g., a memory that stores computer executable components, and a processor that executes the computer executable components stored in the memory. Gausebeck: [0044] L.3-5. Aspects of systems, apparatuses or processes explained in this disclosure can constitute machine-executable components embodied within machine(s), e.g. embodied in one or more computer readable mediums (or media) associated with one or more machines. Such components, when executed by the one or more machines, e.g. computer(s), computing device(s), virtual machine(s), etc. can cause the machine(s) to perform the operations described. Gausebeck: [0061] L.6-13).

Regarding claim 15, the claim is a wearable extended reality device claim of method claim 1.  The claim is similar in scope to claim 1 and it is rejected under similar rationale as claim 1.
Claim 15 further recites: 
an optical system having an array of micro-displays or micro-projectors to present digital contents to an eye of a user (e.g., the user device 130 can include but is not limited to: a desktop computer, a laptop computer, a mobile phone, a smartphone, a tablet personal computer (PC), a personal digital assistant (PDA), a heads-up display (HUD), a virtual reality (VR) headset, augmented reality (AR) headset or device, a standalone digital camera, or another type of wearable computing device. Gausebeck: [0062] L.40-49); 
a processor coupled to the optical system (e.g., a processor that executes the computer executable components stored in the memory. The computer executable components can comprise a 3D data derivation component configured to employ one or more 3D-from-2D neural network models to derive 3D data for the 2D images. Gausebeck: [0044] L.4-6; the computer executable components can comprise a communication component configured to send the 2D images and the 3D data to an external device, Gausebeck: [0044] L.13-16; the communication component also be configured to receive the 3D model from the external device and device can render the 3D model via a display of the device. Gausebeck: [0044] L.20-23); and 
memory operatively coupled to the processor and storing a sequence of instructions which, when executed by the processor, causes the processor to perform a set of acts, the set of acts (e.g., a memory that stores computer executable components, and a processor that executes the computer executable components stored in the memory. Gausebeck: [0044] L.3-5. Aspects of systems, apparatuses or processes explained in this disclosure can constitute machine-executable components embodied within machine(s), e.g. embodied in one or more computer readable mediums (or media) associated with one or more machines. Such components, when executed by the one or more machines, e.g. computer(s), computing device(s), virtual machine(s), etc. can cause the machine(s) to perform the operations described. Gausebeck: [0061] L.6-13).

Claims 4-7, 9-10, 13-14, 16-17 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Gausebeck in view of Mehr and Bell as applied to claim(s) 3 (8 and 15) and further in view of Ebrahimi et al. (2019/0035099; IDS).

Regarding claim 4, the combined teaching of Gausebeck, Mehr and Bell teaches the method of claim 3, wherein determining the room classification of the room and the wall classification of the wall further comprises: 
generating, at a deep network, one or more room cluster labels for one or more vertices represented in the subset and a wall cluster label for the wall (see 4_1 below).
While the combined teaching of Gausebeck, Mehr and Bell does not explicitly teach, Ebrahimi teaches:
(4_1). generating, at a deep network, one or more room cluster labels for one or more vertices represented in the subset and a wall cluster label for the wall (e.g., In some embodiments, maps may be three dimensional maps, e.g., indicating the position of walls, furniture, doors, and the like in a room being mapped. In some embodiments, maps may be two dimensional maps, e.g., point clouds or polygons or finite ordered list indicating obstructions at a given height (or range of height, for instance from zero to 5 or 10 centimeters or less) above the floor. Ebrahimi: [0053] L.1-8. Nodes on the graph are identified in response to non-core corresponding depth vectors being within a threshold distance of a core depth vector in the graph, and in response to core depth vectors in the graph being reachable by other core depth vectors in the graph, where to depth vectors are reachable from one another if there is a path from one depth vector to the other depth vector where every link and the path is a core depth vector and is it within a threshold distance of one another. The set of nodes in each resulting graph, in some embodiments, may be designated as a cluster, and points excluded from the graphs may be designated as outliers that do not correspond to clusters. Ebrahimi: [0061] L.18-29. Some embodiments may then determine the centroid of each cluster in the spatial dimensions of an output depth vector for constructing floor plan maps. Ebrahimi: [0062] L.1-3).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Ebrahimi into the combined teaching of Gausebeck, Mehr and Bell so that clustering is used for constructing floor plan map.

Regarding claim 5, the combined teaching of Gausebeck, Mehr, Bell and Ebrahimi teaches the method of claim 4, wherein generating the one or more room cluster labels and the wall cluster label comprises: 
performing a nested partitioning on a set of points to divide the set of points into a plurality of overlapping local regions based at least in part upon a distance metric pertaining to the indoor scene (e.g., using a standalone digital camera, a smartphone, or similar device with a camera, a user can walk around an environment and take 2D images at several points nearby along the way, capturing different perspectives of the environment. In another example implementation, related 2D images can include 2D images from nearby or overlapping perspectives captured by a single camera in association rotation of the camera about a fixed axis. In another implementation, related 2D images can include two or more images respectively captured by two or more cameras with partially overlapping fields-of-view or different perspective of an environment (e.g., captured by different cameras at or near the same time). Gausebeck: [0141] L.28-41. The image correlation component 920 can be configured to automatically classify respective frames of video included in a same video clip having less than a defined duration and/or associated with a defined range of movement based on the capture device motion data 904 (e.g., movement in a particular direction less than a threshold distance or degree of rotation) as being related. Gausebeck: [0151] L.24-32); and 
extracting a local feature that captures a geometric structure in the indoor scene at least by recursively performing semantic feature extraction on the nested partitioning of the set of points (e.g., The alignment process can involve iteratively aligning different point clouds from neighboring and overlapping images captured from different positions and orientations relative to an object or environment to generate a global alignment between the respective point clouds using correspondences in derived position information for the respective points. Visual feature information including correspondences in color data, texture data, luminosity data, etc. for respective points or pixels included in the point clouds can also be used (along with other sensor data if available) to generate the aligned data. Gausebeck: [0075] L.16-26).

Regarding claim 6, the combined teaching of Gausebeck, Mehr, Bell and Ebrahimi teaches the method of claim 5, wherein generating the one or more room cluster labels and a wall cluster label comprises: 
abstracting the local feature into a higher-level feature or representation (e.g., Deep learning models can include one or more layers that learn using supervised learning (e.g., classification) and/or unsupervised learning (e.g., pattern analysis) manners. In some implementations, deep learning techniques for deriving 3D data from 2D images can learn using multiple levels of representations that correspond to different levels of abstraction, wherein the different levels form a hierarchy of concepts. Gausebeck: [0068] L.14-22); and 
adaptively weighing a plurality of local features at multiple, different scales or resolutions (e.g., There are many existing models for 3D-from-2D depth prediction based on deep convolutional neural networks. One approach is to use fully convolutional residual networks that directly predict depth values as regression outputs. Other models use multi-scale neural networks to separate overall scale prediction from prediction of the fine details. Gausebeck: [0069] L.1-7. In one or more implementations, the 3D-from-2D convolutional neural network accounts for weighted values applied to respective pixels based on their projected angular area during training. In this regard, the 3D-from-2D neural network model can include a model that was trained based on weighted values applied to respective pixels of projected panoramic images in association with deriving depth data for the respective pixels, wherein the weighted values varied based on an angular area of the respective pixels. For example, during training, the weighted values were decreased as the angular area of the respective pixels decreased. In addition, in some implementations, downstream convolutional layers of the convolutional layers that follow a preceding layer are configured to re-project a portion of the panoramic image processed by the preceding layer in association with deriving depth data for the panoramic image, resulting in generation of a re-projected version of the panoramic image for each of the downstream convolutional layers. Gausebeck: [0118] L.1-19).

Regarding claim 7, the combined teaching of Gausebeck, Mehr, Bell and Ebrahimi teaches the method of claim 6, wherein generating the one or more room cluster labels and a wall cluster label comprises: 
combining the plurality of local features at the multiple, different scales or resolutions (e.g., In some embodiments, thresholding may be used in identifying the area of overlap wherein areas or objects of interest within an image may be identified using thresholding as different areas or objects have different ranges of pixel intensity. For example, an object captured in an image, the object having high range of intensity, can be separated from a background having low range of intensity by thresholding wherein all pixel intensities below a certain threshold are discarded or segmented, leaving only the pixels of interest. In some embodiments, a metric can be used to indicate how good of an overlap there is between the two sets of perceived depths. Ebrahimi: [0047] L.41-52); and 
assigning the one or more room cluster labels and the wall cluster label to a metric space for the indoor scene based at least in part upon the distance metric (e.g., Some embodiments may implement DB-SCAN on depths and related values like pixel intensity, e.g., in a vector space that includes both depths and pixel intensities corresponding to those depths, to determine a plurality of clusters, each corresponding to depth measurements of the same feature of an object. Some embodiments may execute a density-based clustering algorithm, like DBSCAN, to establish groups corresponding to the resulting clusters and exclude outliers. To cluster according to depth vectors and related values like intensity, some embodiments may iterate through each of the depth vectors and designate a depth vectors as a core depth vector if at least a threshold number of the other depth vectors are within a threshold distance in the vector space (which may be higher than three dimensional in cases where pixel intensity is included). Some embodiments may then iterate through each of the core depth vectors and create a graph of reachable depth vectors, where nodes on the graph are identified in response to non-core corresponding depth vectors being within a threshold distance of a core depth vector in the graph, and in response to core depth vectors in the graph being reachable by other core depth vectors in the graph, where to depth vectors are reachable from one another if there is a path from one depth vector to the other depth vector where every link and the path is a core depth vector and is it within a threshold distance of one another. The set of nodes in each resulting graph, in some embodiments, may be designated as a cluster, and points excluded from the graphs may be designated as outliers that do not correspond to clusters. Ebrahimi: [0061]).

Regarding claim 9, the combined teaching of Gausebeck, Mehr and Bell teaches the system of claim 8, the memory comprising the sequence of instructions which, when executed by the processor, causes the processor to perform determining the floorplan further comprises instructions which, when executed by the processor, causes the processor to perform the set of acts that further comprises: 
generating a shape for the room using at least the room classification and the wall classification, wherein the room classification comprises a room cluster label assigned to or associated with the room, and the wall classification comprises one or more wall cluster labels assigned to or associated with one or more walls of the room, and the one or more walls comprise the wall (see 9_1 below); and 
generating the floorplan at least by aggregating or integrating an estimated room perimeter relative to a global coordinate system based at least in part upon the shape, wherein the shape comprises a polygon of a DeepPerimeter type (see 9_2 below).
While the combined teaching of Gausebeck, Mehr and Bell does not explicitly teach, Ebrahimi teaches:
(9_1). generating a shape for the room using at least the room classification and the wall classification, wherein the room classification comprises a room cluster label assigned to or associated with the room, and the wall classification comprises one or more wall cluster labels assigned to or associated with one or more walls of the room, and the one or more walls comprise the wall (e.g., In some embodiments, maps may be three dimensional maps, e.g., indicating the position of walls, furniture, doors, and the like in a room being mapped. In some embodiments, maps may be two dimensional maps, e.g., point clouds or polygons or finite ordered list indicating obstructions at a given height (or range of height, for instance from zero to 5 or 10 centimeters or less) above the floor. Ebrahimi: [0053] L.1-8. Nodes on the graph are identified in response to non-core corresponding depth vectors being within a threshold distance of a core depth vector in the graph, and in response to core depth vectors in the graph being reachable by other core depth vectors in the graph, where to depth vectors are reachable from one another if there is a path from one depth vector to the other depth vector where every link and the path is a core depth vector and is it within a threshold distance of one another. The set of nodes in each resulting graph, in some embodiments, may be designated as a cluster, and points excluded from the graphs may be designated as outliers that do not correspond to clusters. Ebrahimi: [0061] L.18-29. Some embodiments may then determine the centroid of each cluster in the spatial dimensions of an output depth vector for constructing floor plan maps. Ebrahimi: [0062] L.1-3);
(9_2). generating the floorplan at least by aggregating or integrating an estimated room perimeter relative to a global coordinate system based at least in part upon the shape, wherein the shape comprises a polygon of a DeepPerimeter type (e.g., Since the overlapping depths from the first and second fields of view within the area of overlap do not necessarily have the exact same values and a range of tolerance between their values is allowed, the overlapping depths from the first and second fields of view are used to calculate new depths for the overlapping area using a moving average or another suitable mathematical convolution. This is expected to improve the accuracy of the depths as they are calculated from the combination of two separate sets of measurements. The newly calculated depths are used as the depths for the overlapping area, substituting for the depths from the first and second fields of view within the area of overlap. The new depths are then used as ground truth values to adjust all other perceived depths outside the overlapping area. Once all depths are adjusted, a first segment of the floor plan is complete. This method may be repeated such that the camera perceives depths (or pixel intensities indicative of depth) within consecutively overlapping fields of view as it moves, and the control system identifies the area of overlap and combines overlapping depths to construct a floor plan of the environment. Ebrahimi: [0034] L.55-75. The resulting floor plan may be encoded in various forms. For instance, some embodiments may construct a point cloud of two dimensional or three dimensional points by transforming each of the vectors into a vector space with a shared origin, e.g., based on the above-described displacement vectors, in some cases with displacement vectors refined based on measured depths. Or some embodiments may represent maps with a set of polygons that model detected surfaces, e.g., by calculating a convex hull over measured vectors within a threshold area, like a tiling polygon. Polygons are expected to afford faster interrogation of maps during navigation and consume less memory than point clouds at the expense of greater computational load when mapping. Vectors need not be labeled as “vectors” in program code to constitute vectors, which is not to suggest that other mathematical constructs are so limited. In some embodiments, vectors may be encoded as tuples of scalars, as entries in a relational database, as attributes of an object, etc. Similarly, it should be emphasized that images need not be displayed or explicitly labeled as such to constitute images. Moreover, sensors may undergo some movement while capturing a given image, and the “pose” of a sensor corresponding to a depth image may, in some cases, be a range of poses over which the depth image is captured. Ebrahimi: [0052]).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Ebrahimi into the combined teaching of Gausebeck, Mehr and Bell so that overlapping features are detected and labeled to determine a floor plan.

Regarding claim 10, the combined teaching of Gausebeck, Mehr, Bell and Ebrahimi teaches the system of claim 9, the memory comprising the sequence of instructions which, when executed by the processor, causes the processor to perform generating the shape further comprises instructions which, when executed by the processor, causes the processor to perform the set of acts that further comprises: 
performing a deep estimation on an RGB (red green blue) frame of the input image of the indoor scene (e.g., The standard models 114 can also include one or more models that perform 3D-from-2D depth estimation using non-parameter algorithms. Non-parameter algorithms learn depth from a single RGB image, relying on the assumption that the similarities between regions in the RGB images imply similar depth cues. After clustering the training dataset based on global features, these models first search the candidate RGB-D of the input RGB image in the feature space, then, the candidate pairs are warped and fused to obtain the final depth.  Gausebeck: [0067] L.9-18); and 
generating a depth map and a wall segmentation mask at least by using a multi-view depth estimation network and a segmentation module, wherein the segmentation module is based at least in part upon a PSPNet (Pyramid scene parsing network) and a ResNet (residual network) (e.g., There are many existing models for 3D-from-2D depth prediction based on deep convolutional neural networks. One approach is to use fully convolutional residual networks that directly predict depth values as regression outputs. Other models use multi-scale neural networks to separate overall scale prediction from prediction of the fine details. Some models refine the results by incorporating fully-connected layers, adding conditional random field (CRF) elements to the network, or predicting additional outputs such as normal vectors and combining those with the initial depth predictions to produce refined depth predictions. Gausebeck: [0069]).

Regarding claim 13, the combined teaching of Gausebeck, Mehr, Bell and Ebrahimi teaches the system of claim 9, the memory comprising the sequence of instructions which, when executed by the processor, causes the processor to perform generating the floorplan further comprises instructions which, when executed by the processor, causes the processor to perform the set of acts that further comprises: 
identifying a room instance and a wall instance from a scan of the indoor environment (e.g., A floorplan model generated by the 3D model generation component 118 can be a 3D floorplan model or a 2D floorplan model. A 3D floorplan model can comprise edges of each floor, wall, and ceiling as lines. Lines for floors, walls and ceilings can be dimensioned (e.g., annotated) with an associated size. In one or more embodiments, a 3D floorplan model can be navigated via a viewer on a remote device in 3D. In an aspect, subsections of the 3D floorplan model (e.g., rooms) can be associated with a textual data (e.g., a name). Gausebeck: [0080] L.1-10); and 
estimating a closed perimeter for the room instance (e.g., Measurement data (e.g., square footage, etc.) associated with surfaces can also be determined based on the derived 3D data corresponding to the respective surfaces and associated with the respective surfaces. These measurements can be displayed in association with viewing and/or navigation of the 3D floorplan model. Calculation of area (e.g., square footage) can be determined for any identified surface or portion of a 3D model with a known boundary, for example, by summing areas of polygons comprising the identified surface or the portion of the 3D model. Displays of individual items (e.g., dimensions) and/or classes of items can be toggled in a floorplan via a viewer on a remote device (e.g., via a user interface on a remote client device). A 2D floorplan model can include surfaces (e.g., walls, floors, ceilings, etc), portals (e.g., door openings) and/or window openings associated with derived 3D data 116 used to generate a 3D model and projected to a flat 2D surface. In yet another aspect, a floorplan can be viewed at a plurality of different heights with respect to vertical surfaces (e.g., walls) via a viewer on a remote device. Gausebeck: [0080] L.10-30).

Regarding claim 14, the combined teaching of Gausebeck, Mehr, Bell and Ebrahimi teaches the system of claim 13, the memory comprising the sequence of instructions which, when executed by the processor, causes the processor to perform generating the floorplan further comprises instructions which, when executed by the processor, causes the processor to perform the set of acts that further comprises: 
predicting a number of clusters at least by using a voting architecture (e.g., These standard models 114 can include various types of 3D-from-2D prediction models configured to receive a single 2D image as input and process the 2D image using one or more machine learning techniques to infer or predict 3D/depth data for 2D image. The machine learning techniques can include for example, supervised learning techniques, unsupervised learning techniques, semi-supervised learning techniques, decision tree learning techniques, association rule learning techniques, artificial neural network techniques, inductive logic programming techniques, support vector machine techniques, clustering techniques, Bayesian network techniques, reinforcement learning techniques, representation learning techniques, and the like.  Gausebeck: [0066] L.13-25); and 
extracting a plurality of features at least by performing room or wall regression that computes the plurality of features at one or more scales (e.g., There are many existing models for 3D-from-2D depth prediction based on deep convolutional neural networks. One approach is to use fully convolutional residual networks that directly predict depth values as regression outputs. Other models use multi-scale neural networks to separate overall scale prediction from prediction of the fine details. Some models refine the results by incorporating fully-connected layers, adding conditional random field (CRF) elements to the network, or predicting additional outputs such as normal vectors and combining those with the initial depth predictions to produce refined depth predictions. Gausebeck: [0069]).

Regarding claims 16, 17 and 20, the claims are wearable extended reality device claims of system claims 9, 10 and 13 respectively.  The claims are similar in scope to claims 9, 10 and 13 respectively and they are rejected under similar rationale as claims 9, 10 and 13 respectively.

Claims 11-12 and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Gausebeck, Mehr, Bell and Ebrahimi as applied to claim(s) 10 (17) and further in view of Steinbrucker et al. (2019/0197777; IDS).

Regarding claim 11, the combined teaching of Gausebeck, Mehr, Bell and Ebrahimi teaches the system of claim 10, the memory comprising the sequence of instructions which, when executed by the processor, causes the processor to perform generating the shape further comprises instructions which, when executed by the processor, causes the processor to perform the set of acts that further comprises: 
extracting a wall point cloud at least by fusing one or more mask depth images with pose trajectory using a marching cube algorithm (see 11_1 below); 
isolating a depth prediction corresponding to the wall point cloud at least by training a deep segmentation network (see 11_2 below); and 
projecting the depth prediction to a three-dimensional (3D) point cloud (see 11_3 below).
While the combined teaching of Gausebeck, Mehr, Bell and Ebrahimi does not explicitly teach, Steinbrucker teaches:
(11_1). extracting a wall point cloud at least by fusing one or more mask depth images with pose trajectory using a marching cube algorithm (e.g., Some embodiments relate to a method of operating a computing system to generate a three-dimensional (3D) representation of a portion of a scene. The method includes receiving a query from an application requesting a planar geometry representation; searching a plane data store for plane data corresponding to the query; generating a rasterized plane mask from the plane data corresponding to the query, the rasterized plane mask comprising a plurality of plane coverage points; generating the 3D representation of the portion of the scene based at least in part on the rasterized plane mask according to the requested planar geometry representation; and sending the generated 3D representation of the portion of the scene to the application. Steinbrucker: [0037]. FIG. 21 shows a plane extraction system 1300, according to some embodiments. The plane extraction system 1300 may include depth fusion 1304, which may receive multiple depth maps 1302. The multiple depth maps 1302 may be created by one or more users wearing depth sensors, and/or downloaded from local/remote memories. The multiple depth maps 1302 may represent multiple views of a same surface. There may be differences between the multiple depth maps, which may be reconciled by the depth fusion 1304. Steinbruck: [0207]. In some embodiments, the depth fusion 1304 may generate SDFs 1306 based, at least in part, on the method 600. Mesh bricks 1308 may be extracted from the SDFs 1306 by, for example, applying a marching cube algorithm over corresponding bricks (e.g., bricks [0000]-[0015] in FIG. 23). Plane extraction 1310 may detect planar surfaces in the mesh bricks 1308 and extract planes based at least in part on the mesh bricks 1308. Steinbrucker: [0208] L.1-8);
(11_2). isolating a depth prediction corresponding to the wall point cloud at least by training a deep segmentation network (e.g., in some embodiments, a mesh simplification method may include mesh block segmentation, pre-simplification, mesh planarization, and post-simplification. Steinbrucker: [0234] L.1-4. Depth data from the depth camera and/or image data from the visual camera may be processed to extract points representing the real objects in the physical world. Images from the visual camera, such as a stereoscopic camera, may be processed to compute a three-dimensional (3D) reconstruction of the physical world. In some embodiments, depth data may be generated from the images from the visual cameras, for example, using deep learning techniques. Steinbrucker: [0358] L.3-);
(11_3). projecting the depth prediction to a three-dimensional (3D) point cloud (e.g., The stitching component 508 can further apply pixel color data to the depth map or point cloud by projecting the color data from the respective 2D images onto the depth map or point cloud. This can involve casting rays out from the color cameras along each captured pixel towards the interesting portion of the depth map or point cloud to colorize the depth map or point cloud. Gausebeck: [0112] L.8-14. These 3D reconstruction data may be in any suitable formats including meshes, point clouds, voxels, and the like. Steinbrucker: [0275] L.12-14).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Steinbrucker into the combined teaching of Gausebeck, Mehr, Bell and Ebrahimi so the marching cube algorithm is applied to determine surfaces of walls and floors to facilitate the determination of a floor plan.

Regarding claim 12, the combined teaching of Gausebeck, Mehr, Bell, Ebrahimi and Steinbrucker teaches the system of claim 11, the memory comprising the sequence of instructions which, when executed by the processor, causes the processor to perform generating the shape further comprises instructions which, when executed by the processor, causes the processor to perform the set of acts that further comprises: 
clustering the 3D point cloud into a plurality of clusters at least by detecting, at the deep segmentation network, one or more points that belong to a same plane instance (e.g., A planarization operation may be performed on mesh block 3202 to generate mesh block 3203. The planarization operation may include detecting planar areas in mesh 3202 based on, for example, the plane (or primitive) normals of the faces. Values of plane normals x1, y1, z1 of a first face 3212 and plane normals x2, y2, z2 of a second face 3214 may be compared. The comparison result of the plane normals of the first and second faces may indicate angles between the plane normals (e.g., angles between x1 and x2). When the comparison result is within a threshold value, it may be determined that the first and second planes are on a same planar area. In the illustrated example, planes 3212, 3214, 3216, and 3218 may be determined as on a first planar area corresponding to plane 3228; planes 3220, 3222, 3224, and 3226 may be determined as on a second same planar area corresponding to plane 3230.  Steinbrucker: [0256]. The blocks, for example, may be formatted as mesh blocks, in which features of objects in the physical world, such as corners, become points in the mesh block, or are used as points to create a mesh block. Connections between points in the mesh may indicate groups of points on the same surface of a physical object. Steinbrucker: [0303] L.21-27); and 
translating the plurality of clusters into a set of planes that forms a perimeter layout for the floorplan (e.g., In some embodiments, the 3D reconstruction 408 may be stored as a mesh, with groups of points defining vertices of triangles that represent surfaces. In some embodiments, the 3D reconstruction 4908 may be generated using other techniques such as room layout detection system, and/or object detection. In some embodiments, a number of techniques may be used together to generate the 3D reconstruction 4908. For example, object detection may be used for known physical objects in the physical world, 3D modeling may be used for unknown physical objects in the physical world, and room layout detection system may also be used to identify the boundaries in the physical world such as walls and floors.  Steinbrucker: [0376] L.4-17).

Regarding claims 18-19, the claims are wearable extended reality device claims of system claims 11-12 respectively.  The claims are similar in scope to claims 11-12 respectively and they are rejected under similar rationale as claims 11-12 respectively.

Response to Arguments
Applicant’s arguments filed on September 27, 2022 have been fully considered and are persuasive in view of the amendments.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of additional references of Mehr (2018/0330184) and Bell (2016/0055268).
R1.	The examiner applied the reference of Mehr to teach identification (classification) of a wall and a room with “Now, solutions that aim in specific at extracting the layout of a 3D reconstructed indoor scene also exist. Such a determination of an architectural layout has many applications in Scene Modeling, Scene Understanding, Augmented Reality, and all domains where it is necessary to precisely build an accurate 3D indoor scene or to get accurate measurements. Two main families of methods exist in the state of the art: methods based on learning algorithms to cluster the walls, ceiling and floor, and methods based on pure geometric hypotheses to extract the layout (for instance, the room is a cuboid, the walls are vertical and their projection is a line on the floor plan, etc).” (Mehr: [0027]) and “The providing of the cycle of points comprises providing a 3D point cloud representing a room that includes the cycle of walls, identifying the projection plane and the 3D point cloud representing the cycle of walls from the 3D point cloud representing the room, projecting the 3D point cloud representing the cycle of walls on the projection plane, determining the 2D point cloud, and determining the concave hull; the identifying of the projection plane and of the 3D point cloud representing the cycle of walls comprises detecting planes in the 3D point cloud representing the room with a random sample consensus algorithm” (Mehr: [0054]-[0055]).
R2.	The examiner applied the reference of Bell to teach the threshold values on surface and volume of a 3D data as criterion for a wall and a room respectively with “Captured three-dimensional (3D) data associated with a 3D model of an architectural environment is received and at least a portion of the captured 3D data associated with a flat surface is identified” (Bell: Abstract L.2-5) and “Additionally, the first identification component 202 can associate an identified flat surface and/or an identified non-flat surface with a particular identifier and/or a particular score. For example, an identified flat surface can be assigned an identifier value associated with a wall, an identifier value associated with a floor, or an identifier value associated with a ceiling based on a calculated score. The first identification component 202 can assign an identifier value to an identified flat surface based on one or more criteria. For example, an identified flat surface can be assigned an identifier associated with a wall in response to a determination that a normal vector associated with the identified flat surface is approximately horizontal, that a size (e.g., a surface area, etc.) of the identified flat surface satisfies a particular threshold level, … In an aspect, a largest unidentified plane in a given room and/or in an entire 3D model can be iteratively identified as a wall, floor or ceiling (e.g., a surface) until a certain threshold associated with an enclosing of a captured volume is reached.” (Bell: [0058]).
For details, please see rejections to the independent claims above.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
                                                                                                                                            
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SING-WAI WU whose telephone number is (571)270-5850. The examiner can normally be reached 9:00am - 5:30pm (Central Time).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on 571-272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SING-WAI WU/Primary Examiner, Art Unit 2611