DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
Claims 1-16 were cancelled by preliminary amendment.  
Claims 17-36 are pending in this application. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159.  See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 17-36 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-9 of U.S. Patent No. US 10,373,372 B2. Although the claims at issue are not identical, they are not patentably distinct from each other because they are both directed towards methods for object recognition by comparing reconstructed 3D point clouds of known objects, and the . 
Claims 17-36 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-16 of U.S. Patent No. US 10,922,526 B2. Although the claims at issue are not identical, they are not patentably distinct from each other because they are both directed towards methods for object recognition by comparing reconstructed 3D point clouds of known objects, and the claims of the instant application are broader in scope and are anticipated by the previously patented narrower claims. 

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claims 17, 25 and 37 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Ho et al. (US Patent US 10,650,278 B1, hereby referred to as “Ho”).
Claims 17, 25 and 37. 
Ho teaches: 
-; 17. (New) A computer-implemented method of 3D object recognition, the method comprising: / 27. (New) A system comprising: / 35. A non-transitory computer-readable medium containing instructions which, when executed by a processor, cause the (Ho: abstract, column 3 lines 63-67, column 4 lines 1-44, FIG. 1 is a block diagram of a system 100 for semantic labeling of point clouds using images. For example, the system 100 may implement the process 200 of FIG. 2. The system 100 takes as input a three dimensional point cloud 102 of data based, at least in part on, lidar sensor data reflecting objects in a space ( e.g., the vicinity of segment of road). For example, the point cloud 102 may be determined by applying bundle adjustment processing (e.g., using a SLAM (Simultaneous Localization And Mapping) algorithm) to a set of lidar sensor scans taken at different times and/or locations within the space.)
27. a first computing system and a second computing system, wherein the first computing system comprises at least one camera, at least one processor, and memory storing a plurality of executable instructions which, when executed by the at least one processor of the first computing system, cause the first computing system to: (Ho: column 3 lines 63-67, column 4 lines 1-44, FIG. 1 is a block diagram of a system 100 for semantic labeling of point clouds using images. For example, the system 100 may implement the process 200 of FIG. 2. The system 100 takes as input a three dimensional point cloud 102 of data based, at least in part on, lidar sensor data reflecting objects in a space ( e.g., the vicinity of segment of road). The point cloud 102 may include data associated with points in the space, such as lidar intensity and/or geometric features of collections of nearby points ( e.g., a normal or spin). In some implementations, the 10 point cloud 102 may include static/moving labels that indicate whether a point reflects a static object or a moving object. For example, static/moving labels may for points of the point cloud 102 may be determined by implementing the process 500 of FIG. 5.)
(Ho: column 3 lines 63-67, column 4 lines 1-44, FIG. 1, The system 100 also takes as input a set of two dimensional images 104 (e.g., greyscale images or color images) that include views of objects in the space. For example, the set of images 104 may be captured with one or more cameras or other image sensors (e.g., an array of cameras) operating in the same space as the lidar sensor. An image from the set of images 104 may be associated with a location and orientation of the image sensor (e.g., a camera) used to capture the image and/or a time when the image was captured. In some implementations, the point cloud 102 and the images 104 are based on data captured with sensors (e.g., lidar sensors, image sensors, global positioning system, etc.))
18. receiving a plurality of pictures of the 3D object; and constructing, from the plurality of pictures, the 3D point cloud. / 27. receive the plurality of pictures; (Ho: column 3 lines 63-67, column 4 lines 1-44, FIG. 1, The point cloud 102 and the set of images 104 are passed to the image selection module 106, which is configured to select a subset of the set of images 104 that provides multiple views of each of the points in the point cloud 102 while attempting to reduce the total number of images that will be processed by the downstream modules of the system 100. For example, image selection module 106 may implement the process 300 of FIG. 3. Once image selection module 106 has identified the subset of the set of images 104 that will be processed an image 108 from the subset may be passed to the 3D-2D projection module 110, along with the point cloud 102, for processing. For example, the image 108 may be similar to the image 1300 of FIG. 13. Selecting and processing multiple images captured from different locations with different views of objects reflected in the point cloud 102 may help to aggregate information to account for occlusion in some of the images.)
17. receiving a 3D point cloud corresponding to a 3D object; /27. determine, based on the plurality of pictures, a 3D point cloud corresponding to the 3D object; / 35. receive a 3D point cloud corresponding to a 3D object; (Ho: column 4 lines 27-67, FIG. 1, The point cloud 102 and the set of images 104 are passed to the image selection module 106, which is configured to select a subset of the set of images 104 that provides multiple views of each of the points in the point cloud 102 while attempting to reduce the total number of images that will be processed by the downstream modules of the system 100. For example, image selection module 106 may implement the process 300 of FIG. 3. Once image selection 35 module 106 has identified the subset of the set of images 104 that will be processed an image 108 from the subset may be passed to the 3D-2D projection module 110, along with the point cloud 102, for processing. For example, the image 108 may be similar to the image 1300 of FIG. 13. Selecting and  processing multiple images captured from different locations with different views of objects reflected in the point cloud 102 may help to aggregate information to account for occlusion in some of the images.)
17. splitting the 3D point cloud into a plurality of 3D primitives; / 27. split the 3D point cloud into a plurality of 3D primitives; / 35. split the 3D point cloud into a plurality of 3D primitives; (Ho: column 3 lines 52-62, The predictions projected back to the point cloud may be improved by analyzing three dimensional clusters of points together. For example, a labeled point cloud may be segmented into clusters using a hierarchical segmentation running on a graphical processing unit (GPU). The point cloud may be represented as a graph split into connected components before applying hierarchical segmentation based on the Felzenszwalb algorithm to each connected component. The label predictions for the resulting clusters may be input to a three dimensional convolution neural network to determine a label prediction for the cluster as a whole, which may be propagated to the points in the cluster of the point cloud. Column 4 lines 44-66, The 3D-2D projection module 110 may determine a projection of points from the point cloud 102 onto the image 108. The position and orientation of an image sensor when it was used to capture the image 108 may be correlated (e.g., using a bundle adjustment algorithm such as SLAM) with a position and orientation in the point cloud 102 model of the space.)
17. determining a connectivity graph describing a spatial connectivity of the plurality of 3D primitives forming the 3D object; / 27. determine a connectivity graph describing a spatial connectivity of the plurality of 3D primitives forming the 3D object; / 35. determine a connectivity graph describing a spatial connectivity of the plurality of 3D primitives forming the 3D object; (Ho: column 12 lines 3-55, Figures 5-6, 11-12, column 12 lines 21-38, FIG. 6 is a flowchart of an example process 600 for three dimensional segmentation of a point cloud into clusters. The process 600 includes determining 610 a graph based on a semantic labeled point cloud, wherein nodes of the graph are points from the semantic labeled point cloud and edges of the graph connect nodes with respective points that satisfy a pairwise criteria; identifying 620 one or more connected components of the graph; and determining 630 clusters of points from the semantic labeled point cloud by performing a hierarchical segmentation of each of the one or more connected components of the graph. For example, the process 600 may be implemented with a graphical processing unit (GPU) to exploit the highly parallel nature of the calculations. For example, the process 600 may be implemented by the system 100 of FIG. 1. For example, the process 600 may be implemented by the vehicle controller 1100 of FIG. 11. For example, the process 600 may be implemented by the computing system 1200 of FIG. 12)
17. performing a 3D match search in a 3D database based on the plurality of 3D primitives and the connectivity graph; / 27. perform a 3D match search in a 3D database based on the plurality of 3D primitives and the connectivity graph; / 35. perform a 3D match search in a 3D database based on the plurality of 3D primitives and the connectivity graph; (Ho: column 12 lines 3-20, The process 500 includes assigning 510 indications of moving likelihood to respective points of the point cloud based on how frequently the respective points are detected in lidar scans captured at different times; and applying 520 a fully connected conditional random field to the indications of moving likelihood for points in the point cloud to obtain moving labels for respective points of the point cloud. The moving labels may be binary indications of whether or not a respective point of the point cloud corresponds to a moving object (e.g., moving vs. static). The moving labels may be included in an augmented image as one of one or more channels of data from the point cloud. For example, the process 500 may be implemented by the system 100 of FIG. 1. For example, the process 500 may be implemented by the vehicle controller 1100 of FIG. 11. For example, the process 500 may be implemented by the computing system 1200 of FIG. 12)
17. and outputting an identifier of the 3D object./ 27. and output an identifier of the 3D object. / 35. and output an identifier of the 3D object. (Ho: column 6 lines 12-28, The 3D CNN classification module 150 includes a three dimensional convolutional neural network that takes a three dimensional array of predictions for a cluster (e.g., based on the 3D semantic priors for the cluster) as input and outputs a label prediction for the cluster as a whole. The 3D cluster label predictions 152 that result from processing the clusters of the labeled point cloud 132 with the 3D CNN classification module 150 may be used to update 3D semantic priors of the labeled point cloud 132)

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.

Claims 17-35 are rejected under 35 U.S.C. 103 as being unpatentable over Yuan et al. (US Patent US 9,424,461 B1, hereby referred to as “Yuan”), in view of Ho et al. (US Patent US 10,650,278 B1, hereby referred to as “Ho”). 
Claims 17, 27 and 35. (Original) 
Ho teaches: 
-; 17. (New) A computer-implemented method of 3D object recognition, the method comprising: / 27. (New) A system comprising: / 35. A non-transitory computer-readable medium containing instructions which, when executed by a processor, cause the processor to: (Ho: abstract, column 3 lines 63-67, column 4 lines 1-44, FIG. 1 is a block diagram of a system 100 for semantic labeling of point clouds using images. For example, the system 100 may implement the process 200 of FIG. 2. The system 100 takes as input a three dimensional point cloud 102 of data based, at least in part on, lidar sensor data reflecting objects in a space ( e.g., the vicinity of segment of road). For example, the point cloud 102 may be determined by applying bundle adjustment processing (e.g., using a SLAM (Simultaneous Localization And Mapping) algorithm) to a set of lidar sensor scans taken at different times and/or locations within the space.)
27. a first computing system and a second computing system, wherein the first computing system comprises at least one camera, at least one processor, and memory storing a plurality of executable instructions which, when executed by the at least one processor of the first computing system, cause the first computing system to: (Ho: column 3 lines 63-67, column 4 lines 1-44, FIG. 1 is a block diagram of a system 100 for semantic labeling of point clouds using images. For example, the system 100 may implement the process 200 of FIG. 2. The system 100 takes as input a three dimensional point cloud 102 of data based, at least in part on, lidar sensor data reflecting objects in a space ( e.g., the vicinity of segment of road). The point cloud 102 may include data associated with points in the space, such as lidar intensity and/or geometric features of collections of nearby points ( e.g., a normal or spin). In some implementations, the 10 point cloud 102 may include static/moving labels that indicate whether a point reflects a static object or a moving object. For example, static/moving labels may for points of the point cloud 102 may be determined by implementing the process 500 of FIG. 5.)
27. capture, by the at least one camera, a plurality of pictures of a 3D object; and send, to the second computing system, the plurality of pictures, and wherein the second computing system comprises at least one processor and memory storing a plurality of executable instructions which, when executed by the at least one processor of the (Ho: column 3 lines 63-67, column 4 lines 1-44, FIG. 1, The system 100 also takes as input a set of two dimensional images 104 (e.g., greyscale images or color images) that include views of objects in the space. For example, the set of images 104 may be captured with one or more cameras or other image sensors (e.g., an array of cameras) operating in the same space as the lidar sensor. An image from the set of images 104 may be associated with a location and orientation of the image sensor (e.g., a camera) used to capture the image and/or a time when the image was captured. In some implementations, the point cloud 102 and the images 104 are based on data captured with sensors (e.g., lidar sensors, image sensors, global positioning system, etc.))
18. receiving a plurality of pictures of the 3D object; and constructing, from the plurality of pictures, the 3D point cloud. / 27. receive the plurality of pictures; (Ho: column 3 lines 63-67, column 4 lines 1-44, FIG. 1, The point cloud 102 and the set of images 104 are passed to the image selection module 106, which is configured to select a subset of the set of images 104 that provides multiple views of each of the points in the point cloud 102 while attempting to reduce the total number of images that will be processed by the downstream modules of the system 100. For example, image selection module 106 may implement the process 300 of FIG. 3. Once image selection module 106 has identified the subset of the set of images 104 that will be processed an image 108 from the subset may be passed to the 3D-2D projection module 110, along with the point cloud 102, for processing. For example, the image 108 may be similar to the image 1300 of FIG. 13. Selecting and processing multiple images captured from different locations with different views of objects reflected in the point cloud 102 may help to aggregate information to account for occlusion in some of the images.)
17. receiving a 3D point cloud corresponding to a 3D object; /27. determine, based on the plurality of pictures, a 3D point cloud corresponding to the 3D object; / 35. receive a 3D point cloud corresponding to a 3D object; (Ho: column 4 lines 27-67, FIG. 1, The point cloud 102 and the set of images 104 are passed to the image selection module 106, which is configured to select a subset of the set of images 104 that provides multiple views of each of the points in the point cloud 102 while attempting to reduce the total number of images that will be processed by the downstream modules of the system 100. For example, image selection module 106 may implement the process 300 of FIG. 3. Once image selection 35 module 106 has identified the subset of the set of images 104 that will be processed an image 108 from the subset may be passed to the 3D-2D projection module 110, along with the point cloud 102, for processing. For example, the image 108 may be similar to the image 1300 of FIG. 13. Selecting and  processing multiple images captured from different locations with different views of objects reflected in the point cloud 102 may help to aggregate information to account for occlusion in some of the images.)
17. splitting the 3D point cloud into a plurality of 3D primitives; / 27. split the 3D point cloud into a plurality of 3D primitives; / 35. split the 3D point cloud into a plurality of 3D primitives; (Ho: column 3 lines 52-62, The predictions projected back to the point cloud may be improved by analyzing three dimensional clusters of points together. For example, a labeled point cloud may be segmented into clusters using a hierarchical segmentation running on a graphical processing unit (GPU). The point cloud may be represented as a graph split into connected components before applying hierarchical segmentation based on the Felzenszwalb algorithm to each connected component. The label predictions for the resulting clusters may be input to a three dimensional convolution neural network to determine a label prediction for the cluster as a whole, which may be propagated to the points in the cluster of the point cloud. Column 4 lines 44-66, The 3D-2D projection module 110 may determine a projection of points from the point cloud 102 onto the image 108. The position and orientation of an image sensor when it was used to capture the image 108 may be correlated (e.g., using a bundle adjustment algorithm such as SLAM) with a position and orientation in the point cloud 102 model of the space.)
17. determining a connectivity graph describing a spatial connectivity of the plurality of 3D primitives forming the 3D object; / 27. determine a connectivity graph describing a spatial connectivity of the plurality of 3D primitives forming the 3D object; / 35. determine a connectivity graph describing a spatial connectivity of the plurality of 3D primitives forming the 3D object; (Ho: column 12 lines 3-55, Figures 5-6, 11-12, column 12 lines 21-38, FIG. 6 is a flowchart of an example process 600 for three dimensional segmentation of a point cloud into clusters. The process 600 includes determining 610 a graph based on a semantic labeled point cloud, wherein nodes of the graph are points from the semantic labeled point cloud and edges of the graph connect nodes with respective points that satisfy a pairwise criteria; identifying 620 one or more connected components of the graph; and determining 630 clusters of points from the semantic labeled point cloud by performing a hierarchical segmentation of each of the one or more connected components of the graph. For example, the process 600 may be implemented with a graphical processing unit (GPU) to exploit the highly parallel nature of the calculations. For example, the process 600 may be implemented by the system 100 of FIG. 1. For example, the process 600 may be implemented by the vehicle controller 1100 of FIG. 11. For example, the process 600 may be implemented by the computing system 1200 of FIG. 12)
17. performing a 3D match search in a 3D database based on the plurality of 3D primitives and the connectivity graph; / 27. perform a 3D match search in a 3D database based on the plurality of 3D primitives and the connectivity graph; / 35. perform a 3D match search in a 3D database based on the plurality of 3D primitives and the connectivity graph; (Ho: column 12 lines 3-20, The process 500 includes assigning 510 indications of moving likelihood to respective points of the point cloud based on how frequently the respective points are detected in lidar scans captured at different times; and applying 520 a fully connected conditional random field to the indications of moving likelihood for points in the point cloud to obtain moving labels for respective points of the point cloud. The moving labels may be binary indications of whether or not a respective point of the point cloud corresponds to a moving object (e.g., moving vs. static). The moving labels may be included in an augmented image as one of one or more channels of data from the point cloud. For example, the process 500 may be implemented by the system 100 of FIG. 1. For example, the process 500 may be implemented by the vehicle controller 1100 of FIG. 11. For example, the process 500 may be implemented by the computing system 1200 of FIG. 12)
17. and outputting an identifier of the 3D object./ 27. and output an identifier of the 3D object. / 35. and output an identifier of the 3D object. (Ho: column 6 lines 12-28, The 3D CNN classification module 150 includes a three dimensional convolutional neural network that takes a three dimensional array of predictions for a cluster (e.g., based on the 3D semantic priors for the cluster) as input and outputs a label prediction for the cluster as a whole. The 3D cluster label predictions 152 that result from processing the clusters of the labeled point cloud 132 with the 3D CNN classification module 150 may be used to update 3D semantic priors of the labeled point cloud 132)
Even if Ho does not teach: 3D primitives
Yuan teaches:
-; 17. (New) A computer-implemented method of 3D object recognition, the method comprising: / 27. (New) A system comprising: / 35. A non-transitory computer-readable medium containing instructions which, when executed by a processor, cause the processor to:  (Yuan: abstract, Figures 8 and 9, columns 20-21)
27. a first computing system and a second computing system, wherein the first computing system comprises at least one camera, at least one processor, and memory storing a plurality of executable instructions which, when executed by the at least one processor of the first computing system, cause the first computing system to: (Yuan: Figure 8 column 19 lines 26-67, Figure 9 column 20 lines 34-67, column 21 lines 1-62, Each server typically will include an operating system that provides executable program instructions for the general administration and operation of that server and typically will include computer-readable medium storing instructions that, when executed by a processor of the server, allow the server to perform its intended functions. Suitable implementations for the operating system and general functionality of the servers are known or commercially available and are readily implemented by persons having ordinary skill in the art, particularly in light of the disclosure herein)
27. capture, by the at least one camera, a plurality of pictures of a 3D object; and send, to the second computing system, the plurality of pictures, and wherein the second (Yuan: Figure 8 column 19 lines 26-67, FIG. 8 illustrates front and back views of an example electronic computing device 800 that can be used in accordance with various embodiments. In this example, the computing device 800 has a display screen 802 ( e.g., an LCD element) operable to display information or image content to one or more users or viewers of the device. The display screen of some embodiments displays information to the viewers facing the display screen (e.g., on the same side of the computing device as the display screen). The computing device in this example can include one or more imaging elements, in this example including two image capture elements 804 on the front of the device and at least one image capture element 810 on the back of the device. Each image capture element 804 and 810 may be, for example, a camera, a charge-coupled device (CCD), a motion detection sensor or an infrared sensor, or other image capturing technology)
18. receiving a plurality of pictures of the 3D object; and constructing, from the plurality of pictures, the 3D model. / 27. receive the plurality of pictures; (Yuan: column 4 lines 12-20, As described, users of computing devices (e.g., mobile phones, tablet computers, etc.) desire to point their device at an object and retrieve relevant information (e.g., pricing information, user reviews, links to purchase the object, etc.) associated with the object. FIGS. l(a), l(b), l(c), l(d), l(e), and 1(/) illustrate example images of objects that can be captured and analyzed to retrieve relevant information in accordance with various embodiments.)
(Yuan: column 17 lines 3-20, In this example, the request is received to a network interface layer 708 of the content provider 706. The network interface layer can include any appropriate components known or used to receive requests from across a network, such as may include one or more application programming interfaces (APis) or other such interfaces for receiving such requests. The network interface layer 708 might be owned and operated by the provider, or leveraged by the provider as part of a shared resource or "cloud" offering. The network interface layer can receive and analyze the request, and cause at least a portion of the information in the request to be directed to an appropriate system or service, such as an image analysis service 710 as illustrated in FIG. 7. An image analysis service in this example includes components operable to receive image information about an object, analyze the image information, and return information relating to people, products, places, or things that are determined to match objects in that image information.)
17. representing the 3D model into a plurality of 3D primitives; /  27. represent the 3D model into a plurality of 3D primitives; / 35. represent the 3D model into a plurality of 3D primitives; (Yuan: column 8 lines 15-28, A catalog 3D model can include a high-resolution 3D triangle mesh scan of an object made under controlled conditions and using professional equipment. The 3D triangle mesh can contain scale information, detailed texture maps, and is accurate for all possible viewpoints of the object. In addition to this information, the catalog 3D model can also include metadata describing the location, configuration and range of motion of the points of articulation in the catalog object. In accordance with an embodiment, the simplified 3D model and the 3D mesh based object model can be captured 25 by multi-view stereo camera setup and computed by a multi-view stereo reconstruction process. Alternatively, the object model can be captured by or one/multiple moving cameras moving around the object.)
17. determining a connectivity graph describing a spatial connectivity of the plurality of 3D primitives forming the 3D object; / 27. determine a connectivity graph describing a spatial connectivity of the plurality of 3D primitives forming the 3D object; / 35. determine a connectivity graph describing a spatial connectivity of the plurality of 3D primitives forming the 3D object;  (Examiner Note: a 3D triangle mesh based object model [connectivity graph] of the object may not change its shape as a whole ("rigid") and/or each of its components is rigid itself and connected through a set of joints ("articulated") [3D primitives form the 3D object], Yuan: column 8 lines 30-42, The device 3D model can include a low-resolution 3D triangle mesh representation of one or more points of view of an object made by a customer using a mobile device. Device 3D models can be generated through the use of stereo cameras, structured light, scanning laser range finding, light field technology, or any other technology suitable for implementation on a mobile device. 3D reconstruction algorithms such as dense depth map estimation and registration and triangulation followed by bundle adjustment can be applied to generate the device 3D models. Similar to the catalog 3D model, the 3D triangle mesh contains scale information as well as detailed texture maps. Unlike the catalog 3D model, there is no metadata provided about the articulations in the target object.)
(Yuan: column 13 lines 7-37, Figure 5, As described, catalog simplified models and associated texture maps can be used in the image matching/object identification process. Accordingly, an offline catalog object intake process is performed to generate one or more simplified models. Such a process can include generating, for each new catalog object that is either a single-piece rigid body, or is itself a collection of rigid bodies connected by one or more articulated joints, a high definition 3D scan of the object to create a triangle mesh of the catalog 3D model. This can be accomplished using, for example, any one of off-the-shelf 3D scanning product. Along with the 3D triangle mesh, images of the object are taken and can be used by the 3D scanning product to produce the texture map for the object. Using the catalog intake software tool, the catalog intake operator can identify the parts of the product that are articulated and can input the information about the range associated with each point of articulation.)
17. and outputting an identifier of the 3D object./ 27. and output an identifier of the 3D object. / 35. and output an identifier of the 3D object. (Yuan: column 17 lines 47-62, The image analysis service 710 can receive information from each contacted identification service 714 as to whether one or more matches could be found with at least a threshold level of confidence, for example, and can receive any appropriate information for a located potential match. The information from each identification service can be analyzed and/or processed by one or more applications of the image analysis service, such as to determine data useful in obtaining information for each of the potential matches to provide to the user. For example, an image analysis service might receive bar codes, product identifiers, or any other types of data from the identification service( s ), and might process that data to be provided to a service such as an information aggregator service 716 that is capable of locating descriptions or other content related to the located potential matches)
It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify Ho’s semantic labeling of point clouds with the teachings of Yuan for 3D object recognition using geometric primitives, as they are both directed towards object recognition in image data.  The determination of obviousness is predicated upon the following findings: One skilled in the art would have been motivated to modify Ho in this manner in order to improve the overall accuracy of object recognition by leveraging the use of 3D primitives. Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in the manner explained above using known engineering design, interface and/or programming techniques, without changing a “fundamental” operating principle of Ho, while the teaching of Yuan continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result of improving the overall accuracy of object recognition to leverage the use of geometric primitives as it was a known technique that could easily be applied for 3D object classification. It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question.

Consider Claims 18-19 and 28. 

18. (New) The method of claim 17, further comprising: receiving a plurality of pictures of the 3D object; and constructing, from the plurality of pictures, the 3D point cloud. Column 9 lines 9-30, For example, the offline catalog object intake process may generate all the possible features from both single view and multiple views, from multi-modal features and from a lengthy video sequence. The device side feature may be from just a single view and from a small number of video frames. Therefore, this initial set of features may be large. Accordingly, in accordance with an embodiment, a feature selection and conversion algorithm can be applied to generate a compact feature vector that is adapted to large-scale database search and can be used in object recognition process. For example, a distinctive score can be computed for each feature and only those features with a high distinctive score ( e.g., a distinctive score above a predetermined distinctive score threshold) are used for object recognition. The distinctive score can be determined by one or more factors for each type of features, which can include image gradient magnitude of local image descriptors; curvature and length of object contour segments; sparseness of global object descriptors; number of vertices, triangles of the 3D mesh; connectedness of the object topological map, etc. The selected features can be further converted into more compact forms for lower space and computation complexity)
19. (New) The method of claim 18, wherein the plurality of pictures were extracted from a video of the 3D object. (Yuan: column 8 lines 50-60 In accordance with an embodiment, temporal features from video sequences can include single-view and multi-view video sequences can be captured for the object. Additional features can be extracted from video sequences such as feature point trajectories captured by object tracking; relative depth/parallax info from the feature correspondences and/or optical flows; correlation between object patches which reflects how differently the object looks from multiple viewpoints. Column 16 lines 44-65, Figure 7 In this example, a user is able to capture one or more types of information using at least one computing device 702. For example, a user can cause a device to capture image ( or video) information, and can send at least a portion of that image ( or video) information across at least one appropriate network 704 to attempt to obtain information for one or more objects, persons, or occurrences within a field of view of the device.)
28. (New) The system of claim 27, wherein the instructions, when executed by the at least one processor of the first computing system, cause the first computing system to: record a video of the 3D object; and extract the plurality of pictures from the video. (Yuan: column 8 lines 50-60 In accordance with an embodiment, temporal features from video sequences can include single-view and multi-view video sequences can be captured for the object. Additional features can be extracted from video sequences such as feature point trajectories captured by object tracking; relative depth/parallax info from the feature correspondences and/or optical flows; correlation between object patches which reflects how differently the object looks from multiple viewpoints. Column 16 lines 44-65, Figure 7 In this example, a user is able to capture one or more types of information using at least one computing device 702. For example, a user can cause a device to capture image ( or video) information, and can send at least a portion of that image ( or video) information across at least one appropriate network 704 to attempt to obtain information for one or more objects, persons, or occurrences within a field of view of the device.)


Consider Claim 20. The combination of Ho and Yuan teaches: The method of claim 18, wherein constructing the 3D point cloud comprises: extracting a plurality of key points from the plurality of pictures; defining a plurality of 3D vertices of the 3D object, wherein a vertex of the plurality of 3D vertices corresponds in 3D to a key point of the plurality of key points; and deriving the 3D point cloud based on the 3D vertices. (Yuan: Column 9 lines 9-30, For example, the offline catalog object intake process may generate all the possible features from both single view and multiple views, from multi-modal features and from a lengthy video sequence. The device side feature may be from just a single view and from a small number of video frames. Therefore, this initial set of features may be large. Accordingly, in accordance with an embodiment, a feature selection and conversion algorithm can be applied to generate a compact feature vector that is adapted to large-scale database search and can be used in object recognition process. For example, a distinctive score can be computed for each feature and only those features with a high distinctive score ( e.g., a distinctive score above a predetermined distinctive score threshold) are used for object recognition. The distinctive score can be determined by one or more factors for each type of features, which can include image gradient magnitude of local image descriptors; curvature and length of object contour segments; sparseness of global object descriptors; number of vertices, triangles of the 3D mesh; connectedness of the object topological map, etc. The selected features can be further converted into more compact forms for lower space and computation complexity)


Consider Claim 21. (New) The combination of Ho and Yuan teaches: The method of claim 18, wherein constructing the 3D point cloud comprises: extracting a plurality of key points from the  (Ho: column 12 lines 3-55, Figures 5-6, 11-12, column 12 lines 21-38, FIG. 6 is a flowchart of an example process 600 for three dimensional segmentation of a point cloud into clusters. The process 600 includes determining 610 a graph based on a semantic labeled point cloud, wherein nodes of the graph are points from the semantic labeled point cloud and edges of the graph connect nodes with respective points that satisfy a pairwise criteria; identifying 620 one or more connected components of the graph; and determining 630 clusters of points from the semantic labeled point cloud by performing a hierarchical segmentation of each of the one or more connected components of the graph. For example, the process 600 may be implemented with a graphical processing unit (GPU) to exploit the highly parallel nature of the calculations. For example, the process 600 may be implemented by the system 100 of FIG. 1. For example, the process 600 may be implemented by the vehicle controller 1100 of FIG. 11. For example, the process 600 may be implemented by the computing system 1200 of FIG. 12)

Consider Claims 22, 29 and 36. 
The combination of Ho and Yuan teaches: 
22. (New) The method of claim 17, wherein performing the 3D match search comprises comparing the plurality of 3D primitives to one or more 2D normal vectors. / 29. (New) The system of claim 27, wherein the instructions that cause the second computing system to perform the 3D match search comprise instructions that cause the second computing system to compare the plurality of 3D primitives to one or more 2D normal vectors. / 36. (New) The non-transitory computer-readable medium of claim 35,  (Yuan: column 3 lines 35-63, Embodiments also can allow for additional information to be captured and/or provided, such as by utilizing stereoscopic imaging with a stereo matching process, or by capturing and analyzing multiple frames using a multi-frame matching process. The compact combined visual feature vector can be compared to one or more stored vectors of a set of stored vectors, where each of the set of stored vectors corresponds to a respective type of object. A matching stored vector having a respective similarity score that at least meets a matching threshold can be determined, and based at least in part on the matching stored vector, at least one respective type of object represented in the image information can be identified. Other approaches can be used as well for object recognition and/or image tracking. For example, the image information can be processed by the electronic device to generate a 3D model ( e.g., a wire frame model) and a 2D representation of the object represented in the image information. The 3D model and the 2D representation of the object can be transmitted to a remote server. The remote server includes an object recognition, image matching, or other such image analysis service that can match the 3 D model generated by the electronic device (also referred to as a device 3D model) to at least one 3D model of a set of 3D models accessible by the remote server ( also referred to as simplifying 3D models), by comparing the device 3D model against at least a portion of the set of simplifying 3D models, including possible articulation configurations for each of the simplifying 3D models.)

Consider Claim 23. (New) The combination of Ho and Yuan teaches: The method of claim 17, wherein performing the 3D match search comprises comparing the plurality of 3D primitives and the connectivity graph with 3D primitives and connectivity graphs of known 3D objects stored in the 3D database. (Yuan: column 11 lines 40-60, Upon determining at least one match, as illustrated in FIG. 4(b), the image analysis service projects or maps the 2D image information onto the device 3D model, producing a set of texture maps for the device 3D model. Using point source ray projection or other similar projection, for example, the texture map 440 for the device 3D model can be projected onto the simplifying 3D model(s) 442 that were found to have a high degree of saliency, each of which can be stored in a texture map database 444 or included in some other database. In accordance with various embodiments, the texture map 440 for the device 3D model can be project onto one or more generic simplifying models, such as a sphere, cube, cylinder, among others, instead of the determined catalog-based simplified model. Accordingly, in this example, the textures maps for the device 3D model and the texture maps for the catalog based simplifying 3D model 444 can be projected or mapped onto the one of the generic simplifying models, and the mapping to the generic simplifying model can be used as the basis for matching. Ho: column 3 lines 35-50, Data from multiple lidar scans taken at different times and/or from different locations may be used to generate ( e.g., using a bundle adjustment process) the point cloud. Information about whether objects reflected in the point cloud are moving may be available by comparing lidar scans from different times. For example, a probability that a point in the point cloud corresponds to moving or static (i.e., not moving) object may be determined based on intersection tests. A fully connected CRF may be applied to these motion probabilities (or other indications) to determine motion labels for points of the point cloud. These motion labels in the point cloud may be propagated ( e.g., a channel of projected data) to an augmented image that is input to the two dimensional convolutional neural network and used for semantic segmentation to assist in distinguishing certain classes of objects that can be static or moving. column 12 lines 3-55, Figures 5-6, 11-12, column 12 lines 21-38, FIG. 6 is a flowchart of an example process 600 for three dimensional segmentation of a point cloud into clusters. The process 600 includes determining 610 a graph based on a semantic labeled point cloud, wherein nodes of the graph are points from the semantic labeled point cloud and edges of the graph connect nodes with respective points that satisfy a pairwise criteria; identifying 620 one or more connected components of the graph; and determining 630 clusters of points from the semantic labeled point cloud by performing a hierarchical segmentation of each of the one or more connected components of the graph. For example, the process 600 may be implemented with a graphical processing unit (GPU) to exploit the highly parallel nature of the calculations. For example, the process 600 may be implemented by the system 100 of FIG. 1. For example, the process 600 may be implemented by the vehicle controller 1100 of FIG. 11. For example, the process 600 may be implemented by the computing system 1200 of FIG. 12))

Consider Claim 24. (New) The combination of Ho and Yuan teaches: The method of claim 17, wherein performing the 3D match search comprises performing, using a machine learning algorithm, the 3D match search. (Ho: column 3 lines 10-34, The same process (e.g., including a 3D to 2D projection) may be used to generate augmented images for training and for inference with a two dimensional convolutional neural network used to generate label predictions based on the augmented images. For example, a training point cloud may include points labeled with ground truth labels. These points may be projected onto training images and used with the associated ground truth labels for the projected points to train the two dimensional convolution neural network for semantic segmentation. Using the same process for training and inference may assure the same types of variations from the projection process are experienced in training and inference and thus improve the performance of the two dimensional convolution neural network for semantic segmentation.)

Consider Claim 25. (New) The combination of Ho and Yuan teaches:  The method of claim 17, wherein the plurality of 3D primitives comprise planes, spheres, cylinders, cubes, and toms. (Yuan: column 11 lines 40-60, Upon determining at least one match, as illustrated in FIG. 4(b), the image analysis service projects or maps the 2D image information onto the device 3D model, producing a set of texture maps for the device 3D model. Using point source ray projection or other similar projection, for example, the texture map 440 for the device 3D model can be projected onto the simplifying 3D model(s) 442 that were found to have a high degree of saliency, each of which can be stored in a texture map database 444 or included in some other database. In accordance with various embodiments, the texture map 440 for the device 3D model can be project onto one or more generic simplifying models, such as a sphere, cube, cylinder, among others, instead of the determined catalog-based simplified model. Accordingly, in this example, the textures maps for the device 3D model and the texture maps for the catalog based simplifying 3D model 444 can be projected or mapped onto the one of the generic simplifying models, and the mapping to the generic simplifying model can be used as the basis for matching.)

Consider Claim 26. (New) The combination of Ho and Yuan teaches: The method of claim 17, further comprising storing, in the 3D database, the plurality of 3D primitives and the connectivity graph. (Yuan: column 11 lines 40-60, Upon determining at least one match, as illustrated in FIG. 4(b), the image analysis service projects or maps the 2D image information onto the device 3D model, producing a set of texture maps for the device 3D model. Using point source ray projection or other similar projection, for example, the texture map 440 for the device 3D model can be projected onto the simplifying 3D model(s) 442 that were found to have a high degree of saliency, each of which can be stored in a texture map database 444 or included in some other database. In accordance with various embodiments, the texture map 440 for the device 3D model can be project onto one or more generic simplifying models, such as a sphere, cube, cylinder, among others, instead of the determined catalog-based simplified model. Accordingly, in this example, the textures maps for the device 3D model and the texture maps for the catalog based simplifying 3D model 444 can be projected or mapped onto the one of the generic simplifying models, and the mapping to the generic simplifying model can be used as the basis for matching.)

Consider Claims 30-31. 
The combination of Ho and Yuan teaches:
30. (New) The system of claim 27, wherein the instructions, when executed by the at least one processor of the first computing system, cause the first computing system to: receive user input corresponding to the 3D object; and send, to the second computing system, the user input.
31. (New) The system of claim 30, wherein the instructions that cause the second computing system to perform the 3D match search comprise instructions that cause the second computing system to perform the 3D match search based on the user input. (Yuan: column 13 lines 7-37, Figure 5, As described, catalog simplified models and associated texture maps can be used in the image matching/object identification process. Accordingly, an offline catalog object intake process is performed to generate one or more simplified models. Such a process can include generating, for each new catalog object that is either a single-piece rigid body, or is itself a collection of rigid bodies connected by one or more articulated joints, a high definition 3D scan of the object to create a triangle mesh of the catalog 3D model. This can be accomplished using, for example, any one of off-the-shelf 3D scanning product. Along with the 3D triangle mesh, images of the object are taken and can be used by the 3D scanning product to produce the texture map for the object. Using the catalog intake software tool, the catalog intake operator can identify the parts of the product that are articulated and can input the information about the range associated with each point of articulation.) (Ho: column 6 lines 12-28, The 3D CNN classification module 150 includes a three dimensional convolutional neural network that takes a three dimensional array of predictions for a cluster (e.g., based on the 3D semantic priors for the cluster) as input and outputs a label prediction for the cluster as a whole. The 3D cluster label predictions 152 that result from processing the clusters of the labeled point cloud 132 with the 3D CNN classification module 150 may be used to update 3D semantic priors of the labeled point cloud 132)

Consider Claims 32-33. 
The combination of Ho and Yuan teaches: 
32. (New) The system of claim 30, wherein the instructions that cause the second computing system to perform the 3D match search comprise instructions that cause the second computing system to compare the plurality of 3D primitives and the connectivity graph with 3D primitives and connectivity graphs of known 3D objects stored in the 3D database.
 (Ho: column 12 lines 3-55, Figures 5-6, 11-12, column 12 lines 21-38, FIG. 6 is a flowchart of an example process 600 for three dimensional segmentation of a point cloud into clusters. The process 600 includes determining 610 a graph based on a semantic labeled point cloud, wherein nodes of the graph are points from the semantic labeled point cloud and edges of the graph connect nodes with respective points that satisfy a pairwise criteria; identifying 620 one or more connected components of the graph; and determining 630 clusters of points from the semantic labeled point cloud by performing a hierarchical segmentation of each of the one or more connected components of the graph. For example, the process 600 may be implemented with a graphical processing unit (GPU) to exploit the highly parallel nature of the calculations. For example, the process 600 may be implemented by the system 100 of FIG. 1. For example, the process 600 may be implemented by the vehicle controller 1100 of FIG. 11. For example, the process 600 may be implemented by the computing system 1200 of FIG. 12 Yuan: column 11 lines 40-60, Upon determining at least one match, as illustrated in FIG. 4(b), the image analysis service projects or maps the 2D image information onto the device 3D model, producing a set of texture maps for the device 3D model. Using point source ray projection or other similar projection, for example, the texture map 440 for the device 3D model can be projected onto the simplifying 3D model(s) 442 that were found to have a high degree of saliency, each of which can be stored in a texture map database 444 or included in some other database. In accordance with various embodiments, the texture map 440 for the device 3D model can be project onto one or more generic simplifying models, such as a sphere, cube, cylinder, among others, instead of the determined catalog-based simplified model.)

Consider Claim 34. (New) The combination of Ho and Yuan teaches: The system of claim 27, wherein the instructions, when executed by the at least one processor of the second computing system, cause the second computing system to send the identifier to the first computing system. (Yuan: column 17 lines 47-62, The image analysis service 710 can receive information from each contacted identification service 714 as to whether one or more matches could be found with at least a threshold level of confidence, for example, and can receive any appropriate information for a located potential match. The information from each identification service can be analyzed and/or processed by one or more applications of the image analysis service, such as to determine data useful in obtaining information for each of the potential matches to provide to the user. For example, an image analysis service might receive bar codes, product identifiers, or any other types of data from the identification service( s ), and might process that data to be provided to a service such as an information aggregator service 716 that is capable of locating descriptions or other content related to the located potential matches) Ho: column 6 lines 12-28, The 3D CNN classification module 150 includes a three dimensional convolutional neural network that takes a three dimensional array of predictions for a cluster (e.g., based on the 3D semantic priors for the cluster) as input and outputs a label prediction for the cluster as a whole. The 3D cluster label predictions 152 that result from processing the clusters of the labeled point cloud 132 with the 3D CNN classification module 150 may be used to update 3D semantic priors of the labeled point cloud 132)


Conclusion
The prior art made of record in form PTO-892 and not relied upon is considered pertinent to applicant's disclosure. 
Younas; Sohail et al.	US 20180276887 A1	Medial Axis Extraction for Complex 3D Objects
Schneemann; Dirk	US 20180190377 A1	MODELING AND LEARNING CHARACTER TRAITS AND MEDICAL CONDITION BASED ON 3D FACIAL FEATURES
Saha; Arindam et al.	US 20170200307 A1	CONSTRUCTING A 3D STRUCTURE
Berger; Ulrich et al.	US 10839530 B1	Moving point detection
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TAHMINA ANSARI whose telephone number is 571-270-3379.  The examiner can normally be reached on IFP Flex - Monday through Friday 9 to 5.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, SUMATI LEFKOWITZ can be reached on 571-272-3638.  The fax phone numbers for the organization where this application or proceeding is assigned are 571-273-8300 for regular communications and 571-273-8300 for After Final communications. TC 2600’s customer service number is 571-272-2600.
Any inquiry of a general nature or relating to the status of this application or proceeding should be directed to the receptionist whose telephone number is 571-272-2600.



2662
/Tahmina Ansari/

September 29, 2021
/TAHMINA N ANSARI/Primary Examiner, Art Unit 2662