DETAILED ACTIONNotice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Applicant Response to Official Action
The response filed on 1/26/2022 has been entered and made of record.
Acknowledgment 
Claims 1, 4, 6-8, 11, 14, and 16-18, amended on 1/26/2022, are acknowledged by the examiner.  
Response to Arguments
Applicant’s arguments with respect to claims 1, 11, and their dependent claims have been considered but they are moot in view of the new grounds of rejection necessitated by amendments initiated by the applicant.  Examiner addresses the main arguments of the Applicant as below.
Regarding the 35 U.S.C. 112(a) rejections, the amendment filed on 1/26/2022 addresses the issue.  As a result the previous 35 U.S.C. 112(a) rejections are withdrawn.
Regarding the 35 U.S.C. 103 rejections, the Applicant amended independent claims then present several arguments.
First, the Applicant argued that “Lucas discloses "determining when a normalized color difference (such as a sum of absolute difference (SAD) score) with respect to the original images meets a criteria, such as when it does or does not exceed a threshold in two or more views." Lucas, col. 9, line 66 through col. 10, line 2, emphasis added. Lucas further discloses "determining the L1 (normalized) color difference (SAD) score between the rendered image and the original image for each perspective or camera." Lucas, col. 19, lines 49-51. Comparing a normalized color difference of an original image to a rendered image does not teach or suggest identifying normal score based on a comparison of the sub-scores, where the sub-scores represent angles between (i) a normal vector of the first point, which is perpendicular to a surface of the 3D point cloud, and (i) a unit vector of each projection plane of multiple projection planes that surround the point cloud, as recited in Claim 1.." [para. 2 on page 13 of the Remarks]. Regarding Huang reference, the Applicant argued “this does not disclose or suggest the above-emphasized elements of Claim 1. Claim 1 recites identifying identify a number of sub-scores, where the sub-scores for the first point represent angles between (i) a normal vector of the first point, which is perpendicular to a surface of the 3D point cloud, and (ii) a unit vector of each projection plane of multiple projection planes that surround the point cloud.” [para. 4 on page 13 of the Remarks].           Examiner respectfully disagrees.  These arguments are based on a new limitation “sub-scores”.   It is  noted that the amended claims 1 and 11 define “the sub-scores for the first point represent angles between (i) a normal vector, which is perpendicular to a surface of the 3D point cloud, of the first point, and (ii) a unit vector of each projection plane of multiple projection planes that surround the point cloud”. However, paragraph [0080] of the original specification defines the sub-scores as follow, “multiple sub-scores are assigned for each point with respect to each projection plane, where the score is the inner product of the normal vector of the one point and the unit vector of each plane” [para. 0080; Fig. 4D].  Therefore, the amended claims are contradicted with the description in the specification. Hence the argument related to “the sub-scores for the first point represent angles between (i) a normal vector, which is perpendicular to a surface of the 3D point cloud, of the first point, and (ii) a unit vector of each projection plane of multiple projection planes that surround the point cloud
Second, the Applicant argued that “Lucas fails to teach or suggest an encoding device that identifies "identify normal scores for each of the points of the 3D point cloud, the normal scores indicating a proximity between a normal vector of a point perpendicular to a surface of the 3D point cloud to a normal vector of each projection plane of multiple projection planes." [para. 2 on page 13 of the Remarks].             Examiner respectfully disagrees.  It is noted that in the amendment, the Applicant canceled the following limitations, “the normal scores indicating a proximity between a normal vector of a point perpendicular to a surface of the 3D point cloud to a normal vector of each projection plane of multiple projection planes."  As a result, a reply to this argument is irrelevant. 
Third, the Applicant argued that the cited references do not teach following limitation “Simply encoding a vectors indicating the surface direction of the point cloud does not teach or suggest "identify a normal score for each of the points of the 3D point cloud, based on a comparison of the sub-scores associated with each respective point of the 3D point cloud" where a sub-score "represent angles between (i) a normal vector of the first point, which is perpendicular to a surface of the 3D point."
Examiner respectfully disagrees.  The argument is also based on the “sub-scores”.  It is noted that the original specification mentions the limitation “sub-scores” only three times, and they are all included in paragraph [0080], “multiple sub-scores are assigned for each point with respect to each projection plane, where the score is the inner product of the normal vector of the one point and the unit vector of each plane. For instance, if there are six projection planes (as shown in diagram 430), the indices of projected planes could be integer numbers from 0 to 5. It is noted that the each of the multiple sub-scores relates the angle of the normal vector of the point with the angle of each projection plane, such that the projection plane that is selected for a particular point is based on projection plane whose angle is closest to the normal vector of the point. The plane having the largest sub-score is specified as the initial cluster index of that point” [para. 0080; Fig. 4D].  As shown in the recited citation above, there is nowhere in the identify a normal score for each of the points of the 3D point cloud, based on a comparison of the sub-scores associated with each respective point of the 3D point cloud" where a sub-score "represent angles between (i) a normal vector of the first point, which is perpendicular to a surface of the 3D point” is not persuasive.

Claim Rejection – 35 U.S.C. § 112
The following is a quotation of 35 U.S.C. 112(a): 
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention. 
The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112: 
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claims 1-20 are rejected under 35 U.S.C. 112(a) or pre-AIA  35 U.S.C. 112, first paragraph because of a new matter. Amended independent claims 1 and 11 include following claim limitations: “the sub-scores for the first point represent angles between (i) a normal vector, which is perpendicular to a surface of the 3D point cloud, of the first point, and (ii) a unit vector of each projection plane of multiple projection planes that surround the point cloud”. It is noted that paragraph [0080] of the original specification defines the sub-scores as follow, “multiple sub-scores are assigned for each point with respect to each projection plane, where the score is the inner product of the normal vector of the one point and the unit vector of each plane” [para. the sub-scores for the first point represent angles between (i) a normal vector, which is perpendicular to a surface of the 3D point cloud, of the first point, and (ii) a unit vector of each projection plane of multiple projection planes that surround the point cloud” is a new matter, which is not described in the application as originally filed. Therefore, the claims 1, 11, and their dependent claims are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph. The new matter is required to be canceled from the claims (Please see MPEP 608.04).  
Claims 1-20 are rejected under 35 U.S.C. 112(a) or pre-AIA  35 U.S.C. 112, first paragraph because of a new matter. Amended independent claims 1 and 11 include following claim limitations: “based on a comparison of the sub-scores associated with each respective point of the 3D point cloud”. It is noted that the original specification only mentions “sub-scores” three times, and they are all in paragraph [0080], “multiple sub-scores are assigned for each point with respect to each projection plane, where the score is the inner product of the normal vector of the one point and the unit vector of each plane. For instance, if there are six projection planes (as shown in diagram 430), the indices of projected planes could be integer numbers from 0 to 5. It is noted that the each of the multiple sub-scores relates the angle of the normal vector of the point with the angle of each projection plane, such that the projection plane that is selected for a particular point is based on projection plane whose angle is closest to the normal vector of the point. The plane having the largest sub-score is specified as the initial cluster index of that point” [para. 0080; Fig. 4D].  There is no description related to comparing the sub-cores in the specification. Hence the claim limitation “based on a comparison of the sub-scores associated with each respective point of the 3D point cloud” is a new matter, which is not described in the application as originally filed. Therefore, the claims 1, 11, and their dependent claims are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph. The new matter is required to be canceled from the claims (Please see MPEP 608.04).  
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.         
            The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under pre-AIA  35 U.S.C. 103(a) are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
           This application currently names joint inventors. In considering patentability of the claims under pre-AIA  35 U.S.C. 103(a), the examiner presumes that the subject matter of the various claims was commonly owned at the time any inventions covered therein were made absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and invention dates of each claim that was not commonly owned at the time a later invention was made in order for the examiner to consider the applicability of pre-AIA  35 U.S.C. 

Claims 1-4, 6-7, 9, 11-14, 16-17 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Lucas (US Patent 10,867,430 B2), (“Lucas”), in view of Huang et al. (US Patent 10,062,207 B1), (“Huang”), in view of Boyce et al. (US Patent 10,893,299 B1), (“Boyce”).

Regarding claim 1, Lucas meets the claim limitations, as follows:
An encoding device ((i.e. encoder) [Lucas: col 24, line 51]; (i.e. image processing system) [Lucas: col 24, line 46]) for point cloud encoding (i.e. forming a point cloud using the image data) [Lucas: col 30, line 2], the encoding device ((i.e. encoder) [Lucas: col 24, line 51]; (i.e. image processing system) [Lucas: col 24, line 46]) comprising: a processor ((i.e. the image processing system 2900 may have one or more processors) [Lucas: col 24, line 46-47; Figs. 29-30] (i.e. graphics and/or video functionality may be integrated within a chipset. Alternatively, a discrete graphics and/or video processor may be used. As still another implementation, the graphics and/or video functions may be provided by a general purpose processor, including a multi-core processor. In further implementations, the functions may be implemented in a consumer electronics device.) [Lucas: col 26, line 17-24; Figs. 29-30]) configured to: segment an area including points (i.e. performing a rigorous yet computationally efficient initial segmentation process that provides good quality initial segmentation masks. Such initial segmentation combines the results of chroma-key segmentation, background subtraction, and neural network object detection. The combined result is refined by a boundary segmentation method such as active contours or graph cut algorithm) [Lucas: col 6, line 7-14] representing a three-dimensional (3D) point cloud into multiple voxels ((i.e. The method may comprise rendering of an individual local point volume at the point cloud and including 3D particles positioned within the individual local point volume and defining other local point volumes) [Lucas: col 30, line 20-24 – Note: point volumes are also known as voxels]; (i.e. Candidate pixel locations indicating landmarks from the segmentation are then tested to form an initial point cloud) [Lucas: col 6, line 14-16]; (i.e. wherein individual local point volumes are formed of at least one particle on the point cloud defining a volume having fixed real world dimensions relative to at least one object in the multiple images and that remains fixed from image to image of different perspectives; and providing an expanded and filtered point cloud to be used to generate images) [Lucas: col 30, line 7-13]), 
identify a number of sub-scores for each of the points of the 3D point cloud including a first point, the sub-scores for the first point represent angles between (i) a normal vector, which is perpendicular to a surface of the 3D point cloud, of the first point, and (ii) a unit vector of each projection plane of multiple projection planes that surround the point cloud ((i.e. The 2D patch image representations for a projection angle can be created by projecting only those points for which a projection angle has the closest normal. In other words, the 2D patch image representation is taken for the points that maximize the dot product of the point normal and the plane normal) [Boyce: col 30, line 37-42] – Note: The application specification define the sub-scores as “sub-scores are assigned for each point with respect to each projection plane, where the score is the inner product of the normal vector of the one point and the unit vector of each plane” [para. 0080; Fig. 4D]. Boyce discloses 2D patch image representations by points that maximize the dot product of the point normal and the plane normal. It is noted ); 
identify a normal score ((i.e. feature identifying algorithm that generates scores, such with a Shi-Tomasi Eigenvalue-based "corner" score. The process retains the points with such a score that is above a threshold and are maximal among samples within each grid cell. See for example, Shi, J., et al., "Good features to track", Cornell University (1993)) [Lucas: col 12, line 6-11]; (i.e. the match may be performed by determining when a normalized color difference (such as a sum of absolute difference (SAD) score) with respect to the original images) [Lucas: col 9, line 65-68]; (i.e. determining the L1 (normalized) color difference (SAD score) between the rendered image and the original image for each perspective or camera. Other options include a different metric for image comparison, such as L2 color difference, normalized correlation, normalize mutual information, etc .....) [Lucas: col 19, line 49-54]) for each of the points of the 3D point cloud ((i.e. feature identifying algorithm that generates scores, such with a Shi-Tomasi Eigenvalue-based "corner" score. The process retains the points with such a score that is above a threshold and are maximal among samples within each grid cell. See for example, Shi, J., et al., "Good features to track", Cornell University (1993)) [Lucas: col 12, line 6-11]; (i.e. A bracket search is used again to assign a depth estimate near the pivot point to the neighbor point, and by one form, by determining the CENSUS score within the bracket. These neighbor points then become 3D particles that populate the latest expanded point cloud, and each such neighbor point then becomes its own LPV) [Lucas: col 8, line 52-58]; (i.e. Stereo-matching confidence scores based on local region descriptors for image data (also referred to herein as representations) are then used to select the best depth estimate for the point being analyzed. By one form, this involves an initial selection by using a gradient histogram-type of local region descriptor such as a DAISY score) [Lucas: col 6, line 21-28]) based on a comparison of the sub-scores associated with each respective point of the 3D point cloud ((i.e. determining the L1 (normalized) color difference (SAD score) between the rendered image and the original image for each perspective or camera. Other options include a different metric for image comparison, such as L2 color difference, normalized correlation, normalize mutual information, etc .....) [Lucas: col 19, line 49-54]; (i.e. The 2D patch image representations for a projection angle can be created by projecting only those points for which a projection angle has the closest normal. In other words, the 2D patch image representation is taken for the points that maximize the dot product of the point normal and the plane normal) [Boyce: col 30, line 37-42]; (i.e. According to one embodiment, normals processing mechanism 2010 explicitly codes surface normals with point cloud geometry to provide for accurate geometric reconstructions of objects during rendering. In such an embodiment, normals processing mechanism 2010 may implement the surface normals to perform subjective artifact reduction while rendering the objects) [Boyce: col 36, line 27-33]), 
identify a smoothing score for each of the multiple voxels ((i.e. Process 400 may include "smooth point cloud with shrink wrapping" 498. Starting with the visual hull represented as a mesh, the method may shrink wrap (see Dale, A. M., "Cortical surface-based analysis: I. Segmentation and surface reconstruction", Neuroimage 9.2, pp. 179-194 (1999)) the point cloud by moving mesh vertexes closer to the original point cloud, subject to regularization so that the resultant point cloud is smooth. The topology of the mesh is discarded because triangle quality tends to be poor when vertices are spaced close together) [Lucas: col 20, line 53-62]; (i.e. the point is then refined using a non-parametric intensity-based confidence score, such as a CENSUS score) [Lucas: col 6, line 28-30]; (i.e. A bracket search is used again to assign a depth estimate near the pivot point to the neighbor point, and by one form, by determining the CENSUS score within the bracket. These neighbor points then become 3D particles that populate the latest expanded point cloud, and each such neighbor point then becomes its own LPV) [Lucas: col 8, line 52-58]; (i.e. the CENSUS pixel area that is being applied as explained below. This approach minimizes the amount of work that has to be done by artists to clean up resulting point clouds for film and/or video production which may be at the expense of more computation time. The method is biased towards erring on the side of false positives instead of false negatives by having such a robust seeding and expansion of points image by image such that corresponding points on different images could each have its own candidate point in the point cloud resulting in some redundancy. This is ultimately more efficient because it is easier for artists to manually remove extraneous points (which is a relatively easier 2D task) than to complete missing structures by sculpting (which is a relatively more difficult 3D task)) [Lucas: col 6, line 37-51]) that include at least one of the points of the 3D point cloud ((i.e. wherein individual local point volumes are formed of at least one particle on the point cloud defining a volume having fixed real world dimensions relative to at least one object in the multiple images and that remains fixed from image to image of different perspectives) [Lucas: col 9, line 24-28]; (i.e. filtering iterations are performed for each or individual expansion iterations. The filtering is performed by setting a local point volume (LPV) (i.e. 3D point) at each 2D sample location with an estimated depth determined by ray-casting, mentioned previously, that was the center pixel of a patch window used for computing a CENSUS score) [Lucas: col 6, line 52-57]), wherein the smoothing score is based on a number of points in a voxel that correspond to different projection planes of the multiple projection planes (i.e. A stereo matching technique is then applied to perform depth estimation for seeds in the initial point cloud. Rays are traced from the camera center of a first view or image, through a seed point on the first image. Ray positions are projected onto the view of another overlapping second camera, or second view or image, to determine a linear range or bracket of potential depth estimates along the ray and within the second view or image) [Lucas: col 8, line 4-11],group each point of the 3D point cloud ((i.e. Process 400 may include "generate point cloud by combining the LPVs") [Lucas: col 18, line 55-56]; (i.e. constructing the point cloud by combining the local point volumes) [Lucas: col 30, line 17-18]; (i.e. a collection of points or pixel locations from a single image that are assigned depth values is referred to herein as a depth map of a single image, while a 3D object formed by combining multiple images from different perspectives is represented as a point cloud comprised of an unstructured collection of points with associated colors and normal directions.) [Lucas: col 5, line 18-24]; (i.e. rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 47-48]; (i.e. one stereo technique uses a patch to determine if points on one image match points on another image rather than the more conventional scanline matching. The patches are better for capturing similar pixel data on images of two different perspectives) [Lucas: col 5, line 28-32]) to one of the multiple projection planes ((i.e. A stereo matching technique is then applied to perform depth estimation for seeds in the initial point cloud. Rays are traced from the camera center of a first view or image, through a seed point on the first image. Ray positions are projected onto the view of another overlapping second camera, or second view or image, to determine a linear range or bracket of potential depth estimates along the ray and within the second view or image. By one form, the process is repeated for all pairs of cameras with overlapping fields of view, although other alternatives could be used) [Lucas: col 8, line 4-14]; (i.e. a set of points contained within the silhouette of the visual hull when reprojected into individual camera views) [Lucas: col 12, line 14-16]; (i.e. the LPV or sphere will have its 1 cm diameter cover that pixel length of 5, 10, 100, and so on. Spheres are the selected shape due to software rendering efficiency so that their projection onto any camera image is not less than a pixel in width and when rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 42-48]) based on the normal score (i.e. determining the L1 (normalized) color difference (SAD score) between the rendered image and the original image for each perspective or camera. Other options include a different metric for image comparison, such as L2 color difference, normalized correlation, normalize mutual information, etc .....) [Lucas: col 19, line 49-54]) and the smoothing score (i.e. Stereo-matching confidence scores are then used to select the best depth estimate for the seed point being analyzed. By one form, and as mentioned above, this involves an initial selection by using the DAISY score, while the depth estimate is then refined using the CENSUS score.) [Lucas: col 8, line 15-20] to generate refined patches that represent the 3D point cloud ((i.e. Stereo-matching confidence scores are then used to select the best depth estimate for the seed point being analyzed. By one form, and as mentioned above, this involves an initial selection by using the DAISY score, while the depth estimate is then refined using the CENSUS score.) [Lucas: col 8, line 15-20]; (i.e. To improve localization of landmarks, the highest score of a CENSUS metric using a 7x7 pixel patch with a bracketed line search is performed where the local CENSUS search bracket 1032 and 1106 are respectively shown on FIGS. 10 and 11. CENSUS is another stereo-matching descriptor and refers to a non-parametric intensity-based image data representation over a certain pixel area referred to as a CENSUS transform that summarizes local image structure by providing a bit string (or in other words, transforms image data into a representation). The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel) [Lucas: col 14, line 32-45]), generate two-dimensional (2D) frames (i.e. provide an expanded and filtered point cloud to be used to generate images) [Lucas: col 10, line 3-5] that include pixels that represent the refined patches ((i.e. provide an expanded and filtered point cloud to be used to generate images" 316. The final point cloud then may be provided first for post-processing to refine the points, which may include traditional space carving, as described below, and then for modeling, display, or analysis as needed depending on the application and as described below as well) [Lucas: col 10, line 3-9]; (i.e. To improve localization of landmarks, the highest score of a CENSUS metric using a 7x7 pixel patch with a bracketed line search is performed where the local CENSUS search bracket 1032 and 1106 are respectively shown on FIGS. 10 and 11. CENSUS is another stereo-matching descriptor and refers to a non-parametric intensity-based image data representation over a certain pixel area referred to as a CENSUS transform that summarizes local image structure by providing a bit string (or in other words, transforms image data into a representation). The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel) [Lucas: col 14, line 32-45]), and encode the 2D frames to generate a bitstream; and a communication interface (i.e. An analog or digital interface may be used to communicatively couple graphics subsystem 3015 and display 3020. For example, the interface may be any of a High-Definition Multimedia Interface, Display Port, wireless HDMI, and/or wireless HD compliant techniques. Graphics subsystem 3015 may be integrated into processor 3010 or chipset 3005) [Lucas: col 26, line 6-12; Figs. 29-30] operably coupled to the processor ((i.e. In various implementations, content services device(s) 3030 may include a cable television box, personal computer, network, telephone, Internet enabled devices or appliance capable of delivering digital information and/or content, and any other similar device capable of unidirectionally or bidirectionally communicating content between content providers and platform 3002 and/display 3020, via network 3060 or directly. It will be appreciated that the content may be communicated unidirectionally and/or bidirectionally to and from any one of the components in system 3000 and a content provider via network 3060) [Lucas: col 26, line 60 – col. 27, line 3]; (i.e. processors 2920 may be communicatively coupled to both the image device 2902 and the logic modules 2904 for operating those components) [Lucas: col 25, line 1-3; Figs. 29-30]), the communication interface configured to transmit the bitstream ((i.e. logic modules 2904 may communicate remotely with, or otherwise may be communicatively coupled to, the imaging device 2902 for further processing of the image data) [Lucas: col 23, line 15-18]; (i.e. Program logic may allow platform 3002 to stream content to media adaptors or other content services device(s) 3030 or content delivery device(s)) [Lucas: col 27, line 40-43]).  
Lucas does not explicitly disclose the following claim limitations (Emphasis added).
An encoding device for point cloud encoding, the encoding device comprising: a processor configured to: segment an area including points representing a three-dimensional (3D) point cloud into multiple voxels,
identify a number of sub-scores for each of the points of the 3D point cloud including a first point, the sub-scores for the first point represent angles between (i) a normal vector, which is perpendicular to a surface of the 3D point cloud, of the first point, and (ii) a unit vector of each projection plane of multiple projection planes that surround the point cloud, identify a normal score for each of the points of the 3D point cloud, based on a comparison of the sub-scores associated with each respective point of the 3D point cloud,  identify a smoothing score for each of the multiple voxels that include at least one of the points of the 3D point cloud, wherein the smoothing score is based on a number of points in a voxel that correspond to different projection planes of the multiple projection planes, group each point of the 3D point cloud to one of the multiple projection planes based on the normal score and the smoothing score to generate refined patches that represent the 3D point cloud, generate two-dimensional (21D) frames that include pixels that represent the refined patches, and encode the 2D frames to generate a bitstream; and a communication interface operably coupled to the processor, the communication interface configured to transmit the bitstream. 
However, in the same field of endeavor Huang further discloses the claim limitations and the deficient claim limitations, as follows:
(i.e. a confidence score f, based on smoothness analysis) [Huang: col 8, line 31] (i.e. a confidence score f, based on smoothness analysis) [Huang: col 8, line 31] is based on a number of points in a voxel that correspond to different projection planes of the multiple projection planes ((i.e. Third, each of the iso-points is performed with a confidence score calculation. Specifically, different methods firstly are used to calculate a missing rate of point cloud data at the point (i.e., a confidence score fg based on distance field gradient analysis) and a possibility of the point belonging to a detail part (i.e., a confidence score f, based on smoothness analysis). After that, the two confidence scores are multiplied to be mixed as a final confidence score fk=*fs) [Huang: col 8, line 26-34]; (i.e. For the confidence score fs based on smooth analysis: the confidence score of smooth analysis for each sk is calculated by the following method. The higher the confidence score of the smooth analysis, the more irregular the local point cloud distribution is, which is more possible to be a detail area that needs more scanning. nk is a normal vector of sk, Qk represents K numbers of adjacent points of sk (i.e., the nearest K numbers of already-scanned points q adjacent thereto, and the number of K in the illustrated embodiment generally is 100), hk indicates a distance from the farthest kth point to sk.) [Huang: col 4, line 57-67]),
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Lucas with Huang to program the system to calculate normal scores and performs smooth analysis.  
Therefore, the combination of Lucas with Huang will enable the system to calculate a missing rate of point cloud data at a certain point [Huang: col 4, line 37-41]. 
Lucas and Huang do not explicitly disclose the following claim limitations (Emphasis added).
identify normal scores for each of the points of the 3D point cloud, the normal scores indicating a proximity between a normal vector of a point perpendicular to a surface of the 3D point cloud to a normal vector of each projection plane of multiple projection planes,
….
encode the 2D frames to generate a bitstream;
However, in the same field of endeavor Boyce further discloses the claim limitations and the deficient claim limitations, as follows:
the normal scores indicating a proximity between a normal vector of a point perpendicular to a surface of the 3D point cloud ((i.e. wherein encoding the surface normals data with the point cloud geometry data comprises encoding a vector perpendicular to each point on a surface of an object to provide directions associated with the object) [Boyce: col 40, line 5-8]; (i.e. At processing block 2280, the point cloud data objects are rendered using the surface normal data. Thus, the perpendicular vectors associated with object points are implemented to provide surface directions for each object during rendering. Accordingly, implementation
of the surface normals provides enhanced reconstruction of the original geometry of objects in the point cloud data) [Boyce: col 37, line 9-15]) to a normal vector of each projection plane of multiple projection planes (i.e. In the encoding system 1900 of FIG. 19A, a point cloud frame is projected onto several two-dimensional (2D) planes, each 2D plane corresponding to a projection angle. The projection planes can be similar to the projection planes 1512 of FIG. 15B. In some implementations, six projection angles are used in the PCC standard test model, with the projection angles corresponding to angles pointing to the centers of six faces of a rectangular solid that bound the object represented by the point cloud data. While six projection angles are described, other number of angles could possibly be used in different implementations) [Boyce: col 30, line 24-35],
planes ((i.e. The multiple projection planes 1512 can be used to generate multiple two-dimensional (2D) projections, each projection associated with a projection plane) [Boyce: col 27, line 59-62]; (i.e. In the encoding system 1900 of FIG. 19A, a point cloud frame is projected onto several two-dimensional (2D) planes, each 2D plane corresponding to a projection angle. The projection planes can be similar to the projection planes 1512 of FIG. 15B. In some implementations, six projection angles are used in the PCC standard test model, with the projection angles corresponding to angles pointing to the centers of six faces of a rectangular solid that bound the object represented by the point cloud data. While six projection angles are described, other number of angles could possibly be used in different implementations) [Boyce: col 30, line 24-35]),
….
encode the 2D frames to generate a bitstream (i.e. The apparatus includes one or more processors to encode surface normals data with point cloud geometry data included in the video bit stream data for reconstruction of objects within the video bit stream data based on the surface normals data) [Boyce: Abstract]; (i.e. one embodiment, encoder 2400 performs HEVC coding on the video bit stream data. In such an embodiment, normals encoder 2406 uses the high level syntax to indicate that the video bit stream includes normal data. For instance, normals encoder 2106 may encode the 3-dimensional (X, Y, Z) normals data within 3 color components ( e.g., Red, Green, Blue (RGB), or YUV) having a fixed bit depth ( e.g., 8, 10, 12, 16).) [Boyce: col 37, line 36-44; Figs. 24A-B];
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Lucas and Huang with Boyce to encode the image data into the bitstream.  


Regarding claim 2, Lucas meets the claim limitations as set forth in claim 1.Lucas further meets the claim limitations as follow.
The encoding device of Claim 1 ((i.e. encoder) [Lucas: col 24, line 51]; (i.e. image processing system) [Lucas: col 24, line 46]), wherein the processor (i.e. the image processing system 2900 may have one or more processors) [Lucas: col 24, line 46-47; Figs. 29-30] is further configured to: identify neighboring voxels that are proximate to each of the multiple voxels ((i.e. identified using a nearest-neighbor lookup to first connect neighboring points that are within a specified distance) [Lucas: col 20, line 28-30]; (i.e. this operation may
include first expanding or growing an initial point cloud (or at least the points that could be used to form an initial point cloud when the initial point cloud is not actually generated) by providing depth estimates to points neighboring a pivot point that already has a depth estimate determined by the ray-casting process. By one form, neighboring points are the directly adjacent pixel locations to a current, center, or other key pixel (or pixel location), and by one example, is the adjacent upper, lower, left and right pixels relative to the pivot pixel. Many other variations are contemplated such as including the diagonal pixel locations and/or any other pattern that includes pixel locations within a certain range or distance from the pivot pixel.) [Lucas: col 8, line 35-48]) that include at least one of the points of the 3D point cloud ((i.e. The point cloud may be severely oversampled in some regions because the expansion phase will add points in overlapping regions. Thus, process 400 may include "generate a visual hull" 497. To reduce the point cloud density, a visual hull is generated (see for visual hall, Kutulakos, K. N., et al. cited above) of the reconstruction using depth maps rendered from the point cloud using traditional space carving, and providing the advantages mentioned above that compensate for the stereo techniques) [Lucas: col 20, line 34-43] ; (i.e. The expansion may be performed by analyzing each image in 2D, image by image, determining which points from the initial or latest point cloud have neighbor points that still need a depth estimate, and then analyzing those points. A bracket search is used again to assign a depth estimate near the pivot point to the neighbor point, and by one form, by determining the CENSUS score within the bracket. These neighbor points then become 3D particles that populate the latest expanded point cloud, and each such neighbor point then becomes its own LPV) [Lucas: col 8, line 48-58]), wherein each of the neighboring voxels that are identified include at least one of the points of the 3D point cloud ((i.e. identified using a nearest-neighbor lookup to first connect neighboring points that are within a specified distance) [Lucas: col 20, line 28-30]; (i.e. It also will be understood that the LPVs of the point cloud also are formed by the neighbor particles added during the expansion iterations when used) [Lucas: col 9, line 50-52]); and generate, for the 3D point cloud (i.e. The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel. By one example, the bit stream is merely a count of the number of neighbor pixels with an intensity less than the key pixel. Thus, the CENSUS is a characterization, descriptor, or representation of image data for comparison purposes between one image and another image, and the bit stream may be a string of 1s and 0s where 1s indicate a pixel intensity less than the key pixel. Therefore, each pixel on an image can have a CENSUS score that indicates the difference between such a CENSUS of a reference pixel on one image compared to a CENSUS of a corresponding current or candidate pixel on another image. The CENSUS score may be determined by hamming distance between the two bit strings of corresponding pixel location patches on two different images. See Hirschmuller, et al., "Evaluation of stereo matching costs on images with radiometric differences", IEEE Transactions on Pattern Analysis & Machine Intelligence, pp. 1582-1599 (2008); and Zabih, R., et al., "Non-parametric local transforms for computing visual correspondence", European conference on computer vision, Springer, Berlin, Heidelberg, pp. 151-158, (1994)).) [Lucas: col 14, line 41-64], a plurality of initial patches ((i.e. the spatial support for the patch expands, and a distinct maximum CENSUS score is more likely to be observed at the lower resolutions, albeit possibly at the expense of less accuracy in depth) [Lucas: col 15, line 51-55]; (i.e. performing a rigorous yet computationally efficient initial segmentation process that provides good quality initial segmentation masks. Such initial segmentation combines the results of chroma-key segmentation, background subtraction, and neural network object detection. The combined result is refined by a boundary segmentation method such as active contours or graph cut algorithm) [Lucas: col 6, line 7-14]; (i.e. Stereo-matching confidence scores are then used to select the best depth estimate for the seed point being analyzed. By one form, and as mentioned above, this involves an initial selection by using the DAISY score, while the depth estimate is then refined using the CENSUS score.) [Lucas: col 8, line 15-20]; (i.e. To improve localization of landmarks, the highest score of a CENSUS metric using a 7x7 pixel patch with a bracketed line search is performed where the local CENSUS search bracket 1032 and 1106 are respectively shown on FIGS. 10 and 11. CENSUS is another stereo-matching descriptor and refers to a non-parametric intensity-based image data representation over a certain pixel area referred to as a CENSUS transform that summarizes local image structure by providing a bit string (or in other words, transforms image data into a representation). The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel) [Lucas: col 14, line 32-45]).

Regarding claim 3, Lucas meets the claim limitations as set forth in claim 2.Lucas further meets the claim limitations as follow.
The encoding device of Claim 2 ((i.e. encoder) [Lucas: col 24, line 51]; (i.e. image processing system) [Lucas: col 24, line 46]), wherein a quantity of the neighboring voxels that are proximate to one voxel of the multiple voxels is limited (i.e. neighboring points that are within a specified distance) [Lucas: col 20, line 29-30] based on at least one of: a predefined distance from the one voxel of the multiple voxels (i.e. These are identified using a nearest-neighbor lookup to first connect neighboring points that are within a specified distance. Small clusters are then removed based on the spatial extent and number of points in the connected component) [Lucas: col 20, line 29-31], and a predefined quantity of points inside the neighboring voxels (i.e. The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel. By one example, the bit stream is merely a count of the number of neighbor pixels with an intensity less than the key pixel) [Lucas: col 14, line 41-47].

Regarding claim 4, Lucas meets the claim limitations as set forth in claim 2.Lucas further meets the claim limitations as follow.
The encoding device of Claim 2 ((i.e. encoder) [Lucas: col 24, line 51]; (i.e. image processing system) [Lucas: col 24, line 46]), wherein: the refined patches are generated based on the plurality of initial patches ((i.e. Referring to FIGS. 1-2, stereo methods perform well on scenes with highly textured surfaces even though segmentation masks are not necessarily highly accurate. For instance, one stereo technique uses a patch to determine if points on one image match points on another image rather than the more conventional scanline matching. The patches are better for capturing similar pixel data on images of two different perspectives) [Lucas: col 5, line 25-32]; (i.e. The CENSUS score may be determined by hamming distance between the two bit strings of corresponding pixel location patches on two different images) [Lucas: col 14, line 55-58]); and to generate (i.e. generate) [Lucas: col 18, line 55] the plurality of initial patches (i.e. the pixel-limitations of the patches that fix the pivot points (the center of the pixel patches) on object locations) [Lucas: col 24, line 46], the processor is configured to (i.e. the image processing system 2900 may have one or more processors) [Lucas: col 24, line 46-47; Figs. 29-30]: identify the normal vector for each of the points of the 3D point cloud (i.e. a 3D object formed by combining multiple images from different perspectives is
represented as a point cloud comprised of an unstructured collection of points with associated colors and normal directions) [Lucas: col 5, line 25-32] that is perpendicular to an external surface of the 3D point cloud, and generate the initial patches by grouping each of the points of the 3D point cloud ((i.e. Process 400 may include "generate point cloud by combining the LPVs") [Lucas: col 18, line 55-56]; (i.e. constructing the point cloud by combining the local point volumes) [Lucas: col 30, line 17-18]; (i.e. a collection of points or pixel locations from a single image that are assigned depth values is referred to herein as a depth map of a single image, while a 3D object formed by combining multiple images from different perspectives is represented as a point cloud comprised of an unstructured collection of points with associated colors and normal directions.) [Lucas: col 5, line 18-24]; (i.e. rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 47-48]; (i.e. one stereo technique uses a patch to determine if points on one image match points on another image rather than the more conventional scanline matching. The patches are better for capturing similar pixel data on images of two different perspectives) [Lucas: col 5, line 28-32]) to one of the multiple projection planes ((i.e. A stereo matching technique is then applied to perform depth estimation for seeds in the initial point cloud. Rays are traced from the camera center of a first view or image, through a seed point on the first image. Ray positions are projected onto the view of another overlapping second camera, or second view or image, to determine a linear range or bracket of potential depth estimates along the ray and within the second view or image. By one form, the process is repeated for all pairs of cameras with overlapping fields of view, although other alternatives could be used) [Lucas: col 8, line 4-14]; (i.e. a set of points contained within the silhouette of the visual hull when reprojected into individual camera views) [Lucas: col 12, line 14-16]; (i.e. the LPV or sphere will have its 1 cm diameter cover that pixel length of 5, 10, 100, and so on. Spheres are the selected shape due to software rendering efficiency so that their projection onto any camera image is not less than a pixel in width and when rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 42-48]; (i.e. Stereo-matching confidence scores are then used to select the best depth estimate for the seed point being analyzed. By one form, and as mentioned above, this involves an initial selection by using the DAISY score, while the depth estimate is then refined using the CENSUS score.) [Lucas: col 8, line 15-20]; (i.e. To improve localization of landmarks, the highest score of a CENSUS metric using a 7x7 pixel patch with a bracketed line search is performed where the local CENSUS search bracket 1032 and 1106 are respectively shown on FIGS. 10 and 11. CENSUS is another stereo-matching descriptor and refers to a non-parametric intensity-based image data representation over a certain pixel area referred to as a CENSUS transform that summarizes local image structure by providing a bit string (or in other words, transforms image data into a representation). The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel) [Lucas: col 14, line 32-45]) that the normal vector is directed towards (i.e. a 3D object formed by combining multiple images from different perspectives is represented as a point cloud comprised of an unstructured collection of points with associated colors and normal directions) [Lucas: col 5, line 25-32].
Lucas and Huang do not explicitly disclose the following claim limitations (Emphasis added).
The encoding device of Claim 2, wherein: the refined patches are generated based on the plurality of initial patches; and to generate the plurality of initial patches, the processor is configured to: identify a normal vector for each of the points of the 3D point cloud that is perpendicular to an external surface of the 3D point cloud, and generate the initial patches by grouping each of the points of the 3D point cloud to one of the multiple projection planes that the normal vector is directed towards.  

identify a normal vector for each of the points of the 3D point cloud (i.e. The 2D patch image representations for a projection angle can be created by projecting only those points for which a projection angle has the closest normal. In other words, the 2D patch image representation is taken for the points that maximize the dot product of the point normal and the plane normal) [Boyce: col 30, line 37-42] that is perpendicular to an external surface of the 3D point cloud ((i.e. wherein encoding the surface normals data with the point cloud geometry data comprises encoding a vector perpendicular to each point on a surface of an object to provide directions associated with the object) [Boyce: col 40, line 5-8]; (i.e. At processing block 2280, the point cloud data objects are rendered using the surface normal data. Thus, the perpendicular vectors associated with object points are implemented to provide surface directions for each object during rendering. Accordingly, implementation
of the surface normals provides enhanced reconstruction of the original geometry of objects in the point cloud data) [Boyce: col 37, line 9-15]), and generate the initial patches by grouping each of the points of the 3D point cloud to one of the multiple projection planes that the normal vector is directed towards (i.e. Texture and depth 2D image patch representations are formed at each projection angle. The 2D patch image representations for a projection angle can be created by projecting only those points for which a projection angle has the closest normal. In other words, the 2D patch image representation is taken for the points that maximize the dot product of the point normal and the plane normal. Texture patches from the separate projections are combined into a single texture image, which is referred to as the geometry image. Metadata to represent the patches and how they were packed into a frame are described in the occupancy map and auxiliary patch info. The occupancy map metadata includes an indication of which image sample positions are empty (e.g., do not contain corresponding point cloud information). The auxiliary patch info indicates the projection plane to which a patch belongs and can be used to determine a projection plane associated with a given sample position) [Boyce: col 30, line 36-52].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Lucas, Huang, and Boyce to program the system to computer the normal vector which is perpendicular to the external surface of the 3D point cloud.  
Therefore, the combination of Lucas and Huang with Boyce will enable the system to calculate a conventional normal vector defined in the arts.

Regarding claim 6, Lucas meets the claim limitations as set forth in claim 5.Lucas further meets the claim limitations as follow.
The encoding device of Claim 5 ((i.e. encoder) [Lucas: col 24, line 51]; (i.e. image processing system) [Lucas: col 24, line 46]), wherein the set of smoothing scores for the one voxel ((i.e. Process 400 may include "smooth point cloud with shrink wrapping" 498. Starting with the visual hull represented as a mesh, the method may shrink wrap (see Dale, A. M., "Cortical surface-based analysis: I. Segmentation and surface reconstruction", Neuroimage 9.2, pp. 179-194 (1999)) the point cloud by moving mesh vertexes closer to the original point cloud, subject to regularization so that the resultant point cloud is smooth. The topology of the mesh is discarded because triangle quality tends to be poor when vertices are spaced close together) [Lucas: col 20, line 53-62]; (i.e. the point is then refined using a non-parametric intensity-based confidence score, such as a CENSUS score) [Lucas: col 6, line 28-30]; (i.e. A bracket search is used again to assign a depth estimate near the pivot point to the neighbor point, and by one form, by determining the CENSUS score within the bracket. These neighbor points then become 3D particles that populate the latest expanded point cloud, and each such neighbor point then becomes its own LPV) [Lucas: col 8, line 52-58]; (i.e. the CENSUS pixel area that is being applied as explained below. This approach minimizes the amount of work that has to be done by artists to clean up resulting point clouds for film and/or video production which may be at the expense of more computation time. The method is biased towards erring on the side of false positives instead of false negatives by having such a robust seeding and expansion of points image by image such that corresponding points on different images could each have its own candidate point in the point cloud resulting in some redundancy. This is ultimately more efficient because it is easier for artists to manually remove extraneous points (which is a relatively easier 2D task) than to complete missing structures by sculpting (which is a relatively more difficult 3D task)) [Lucas: col 6, line 37-51]) indicates a number of points within a respective voxel (i.e. find potential 3D points using DAISY features and refined by CENSUS stereo matching) [Lucas: col 21, line 31-33] that are grouped to each of the projection planes ((i.e. Process 400 may include "generate point cloud by combining the LPVs") [Lucas: col 18, line 55-56]; (i.e. constructing the point cloud by combining the local point volumes) [Lucas: col 30, line 17-18]; (i.e. a collection of points or pixel locations from a single image that are assigned depth values is referred to herein as a depth map of a single image, while a 3D object formed by combining multiple images from different perspectives is represented as a point cloud comprised of an unstructured collection of points with associated colors and normal directions.) [Lucas: col 5, line 18-24]; (i.e. rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 47-48]; (i.e. one stereo technique uses a patch to determine if points on one image match points on another image rather than the more conventional scanline matching. The patches are better for capturing similar pixel data on images of two different perspectives) [Lucas: col 5, line 28-32]; (i.e. A stereo matching technique is then applied to perform depth estimation for seeds in the initial point cloud. Rays are traced from the camera center of a first view or image, through a seed point on the first image. Ray positions are projected onto the view of another overlapping second camera, or second view or image, to determine a linear range or bracket of potential depth estimates along the ray and within the second view or image. By one form, the process is repeated for all pairs of cameras with overlapping fields of view, although other alternatives could be used) [Lucas: col 8, line 4-14]; (i.e. a set of points contained within the silhouette of the visual hull when reprojected into individual camera views) [Lucas: col 12, line 14-16]; (i.e. the LPV or sphere will have its 1 cm diameter cover that pixel length of 5, 10, 100, and so on. Spheres are the selected shape due to software rendering efficiency so that their projection onto any camera image is not less than a pixel in width and when rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 42-48]) as indicated by the initial patches ((i.e. one stereo technique uses a patch to determine if points on one image match points on another image rather than the more conventional scanline matching. The patches are better for capturing similar pixel data on images of two different perspectives) [Lucas: col 5, line 28-32]; (i.e. To improve localization of landmarks, the highest score of a CENSUS metric using a 7x7 pixel patch with a bracketed line search is performed where the local CENSUS search bracket 1032 and 1106 are respectively shown on FIGS. 10 and 11. CENSUS is another stereo-matching descriptor and refers to a non-parametric intensity-based image data representation over a certain pixel area referred to as a CENSUS transform that summarizes local image structure by providing a bit string (or in other words, transforms image data into a representation). The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel) [Lucas: col 14, line 32-45]).

Regarding claim 7, Lucas meets the claim limitations as set forth in claim 1.Lucas further meets the claim limitations as follow.
The encoding device of Claim 1 ((i.e. encoder) [Lucas: col 24, line 51]; (i.e. image processing system) [Lucas: col 24, line 46]), wherein to identify the normal score ((i.e. feature identifying algorithm that generates scores, such with a Shi-Tomasi Eigenvalue-based "corner" score. The process retains the points with such a score that is above a threshold and are maximal among samples within each grid cell. See for example, Shi, J., et al., "Good features to track", Cornell University (1993)) [Lucas: col 12, line 6-11]; (i.e. the match may be performed by determining when a normalized color difference (such as a sum of absolute difference (SAD) score) with respect to the original images) [Lucas: col 9, line 65-68]; (i.e. determining the L1 (normalized) color difference (SAD score) between the rendered image and the original image for each perspective or camera. Other options include a different metric for image comparison, such as L2 color difference, normalized correlation, normalize mutual information, etc .....) [Lucas: col 19, line 49-54]), the processor (i.e. Graphics subsystem 3015 may be integrated into processor 3010 or chipset 3005) [Lucas: col 26, line 11-12; Figs. 29-30] is configured to: identify the normal vector of the first point of the 3D point cloud ((i.e. feature identifying algorithm that generates scores, such with a Shi-Tomasi Eigenvalue-based "corner" score. The process retains the points with such a score that is above a threshold and are maximal among samples within each grid cell. See for example, Shi, J., et al., "Good features to track", Cornell University (1993)) [Lucas: col 12, line 6-11]; (i.e. the match may be performed by determining when a normalized color difference (such as a sum of absolute difference (SAD) score) with respect to the original images) [Lucas: col 9, line 65-68]; (i.e. determining the L1 (normalized) color difference (SAD score) between the rendered image and the original image for each perspective or camera. Other options include a different metric for image comparison, such as L2 color difference, normalized correlation, normalize mutual information, etc .....) [Lucas: col 19, line 49-54]), the normal vector is perpendicular to an external surface of the 3D point cloud; and compare (i.e. determining the L1 (normalized) color difference (SAD score) between the rendered image and the original image for each perspective or camera. Other options include a different metric for image comparison, such as L2 color difference, normalized correlation, normalize mutual information, etc .....) [Lucas: col 19, line 49-54] the normal vector of the first point to the unit vector of each of the multiple projection planes to identify the normal score of the first point ((i.e. comprehensive quality analysis to the existing point cloud, planning of the scanning path and automatic scanning and stitching to sections with low confidence scores) [Lucas: col 3, line 32-35]; (i.e. feature identifying algorithm that generates scores, such with a Shi-Tomasi Eigenvalue-based "corner" score. The process retains the points with such a score that is above a threshold and are maximal among samples within each grid cell. See for example, Shi, J., et al., "Good features to track", Cornell University (1993)) [Lucas: col 12, line 6-11]; (i.e. the match may be performed by determining when a normalized color difference (such as a sum of absolute difference (SAD) score) with respect to the original images) [Lucas: col 9, line 65-68]; (i.e. determining the L1 (normalized) color difference (SAD score) between the rendered image and the original image for each perspective or camera. Other options include a different metric for image comparison, such as L2 color difference, normalized correlation, normalize mutual information, etc .....) [Lucas: col 19, line 49-54]).
Lucas and Huang do not explicitly disclose the following claim limitations (Emphasis added).
The encoding device of Claim 1, wherein to identify the normal score, the processor is configured to: identify the normal vector of the first point of the 3D point cloud, the normal vector is perpendicular to an external surface of the 3D point cloud; and compare the normal vector of the first point to the unit vector of each of the multiple projection planes to identify the normal score of the first point.  
However, in the same field of endeavor Boyce further discloses the claim limitations and the deficient claim limitations, as follows:
the normal vector is perpendicular to an external surface of the 3D point cloud ((i.e. wherein encoding the surface normals data with the point cloud geometry data comprises encoding a vector perpendicular to each point on a surface of an object to provide directions associated with the object) [Boyce: col 40, line 5-8]; (i.e. At processing block 2280, the point cloud data objects are rendered using the surface normal data. Thus, the perpendicular vectors associated with object points are implemented to provide surface directions for each object during rendering. Accordingly, implementation of the surface normals provides enhanced reconstruction of the original geometry of objects in the point cloud data) [Boyce: col 37, line 9-15]);

Therefore, the combination of Lucas and Huang with Boyce will enable the system to calculate a conventional normal vector defined in the arts.

Regarding claim 9, Lucas meets the claim limitations as set forth in claim 8.Lucas further meets the claim limitations as follow.
The encoding device of Claim 8 ((i.e. encoder) [Lucas: col 24, line 51]; (i.e. image processing system) [Lucas: col 24, line 46]), wherein identifying the final score ((i.e. the final optima maximum CENSUS score point) [Lucas: col 15, line 8]; (i.e. Thereafter, process 300 may include "provide an expanded and filtered point cloud to be used to generate images" 316. The final point cloud then may be provided first for post-processing to refine the points, which may include traditional space carving, as described below, and then for modeling, display, or analysis as needed depending on the application and as described below as well) [Lucas: col 10, line 2-9]), the processor is further configured to (i.e. the image processing system 2900 may have one or more processors) [Lucas: col 24, line 46-47; Figs. 29-30]: compare a current iteration (i.e. This may include "compare LPV to corresponding points in 2D images" 482, which in tum, involves "render a visible LPV into a rendered 2D image including any other particles within the volume of the LPV" 482-1. Thus, a current LPV on the latest expanded point cloud and the particles of other LPVs within the volume of the current LPV are projected to rendered a 2D image, one for each camera ( or perspective or different view) of the multiple cameras.) [Lucas: col 18, line 67 – col. 19, line 7] for generating the refined patches to a predefined number of iterations (i.e. it encompasses a larger size in close-up images that are efficient for filtering using stereo-matching comparisons when colors and/or intensities at a single pixel patch on such close-ups) [Lucas: col 18, line 46-49]; and generate the 2D frames (i.e. provide an expanded and filtered point cloud to be used to generate images) [Lucas: col 10, line 3-5] that include the pixels ((i.e. provide an expanded and filtered point cloud to be used to generate images" 316. The final point cloud then may be provided first for post-processing to refine the points, which may include traditional space carving, as described below, and then for modeling, display, or analysis as needed depending on the application and as described below as well) [Lucas: col 10, line 3-9]; (i.e. To improve localization of landmarks, the highest score of a CENSUS metric using a 7x7 pixel patch with a bracketed line search is performed where the local CENSUS search bracket 1032 and 1106 are respectively shown on FIGS. 10 and 11. CENSUS is another stereo-matching descriptor and refers to a non-parametric intensity-based image data representation over a certain pixel area referred to as a CENSUS transform that summarizes local image structure by providing a bit string (or in other words, transforms image data into a representation). The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel) [Lucas: col 14, line 32-45]) when the current iteration matches the predefined number of iterations (i.e. Referring to FIGS. 15-19, example images are provided and formed by the present iterative expansion-filtering method, and the images show the clear increase in accuracy with the expansion-filtering iterations. An image 1500 shows a scene generated by using initial reconstruction seeds of an initial point cloud. An image 1600 is generated by using an expanded point cloud after a first expansion iteration, while an image 1700 is generated by using a filtered point cloud after a set of first filter iterations for the first expansion iteration. An image 1800 is generated by using an expanded point cloud after a second expansion iteration, while an image 1900, the best quality image so far, is generated by using a filtered point cloud after a set of second filter iterations for the second expansion iteration) [Lucas: col 17, line 10-23].  

Regarding claim 11, Lucas meets the claim limitations, as follows:
A method (i.e. a method) [Lucas: col 1, line 54] for point cloud encoding (i.e. forming a point cloud using the image data) [Lucas: col 30, line 2], the method (i.e. a method) [Lucas: col 1, line 54] comprising:segmenting an area including points (i.e. performing a rigorous yet computationally efficient initial segmentation process that provides good quality initial segmentation masks. Such initial segmentation combines the results of chroma-key segmentation, background subtraction, and neural network object detection. The combined result is refined by a boundary segmentation method such as active contours or graph cut algorithm) [Lucas: col 6, line 7-14] representing a three-dimensional (3D) point cloud into multiple voxels ((i.e. The method may comprise rendering of an individual local point volume at the point cloud and including 3D particles positioned within the individual local point volume and defining other local point volumes) [Lucas: col 30, line 20-24 – Note: point volumes are also known as voxels]; (i.e. Candidate pixel locations indicating landmarks from the segmentation are then tested to form an initial point cloud) [Lucas: col 6, line 14-16]; (i.e. wherein individual local point volumes are formed of at least one particle on the point cloud defining a volume having fixed real world dimensions relative to at least one object in the multiple images and that remains fixed from image to image of different perspectives; and providing an expanded and filtered point cloud to be used to generate images) [Lucas: col 30, line 7-13]),
identifying a normal score ((i.e. feature identifying algorithm that generates scores, such with a Shi-Tomasi Eigenvalue-based "corner" score. The process retains the points with such a score that is above a threshold and are maximal among samples within each grid cell. See for example, Shi, J., et al., "Good features to track", Cornell University (1993)) [Lucas: col 12, line 6-11]; (i.e. the match may be performed by determining when a normalized color difference (such as a sum of absolute difference (SAD) score) with respect to the original images) [Lucas: col 9, line 65-68]; (i.e. determining the L1 (normalized) color difference (SAD score) between the rendered image and the original image for each perspective or camera. Other options include a different metric for image comparison, such as L2 color difference, normalized correlation, normalize mutual information, etc .....) [Lucas: col 19, line 49-54]) for each of the points of the 3D point cloud ((i.e. feature identifying algorithm that generates scores, such with a Shi-Tomasi Eigenvalue-based "corner" score. The process retains the points with such a score that is above a threshold and are maximal among samples within each grid cell. See for example, Shi, J., et al., "Good features to track", Cornell University (1993)) [Lucas: col 12, line 6-11]; (i.e. A bracket search is used again to assign a depth estimate near the pivot point to the neighbor point, and by one form, by determining the CENSUS score within the bracket. These neighbor points then become 3D particles that populate the latest expanded point cloud, and each such neighbor point then becomes its own LPV) [Lucas: col 8, line 52-58]; (i.e. Stereo-matching confidence scores based on local region descriptors for image data (also referred to herein as representations) are then used to select the best depth estimate for the point being analyzed. By one form, this involves an initial selection by using a gradient histogram-type of local region descriptor such as a DAISY score) [Lucas: col 6, line 21-28]) based on a comparison of the sub-scores associated with each respective point of the 3D point cloud ((i.e. determining the L1 (normalized) color difference (SAD score) between the rendered image and the original image for each perspective or camera. Other options include a different metric for image comparison, such as L2 color difference, normalized correlation, normalize mutual information, etc .....) [Lucas: col 19, line 49-54]; (i.e. The 2D patch image representations for a projection angle can be created by projecting only those points for which a projection angle has the closest normal. In other words, the 2D patch image representation is taken for the points that maximize the dot product of the point normal and the plane normal) [Boyce: col 30, line 37-42]; (i.e. According to one embodiment, normals processing mechanism 2010 explicitly codes surface normals with point cloud geometry to provide for accurate geometric reconstructions of objects during rendering. In such an embodiment, normals processing mechanism 2010 may implement the surface normals to perform subjective artifact reduction while rendering the objects) [Boyce: col 36, line 27-33]), 
identifying a smoothing score for each of the multiple voxels ((i.e. Process 400 may include "smooth point cloud with shrink wrapping" 498. Starting with the visual hull represented as a mesh, the method may shrink wrap (see Dale, A. M., "Cortical surface-based analysis: I. Segmentation and surface reconstruction", Neuroimage 9.2, pp. 179-194 (1999)) the point cloud by moving mesh vertexes closer to the original point cloud, subject to regularization so that the resultant point cloud is smooth. The topology of the mesh is discarded because triangle quality tends to be poor when vertices are spaced close together) [Lucas: col 20, line 53-62]; (i.e. the point is then refined using a non-parametric intensity-based confidence score, such as a CENSUS score) [Lucas: col 6, line 28-30]; (i.e. A bracket search is used again to assign a depth estimate near the pivot point to the neighbor point, and by one form, by determining the CENSUS score within the bracket. These neighbor points then become 3D particles that populate the latest expanded point cloud, and each such neighbor point then becomes its own LPV) [Lucas: col 8, line 52-58]; (i.e. the CENSUS pixel area that is being applied as explained below. This approach minimizes the amount of work that has to be done by artists to clean up resulting point clouds for film and/or video production which may be at the expense of more computation time. The method is biased towards erring on the side of false positives instead of false negatives by having such a robust seeding and expansion of points image by image such that corresponding points on different images could each have its own candidate point in the point cloud resulting in some redundancy. This is ultimately more efficient because it is easier for artists to manually remove extraneous points (which is a relatively easier 2D task) than to complete missing structures by sculpting (which is a relatively more difficult 3D task)) [Lucas: col 6, line 37-51]) that include at least one of the points of the 3D point cloud ((i.e. wherein individual local point volumes are formed of at least one particle on the point cloud defining a volume having fixed real world dimensions relative to at least one object in the multiple images and that remains fixed from image to image of different perspectives) [Lucas: col 9, line 24-28]; (i.e. filtering iterations are performed for each or individual expansion iterations. The filtering is performed by setting a local point volume (LPV) (i.e. 3D point) at each 2D sample location with an estimated depth determined by ray-casting, mentioned previously, that was the center pixel of a patch window used for computing a CENSUS score) [Lucas: col 6, line 52-57]), wherein the smoothing score is based on a number of points in a voxel that correspond to different projection planes of the multiple projection planes (i.e. A stereo matching technique is then applied to perform depth estimation for seeds in the initial point cloud. Rays are traced from the camera center of a first view or image, through a seed point on the first image. Ray positions are projected onto the view of another overlapping second camera, or second view or image, to determine a linear range or bracket of potential depth estimates along the ray and within the second view or image) [Lucas: col 8, line 4-11], grouping each point of the 3D point cloud ((i.e. Process 400 may include "generate point cloud by combining the LPVs") [Lucas: col 18, line 55-56]; (i.e. constructing the point cloud by combining the local point volumes) [Lucas: col 30, line 17-18]; (i.e. a collection of points or pixel locations from a single image that are assigned depth values is referred to herein as a depth map of a single image, while a 3D object formed by combining multiple images from different perspectives is represented as a point cloud comprised of an unstructured collection of points with associated colors and normal directions.) [Lucas: col 5, line 18-24]; (i.e. rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 47-48]; (i.e. one stereo technique uses a patch to determine if points on one image match points on another image rather than the more conventional scanline matching. The patches are better for capturing similar pixel data on images of two different perspectives) [Lucas: col 5, line 28-32]) to one of multiple projection planes ((i.e. A stereo matching technique is then applied to perform depth estimation for seeds in the initial point cloud. Rays are traced from the camera center of a first view or image, through a seed point on the first image. Ray positions are projected onto the view of another overlapping second camera, or second view or image, to determine a linear range or bracket of potential depth estimates along the ray and within the second view or image. By one form, the process is repeated for all pairs of cameras with overlapping fields of view, although other alternatives could be used) [Lucas: col 8, line 4-14]; (i.e. a set of points contained within the silhouette of the visual hull when reprojected into individual camera views) [Lucas: col 12, line 14-16]; (i.e. the LPV or sphere will have its 1 cm diameter cover that pixel length of 5, 10, 100, and so on. Spheres are the selected shape due to software rendering efficiency so that their projection onto any camera image is not less than a pixel in width and when rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 42-48]) based on the normal score (i.e. determining the L1 (normalized) color difference (SAD score) between the rendered image and the original image for each perspective or camera. Other options include a different metric for image comparison, such as L2 color difference, normalized correlation, normalize mutual information, etc .....) [Lucas: col 19, line 49-54] and the smoothing score (i.e. Stereo-matching confidence scores are then used to select the best depth estimate for the seed point being analyzed. By one form, and as mentioned above, this involves an initial selection by using the DAISY score, while the depth estimate is then refined using the CENSUS score.) [Lucas: col 8, line 15-20] to generate refined patches that represent the 3D point cloud ((i.e. Stereo-matching confidence scores are then used to select the best depth estimate for the seed point being analyzed. By one form, and as mentioned above, this involves an initial selection by using the DAISY score, while the depth estimate is then refined using the CENSUS score.) [Lucas: col 8, line 15-20]; (i.e. To improve localization of landmarks, the highest score of a CENSUS metric using a 7x7 pixel patch with a bracketed line search is performed where the local CENSUS search bracket 1032 and 1106 are respectively shown on FIGS. 10 and 11. CENSUS is another stereo-matching descriptor and refers to a non-parametric intensity-based image data representation over a certain pixel area referred to as a CENSUS transform that summarizes local image structure by providing a bit string (or in other words, transforms image data into a representation). The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel) [Lucas: col 14, line 32-45]), generating two-dimensional (2D) frames (i.e. provide an expanded and filtered point cloud to be used to generate images) [Lucas: col 10, line 3-5] that include pixels that represent the refined patches ((i.e. provide an expanded and filtered point cloud to be used to generate images" 316. The final point cloud then may be provided first for post-processing to refine the points, which may include traditional space carving, as described below, and then for modeling, display, or analysis as needed depending on the application and as described below as well) [Lucas: col 10, line 3-9]; (i.e. To improve localization of landmarks, the highest score of a CENSUS metric using a 7x7 pixel patch with a bracketed line search is performed where the local CENSUS search bracket 1032 and 1106 are respectively shown on FIGS. 10 and 11. CENSUS is another stereo-matching descriptor and refers to a non-parametric intensity-based image data representation over a certain pixel area referred to as a CENSUS transform that summarizes local image structure by providing a bit string (or in other words, transforms image data into a representation). The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel) [Lucas: col 14, line 32-45]), and encoding the  2D frames to generate a bitstream; and transmitting the bitstream ((i.e. logic modules 2904 may communicate remotely with, or otherwise may be communicatively coupled to, the imaging device 2902 for further processing of the image data) [Lucas: col 23, line 15-18]; (i.e. Program logic may allow platform 3002 to stream content to media adaptors or other content services device(s) 3030 or content delivery device(s)) [Lucas: col 27, line 40-43]).  
Lucas does not explicitly disclose the following claim limitations (Emphasis added).
A method for point cloud encoding, the method comprising: segmenting an area including points representing a three-dimensional (3D) point cloud into multiple voxels; identify a number of sub-scores for each of the points of the 3D point cloud including a first point, the sub-scores for the first point represent angles between (i) a normal vector, which is perpendicular to a surface of the 3D point cloud, of the first point, and (ii) a unit vector of each projection plane of multiple projection planes that surround the point cloud; identifying a normal score for each of the points of the 3D point cloud, based on a comparison of the sub-scores associated with each respective point of the 3D point cloud;identifying a smoothing score for each of the multiple voxels that include at least one of the points of the 3D point cloud, wherein the smoothing score is based on a number of points in a voxel that correspond to different projection planes of the multiple projection planes; grouping each point of the 3D point cloud to one of multiple projection planes based on the normal score and the smoothing score to generate refined patches that represent the 3D point cloud; generating two-dimensional (2D) frames that include pixels that represent the refined patches; encoding the 2D frames to generate a bitstream; and transmitting the bitstream.
However, in the same field of endeavor Huang further discloses the claim limitations and the deficient claim limitations, as follows:
(i.e. a confidence score f, based on smoothness analysis) [Huang: col 8, line 31] is based on a number of points in a voxel that correspond to different projection planes of the multiple projection planes ((i.e. Third, each of the iso-points is performed with a confidence score calculation. Specifically, different methods firstly are used to calculate a missing rate of point cloud data at the point (i.e., a confidence score fg based on distance field gradient analysis) and a possibility of the point belonging to a detail part (i.e., a confidence score f, based on smoothness analysis). After that, the two confidence scores are multiplied to be mixed as a final confidence score fk=*fs) [Huang: col 8, line 26-34]; (i.e. For the confidence score fs based on smooth analysis: the confidence score of smooth analysis for each sk is calculated by the following method. The higher the confidence score of the smooth analysis, the more irregular the local point cloud distribution is, which is more possible to be a detail area that needs more scanning. nk is a normal vector of sk, Qk represents K numbers of adjacent points of sk (i.e., the nearest K numbers of already-scanned points q adjacent thereto, and the number of K in the illustrated embodiment generally is 100), hk indicates a distance from the farthest kth point to sk.) [Huang: col 4, line 57-67]),

Therefore, the combination of Lucas with Huang will enable the system to calculate a missing rate of point cloud data at a certain point [Huang: col 4, line 37-41]. 
Lucas and Huang do not explicitly disclose the following claim limitations (Emphasis added).
identifying normal scores for each of the points of the 3D point cloud, the normal scores indicating a proximity between a normal vector of a point perpendicular to a surface of the 3D point cloud to a normal vector of each projection plane of multiple projection planes,
….
encoding the 2D frames to generate a bitstream;
However, in the same field of endeavor Boyce further discloses the claim limitations and the deficient claim limitations, as follows:
the normal scores indicating a proximity between a normal vector of a point perpendicular to a surface of the 3D point cloud ((i.e. The 2D patch image representations for a projection angle can be created by projecting only those points for which a projection angle has the closest normal. In other words, the 2D patch image representation is taken for the points that maximize the dot product of the point normal and the plane normal) [Boyce: col 30, line 37-42]; (i.e. wherein encoding the surface normals data with the point cloud geometry data comprises encoding a vector perpendicular to each point on a surface of an object to provide directions associated with the object) [Boyce: col 40, line 5-8]; (i.e. At processing block 2280, the point cloud data objects are rendered using the surface normal data. Thus, the perpendicular vectors associated with object points are implemented to provide surface directions for each object during rendering. Accordingly, implementation of the surface normals provides enhanced reconstruction of the original geometry of objects in the point cloud data) [Boyce: col 37, line 9-15]) to a normal vector of each projection plane of multiple projection planes (i.e. In the encoding system 1900 of FIG. 19A, a point cloud frame is projected onto several two-dimensional (2D) planes, each 2D plane corresponding to a projection angle. The projection planes can be similar to the projection planes 1512 of FIG. 15B. In some implementations, six projection angles are used in the PCC standard test model, with the projection angles corresponding to angles pointing to the centers of six faces of a rectangular solid that bound the object represented by the point cloud data. While six projection angles are described, other number of angles could possibly be used in different implementations) [Boyce: col 30, line 24-35],
((i.e. The multiple projection planes 1512 can be used to generate multiple two-dimensional (2D) projections, each projection associated with a projection plane) [Boyce: col 27, line 59-62]; (i.e. In the encoding system 1900 of FIG. 19A, a point cloud frame is projected onto several two-dimensional (2D) planes, each 2D plane corresponding to a projection angle. The projection planes can be similar to the projection planes 1512 of FIG. 15B. In some implementations, six projection angles are used in the PCC standard test model, with the projection angles corresponding to angles pointing to the centers of six faces of a rectangular solid that bound the object represented by the point cloud data. While six projection angles are described, other number of angles could possibly be used in different implementations) [Boyce: col 30, line 24-35]),
….
encoding the 2D frames to generate a bitstream (i.e. The apparatus includes one or more processors to encode surface normals data with point cloud geometry data included in the video bit stream data for reconstruction of objects within the video bit stream data based on the surface normals data) [Boyce: Abstract]; (i.e. one embodiment, encoder 2400 performs HEVC coding on the video bit stream data. In such an embodiment, normals encoder 2406 uses the high level syntax to indicate that the video bit stream includes normal data. For instance, normals encoder 2106 may encode the 3-dimensional (X, Y, Z) normals data within 3 color components ( e.g., Red, Green, Blue (RGB), or YUV) having a fixed bit depth ( e.g., 8, 10, 12, 16).) [Boyce: col 37, line 36-44; Figs. 24A-B];
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Lucas and Huang with Boyce to encode the image data into the bitstream.  
Therefore, the combination of Lucas and Huang with Boyce will enable the system to efficiently transmitting compressed image data to display and/or storage devices [Boyce: col 28, line 41-49; col. 30, line 53-55].

Regarding claim 12, Lucas meets the claim limitations as set forth in claim 11.Lucas further meets the claim limitations as follow.
The method of Claim 11 (i.e. a method) [Lucas: col 1, line 54], further comprising:identifying neighboring voxels that are proximate to each of the multiple voxels ((i.e. identified using a nearest-neighbor lookup to first connect neighboring points that are within a specified distance) [Lucas: col 20, line 28-30]; (i.e. this operation may
include first expanding or growing an initial point cloud (or at least the points that could be used to form an initial point cloud when the initial point cloud is not actually generated) by providing depth estimates to points neighboring a pivot point that already has a depth estimate determined by the ray-casting process. By one form, neighboring points are the directly adjacent pixel locations to a current, center, or other key pixel (or pixel location), and by one example, is the adjacent upper, lower, left and right pixels relative to the pivot pixel. Many other variations are contemplated such as including the diagonal pixel locations and/or any other pattern that includes pixel locations within a certain range or distance from the pivot pixel.) [Lucas: col 8, line 35-48]) that include at least one of the points of the 3D point cloud ((i.e. The point cloud may be severely oversampled in some regions because the expansion phase will add points in overlapping regions. Thus, process 400 may include "generate a visual hull" 497. To reduce the point cloud density, a visual hull is generated (see for visual hall, Kutulakos, K. N., et al. cited above) of the reconstruction using depth maps rendered from the point cloud using traditional space carving, and providing the advantages mentioned above that compensate for the stereo techniques) [Lucas: col 20, line 34-43] ; (i.e. The expansion may be performed by analyzing each image in 2D, image by image, determining which points from the initial or latest point cloud have neighbor points that still need a depth estimate, and then analyzing those points. A bracket search is used again to assign a depth estimate near the pivot point to the neighbor point, and by one form, by determining the CENSUS score within the bracket. These neighbor points then become 3D particles that populate the latest expanded point cloud, and each such neighbor point then becomes its own LPV) [Lucas: col 8, line 48-58]), wherein each of the neighboring voxels that are identified include at least one of the points of the 3D point cloud ((i.e. identified using a nearest-neighbor lookup to first connect neighboring points that are within a specified distance) [Lucas: col 20, line 28-30]; (i.e. It also will be understood that the LPVs of the point cloud also are formed by the neighbor particles added during the expansion iterations when used) [Lucas: col 9, line 50-52]); and generating, for the 3D point cloud (i.e. The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel. By one example, the bit stream is merely a count of the number of neighbor pixels with an intensity less than the key pixel. Thus, the CENSUS is a characterization, descriptor, or representation of image data for comparison purposes between one image and another image, and the bit stream may be a string of 1s and 0s where 1s indicate a pixel intensity less than the key pixel. Therefore, each pixel on an image can have a CENSUS score that indicates the difference between such a CENSUS of a reference pixel on one image compared to a CENSUS of a corresponding current or candidate pixel on another image. The CENSUS score may be determined by hamming distance between the two bit strings of corresponding pixel location patches on two different images. See Hirschmuller, et al., "Evaluation of stereo matching costs on images with radiometric differences", IEEE Transactions on Pattern Analysis & Machine Intelligence, pp. 1582-1599 (2008); and Zabih, R., et al., "Non-parametric local transforms for computing visual correspondence", European conference on computer vision, Springer, Berlin, Heidelberg, pp. 151-158, (1994)).) [Lucas: col 14, line 41-64], a plurality of initial patches ((i.e. the spatial support for the patch expands, and a distinct maximum CENSUS score is more likely to be observed at the lower resolutions, albeit possibly at the expense of less accuracy in depth) [Lucas: col 15, line 51-55]; (i.e. performing a rigorous yet computationally efficient initial segmentation process that provides good quality initial segmentation masks. Such initial segmentation combines the results of chroma-key segmentation, background subtraction, and neural network object detection. The combined result is refined by a boundary segmentation method such as active contours or graph cut algorithm) [Lucas: col 6, line 7-14]; (i.e. Stereo-matching confidence scores are then used to select the best depth estimate for the seed point being analyzed. By one form, and as mentioned above, this involves an initial selection by using the DAISY score, while the depth estimate is then refined using the CENSUS score.) [Lucas: col 8, line 15-20]; (i.e. To improve localization of landmarks, the highest score of a CENSUS metric using a 7x7 pixel patch with a bracketed line search is performed where the local CENSUS search bracket 1032 and 1106 are respectively shown on FIGS. 10 and 11. CENSUS is another stereo-matching descriptor and refers to a non-parametric intensity-based image data representation over a certain pixel area referred to as a CENSUS transform that summarizes local image structure by providing a bit string (or in other words, transforms image data into a representation). The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel) [Lucas: col 14, line 32-45]).

Regarding claim 13, Lucas meets the claim limitations as set forth in claim 12.Lucas further meets the claim limitations as follow.
The method of Claim 11 (i.e. a method) [Lucas: col 1, line 54], wherein a quantity of the neighboring voxels that are proximate to one voxel of the multiple voxels is limited (i.e. neighboring points that are within a specified distance) [Lucas: col 20, line 29-30] based on at least one of: a predefined distance from the one voxel of the multiple voxels (i.e. These are identified using a nearest-neighbor lookup to first connect neighboring points that are within a specified distance. Small clusters are then removed based on the spatial extent and number of points in the connected component) [Lucas: col 20, line 29-31], and a predefined quantity of points inside the neighboring voxels (i.e. The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel. By one example, the bit stream is merely a count of the number of neighbor pixels with an intensity less than the key pixel) [Lucas: col 14, line 41-47].

Regarding claim 14, Lucas meets the claim limitations as set forth in claim 12.Lucas further meets the claim limitations as follow.
The method of Claim 12 (i.e. a method) [Lucas: col 1, line 54], wherein: the refined patches are generated based on the plurality of initial patches ((i.e. Referring to FIGS. 1-2, stereo methods perform well on scenes with highly textured surfaces even though segmentation masks are not necessarily highly accurate. For instance, one stereo technique uses a patch to determine if points on one image match points on another image rather than the more conventional scanline matching. The patches are better for capturing similar pixel data on images of two different perspectives) [Lucas: col 5, line 25-32]; (i.e. The CENSUS score may be determined by hamming distance between the two bit strings of corresponding pixel location patches on two different images) [Lucas: col 14, line 55-58]); and generating (i.e. generate) [Lucas: col 18, line 55] the plurality of initial patches (i.e. the pixel-limitations of the patches that fix the pivot points (the center of the pixel patches) on object locations) [Lucas: col 24, line 46], the processor is configured to (i.e. the image processing system 2900 may have one or more processors) [Lucas: col 24, line 46-47; Figs. 29-30]: identify the normal vector for each of the points of the 3D point cloud (i.e. a 3D object formed by combining multiple images from different perspectives is
represented as a point cloud comprised of an unstructured collection of points with associated colors and normal directions) [Lucas: col 5, line 25-32] that is perpendicular to an external surface of the 3D point cloud, and generate the initial patches by grouping each of the points of the 3D point cloud ((i.e. Process 400 may include "generate point cloud by combining the LPVs") [Lucas: col 18, line 55-56]; (i.e. constructing the point cloud by combining the local point volumes) [Lucas: col 30, line 17-18]; (i.e. a collection of points or pixel locations from a single image that are assigned depth values is referred to herein as a depth map of a single image, while a 3D object formed by combining multiple images from different perspectives is represented as a point cloud comprised of an unstructured collection of points with associated colors and normal directions.) [Lucas: col 5, line 18-24]; (i.e. rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 47-48]; (i.e. one stereo technique uses a patch to determine if points on one image match points on another image rather than the more conventional scanline matching. The patches are better for capturing similar pixel data on images of two different perspectives) [Lucas: col 5, line 28-32]) to one of the multiple projection planes ((i.e. A stereo matching technique is then applied to perform depth estimation for seeds in the initial point cloud. Rays are traced from the camera center of a first view or image, through a seed point on the first image. Ray positions are projected onto the view of another overlapping second camera, or second view or image, to determine a linear range or bracket of potential depth estimates along the ray and within the second view or image. By one form, the process is repeated for all pairs of cameras with overlapping fields of view, although other alternatives could be used) [Lucas: col 8, line 4-14]; (i.e. a set of points contained within the silhouette of the visual hull when reprojected into individual camera views) [Lucas: col 12, line 14-16]; (i.e. the LPV or sphere will have its 1 cm diameter cover that pixel length of 5, 10, 100, and so on. Spheres are the selected shape due to software rendering efficiency so that their projection onto any camera image is not less than a pixel in width and when rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 42-48]; (i.e. Stereo-matching confidence scores are then used to select the best depth estimate for the seed point being analyzed. By one form, and as mentioned above, this involves an initial selection by using the DAISY score, while the depth estimate is then refined using the CENSUS score.) [Lucas: col 8, line 15-20]; (i.e. To improve localization of landmarks, the highest score of a CENSUS metric using a 7x7 pixel patch with a bracketed line search is performed where the local CENSUS search bracket 1032 and 1106 are respectively shown on FIGS. 10 and 11. CENSUS is another stereo-matching descriptor and refers to a non-parametric intensity-based image data representation over a certain pixel area referred to as a CENSUS transform that summarizes local image structure by providing a bit string (or in other words, transforms image data into a representation). The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel) [Lucas: col 14, line 32-45]) that the normal vector is directed towards (i.e. a 3D object formed by combining multiple images from different perspectives is represented as a point cloud comprised of an unstructured collection of points with associated colors and normal directions) [Lucas: col 5, line 25-32].
Lucas and Huang do not explicitly disclose the following claim limitations (Emphasis added).
The method of Claim 12, wherein: the refined patches are generated based on the plurality of initial patches; and generating the plurality of initial patches comprises:  identify the normal vector for each of the points of the 3D point cloud that is perpendicular to an external surface of the 3D point cloud, and generate the initial patches by grouping each of the points of the 3D point cloud to one of the multiple projection planes that the normal vector is directed towards.  
However, in the same field of endeavor Boyce further discloses the claim limitations and the deficient claim limitations, as follows:
identify a normal vector for each of the points of the 3D point cloud that is perpendicular to an external surface of the 3D point cloud ((i.e. wherein encoding the surface normals data with the point cloud geometry data comprises encoding a vector perpendicular to each point on a surface of an object to provide directions associated with the object) [Boyce: col 40, line 5-8]; (i.e. At processing block 2280, the point cloud data objects are rendered using the surface normal data. Thus, the perpendicular vectors associated with object points are implemented to provide surface directions for each object during rendering. Accordingly, implementation
of the surface normals provides enhanced reconstruction of the original geometry of objects in the point cloud data) [Boyce: col 37, line 9-15]), and generate the initial patches by grouping each of the points of the 3D point cloud to one of the multiple projection planes that the normal vector is directed towards (i.e. Texture and depth 2D image patch representations are formed at each projection angle. The 2D patch image representations for a projection angle can be created by projecting only those points for which a projection angle has the closest normal. In other words, the 2D patch image representation is taken for the points that maximize the dot product of the point normal and the plane normal. Texture patches from the separate projections are combined into a single texture image, which is referred to as the geometry image. Metadata to represent the patches and how they were packed into a frame are described in the occupancy map and auxiliary patch info. The occupancy map metadata includes an indication of which image sample positions are empty (e.g., do not contain corresponding point cloud information). The auxiliary patch info indicates the projection plane to which a patch belongs and can be used to determine a projection plane associated with a given sample position) [Boyce: col 30, line 36-52].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Lucas, Huang, and Boyce to program the system to computer the normal vector which is perpendicular to the external surface of the 3D point cloud.  
Therefore, the combination of Lucas and Huang with Boyce will enable the system to calculate a conventional normal vector defined in the arts.

Regarding claim 16, Lucas meets the claim limitations as set forth in claim 15.Lucas further meets the claim limitations as follow.
The method of Claim 15 (i.e. a method) [Lucas: col 1, line 54], wherein the set of smoothing scores for the one voxel ((i.e. Process 400 may include "smooth point cloud with shrink wrapping" 498. Starting with the visual hull represented as a mesh, the method may shrink wrap (see Dale, A. M., "Cortical surface-based analysis: I. Segmentation and surface reconstruction", Neuroimage 9.2, pp. 179-194 (1999)) the point cloud by moving mesh vertexes closer to the original point cloud, subject to regularization so that the resultant point cloud is smooth. The topology of the mesh is discarded because triangle quality tends to be poor when vertices are spaced close together) [Lucas: col 20, line 53-62]; (i.e. the point is then refined using a non-parametric intensity-based confidence score, such as a CENSUS score) [Lucas: col 6, line 28-30]; (i.e. A bracket search is used again to assign a depth estimate near the pivot point to the neighbor point, and by one form, by determining the CENSUS score within the bracket. These neighbor points then become 3D particles that populate the latest expanded point cloud, and each such neighbor point then becomes its own LPV) [Lucas: col 8, line 52-58]; (i.e. the CENSUS pixel area that is being applied as explained below. This approach minimizes the amount of work that has to be done by artists to clean up resulting point clouds for film and/or video production which may be at the expense of more computation time. The method is biased towards erring on the side of false positives instead of false negatives by having such a robust seeding and expansion of points image by image such that corresponding points on different images could each have its own candidate point in the point cloud resulting in some redundancy. This is ultimately more efficient because it is easier for artists to manually remove extraneous points (which is a relatively easier 2D task) than to complete missing structures by sculpting (which is a relatively more difficult 3D task)) [Lucas: col 6, line 37-51]) indicates a number of points within a respective voxel (i.e. find potential 3D points using DAISY features and refined by CENSUS stereo matching) [Lucas: col 21, line 31-33] that are grouped to each of the projection planes ((i.e. Process 400 may include "generate point cloud by combining the LPVs") [Lucas: col 18, line 55-56]; (i.e. constructing the point cloud by combining the local point volumes) [Lucas: col 30, line 17-18]; (i.e. a collection of points or pixel locations from a single image that are assigned depth values is referred to herein as a depth map of a single image, while a 3D object formed by combining multiple images from different perspectives is represented as a point cloud comprised of an unstructured collection of points with associated colors and normal directions.) [Lucas: col 5, line 18-24]; (i.e. rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 47-48]; (i.e. one stereo technique uses a patch to determine if points on one image match points on another image rather than the more conventional scanline matching. The patches are better for capturing similar pixel data on images of two different perspectives) [Lucas: col 5, line 28-32]; (i.e. A stereo matching technique is then applied to perform depth estimation for seeds in the initial point cloud. Rays are traced from the camera center of a first view or image, through a seed point on the first image. Ray positions are projected onto the view of another overlapping second camera, or second view or image, to determine a linear range or bracket of potential depth estimates along the ray and within the second view or image. By one form, the process is repeated for all pairs of cameras with overlapping fields of view, although other alternatives could be used) [Lucas: col 8, line 4-14]; (i.e. a set of points contained within the silhouette of the visual hull when reprojected into individual camera views) [Lucas: col 12, line 14-16]; (i.e. the LPV or sphere will have its 1 cm diameter cover that pixel length of 5, 10, 100, and so on. Spheres are the selected shape due to software rendering efficiency so that their projection onto any camera image is not less than a pixel in width and when rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 42-48]) as indicated by the initial patches ((i.e. one stereo technique uses a patch to determine if points on one image match points on another image rather than the more conventional scanline matching. The patches are better for capturing similar pixel data on images of two different perspectives) [Lucas: col 5, line 28-32]; (i.e. To improve localization of landmarks, the highest score of a CENSUS metric using a 7x7 pixel patch with a bracketed line search is performed where the local CENSUS search bracket 1032 and 1106 are respectively shown on FIGS. 10 and 11. CENSUS is another stereo-matching descriptor and refers to a non-parametric intensity-based image data representation over a certain pixel area referred to as a CENSUS transform that summarizes local image structure by providing a bit string (or in other words, transforms image data into a representation). The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel) [Lucas: col 14, line 32-45]).

Regarding claim 17, Lucas meets the claim limitations as set forth in claim 11.Lucas further meets the claim limitations as follow.
The method of Claim 11 (i.e. a method) [Lucas: col 1, line 54]), wherein identifying the normal score ((i.e. feature identifying algorithm that generates scores, such with a Shi-Tomasi Eigenvalue-based "corner" score. The process retains the points with such a score that is above a threshold and are maximal among samples within each grid cell. See for example, Shi, J., et al., "Good features to track", Cornell University (1993)) [Lucas: col 12, line 6-11]; (i.e. the match may be performed by determining when a normalized color difference (such as a sum of absolute difference (SAD) score) with respect to the original images) [Lucas: col 9, line 65-68]; (i.e. determining the L1 (normalized) color difference (SAD score) between the rendered image and the original image for each perspective or camera. Other options include a different metric for image comparison, such as L2 color difference, normalized correlation, normalize mutual information, etc .....) [Lucas: col 19, line 49-54]), the processor (i.e. Graphics subsystem 3015 may be integrated into processor 3010 or chipset 3005) [Lucas: col 26, line 11-12; Figs. 29-30] comprises: identifying the normal vector of the first point of the 3D point cloud ((i.e. feature identifying algorithm that generates scores, such with a Shi-Tomasi Eigenvalue-based "corner" score. The process retains the points with such a score that is above a threshold and are maximal among samples within each grid cell. See for example, Shi, J., et al., "Good features to track", Cornell University (1993)) [Lucas: col 12, line 6-11]; (i.e. the match may be performed by determining when a normalized color difference (such as a sum of absolute difference (SAD) score) with respect to the original images) [Lucas: col 9, line 65-68]; (i.e. determining the L1 (normalized) color difference (SAD score) between the rendered image and the original image for each perspective or camera. Other options include a different metric for image comparison, such as L2 color difference, normalized correlation, normalize mutual information, etc .....) [Lucas: col 19, line 49-54]), the normal vector is perpendicular to an external surface of the 3D point cloud; and comparing (i.e. determining the L1 (normalized) color difference (SAD score) between the rendered image and the original image for each perspective or camera. Other options include a different metric for image comparison, such as L2 color difference, normalized correlation, normalize mutual information, etc .....) [Lucas: col 19, line 49-54]  the normal vector of the first point to the unit vector of each of the multiple projection planes to identify the normal score of the first point ((i.e. comprehensive quality analysis to the existing point cloud, planning of the scanning path and automatic scanning and stitching to sections with low confidence scores) [Lucas: col 3, line 32-35]; (i.e. feature identifying algorithm that generates scores, such with a Shi-Tomasi Eigenvalue-based "corner" score. The process retains the points with such a score that is above a threshold and are maximal among samples within each grid cell. See for example, Shi, J., et al., "Good features to track", Cornell University (1993)) [Lucas: col 12, line 6-11]; (i.e. the match may be performed by determining when a normalized color difference (such as a sum of absolute difference (SAD) score) with respect to the original images) [Lucas: col 9, line 65-68]; (i.e. determining the L1 (normalized) color difference (SAD score) between the rendered image and the original image for each perspective or camera. Other options include a different metric for image comparison, such as L2 color difference, normalized correlation, normalize mutual information, etc .....) [Lucas: col 19, line 49-54]).
Lucas and Huang do not explicitly disclose the following claim limitations (Emphasis added).
The method of Claim 11, wherein identifying the normal score comprises: identifying the normal vector of the first point of the 3D point cloud, the normal vector is perpendicular to an external surface of the 3D point cloud; and comparing the normal vector of the first point to the unit vector of each of the multiple projection planes to identify the normal score of the first point.   
However, in the same field of endeavor Boyce further discloses the claim limitations and the deficient claim limitations, as follows:
the normal vector is perpendicular to an external surface of the 3D point cloud ((i.e. wherein encoding the surface normals data with the point cloud geometry data comprises encoding a vector perpendicular to each point on a surface of an object to provide directions associated with the object) [Boyce: col 40, line 5-8]; (i.e. At processing block 2280, the point cloud data objects are rendered using the surface normal data. Thus, the perpendicular vectors associated with object points are implemented to provide surface directions for each object during rendering. Accordingly, implementation of the surface normals provides enhanced reconstruction of the original geometry of objects in the point cloud data) [Boyce: col 37, line 9-15]);
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Lucas and Huang with Boyce to program the system to computer the normal vector which is perpendicular to the external surface of the 3D point cloud.  
Therefore, the combination of Lucas and Huang with Boyce will enable the system to calculate a conventional normal vector defined in the arts.

Regarding claim 19, Lucas meets the claim limitations as set forth in claim 18.Lucas further meets the claim limitations as follow.
The method of Claim 18 (i.e. a method) [Lucas: col 1, line 54], further comprising after the final score is identified (i.e. the final optima maximum CENSUS score point) [Lucas: col 15, line 8], comparing a current iteration (i.e. This may include "compare LPV to corresponding points in 2D images" 482, which in tum, involves "render a visible LPV into a rendered 2D image including any other particles within the volume of the LPV" 482-1. Thus, a current LPV on the latest expanded point cloud and the particles of other LPVs within the volume of the current LPV are projected to rendered a 2D image, one for each camera (or perspective or different view) of the multiple cameras.) [Lucas: col 18, line 67 – col. 19, line 7] for generating the refined patches to a predefined number of iterations (i.e. it encompasses a larger size in close-up images that are efficient for filtering using stereo-matching comparisons when colors and/or intensities at a single pixel patch on such close-ups) [Lucas: col 18, line 46-49]; and generate the 2D frames (i.e. provide an expanded and filtered point cloud to be used to generate images) [Lucas: col 10, line 3-5] that include the pixels ((i.e. provide an expanded and filtered point cloud to be used to generate images" 316. The final point cloud then may be provided first for post-processing to refine the points, which may include traditional space carving, as described below, and then for modeling, display, or analysis as needed depending on the application and as described below as well) [Lucas: col 10, line 3-9]; (i.e. To improve localization of landmarks, the highest score of a CENSUS metric using a 7x7 pixel patch with a bracketed line search is performed where the local CENSUS search bracket 1032 and 1106 are respectively shown on FIGS. 10 and 11. CENSUS is another stereo-matching descriptor and refers to a non-parametric intensity-based image data representation over a certain pixel area referred to as a CENSUS transform that summarizes local image structure by providing a bit string (or in other words, transforms image data into a representation). The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel) [Lucas: col 14, line 32-45]) when the current iteration matches the predefined number of iterations (i.e. Referring to FIGS. 15-19, example images are provided and formed by the present iterative expansion-filtering method, and the images show the clear increase in accuracy with the expansion-filtering iterations. An image 1500 shows a scene generated by using initial reconstruction seeds of an initial point cloud. An image 1600 is generated by using an expanded point cloud after a first expansion iteration, while an image 1700 is generated by using a filtered point cloud after a set of first filter iterations for the first expansion iteration. An image 1800 is generated by using an expanded point cloud after a second expansion iteration, while an image 1900, the best quality image so far, is generated by using a filtered point cloud after a set of second filter iterations for the second expansion iteration) [Lucas: col 17, line 10-23].

Claims 5, 8, 10, 15, 18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Lucas (US Patent 10,867,430 B2), (“Lucas”), in view of Huang et al. (US Patent 10,062,207 B1), (“Huang”), in view of Boyce et al. (US Patent 10,893,299 B1), (“Boyce”), in view of Hickman et al. (US Patent 8,436,853 B1), (“Hickman”).

Regarding claim 5, Lucas, Huang, and Boyce meet the claim limitations as set forth in claim 2.Lucas further meets the claim limitations as follow.
The encoding device of Claim 2 ((i.e. encoder) [Lucas: col 24, line 51]; (i.e. image processing system) [Lucas: col 24, line 46]), wherein to identify the smoothing score ((i.e. feature identifying algorithm that generates scores) [Lucas: col 12, line 6-7]; (i.e. Process 400 may include "smooth point cloud with shrink wrapping" 498. Starting with the visual hull represented as a mesh, the method may shrink wrap (see Dale, A. M., "Cortical surface-based analysis: I. Segmentation and surface reconstruction", Neuroimage 9.2, pp. 179-194 (1999)) the point cloud by moving mesh vertexes closer to the original point cloud, subject to regularization so that the resultant point cloud is smooth. The topology of the mesh is discarded because triangle quality tends to be poor when vertices are spaced close together) [Lucas: col 20, line 53-62]; (i.e. the point is then refined using a non-parametric intensity-based confidence score, such as a CENSUS score) [Lucas: col 6, line 28-30]; (i.e. A bracket search is used again to assign a depth estimate near the pivot point to the neighbor point, and by one form, by determining the CENSUS score within the bracket. These neighbor points then become 3D particles that populate the latest expanded point cloud, and each such neighbor point then becomes its own LPV) [Lucas: col 8, line 52-58]; (i.e. the CENSUS pixel area that is being applied as explained below. This approach minimizes the amount of work that has to be done by artists to clean up resulting point clouds for film and/or video production which may be at the expense of more computation time. The method is biased towards erring on the side of false positives instead of false negatives by having such a robust seeding and expansion of points image by image such that corresponding points on different images could each have its own candidate point in the point cloud resulting in some redundancy. This is ultimately more efficient because it is easier for artists to manually remove extraneous points (which is a relatively easier 2D task) than to complete missing structures by sculpting (which is a relatively more difficult 3D task)) [Lucas: col 6, line 37-51]), the processor is configured to ((i.e. the image processing system 2900 may have one or more processors) [Lucas: col 24, line 46-47; Figs. 29-30] (i.e. graphics and/or video functionality may be integrated within a chipset. Alternatively, a discrete graphics and/or video processor may be used. As still another implementation, the graphics and/or video functions may be provided by a general purpose processor, including a multi-core processor. In further implementations, the functions may be implemented in a consumer electronics device.) [Lucas: col 26, line 17-24; Figs. 29-30]): identify a set of smoothing scores for each of the multiple voxels (i.e. feature identifying algorithm that generates scores) [Lucas: col 12, line 6-7]; (i.e. Process 400 may include "smooth point cloud with shrink wrapping" 498. Starting with the visual hull represented as a mesh, the method may shrink wrap (see Dale, A. M., "Cortical surface-based analysis: I. Segmentation and surface reconstruction", Neuroimage 9.2, pp. 179-194 (1999)) the point cloud by moving mesh vertexes closer to the original point cloud, subject to regularization so that the resultant point cloud is smooth. The topology of the mesh is discarded because triangle quality tends to be poor when vertices are spaced close together) [Lucas: col 20, line 53-62]; (i.e. identified using a nearest-neighbor lookup to first connect neighboring points that are within a specified distance. Small clusters are then removed based on the spatial extent and number of points in the connected component) [Lucas: col 20, line 28-32]) that include at least one of the points of the 3D point cloud ((i.e. Process 400 may include "smooth point cloud with shrink wrapping" 498. Starting with the visual hull represented as a mesh, the method may shrink wrap (see Dale, A. M., "Cortical surface-based analysis: I. Segmentation and surface reconstruction", Neuroimage 9.2, pp. 179-194 (1999)) the point cloud by moving mesh vertexes closer to the original point cloud, subject to regularization so that the resultant point cloud is smooth. The topology of the mesh is discarded because triangle quality tends to be poor when vertices are spaced close together) [Lucas: col 20, line 53-62]; (i.e. the point is then refined using a non-parametric intensity-based confidence score, such as a CENSUS score) [Lucas: col 6, line 28-30]; (i.e. A bracket search is used again to assign a depth estimate near the pivot point to the neighbor point, and by one form, by determining the CENSUS score within the bracket. These neighbor points then become 3D particles that populate the latest expanded point cloud, and each such neighbor point then becomes its own LPV) [Lucas: col 8, line 52-58]; (i.e. the CENSUS pixel area that is being applied as explained below. This approach minimizes the amount of work that has to be done by artists to clean up resulting point clouds for film and/or video production which may be at the expense of more computation time. The method is biased towards erring on the side of false positives instead of false negatives by having such a robust seeding and expansion of points image by image such that corresponding points on different images could each have its own candidate point in the point cloud resulting in some redundancy. This is ultimately more efficient because it is easier for artists to manually remove extraneous points (which is a relatively easier 2D task) than to complete missing structures by sculpting (which is a relatively more difficult 3D task)) [Lucas: col 6, line 37-51]); and identify the smoothing score for one voxel of the multiple voxels (i.e. feature identifying algorithm that generates scores) [Lucas: col 12, line 6-7]; (i.e. Process 400 may include "smooth point cloud with shrink wrapping" 498. Starting with the visual hull represented as a mesh, the method may shrink wrap (see Dale, A. M., "Cortical surface-based analysis: I. Segmentation and surface reconstruction", Neuroimage 9.2, pp. 179-194 (1999)) the point cloud by moving mesh vertexes closer to the original point cloud, subject to regularization so that the resultant point cloud is smooth. The topology of the mesh is discarded because triangle quality tends to be poor when vertices are spaced close together) [Lucas: col 20, line 53-62] by combining the set of smoothing scores of the one voxel with additional sets of smoothing scores of the neighboring voxels that are proximate to the one voxel ((i.e. Process 400 may include "generate point cloud by combining the LPVs") [Lucas: col 18, line 55-56]; (i.e. constructing the point cloud by combining the local point volumes) [Lucas: col 30, line 17-18]; (i.e. a collection of points or pixel locations from a single image that are assigned depth values is referred to herein as a depth map of a single image, while a 3D object formed by combining multiple images from different perspectives is represented as a point cloud comprised of an unstructured collection of points with associated colors and normal directions.) [Lucas: col 5, line 18-24]; (i.e. rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 47-48]).
In the same field of endeavor Hickman further discloses the claim limitations as follows:
identify a set of smoothing scores for each of the multiple voxels (i.e. another example, the model is in voxels and a marching cubes algorithm may be applied to convert the voxels into a mesh, which can undergo a smoothing operation to reduce jaggedness on surfaces of the 3D object data model caused by conversion by the marching cubes algorithm. An example smoothing operation may move individual triangle vertices to positions representing averages of connected neighborhood vertices to reduce angles between triangles in the mesh.  In one example, 3D object data model generation may further include application of a decimation operation to the smoothed mesh to eliminate data points) [Hickman: col 13, line 44-54] that include at least one of the points of the 3D point cloud (i.e. another example, the model is in voxels and a marching cubes algorithm may be applied to convert the voxels into a mesh, which can undergo a smoothing operation to reduce jaggedness on surfaces of the 3D object data model caused by conversion by the marching cubes algorithm. An example smoothing operation may move individual triangle vertices to positions representing averages of connected neighborhood vertices to reduce angles between triangles in the mesh) [Hickman: col 13, line 44-51]; and identify the smoothing score for one voxel of the multiple voxels (i.e. another example, the model is in voxels and a marching cubes algorithm may be applied to convert the voxels into a mesh, which can undergo a smoothing operation to reduce jaggedness on surfaces of the 3D object data model caused by conversion by the marching cubes algorithm. An example smoothing operation may move individual triangle vertices to positions representing averages of connected neighborhood vertices to reduce angles between triangles in the mesh.  In one example, 3D object data model generation may further include application of a decimation operation to the smoothed mesh to eliminate data points) [Hickman: col 13, line 44-54] by combining the set of smoothing scores of the one voxel with additional sets of smoothing scores of the neighboring voxels that are proximate to the one voxel ((i.e.   In one example, 3D object data model generation may further include application of a decimation operation to the smoothed mesh to eliminate data points, which may improve processing speed. After the smoothing and decimation operations have been performed, an error value may be calculated based on differences between a resulting mesh and an original mesh or original data, and the error may be compared to an acceptable threshold value. The smoothing and decimation
operations may be applied to the mesh once again based on a comparison of the error to the acceptable value. Last set of mesh data that satisfies the threshold may be stored as the 3D object data model.) [Hickman: col 13, line 53-63].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Lucas, Huang, and Boyce with Hickman to program the system to compute a smooth score for each point.  
Therefore, the combination of Lucas, Huang, and Boyce with Hickman will enable the system to create a deep fully convolutional network for analysis of 3D data in 3D environments [Park: col. 3, line 1-34]. 

Regarding claim 8, Lucas, Huang, and Boyce meet the claim limitations as set forth in claim 1.Lucas further meets the claim limitations as follow.
The encoding device of Claim 1 ((i.e. encoder) [Lucas: col 24, line 51]; (i.e. image processing system) [Lucas: col 24, line 46]), wherein the processor is configured to (i.e. the image processing system 2900 may have one or more processors) [Lucas: col 24, line 46-47; Figs. 29-30]:
identify (i.e. feature identifying algorithm that generates scores) [Lucas: col 12, line 6-7] a final score ((i.e. the final optima maximum CENSUS score point) [Lucas: col 15, line 8]; (i.e. Thereafter, process 300 may include "provide an expanded and filtered point cloud to be used to generate images" 316. The final point cloud then may be provided first for post-processing to refine the points, which may include traditional space carving, as described below, and then for modeling, display, or analysis as needed depending on the application and as described below as well) [Lucas: col 10, line 2-9]) for each of the points of the 3D point cloud ((i.e. determining the L1 (normalized) color difference (SAD score) between the rendered image and the original image for each perspective or camera. Other options include a different metric for image comparison, such as L2 color difference, normalized correlation, normalize mutual information, etc .....) [Lucas: col 19, line 49-54] ; (i.e. A bracket search is used again to assign a depth estimate near the pivot point to the neighbor point, and by one form, by determining the CENSUS score within the bracket. These neighbor points then become 3D particles that populate the latest expanded point cloud, and each such neighbor point then becomes its own LPV) [Lucas: col 8, line 52-58]; (i.e. Stereo-matching confidence scores based on local region descriptors for image data (also referred to herein as representations) are then used to select the best depth estimate for the point being analyzed. By one form, this involves an initial selection by using a gradient histogram-type of local region descriptor such as a DAISY score) [Lucas: col 6, line 21-28]) with respect to the multiple projection planes ((i.e. A stereo matching technique is then applied to perform depth estimation for seeds in the initial point cloud. Rays are traced from the camera center of a first view or image, through a seed point on the first image. Ray positions are projected onto the view of another overlapping second camera, or second view or image, to determine a linear range or bracket of potential depth estimates along the ray and within the second view or image. By one form, the process is repeated for all pairs of cameras with overlapping fields of view, although other alternatives could be used) [Lucas: col 8, line 4-14]; (i.e. a set of points contained within the silhouette of the visual hull when reprojected into individual camera views) [Lucas: col 12, line 14-16]; (i.e. the LPV or sphere will have its 1 cm diameter cover that pixel length of 5, 10, 100, and so on. Spheres are the selected shape due to software rendering efficiency so that their projection onto any camera image is not less than a pixel in width and when rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 42-48]), wherein the final score ((i.e. the final optima maximum CENSUS score point) [Lucas: col 15, line 8]; (i.e. Thereafter, process 300 may include "provide an expanded and filtered point cloud to be used to generate images" 316. The final point cloud then may be provided first for post-processing to refine the points, which may include traditional space carving, as described below, and then for modeling, display, or analysis as needed depending on the application and as described below as well) [Lucas: col 10, line 2-9]) is based on a weighted combination of the normal score and the smoothing score and relates each of the points to a particular projection plane of the multiple projection planes ((i.e. a set of points contained within the silhouette of the visual hull when reprojected into individual camera views) [Lucas: col 12, line 14-16]; (i.e. This may include "compare LPV to corresponding points in 2D images" 482, which in tum, involves "render a visible LPV into a rendered 2D image including any other particles within the volume of the LPV" 482-1. Thus, a current LPV on the latest expanded point cloud and the particles of other LPVs within the volume of the current LPV are projected to rendered a 2D image, one for each camera ( or perspective or different view) of the multiple cameras.) [Lucas: col 18, line 67 – col. 19, line 7]); and group each point of the 3D point cloud ((i.e. Process 400 may include "generate point cloud by combining the LPVs") [Lucas: col 18, line 55-56]; (i.e. constructing the point cloud by combining the local point volumes) [Lucas: col 30, line 17-18]; (i.e. a collection of points or pixel locations from a single image that are assigned depth values is referred to herein as a depth map of a single image, while a 3D object formed by combining multiple images from different perspectives is represented as a point cloud comprised of an unstructured collection of points with associated colors and normal directions.) [Lucas: col 5, line 18-24]; (i.e. rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 47-48]; (i.e. one stereo technique uses a patch to determine if points on one image match points on another image rather than the more conventional scanline matching. The patches are better for capturing similar pixel data on images of two different perspectives) [Lucas: col 5, line 28-32]) to one of the multiple projection planes ((i.e. A stereo matching technique is then applied to perform depth estimation for seeds in the initial point cloud. Rays are traced from the camera center of a first view or image, through a seed point on the first image. Ray positions are projected onto the view of another overlapping second camera, or second view or image, to determine a linear range or bracket of potential depth estimates along the ray and within the second view or image. By one form, the process is repeated for all pairs of cameras with overlapping fields of view, although other alternatives could be used) [Lucas: col 8, line 4-14]; (i.e. a set of points contained within the silhouette of the visual hull when reprojected into individual camera views) [Lucas: col 12, line 14-16]; (i.e. the LPV or sphere will have its 1 cm diameter cover that pixel length of 5, 10, 100, and so on. Spheres are the selected shape due to software rendering efficiency so that their projection onto any camera image is not less than a pixel in width and when rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 42-48]; (i.e. Stereo-matching confidence scores are then used to select the best depth estimate for the seed point being analyzed. By one form, and as mentioned above, this involves an initial selection by using the DAISY score, while the depth estimate is then refined using the CENSUS score.) [Lucas: col 8, line 15-20]; (i.e. To improve localization of landmarks, the highest score of a CENSUS metric using a 7x7 pixel patch with a bracketed line search is performed where the local CENSUS search bracket 1032 and 1106 are respectively shown on FIGS. 10 and 11. CENSUS is another stereo-matching descriptor and refers to a non-parametric intensity-based image data representation over a certain pixel area referred to as a CENSUS transform that summarizes local image structure by providing a bit string (or in other words, transforms image data into a representation). The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel) [Lucas: col 14, line 32-45]) based on the final score ((i.e. the final optima maximum CENSUS score point) [Lucas: col 15, line 8]; (i.e. Thereafter, process 300 may include "provide an expanded and filtered point cloud to be used to generate images" 316. The final point cloud then may be provided first for post-processing to refine the points, which may include traditional space carving, as described below, and then for modeling, display, or analysis as needed depending on the application and as described below as well) [Lucas: col 10, line 2-9]), to generate the refined patches ((i.e. Referring to FIGS. 1-2, stereo methods perform well on scenes with highly textured surfaces even though segmentation masks are not necessarily highly accurate. For instance, one stereo technique uses a patch to determine if points on one image match points on another image rather than the more conventional scanline matching. The patches are better for capturing similar pixel data on images of two different perspectives) [Lucas: col 5, line 25-32]; (i.e. The CENSUS score may be determined by hamming distance between the two bit strings of corresponding pixel location patches on two different images) [Lucas: col 14, line 55-58]).  
Lucas, Huang, and Boyce does not explicitly disclose the following claim limitations (Emphasis added).
The encoding device of Claim 1, wherein the processor is configured to: identify a final score for each of the points of the 3D point cloud with respect to the multiple projection planes, wherein the final score is based on a weighted combination of the normal score and the smoothing score and relates each of the points to a particular projection plane of the multiple projection planes; and group each point of the 3D point cloud to one of the multiple projection planes based on the final score, to generate the refined patches.
However, in the same field of endeavor Hickman further discloses the claim limitations and the deficient claim limitations, as follows:
wherein the final score is based on a weighted combination of the normal score and the smoothing score (i.e. the computing device may be configured to determine a respective score for each of the features (e.g., geometry, surface textures, color, shape, material, etc.) of the object and further may be configured to determine a single feature-based score based on respective scores of the features (e.g., a weighted average of the respective scores)) [Hickman: col 9, line 24-30] 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Lucas, Huang, and Boyce with Hickman to program the system to implement the weighted sum method.  


Regarding claim 10, Lucas, Huang, and Boyce meet the claim limitations as set forth in claim 1.Lucas further meets the claim limitations as follow.
The encoding device of Claim 1 ((i.e. encoder) [Lucas: col 24, line 51]; (i.e. image processing system) [Lucas: col 24, line 46]), wherein the processor (i.e. the image processing system 2900 may have one or more processors) [Lucas: col 24, line 46-47; Figs. 29-30] is further configured to:identify each voxel of the multiple voxels that include at least one of the points of the 3D point cloud (i.e. visual hull is the 3D shape of an object or objects in the captured images defined by outer boundary or silhouette of 55 object(s) from each camera into 3D space. By one form, the space carving is performed on a voxel-level. The space carving is desirable because the space carving methods can infer shapes of occluded surfaces and surfaces with no or little observable texture) [Lucas: col 4, line 52-59]; and generate an index that identifies the points of the 3D point cloud that are within each of the multiple voxels.
Lucas, Huang, and Boyce does not explicitly disclose the following claim limitations (Emphasis added).
The encoding device of Claim 1, wherein the processor is further configured to: identify each voxel of the multiple voxels that include at least one of the points of the 3D point cloud; and generate an index that identifies the points of the 3D point cloud that are within each of the multiple voxels.
However, in the same field of endeavor Hickman further discloses the claim limitations and the deficient claim limitations, as follows:
identifying each voxel of the multiple voxels (i.e. Identified structures can be used to generate 3D models that can be viewed, for example, using 3D Computer Aided Design (CAD) tools. In one example, a 3D geometric model in the form of a triangular surface mesh may be generated. In another example, the model is in voxels and a marching cubes algorithm may be applied to convert the voxels into a mesh, which can undergo a smoothing operation to reduce jaggedness on surfaces of the 3D object data model caused by conversion by the marching cubes algorithm. An example smoothing operation may move individual triangle vertices to positions representing averages of connected neighborhood vertices to reduce angles between triangles in the mesh) [Hickman: col 13, line 39-51] that include at least one of the points of the 3D point cloud ((i.e. The computing device may be configured to generate the 3D object data model of the object by estimating 3D coordinates of points on the object.) [Hickman: col 13, line 32-34]; (i.e. FIG. 6 illustrates conceptual example surface normals of points on a surface 602 of an object in relation to the image capture device 404, in accordance with an embodiment) [Hickman: col 12, line 28-30; Fig. 6]; (i.e. In one example, 3D object data model generation may further include application of a decimation operation to the smoothed mesh to eliminate data points) [Hickman: col 13, line 52-54]); and generating an index (i.e. perform texture resampling and also shape-based indexing. For example, for each object, the semantics and search index 114 may index or label components of the images (e.g., per pixel) as having a certain texture, color, shape, geometry, attribute, etc. The semantics and search index 114 may receive the 3D object data model file or the files comprising the 3D object data model from the model builder 110 or the object data model processor 112, and may be configured to label portions of the file or each file individually
with identifiers related to attributes of the file. In some examples, the semantics and search index 114 may be configured to provide annotations for aspects of the 3D object data models. For instance, an annotation may be provided to label or index aspects of color, texture, shape, appearance, description, function, etc., of an aspect of a 3D object data model Annotations may be used to label any aspect of an image or 3D object data model, or to provide any type of information) [Hickman: col 5, line 5-23] that identifies the points of the 3D point cloud that are within each of the multiple voxels (i.e. The computing device may be configured to generate the 3D object data model of the object by estimating 3D coordinates of points on the object. The coordinates may be determined by measurements made in the respective images. Common points may be identified on each image. A line of sight (or ray) can be constructed from a camera location to a point on the object. Intersection of these rays (triangulation) may determine a 3D location or coordinates of the point. Identified structures can be used to generate 3D models that can be viewed, for example, using 3D Computer Aided Design (CAD) tools. In one example, a 3D geometric model in the form of a triangular surface mesh may be generated. In another example, the model is in voxels and a marching cubes algorithm may be applied to convert the voxels into a mesh, which can undergo a smoothing operation to reduce jaggedness on surfaces of the 3D object data model caused by conversion by the marching cubes algorithm. An example smoothing operation may move individual triangle vertices to positions representing averages of connected neighborhood vertices to reduce angles between triangles in the mesh) [Hickman: col 13, line 32-51].

Therefore, the combination of Lucas, Huang, and Boyce with Hickman will enable the system to create a deep fully convolutional network for analysis of 3D data in 3D environments [Park: col. 3, line 1-34]. 

Regarding claim 15, Lucas, Huang, and Boyce meet the claim limitations as set forth in claim 12.Lucas further meets the claim limitations as follow.
The method of Claim 12 (i.e. a method) [Lucas: col 1, line 54], wherein to identifying the smoothing score ((i.e. feature identifying algorithm that generates scores) [Lucas: col 12, line 6-7]; (i.e. Process 400 may include "smooth point cloud with shrink wrapping" 498. Starting with the visual hull represented as a mesh, the method may shrink wrap (see Dale, A. M., "Cortical surface-based analysis: I. Segmentation and surface reconstruction", Neuroimage 9.2, pp. 179-194 (1999)) the point cloud by moving mesh vertexes closer to the original point cloud, subject to regularization so that the resultant point cloud is smooth. The topology of the mesh is discarded because triangle quality tends to be poor when vertices are spaced close together) [Lucas: col 20, line 53-62]; (i.e. the point is then refined using a non-parametric intensity-based confidence score, such as a CENSUS score) [Lucas: col 6, line 28-30]; (i.e. A bracket search is used again to assign a depth estimate near the pivot point to the neighbor point, and by one form, by determining the CENSUS score within the bracket. These neighbor points then become 3D particles that populate the latest expanded point cloud, and each such neighbor point then becomes its own LPV) [Lucas: col 8, line 52-58]; (i.e. the CENSUS pixel area that is being applied as explained below. This approach minimizes the amount of work that has to be done by artists to clean up resulting point clouds for film and/or video production which may be at the expense of more computation time. The method is biased towards erring on the side of false positives instead of false negatives by having such a robust seeding and expansion of points image by image such that corresponding points on different images could each have its own candidate point in the point cloud resulting in some redundancy. This is ultimately more efficient because it is easier for artists to manually remove extraneous points (which is a relatively easier 2D task) than to complete missing structures by sculpting (which is a relatively more difficult 3D task)) [Lucas: col 6, line 37-51]) comprises: identifying a set of smoothing scores for each of the multiple voxels (i.e. feature identifying algorithm that generates scores) [Lucas: col 12, line 6-7]; (i.e. Process 400 may include "smooth point cloud with shrink wrapping" 498. Starting with the visual hull represented as a mesh, the method may shrink wrap (see Dale, A. M., "Cortical surface-based analysis: I. Segmentation and surface reconstruction", Neuroimage 9.2, pp. 179-194 (1999)) the point cloud by moving mesh vertexes closer to the original point cloud, subject to regularization so that the resultant point cloud is smooth. The topology of the mesh is discarded because triangle quality tends to be poor when vertices are spaced close together) [Lucas: col 20, line 53-62]; (i.e. identified using a nearest-neighbor lookup to first connect neighboring points that are within a specified distance. Small clusters are then removed based on the spatial extent and number of points in the connected component) [Lucas: col 20, line 28-32]) that include at least one of the points of the 3D point cloud ((i.e. Process 400 may include "smooth point cloud with shrink wrapping" 498. Starting with the visual hull represented as a mesh, the method may shrink wrap (see Dale, A. M., "Cortical surface-based analysis: I. Segmentation and surface reconstruction", Neuroimage 9.2, pp. 179-194 (1999)) the point cloud by moving mesh vertexes closer to the original point cloud, subject to regularization so that the resultant point cloud is smooth. The topology of the mesh is discarded because triangle quality tends to be poor when vertices are spaced close together) [Lucas: col 20, line 53-62]; (i.e. the point is then refined using a non-parametric intensity-based confidence score, such as a CENSUS score) [Lucas: col 6, line 28-30]; (i.e. A bracket search is used again to assign a depth estimate near the pivot point to the neighbor point, and by one form, by determining the CENSUS score within the bracket. These neighbor points then become 3D particles that populate the latest expanded point cloud, and each such neighbor point then becomes its own LPV) [Lucas: col 8, line 52-58]; (i.e. the CENSUS pixel area that is being applied as explained below. This approach minimizes the amount of work that has to be done by artists to clean up resulting point clouds for film and/or video production which may be at the expense of more computation time. The method is biased towards erring on the side of false positives instead of false negatives by having such a robust seeding and expansion of points image by image such that corresponding points on different images could each have its own candidate point in the point cloud resulting in some redundancy. This is ultimately more efficient because it is easier for artists to manually remove extraneous points (which is a relatively easier 2D task) than to complete missing structures by sculpting (which is a relatively more difficult 3D task)) [Lucas: col 6, line 37-51]); and identifying the smoothing score for one voxel of the multiple voxels (i.e. feature identifying algorithm that generates scores) [Lucas: col 12, line 6-7]; (i.e. Process 400 may include "smooth point cloud with shrink wrapping" 498. Starting with the visual hull represented as a mesh, the method may shrink wrap (see Dale, A. M., "Cortical surface-based analysis: I. Segmentation and surface reconstruction", Neuroimage 9.2, pp. 179-194 (1999)) the point cloud by moving mesh vertexes closer to the original point cloud, subject to regularization so that the resultant point cloud is smooth. The topology of the mesh is discarded because triangle quality tends to be poor when vertices are spaced close together) [Lucas: col 20, line 53-62] by combining the set of smoothing scores of the one voxel with additional sets of smoothing scores of the neighboring voxels that are proximate to the one voxel ((i.e. Process 400 may include "generate point cloud by combining the LPVs") [Lucas: col 18, line 55-56]; (i.e. constructing the point cloud by combining the local point volumes) [Lucas: col 30, line 17-18]; (i.e. a collection of points or pixel locations from a single image that are assigned depth values is referred to herein as a depth map of a single image, while a 3D object formed by combining multiple images from different perspectives is represented as a point cloud comprised of an unstructured collection of points with associated colors and normal directions.) [Lucas: col 5, line 18-24]; (i.e. rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 47-48]).
In the same field of endeavor Hickman further discloses the claim limitations as follows:
identifying a set of smoothing scores for each of the multiple voxels (i.e. another example, the model is in voxels and a marching cubes algorithm may be applied to convert the voxels into a mesh, which can undergo a smoothing operation to reduce jaggedness on surfaces of the 3D object data model caused by conversion by the marching cubes algorithm. An example smoothing operation may move individual triangle vertices to positions representing averages of connected neighborhood vertices to reduce angles between triangles in the mesh.  In one example, 3D object data model generation may further include application of a decimation operation to the smoothed mesh to eliminate data points) [Hickman: col 13, line 44-54] that include at least one of the points of the 3D point cloud (i.e. another example, the model is in voxels and a marching cubes algorithm may be applied to convert the voxels into a mesh, which can undergo a smoothing operation to reduce jaggedness on surfaces of the 3D object data model caused by conversion by the marching cubes algorithm. An example smoothing operation may move individual triangle vertices to positions representing averages of connected neighborhood vertices to reduce angles between triangles in the mesh) [Hickman: col 13, line 44-51]; and identifying the smoothing score for one voxel of the multiple voxels (i.e. another example, the model is in voxels and a marching cubes algorithm may be applied to convert the voxels into a mesh, which can undergo a smoothing operation to reduce jaggedness on surfaces of the 3D object data model caused by conversion by the marching cubes algorithm. An example smoothing operation may move individual triangle vertices to positions representing averages of connected neighborhood vertices to reduce angles between triangles in the mesh.  In one example, 3D object data model generation may further include application of a decimation operation to the smoothed mesh to eliminate data points) [Hickman: col 13, line 44-54] by combining the set of smoothing scores of the one voxel with additional sets of smoothing scores of the neighboring voxels that are proximate to the one voxel ((i.e.   In one example, 3D object data model generation may further include application of a decimation operation to the smoothed mesh to eliminate data points, which may improve processing speed. After the smoothing and decimation operations have been performed, an error value may be calculated based on differences between a resulting mesh and an original mesh or original data, and the error may be compared to an acceptable threshold value. The smoothing and decimation operations may be applied to the mesh once again based on a comparison of the error to the acceptable value. Last set of mesh data that satisfies the threshold may be stored as the 3D object data model.) [Hickman: col 13, line 53-63].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Lucas, Huang, and Boyce with Hickman to program the system to compute a smooth score for each point.  
Therefore, the combination of Lucas, Huang, and Boyce with Hickman will enable the system to create a deep fully convolutional network for analysis of 3D data in 3D environments [Park: col. 3, line 1-34]. 

Regarding claim 18, Lucas, Huang, and Boyce meet the claim limitations as set forth in claim 11.Lucas further meets the claim limitations as follow.
The encoding device of Claim 1 ((i.e. encoder) [Lucas: col 24, line 51]; (i.e. image processing system) [Lucas: col 24, line 46]), further comprising:
identifying (i.e. feature identifying algorithm that generates scores) [Lucas: col 12, line 6-7] a final score ((i.e. the final optima maximum CENSUS score point) [Lucas: col 15, line 8]; (i.e. Thereafter, process 300 may include "provide an expanded and filtered point cloud to be used to generate images" 316. The final point cloud then may be provided first for post-processing to refine the points, which may include traditional space carving, as described below, and then for modeling, display, or analysis as needed depending on the application and as described below as well) [Lucas: col 10, line 2-9]) for each of the points of the 3D point cloud ((i.e. determining the L1 (normalized) color difference (SAD score) between the rendered image and the original image for each perspective or camera. Other options include a different metric for image comparison, such as L2 color difference, normalized correlation, normalize mutual information, etc .....) [Lucas: col 19, line 49-54] ; (i.e. A bracket search is used again to assign a depth estimate near the pivot point to the neighbor point, and by one form, by determining the CENSUS score within the bracket. These neighbor points then become 3D particles that populate the latest expanded point cloud, and each such neighbor point then becomes its own LPV) [Lucas: col 8, line 52-58]; (i.e. Stereo-matching confidence scores based on local region descriptors for image data (also referred to herein as representations) are then used to select the best depth estimate for the point being analyzed. By one form, this involves an initial selection by using a gradient histogram-type of local region descriptor such as a DAISY score) [Lucas: col 6, line 21-28]) with respect to the multiple projection planes ((i.e. A stereo matching technique is then applied to perform depth estimation for seeds in the initial point cloud. Rays are traced from the camera center of a first view or image, through a seed point on the first image. Ray positions are projected onto the view of another overlapping second camera, or second view or image, to determine a linear range or bracket of potential depth estimates along the ray and within the second view or image. By one form, the process is repeated for all pairs of cameras with overlapping fields of view, although other alternatives could be used) [Lucas: col 8, line 4-14]; (i.e. a set of points contained within the silhouette of the visual hull when reprojected into individual camera views) [Lucas: col 12, line 14-16]; (i.e. the LPV or sphere will have its 1 cm diameter cover that pixel length of 5, 10, 100, and so on. Spheres are the selected shape due to software rendering efficiency so that their projection onto any camera image is not less than a pixel in width and when rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 42-48]), wherein the final score ((i.e. the final optima maximum CENSUS score point) [Lucas: col 15, line 8]; (i.e. Thereafter, process 300 may include "provide an expanded and filtered point cloud to be used to generate images" 316. The final point cloud then may be provided first for post-processing to refine the points, which may include traditional space carving, as described below, and then for modeling, display, or analysis as needed depending on the application and as described below as well) [Lucas: col 10, line 2-9]) is based on a weighted combination of the normal score and the smoothing score and relates each of the points to a particular projection plane of the multiple projection planes ((i.e. a set of points contained within the silhouette of the visual hull when reprojected into individual camera views) [Lucas: col 12, line 14-16]; (i.e. This may include "compare LPV to corresponding points in 2D images" 482, which in tum, involves "render a visible LPV into a rendered 2D image including any other particles within the volume of the LPV" 482-1. Thus, a current LPV on the latest expanded point cloud and the particles of other LPVs within the volume of the current LPV are projected to rendered a 2D image, one for each camera ( or perspective or different view) of the multiple cameras.) [Lucas: col 18, line 67 – col. 19, line 7]); and grouping each point of the 3D point cloud ((i.e. Process 400 may include "generate point cloud by combining the LPVs") [Lucas: col 18, line 55-56]; (i.e. constructing the point cloud by combining the local point volumes) [Lucas: col 30, line 17-18]; (i.e. a collection of points or pixel locations from a single image that are assigned depth values is referred to herein as a depth map of a single image, while a 3D object formed by combining multiple images from different perspectives is represented as a point cloud comprised of an unstructured collection of points with associated colors and normal directions.) [Lucas: col 5, line 18-24]; (i.e. rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 47-48]; (i.e. one stereo technique uses a patch to determine if points on one image match points on another image rather than the more conventional scanline matching. The patches are better for capturing similar pixel data on images of two different perspectives) [Lucas: col 5, line 28-32]) to one of the multiple projection planes ((i.e. A stereo matching technique is then applied to perform depth estimation for seeds in the initial point cloud. Rays are traced from the camera center of a first view or image, through a seed point on the first image. Ray positions are projected onto the view of another overlapping second camera, or second view or image, to determine a linear range or bracket of potential depth estimates along the ray and within the second view or image. By one form, the process is repeated for all pairs of cameras with overlapping fields of view, although other alternatives could be used) [Lucas: col 8, line 4-14]; (i.e. a set of points contained within the silhouette of the visual hull when reprojected into individual camera views) [Lucas: col 12, line 14-16]; (i.e. the LPV or sphere will have its 1 cm diameter cover that pixel length of 5, 10, 100, and so on. Spheres are the selected shape due to software rendering efficiency so that their projection onto any camera image is not less than a pixel in width and when rendered in combination with neighboring LPV's, there are no gaps between the points) [Lucas: col 17, line 42-48]; (i.e. Stereo-matching confidence scores are then used to select the best depth estimate for the seed point being analyzed. By one form, and as mentioned above, this involves an initial selection by using the DAISY score, while the depth estimate is then refined using the CENSUS score.) [Lucas: col 8, line 15-20]; (i.e. To improve localization of landmarks, the highest score of a CENSUS metric using a 7x7 pixel patch with a bracketed line search is performed where the local CENSUS search bracket 1032 and 1106 are respectively shown on FIGS. 10 and 11. CENSUS is another stereo-matching descriptor and refers to a non-parametric intensity-based image data representation over a certain pixel area referred to as a CENSUS transform that summarizes local image structure by providing a bit string (or in other words, transforms image data into a representation). The CENSUS transform represents a set of neighboring pixels within some pixel diameter (such as all adjacent pixels) whose intensity is less than the intensity of a central or other key pixel referred to herein as a pivot pixel) [Lucas: col 14, line 32-45]) based on the final score ((i.e. the final optima maximum CENSUS score point) [Lucas: col 15, line 8]; (i.e. Thereafter, process 300 may include "provide an expanded and filtered point cloud to be used to generate images" 316. The final point cloud then may be provided first for post-processing to refine the points, which may include traditional space carving, as described below, and then for modeling, display, or analysis as needed depending on the application and as described below as well) [Lucas: col 10, line 2-9]), to generate the refined patches ((i.e. Referring to FIGS. 1-2, stereo methods perform well on scenes with highly textured surfaces even though segmentation masks are not necessarily highly accurate. For instance, one stereo technique uses a patch to determine if points on one image match points on another image rather than the more conventional scanline matching. The patches are better for capturing similar pixel data on images of two different perspectives) [Lucas: col 5, line 25-32]; (i.e. The CENSUS score may be determined by hamming distance between the two bit strings of corresponding pixel location patches on two different images) [Lucas: col 14, line 55-58]).  
Lucas, Huang, and Boyce do not explicitly disclose the following claim limitations (Emphasis added).
The method of Claim 11, further comprising:identifying a final score for each of the points of the 3D point cloud with respect to the multiple projection planes, wherein the final score is based on a weighted combination of the normal score and the smoothing score and relates each of the points to a particular projection plane of the multiple projection planes; and grouping each point of the 3D point cloud to one of the multiple projection planes based on the final score, to generate the refined patches.
However, in the same field of endeavor Hickman further discloses the claim limitations and the deficient claim limitations, as follows:
wherein the final score is based on a weighted combination of the normal score and the smoothing score (i.e. the computing device may be configured to determine a respective score for each of the features (e.g., geometry, surface textures, color, shape, material, etc.) of the object and further may be configured to determine a single feature-based score based on respective scores of the features (e.g., a weighted average of the respective scores)) [Hickman: col 9, line 24-30] 
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Lucas, Huang, and Boyce with Hickman to program the system to implement the weighted sum method.  
Therefore, the combination of Lucas, Huang, and Boyce with Hickman will enable the system to 3D visualization of the object for analysis of 3D data in 3D environments [Park: col. 12, line 60 – col. 13, line 10; col. 3, line 1-34].

Regarding claim 20, Lucas, Huang, and Boyce meet the claim limitations as set forth in claim 11.Lucas further meets the claim limitations as follow.
The method of Claim 11 (i.e. a method) [Lucas: col 1, line 54], further comprising: identifying each voxel of the multiple voxels that include at least one of the points of the 3D point cloud (i.e. visual hull is the 3D shape of an object or objects in the captured images defined by outer boundary or silhouette of 55 object(s) from each camera into 3D space. By one form, the space carving is performed on a voxel-level. The space carving is desirable because the space carving methods can infer shapes of occluded surfaces and surfaces with no or little observable texture) [Lucas: col 4, line 52-59]; and generating an index that identifies the points of the 3D point cloud that are within each of the multiple voxels.
Lucas, Huang, and Boyce do not explicitly disclose the following claim limitations (Emphasis added).
The method of Claim 11, further comprising: identifying each voxel of the multiple voxels that include at least one of the points of the 3D point cloud; and generating an index that identifies the points of the 3D point cloud that are within each of the multiple voxels.
However, in the same field of endeavor Hickman further discloses the claim limitations and the deficient claim limitations, as follows:
identifying each voxel of the multiple voxels (i.e. Identified structures can be used to generate 3D models that can be viewed, for example, using 3D Computer Aided Design (CAD) tools. In one example, a 3D geometric model in the form of a triangular surface mesh may be generated. In another example, the model is in voxels and a marching cubes algorithm may be applied to convert the voxels into a mesh, which can undergo a smoothing operation to reduce jaggedness on surfaces of the 3D object data model caused by conversion by the marching cubes algorithm. An example smoothing operation may move individual triangle vertices to positions representing averages of connected neighborhood vertices to reduce angles between triangles in the mesh) [Hickman: col 13, line 39-51] that include at least one of the points of the 3D point cloud ((i.e. The computing device may be configured to generate the 3D object data model of the object by estimating 3D coordinates of points on the object.) [Hickman: col 13, line 32-34]; (i.e. FIG. 6 illustrates conceptual example surface normals of points on a surface 602 of an object in relation to the image capture device 404, in accordance with an embodiment) [Hickman: col 12, line 28-30; Fig. 6]; (i.e. In one example, 3D object data model generation may further include application of a decimation operation to the smoothed mesh to eliminate data points) [Hickman: col 13, line 52-54]); and generating an index (i.e. perform texture resampling and also shape-based indexing. For example, for each object, the semantics and search index 114 may index or label components of the images (e.g., per pixel) as having a certain texture, color, shape, geometry, attribute, etc. The semantics and search index 114 may receive the 3D object data model file or the files comprising the 3D object data model from the model builder 110 or the object data model processor 112, and may be configured to label portions of the file or each file individually
with identifiers related to attributes of the file. In some examples, the semantics and search index 114 may be configured to provide annotations for aspects of the 3D object data models. For instance, an annotation may be provided to label or index aspects of color, texture, shape, appearance, description, function, etc., of an aspect of a 3D object data model Annotations may be used to label any aspect of an image or 3D object data model, or to provide any type of information) [Hickman: col 5, line 5-23] that identifies the points of the 3D point cloud that are within each of the multiple voxels (i.e. The computing device may be configured to generate the 3D object data model of the object by estimating 3D coordinates of points on the object. The coordinates may be determined by measurements made in the respective images. Common points may be identified on each image. A line of sight (or ray) can be constructed from a camera location to a point on the object. Intersection of these rays (triangulation) may determine a 3D location or coordinates of the point. Identified structures can be used to generate 3D models that can be viewed, for example, using 3D Computer Aided Design (CAD) tools. In one example, a 3D geometric model in the form of a triangular surface mesh may be generated. In another example, the model is in voxels and a marching cubes algorithm may be applied to convert the voxels into a mesh, which can undergo a smoothing operation to reduce jaggedness on surfaces of the 3D object data model caused by conversion by the marching cubes algorithm. An example smoothing operation may move individual triangle vertices to positions representing averages of connected neighborhood vertices to reduce angles between triangles in the mesh) [Hickman: col 13, line 32-51].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Lucas, Huang, and Boyce with Hickman to program the system to implement the shape indexing algorithm.  
Therefore, the combination of Lucas, Huang, and Boyce with Hickman will enable the system to create a deep fully convolutional network for analysis of 3D data in 3D environments [Park: col. 3, line 1-34]. 

Reference Notice 
Additional prior arts, included in the Notice of Reference Cited, made of record and not relied upon is considered pertinent to applicant's disclosure.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Philip Dang whose telephone number is (408) 918-7529.  The examiner can normally be reached on Monday-Thursday between 8:30 am - 5:00 pm (PST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sath Perungavoor can be reached on 571-272-7455.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications