Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 1, 8, and 15 recite “detecting correspondences across the set of depth images by using the descriptors; refining camera intrinsic and extrinsic parameters for each image of the set of depth images using bundle-adjustment”. It is unclear to the examiner the relationship between the detecting and refining steps. The limitations do not disclose how the detected correspondences are incorporated by the bundle adjustment, and it is unclear how the detected correspondences can be used to refine the camera parameters.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-5, 8-12, and 15-19 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. (US 2010/0329358) in view of Li et al. (Pattern Recognition Letters, 2008 Elsevier) in view of Bae et al. (US 9,036,044) in view of Schmalstieg et al. (US 2014/0323148).
Regarding claim 1, Zhang et al. (hereinafter Zhang) discloses a method of rendering new images (Zhang, [0021], “The individual frames of the macro-frames are then synthesized into single images and shown in sequence on display 116”), comprising:
acquiring a set of images using one or more image cameras (Zhang, [0021], “the video cameras 104 each capture images or frames 105 at some given increment of time or rate”);
acquiring a set of depth images using one or more depth cameras (Zhang, [0029], “The depth map 200 may be obtained in various ways.  For example, the depth map 200 may be obtained by a scanning laser, by analysis of one or more frames of the video cameras (stereoscopic type depth analysis), or other available depth-sensing means”);

converting the set of images acquired from the one or more image cameras into texture maps (Zhang, [0032], “The mesh model 228 is also projected 274 to the virtual rendering viewpoint 224 using multi-texture blending… For each vertex in the synthetic image being rendered, the vertex is projected to the nearby captured frames to locate the corresponding texture coordinate”); and
rendering new images from the geometric models and texture maps using a geometry- based rendering technique (Zhang, [0033], “Rendered image 292 is an example of a synthetic image rendered at the virtual viewpoint 224 using image data from frames 190”);
while Zhang teaches the depth image; Zhang does not expressly disclose “detecting points in the set of depth images under viewpoint and lighting variations”;
Li et al. (hereinafter Li) discloses detecting points in an image under viewpoint and lighting variations (Li, 3. Review of SIFT, [0002], “potential feature points are detected by searching over all scales and image locations”. In addition, in paragraph [0003], “This information allows candidate feature points to be rejected that have low contrast (and are therefore sensitive to noise) or are poorly localized along an edge. Sub-sample”. Low contrast reads on lighting variation. Fig. 2 illustrates viewpoint).
Li discloses generating a descriptor for each point based on its local neighborhood (Li, 3. Review of SIFT, [0005], “the feature descriptor is created by sampling the magnitudes and orientations of the image gradients in the 16x16 neighboring region around the point”).
And Li discloses detecting correspondences across the images by using the descriptors (Li, Fig. 2 illustrates (c) The matched elliptical region corresponding to the circle in (a) after an affine transformation.).

In addition, though Zhang teaches each image of the set of depth images; Zhang as modified by Li does not expressly disclose “refining camera intrinsic and extrinsic parameters for each image of the set of depth images using bundle-adjustment”;
Bae et al. (hereinafter Bae) discloses refining camera intrinsic and extrinsic parameters for images using bundle-adjustment (Bae, col 1. 13-23, “bundle adjustment can be used to identify and/or refine extrinsic camera parameters, such as position and orientation of the camera used to capture an image (i.e. the pose of the image). Bundle adjustment can also be used to identify and/or refine intrinsic camera parameters, such as principal point, focal length, lens distortion, skewness, etc.”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Bae’s bundle adjustment to refine camera intrinsic and extrinsic parameters in the multi-view video system, as taught by Zhang. The motivation for doing so would have been providing consistency in the connectivity among a set of two-dimensional images taken of a scene.
Furthermore, though Zhang teaches converting the depth images to the geometric models; Zhang as modified by Li and Bae does not expressly disclose “using a patch-based multi-view stereo for dense point cloud and a multi-view reconstruction”;
Schmalstieg et al. (hereinafter Schmalstieg) discloses using a patch-based multi-view stereo for dense point cloud and a multi-view reconstruction (Schmalstieg, [0055], “The WAL Server can use Patch-based Multi View Stereo algorithms to create denser surface point clouds from both the Server Map and the SLAM Map”).

Regarding claim 2, Zhang discloses rasterization (Zhang, [0032], “each pixel of the depth map corresponds to a vertex of the mesh model”).
Regarding claim 3, Zhang discloses ray- tracing (Zhang, [0031], “Each such light ray is projected or traced 248 to the surface of the depth map 200 to obtain an intersection therewith”).
Regarding claim 4, Zhang discloses the scene object is static and one model and set of textures is converted for that scene object (Zhang, [0030], “in some embodiments described later, a mesh model
228 may also be modeled at the location 225 of the real-world scene or subject”. The subject can be static. Fig. 5).
Regarding claim 5, Zhang discloses the scene object is animated (Zhang, Fig. 4 shows the object is animated).
Regarding claim 8, Zhang discloses a system for creating a rendering of images (Zhang, [0023], “The system shown in FIG. 1 may be used for immersive tele-conferencing”) comprising:
one or more processors (Zhang, [0022], “The terminals 100, 112 may be ordinary desktop computers equipped with the necessary peripheral devices, memory, CPU, network interfaces, etc. The terminals 100, 112 may also be teleconference terminals, possibly equipped with digital signal processors”); and
a memory coupled with the one or more processors, the memory configured to store instructions that when executed by the one or more processors (Zhang, [0039], “This is also deemed to include at least volatile memory such as RAM and/or virtual memory storing information such as CPU instructions during execution of a program carrying out an embodiment”).
The remaining limitations are similar in scope to the method recited in claim 1 and therefore are rejected under the same rationale.
Regarding claims 9-12, claims 9-12 recite functions performed by a processor that are similar in scope to the method recited in claims 2-5 and therefore are rejected under the same rationale.
Regarding claim 15, Zhang discloses a non-transitory computer-readable storage medium having stored thereon instructions for causing at least one computer system to create a rendering of images (Zhang, [0039], “Embodiments and features discussed above can be realized in the form of 
information stored in volatile or non-volatile computer or device readable media…This is also deemed to include at least volatile memory such as RAM and/or virtual memory storing information such as CPU instructions during execution of a program carrying out an embodiment”).
The limitations are similar in scope to the method recited in claim 1 and therefore are rejected under the same rationale.
Regarding claims 16-19, claims 16-19 recite functions performed by a processor that are similar in scope to the method recited in claims 2-5 and therefore are rejected under the same rationale.

Claims 6, 13, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. (US 2010/0329358) in view of Li et al. in view of Bae et al. (US 9,036,044) in view of Schmalstieg et al. (US 2014/0323148), as applied to claims 5, 12, and 15, in further view of Brown et al. (US 2012/0281873).
Regarding claim 6, though Zhang teaches the geometric model and texture map; Zhang as modified by Li, Bae, and Schmalstieg is silent with respect to “every frame of the animated scene object has a separate geometric model and texture map”;
Brown et al. (hereinafter Brown) discloses every frame of an animated scene object has a
separate geometric model and texture map (Brown, [0033], “At 410 3D mesh models are initialized and
used to populate the tracked objects with appropriate 3D models, for example, a walking person 3D
model for a object person detected on a causeway...at 412 a real-time 3D projection of the object from
the camera feed into the 3D environment generates an AVE that is both realistic and immersive by using
the motion of the 2D object to drive motion of the 3D volumetric-based object model rendered with the
texture of the 2D object projected thereon, and within a 3D context”).

Regarding claim 13, claims 13 recites function performed by a processor that is similar in scope to the method recited in claim 6 and therefore is rejected under the same rationale.
Regarding claim 20, claims 20 recites function performed by a processor that is similar in scope to the method recited in claim 6 and therefore is rejected under the same rationale.

Claims 7 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. (US 2010/0329358) in view of Li et al. in view of Bae et al. (US 9,036,044) in view of Schmalstieg et al. (US 2014/0323148), as applied to claims 5 and 12, in view of Brown et al. (US 2012/0281873) in further view of Guenter et al. (US 6,072,496).
sRegarding claim 7, while Zhang as modified by Li, Bae, Schalstieg, and Brown with the same motivation from claim 6 teaches every frame of the animated scene object has a separate geometric model; they do not expressly disclose “the animated scene object has a single texture map”;
Guenter et al. (hereinafter Guenter) discloses “an animated scene object has a single texture map” (Guenter, col 5. 38-40, “/t then merges camera images from the multiple cameras into a single texture map per frame”).
At the time the invention was filed, it would have been obvious to a person of ordinary skill in the art to incorporate the concept of Guenter’s a single texture map is created by merging camera images from the multiple cameras in the multi-view rendering system, as taught by Zhang as modified by Li, Bae, Schalstieg, and Brown. The motivation for doing so would have been both the time varying 
Regarding claim 14, claims 14 recites function performed by a processor that is similar in scope to the method recited in claim 7 and therefore is rejected under the same rationale.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KYLE ZHAI whose telephone number is (571)270-3740.  The examiner can normally be reached on 9AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ke Xiao can be reached on (571) 272 - 7776.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/KYLE ZHAI/Primary Examiner, Art Unit 2612