DETAILED ACTION

Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Priority

Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.


Claim Interpretation

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “first acquisition unit configured to acquire”, “second acquisition unit configured to acquire”, “generation unit configured to generate”, “coloring unit configured to generate”, “information processing apparatus…which performs”, “image processing apparatus…which generates”, “limited unit configured to generate” and “output unit configured to output” in claims 1, 8, 12, 13 and 15.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.


Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-3, 5 and 14-18 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Coene et al. (US 2013/0016097) and Wetzel et al. (US 8,811,811).
Regarding claim 1, Coene et al. discloses an image processing system comprising: 
a first acquisition unit configured to acquire a foreground mask from a captured image acquired by capturing an object with an image capturing unit (“A first step used for object identification and positioning is the detection of foreground pixels in the camera input frames, i.e. pixels corresponding to foreground objects. The result of this step is a binary mask 45 for each input frame of each physical camera, which discriminates the foreground objects from the background” at paragraph 0072, line 1); 
a generation unit configured to generate shape data representing a three-dimensional shape of the object based on the foreground mask (“The above mentioned masks play a role in two parts in the creation of virtual viewpoints: first, they may be used to determine the location of the foreground objects in 3D space, e.g. where they are exactly located in the scene; secondly, they may be used to confine the area where the plane sweeping process looks for information” at paragraph 0011, line 1; see also abstract).
Coene et al. does not explicitly disclose that the image capturing unit is one whose exposure value is set relatively higher or lower than that of another image capturing unit, a second acquisition unit configured to acquire an inappropriate area mask by detecting an area whose exposure value is inappropriate in the captured image and the generation unit configured to generate shape data representing a three-dimensional shape of the object based on the inappropriate area mask.
Wetzel et al. teaches an image processing system comprising:
an image capturing unit whose exposure value is set relatively higher or lower than that of another image capturing unit (“One camera, for example, covers a dark brightness range, and thus records dark areas of an image, a second camera records a medium brightness range, and thus records the medium brightness areas of the image, and a third camera records a bright brightness range, and thus records the bright areas of the image” at col. 7, line 5);
a second acquisition unit configured to acquire an inappropriate area mask by detecting an area whose exposure value is inappropriate in the captured image (“The processor 130 may be configured to determine for each pixel of the plurality of pixels of the central image, whether said pixel is overexposed or underexposed, and, depending on whether said pixel of the central image is overexposed or underexposed” at col. 14, line 62; while this is not explicitly a mask, the extracted pixel areas could easily be used to form a mask to identify all the areas of under or over exposure); and
a generation unit configured to generate shape data representing a three-dimensional shape of the object based on the inappropriate area mask (“Finally, the cut out extension image 650 and the image of the central camera 660 are merged to form an HDR recording, possibly considering the camera response functions. All overexposed and underexposed areas which would impair merging have been removed, before, in step 655” at col. 20, line 14; as the under and over exposure areas are removed from the final image, this is analogous to using an overexposure/underexposure mask being combined with the object image data).
It would have been obvious to one of ordinary skill in the art before the effective fling date of the invention to eliminate the improperly exposed areas as taught by Wetzel et al. from the image data of Coene et al. to produce a better quality final output.
Regarding claim 2, the Coene et al. and Wetzel et al. combination discloses a system wherein the generation unit generates the shape data based on a limiting mask obtained as a logical sum of the foreground mask and the inappropriate area mask (“A first step used for object identification and positioning is the detection of foreground pixels in the camera input frames, i.e. pixels corresponding to foreground objects. The result of this step is a binary mask 45 for each input frame of each physical camera, which discriminates the foreground objects from the background. Such mask is illustrated in FIG. 11. The binary masks are constructed from the original camera input streams and the matching background streams obtained in the previous step, on a frame by frame basis. From the binary masks and the original camera frames, the foreground frames may then be constructed, as illustrated in FIG. 12” Coene et al. at paragraph 0072; “Finally, the cut out extension image 650 and the image of the central camera 660 are merged to form an HDR recording, possibly considering the camera response functions. All overexposed and underexposed areas which would impair merging have been removed, before, in step 655” Wetzel et al. at col. 20, line 14; as the under and over exposure areas are removed from the final image, this is analogous to using an overexposure/underexposure mask being combined with the object image data).
Regarding claim 3, the Coene et al. and Wetzel et al. combination discloses a system wherein the generation unit generates, in a case where the foreground mask corresponding to the area whose exposure value is inappropriate cannot be used as a silhouette mask for shape data generation, the shape data by using the limiting mask (“Finally, the cut out extension image 650 and the image of the central camera 660 are merged to form an HDR recording, possibly considering the camera response functions. All overexposed and underexposed areas which would impair merging have been removed, before, in step 655” Wetzel et al. at col. 20, line 14; areas that correspond to improper exposure are eliminated from the final output).
Regarding claim 5, the Coene et al. and Wetzel et al. combination discloses a system wherein the second acquisition unit acquires the inappropriate area mask based on a background image generated from the captured image (“The next step is the subtraction of the original input frames and the corresponding (equalized) background frames. For this step, absolute differences may be taken for each of the color components of the pixels. The resulting difference frames give a first indication of where the foreground objects may be located for each camera” Coene et al. at paragraph 0075; “In step 645, the needed image areas are cut out from the merged extension image 640 depending on a target perspective image 603 (the image recorded by the central camera) to obtain the relevant image areas 650. Moreover, in step 655, valid areas are cut out from the target perspective image 603 to obtain a target perspective image without overexposure and without underexposure 660” Wetzel et al. at col. 19, line 34; the foreground area is determined by the background image and then subjected to masking out) or an average image obtained by averaging each pixel of a background image during a predetermined period, which is generated from the captured image.
Regarding claim 14, the Coene et al. and Wetzel et al. combination discloses a system wherein 
images from a plurality of viewpoints captured by a plurality of the image capturing units are acquired as the captured images (“A virtual camera system in accordance with embodiments of the present invention comprises a plurality of physical cameras providing input images” Coene et al. at paragraph 0045, line 1; “One camera, for example, covers a dark brightness range, and thus records dark areas of an image, a second camera records a medium brightness range, and thus records the medium brightness areas of the image, and a third camera records a bright brightness range, and thus records the bright areas of the image” Wetzel et al. at col. 7, line 5) and
the image processing system generates a virtual viewpoint image by using the images from the plurality of viewpoints and the shape data generated by the generation unit (“create from the input images output images corresponding to virtual viewpoints for the virtual camera. Virtual camera views are generated from images captured by the real cameras” Coene et al. at paragraph 0045, line 4).
Regarding claim 15, the Coene et al. and Wetzel et al. combination discloses a system wherein an information processing apparatus comprising a limiting unit configured to generate a limiting mask the limits the foreground mask to a specific area based on the foreground mask acquired by the first acquisition unit and the inappropriate area mask acquired by the second acquisition unit has an output unit configured to output the limiting mask to the generation unit (“A first step used for object identification and positioning is the detection of foreground pixels in the camera input frames, i.e. pixels corresponding to foreground objects. The result of this step is a binary mask 45 for each input frame of each physical camera, which discriminates the foreground objects from the background. Such mask is illustrated in FIG. 11. The binary masks are constructed from the original camera input streams and the matching background streams obtained in the previous step, on a frame by frame basis. From the binary masks and the original camera frames, the foreground frames may then be constructed, as illustrated in FIG. 12” Coene et al. at paragraph 0072; “Finally, the cut out extension image 650 and the image of the central camera 660 are merged to form an HDR recording, possibly considering the camera response functions. All overexposed and underexposed areas which would impair merging have been removed, before, in step 655” Wetzel et al. at col. 20, line 14).
Regarding claim 16, the Coene et al. and Wetzel et al. combination discloses a system wherein the first acquisition unit generates the foreground mask (“A first step used for object identification and positioning is the detection of foreground pixels in the camera input frames, i.e. pixels corresponding to foreground objects. The result of this step is a binary mask 45 for each input frame of each physical camera, which discriminates the foreground objects from the background” Coene et al. at paragraph 0072, line 1) and the second acquisition unit generates the inappropriate area mask (“The processor 130 may be configured to determine for each pixel of the plurality of pixels of the central image, whether said pixel is overexposed or underexposed, and, depending on whether said pixel of the central image is overexposed or underexposed” Wetzel et al.at col. 14, line 62; while this is not explicitly a mask, the extracted pixel areas could easily be used to form a mask to identify all the areas of under or over exposure).
Regarding claim 17, Coene et al. discloses an image processing method comprising: 
acquiring a foreground mask from a captured image acquired by capturing an object with an image capturing unit (“A first step used for object identification and positioning is the detection of foreground pixels in the camera input frames, i.e. pixels corresponding to foreground objects. The result of this step is a binary mask 45 for each input frame of each physical camera, which discriminates the foreground objects from the background” at paragraph 0072, line 1); 
generating shape data representing a three-dimensional shape of the object based on the foreground mask (“The above mentioned masks play a role in two parts in the creation of virtual viewpoints: first, they may be used to determine the location of the foreground objects in 3D space, e.g. where they are exactly located in the scene; secondly, they may be used to confine the area where the plane sweeping process looks for information” at paragraph 0011, line 1; see also abstract).
Coene et al. does not explicitly disclose that the image capturing unit is one whose exposure value is set relatively higher or lower than that of another image capturing unit, acquiring an inappropriate area mask by detecting an area whose exposure value is inappropriate in the captured image and generating shape data representing a three-dimensional shape of the object based on the inappropriate area mask.
Wetzel et al. teaches an image processing method comprising:
an image capturing unit whose exposure value is set relatively higher or lower than that of another image capturing unit (“One camera, for example, covers a dark brightness range, and thus records dark areas of an image, a second camera records a medium brightness range, and thus records the medium brightness areas of the image, and a third camera records a bright brightness range, and thus records the bright areas of the image” at col. 7, line 5);
acquiring an inappropriate area mask by detecting an area whose exposure value is inappropriate in the captured image (“The processor 130 may be configured to determine for each pixel of the plurality of pixels of the central image, whether said pixel is overexposed or underexposed, and, depending on whether said pixel of the central image is overexposed or underexposed” at col. 14, line 62; while this is not explicitly a mask, the extracted pixel areas could easily be used to form a mask to identify all the areas of under or over exposure); and
generating shape data representing a three-dimensional shape of the object based on the inappropriate area mask (“Finally, the cut out extension image 650 and the image of the central camera 660 are merged to form an HDR recording, possibly considering the camera response functions. All overexposed and underexposed areas which would impair merging have been removed, before, in step 655” at col. 20, line 14; as the under and over exposure areas are removed from the final image, this is analogous to using an overexposure/underexposure mask being combined with the object image data).
It would have been obvious to one of ordinary skill in the art before the effective fling date of the invention to eliminate the improperly exposed areas as taught by Wetzel et al. from the image data of Coene et al. to produce a better quality final output.
Regarding claim 18, Coene et al. discloses a non-transitory computer readable storage medium storing a program for causing a computer to execute an image processing method (“The present invention also includes a computer program product which provides the functionality of any of the methods according to the present invention when executed on a computing device. Such computer program product can be tangibly embodied in a carrier medium carrying machine-readable code for execution by a programmable processor” at paragraph 0017, line 1), the image processing method comprises: 
acquiring a foreground mask from a captured image acquired by capturing an object with an image capturing unit (“A first step used for object identification and positioning is the detection of foreground pixels in the camera input frames, i.e. pixels corresponding to foreground objects. The result of this step is a binary mask 45 for each input frame of each physical camera, which discriminates the foreground objects from the background” at paragraph 0072, line 1); 
generating shape data representing a three-dimensional shape of the object based on the foreground mask (“The above mentioned masks play a role in two parts in the creation of virtual viewpoints: first, they may be used to determine the location of the foreground objects in 3D space, e.g. where they are exactly located in the scene; secondly, they may be used to confine the area where the plane sweeping process looks for information” at paragraph 0011, line 1; see also abstract).
Coene et al. does not explicitly disclose that the image capturing unit is one whose exposure value is set relatively higher or lower than that of another image capturing unit, acquiring an inappropriate area mask by detecting an area whose exposure value is inappropriate in the captured image and generating shape data representing a three-dimensional shape of the object based on the inappropriate area mask.
Wetzel et al. teaches a non-transitory computer readable storage medium storing a program for causing a computer to execute an image processing method (“Some embodiments according to embodiments comprise a non-transitory data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed” at col. 21, line 39), the image processing method comprises:
an image capturing unit whose exposure value is set relatively higher or lower than that of another image capturing unit (“One camera, for example, covers a dark brightness range, and thus records dark areas of an image, a second camera records a medium brightness range, and thus records the medium brightness areas of the image, and a third camera records a bright brightness range, and thus records the bright areas of the image” at col. 7, line 5);
acquiring an inappropriate area mask by detecting an area whose exposure value is inappropriate in the captured image (“The processor 130 may be configured to determine for each pixel of the plurality of pixels of the central image, whether said pixel is overexposed or underexposed, and, depending on whether said pixel of the central image is overexposed or underexposed” at col. 14, line 62; while this is not explicitly a mask, the extracted pixel areas could easily be used to form a mask to identify all the areas of under or over exposure); and
generating shape data representing a three-dimensional shape of the object based on the inappropriate area mask (“Finally, the cut out extension image 650 and the image of the central camera 660 are merged to form an HDR recording, possibly considering the camera response functions. All overexposed and underexposed areas which would impair merging have been removed, before, in step 655” at col. 20, line 14; as the under and over exposure areas are removed from the final image, this is analogous to using an overexposure/underexposure mask being combined with the object image data).
It would have been obvious to one of ordinary skill in the art before the effective fling date of the invention to eliminate the improperly exposed areas as taught by Wetzel et al. from the image data of Coene et al. to produce a better quality final output.

Claim(s) 6 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Coene et al. and Wetzel et al. as applied to claim 1 above, and further in view of Yamamoto (US 2021/0209783).
The Coene et al. and Wetzel et al. combination discloses the elements of claim 1 as described above.
The Coene et al. and Wetzel et al. combination does not explicitly disclose a portion whose pixel value is greater than an upper limit threshold value in the captured image as a first inappropriate area, which is acquired by performing image capturing with the image capturing unit whose exposure value is set relatively higher than the exposure value of the other image capturing unit.
Yamamoto teaches an image processing system wherein the second acquisition unit detects: 
a portion whose pixel value is greater than an upper limit threshold value in the captured image as a first inappropriate area, which is acquired by performing image capturing with the image capturing unit whose exposure value is set relatively higher than the exposure value of the other image capturing unit (“The long-time exposure mask image generation unit 33 determines whether or not a pixel value of the long-time exposure image is larger than a predetermined threshold value V1, and generates a long-time exposure mask image in which pixels larger than the threshold value V1 have been masked. By performing such a determination, it is possible to extract a saturated region and generate a mask image in which the region determined to be saturated has been masked” at paragraph 0094).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to utilize a threshold as taught by Yamamoto to determine the oversaturated image areas of the Coene et al. and Wetzel et al. combination such that they can be eliminated from the final image.
The Coene et al., Wetzel et al. and Yamamoto combination does not explicitly disclose a portion whose pixel value is less than a lower limit threshold value in the captured image as a second inappropriate area, which is acquired by performing image capturing with the image capturing unit whose exposure value is set relatively lower than the exposure value of the other image capturing unit.
However, underexposed image areas have pixel values below a threshold, as they correspond to darkened regions of the image (see Yamamoto at paragraph 0095).  Therefore, it would be obvious to utilize a lower threshold to identify darkened regions to be able to eliminate underexposed image regions.

Claim(s) 8-10 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Coene et al. and Wetzel et al. as applied to claim 1 above, and further in view of Wurmlin et al. (US 2009/0315978).
Regarding claim 8, the Coene et al. and Wetzel et al. combination discloses the elements of claim 1 as described above.
The Coene et al. and Wetzel et al. combination does not explicitly disclose a coloring unit configured to generate color data to be assigned to the shape data based on a foreground texture extracted from the captured image or a limiting texture obtained as a logical sum of the foreground texture and the inappropriate area mask.
Wurmlin et al. teaches an image processing system comprising:
a coloring unit configured to generate color data to be assigned to the shape data based on a foreground texture extracted from the captured image (“Finally, the method determines all foreground areas (some set of pixels) in the color texture data of the video streams 121 by traversing all objects in each video stream and marking the pixels that in the color texture data 121 are labelled as foreground with a flag” at paragraph 0179, line 1) or a limiting texture obtained as a logical sum of the foreground texture and the inappropriate area mask.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to utilize the color texture unit as taught by Wurmlin et al. in the system of the Coene et al. and Wetzel et al. combination to be able to synthesize a view consistent with object color and texture.
Regarding claim 9, the Coene et al., Wetzel et al. and Wurmlin et al. combination discloses a system wherein the coloring unit generates the color data based on at least a pixel value of a pixel projected onto the captured image from a point on a surface of a foreground object represented by the shape data (“Rendering the objects from a virtual view using a particular 3D representation of the scene and using the object textures 126 and either fixed alpha values (from the cutout step 106) or view-dependent alpha values, taking into account angular, resolution and field-of-view similarity. Preferably, texture mapping is achieved using projective texturing” Wurmlin et al. at paragraph 0197, line 1).
Regarding claim 10, the Coene et al., Wetzel et al. and Wurmlin et al. combination discloses a system wherein the coloring unit generates the color data by preferentially using the foreground texture extracted from the captured image captured with the appropriate exposure value (“Finally, the method determines all foreground areas (some set of pixels) in the color texture data of the video streams 121 by traversing all objects in each video stream and marking the pixels that in the color texture data 121 are labelled as foreground with a flag” Wurmlin et al. at paragraph 0179, line 1; any improper exposure areas are eliminated via Wetzel et al., so the resulting object color texture is generated only for proper exposure areas).

Claim(s) 12 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Coene et al. and Wetzel et al. as applied to claim 1 above, and further in view of Kouperman et al. (US 9,445,081).
The Coene et al. and Wetzel et al. combination discloses a system comprising:
an image processing apparatus (processor of system) which generates data relating to the object based on the captured image for which the predetermined processing has been performed in the information processing apparatus (“The interpolation process may comprise several processing steps. The actual viewpoint interpolation for the virtual camera position may be preceded by two preprocessing steps, as illustrated in FIG. 4. These steps are color calibration 40 and image undistortion 41” Coene et al. at paragraph 0061), wherein 
the image processing apparatus has the generation unit (system processor performs the functions as described in claim 1 above).
The Coene et al. and Wetzel et al. combination does not explicitly disclose an information processing apparatus provided in the image capturing unit and which performs predetermined processing for the captured image acquired by the image capturing unit.
Kouperman et al. teaches an image processing system comprising:
an information processing apparatus provided in the image capturing unit and which performs predetermined processing for the captured image acquired by the image capturing unit (“Process 600 may include “compute depth maps for frames of individual cameras” 608. Preliminarily, this operation may include obtaining raw image data once the images are captured by the cameras. Thus, the cameras either may transmit the raw image data to another main processor or may perform pre-processing at least sufficient for depth map generation for each of the frames. The raw data may include RGB color space frames, but can also be HSV or any other color space that the cameras support. The transitions between one color space to another are trivial, so it is assumed that RGB data is used. This operation also may include obtaining metadata for each frame that, in one form, at least includes timestamps for each frame. The pre-processing may include demosaicing, noise reduction, pixel linearization, shading compensation, resolution reduction, vignette elimination, and/or 3A related operations including automatic white balance (AWB), automatic focus (AF), and/or automatic exposure (AE) modifications, and so forth” at col. 10, line 17).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to perform pre-processing in the camera as taught by Kouperman et al. in the system of the Coene et al. and Wetzel et al. combination as a way to distribute some of the processing needs to each individual camera instead of relying on a central processing unit to perform all the processing for all the cameras.
The Coene et al., Wetzel et al. and Kouperman et al. combination does not explicitly disclose that the information processing apparatus has the first acquisition unit and the second acquisition unit.
However, as demonstrated by Kouperman et al., processing may be accomplished in the camera unit prior to sending the image data to a central processing unit as described above.  Therefore, it would have been obvious to include the first and second acquisition units in the camera units to again distribute some of the processing needs to each individual camera instead of relying on a central processing unit to perform all the processing for all the cameras.

Claim(s) 7 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Coene et al., Wetzel et al. and Yamamoto as applied to claim 6 above, and further in view of Sobel et al. (US 2006/0028476).
The Coene et al., Wetzel et al. and Yamamoto combination discloses a system wherein the generation unit generates the shape data that takes a logical sum of a first exposure mask indicating the first inappropriate area or a second exposure mask indicating the second inappropriate area, and the foreground mask as a silhouette mask (“A first step used for object identification and positioning is the detection of foreground pixels in the camera input frames, i.e. pixels corresponding to foreground objects. The result of this step is a binary mask 45 for each input frame of each physical camera, which discriminates the foreground objects from the background. Such mask is illustrated in FIG. 11. The binary masks are constructed from the original camera input streams and the matching background streams obtained in the previous step, on a frame by frame basis. From the binary masks and the original camera frames, the foreground frames may then be constructed, as illustrated in FIG. 12” Coene et al. at paragraph 0072; “Finally, the cut out extension image 650 and the image of the central camera 660 are merged to form an HDR recording, possibly considering the camera response functions. All overexposed and underexposed areas which would impair merging have been removed, before, in step 655” Wetzel et al. at col. 20, line 14; as the under and over exposure areas are removed from the final image, this is analogous to using an overexposure/underexposure mask being combined with the object image data).
The Coene et al., Wetzel et al. and Yamamoto combination does not explicitly disclose that the generation unit generates the shape data by a visual hull method.
Sobel et al. teaches an image processing system wherein the generation unit generates the shape data by a visual hull method that takes the foreground mask as a silhouette mask (“In another embodiment, further refinement of the visual hull of the object is possible from the additional virtual camera viewpoint of the object. That is, the silhouette contour from this virtual viewpoint (e.g., virtual viewpoint 270) can be used in a 3D visual-hull geometry reconstruction algorithm. This silhouette contour is taken as if from an additional virtual reference camera (e.g., from the rear of the object 260 in FIG. 2). The visual-hull improvement from this virtual reference camera will depend upon the consistency of its silhouette with respect to the silhouettes from other virtual and reference cameras having later data” at paragraph 0066).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to use a visual hull method as taught by Sobel et al. on the masked views of the Coene et al., Wetzel et al. and Yamamoto combination to generate the expected the shape of the object for output.

Claim(s) 11 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Coene et al., Wetzel et al. and Wurmlin et al. as applied to claim 10 above, and further in view of Watanabe et al. (US 2009/0022396).
The Coene et al., Wetzel et al. and Wurmlin et al. combination discloses a system wherein the coloring unit corrects a pixel value of the pixel in the captured image corresponding to the image capturing unit in accordance with the exposure value set to the image capturing unit (“Color calibration is generally considered a useful topic for view interpolation. In the color calibration step 40, the input images are altered to match better their photometric information. A color calibration process, e.g. an auto color calibration process, may for example be based based on an algorithm described in "Color correction preprocessing for multiview coding", Colin Doutre, Panos Nasiopoulos, IEEE Trans on Circuits and Systems for Video Technology, Volume 19, Issue 9 (September 2009). Its results can be observed in FIG. 6” Coene et al. at paragraph 0063).
The Coene et al., Wetzel et al. and Wurmlin et al. combination does not explicitly disclose that the coloring unit generates the color data to which a pixel value is assigned, which is obtained based on the corrected pixel value of the pixel and a predetermined coefficient set in advance to the image capturing unit.
Watanabe et al. teaches an image processing system wherein the coloring unit corrects a pixel value of the pixel in the captured image corresponding to the image capturing unit in accordance with the exposure value set to the image capturing unit and generates the color data to which a pixel value is assigned, which is obtained based on the corrected pixel value of the pixel and a predetermined coefficient set in advance to the image capturing unit (“Furthermore, when executing a color correction process on the color information Ci that is to be corrected, post-correction color information Ci_new may be obtained by altering the color information Ci using a predetermined fluctuation amount DefCi, as illustrated by (Formula 11) [0217] Ci_new=Ci+DefCi x (Gi-0.5).x β (11) [0218] Here, Ci_new is the post-correction value of the color information that is to be corrected β is a predetermined positive constant.” at paragraph 0216-0218).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to employ a color correction as taught by Watanabe et al. in the system of the Coene et al., Wetzel et al. and Wurmlin et al. combination to “improve the feeling of depth in a processed image” (Watanabe et al. at paragraph 0221, line 4).

Claim(s) 13 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Coene et al., Wetzel et al. and Wurmlin et al. as applied to claim 8 above, and further in view of Kouperman et al.
The Coene et al., Wetzel et al. and Wurmlin et al. combination discloses a system comprising:
an image processing apparatus (processor of system) which generates data relating to the object based on the captured image for which the predetermined processing has been performed in the information processing apparatus (“The interpolation process may comprise several processing steps. The actual viewpoint interpolation for the virtual camera position may be preceded by two preprocessing steps, as illustrated in FIG. 4. These steps are color calibration 40 and image undistortion 41” Coene et al. at paragraph 0061), wherein 
the image processing apparatus has the coloring unit (system processor performs the functions as described in claim 1 above).
The Coene et al., Wetzel et al. and Wurmlin et al. combination does not explicitly disclose an information processing apparatus provided in the image capturing unit and which performs predetermined processing for the captured image acquired by the image capturing unit.
Kouperman et al. teaches an image processing system comprising:
an information processing apparatus provided in the image capturing unit and which performs predetermined processing for the captured image acquired by the image capturing unit (“Process 600 may include “compute depth maps for frames of individual cameras” 608. Preliminarily, this operation may include obtaining raw image data once the images are captured by the cameras. Thus, the cameras either may transmit the raw image data to another main processor or may perform pre-processing at least sufficient for depth map generation for each of the frames. The raw data may include RGB color space frames, but can also be HSV or any other color space that the cameras support. The transitions between one color space to another are trivial, so it is assumed that RGB data is used. This operation also may include obtaining metadata for each frame that, in one form, at least includes timestamps for each frame. The pre-processing may include demosaicing, noise reduction, pixel linearization, shading compensation, resolution reduction, vignette elimination, and/or 3A related operations including automatic white balance (AWB), automatic focus (AF), and/or automatic exposure (AE) modifications, and so forth” at col. 10, line 17).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to perform pre-processing in the camera as taught by Kouperman et al. in the system of the Coene et al., Wetzel et al. and Wurmlin et al. combination as a way to distribute some of the processing needs to each individual camera instead of relying on a central processing unit to perform all the processing for all the cameras.
The Coene et al., Wetzel et al., Wurmlin et al. and Kouperman et al. combination does not explicitly disclose that the information processing apparatus has the first acquisition unit and the second acquisition unit.
However, as demonstrated by Kouperman et al., processing may be accomplished in the camera unit prior to sending the image data to a central processing unit as described above.  Therefore, it would have been obvious to include the first and second acquisition units in the camera units to again distribute some of the processing needs to each individual camera instead of relying on a central processing unit to perform all the processing for all the cameras.

Allowable Subject Matter

Claim 4 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  as the prior art systems only describe eliminating overexposed/underexposed image regions, thereby rendering those foreground areas corresponding to those image regions ineligible for final object generation, the prior art does not describe that the generation unit generates, in a case where the foreground mask corresponding to the area whose exposure value is inappropriate can be used as a silhouette mask for shape data generation, the shape data by using the foreground mask as required by claim 4.


Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to KATRINA R FUJITA whose telephone number is (571)270-1574. The examiner can normally be reached Monday - Friday 9:30-5:30 pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sumati Lefkowitz can be reached on 5712723638. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KATRINA R FUJITA/Primary Examiner, Art Unit 2662