DETAILED ACTION

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application aft final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 08/06/2021 has been entered.

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior 

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1, 2, 4 – 8, 10 – 14, 17, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Wu (Publication: US 2005/0129325 A1) in view of Dal Mutto et al. (Publication: US 2019/0108396 A1 ), IDA ET AL. (PUBLICATION: 2018/0091741 A1).

Regarding claim 1, Wu discloses an image processing apparatus comprising ([0013], [0040] - An image processing apparatus that is a personal computer with a computer readable storage medium storing a program execute the following methods. Furthermore, a personal computer is known to have a computer readable storage medium storing a program. Comprising: ): 
one or more memories storing instructions ([0013], [0040] - An image processing apparatus that is a personal computer with a computer readable storage medium storing a program execute the following methods. Furthermore, a personal computer is known to have a computer readable storage medium storing a program.); and
one or more processors executing the instructions to ([0013], [0040] - An image processing apparatus that is a personal computer with a computer readable storage medium storing a program execute the following methods. Furthermore, a personal computer is known to have a computer readable storage medium storing a program.):
[[specify an object based on an event]] where imaging by a plurality of image
capturing apparatuses is performed, wherein the object is imaged by at least one of the plurality of image capturing apparatuses ([0035], [0042], [0050], [0058], [0062] to [0064],  [0112], Fig. 3, Fig. 11 - As shown, at the site A, there are provided cameras 11a and 12a to image the user a as an object from different points of view, a display unit 5a to display an image of the user b, captured at the site B, to the user a, and an image processor 2a which generates a virtual viewpoint image on the basis of images Pa1 and Pa2 captured by the cameras 11a and 12a. [0035], [0042] – as shown in Fig. 4, virtual viewpoint image is generated by the virtual viewpoint image generator from the input of the corrected images.
[0035], [0042], [0050], [0058], [0062] to [0064],  [0112], Fig. 3, Fig. 11 - As shown, at the site A, there are provided cameras 11a and 12a to image the user a as an object from different points of view, a display unit 5a to display an image of the user b, captured at the site B, to the user a, and an image processor 2a which generates a virtual viewpoint image on the basis of images Pa1 and Pa2 captured by the cameras 11a and 12a. 
Pa1 and Pa2 is captured by the cameras and obtained by the correction unit and matching unit. FIG. 6 explains how to normalize, the images Pa1 and Pa2 captured by the cameras 11a and 12a if there is a misalignment. As shown in FIG. 6, when the user a is imaged by the cameras 11a and 12a from different viewpoints with the optical axes through their respective optical centers C1 and C2 being aligned, with a point M of the object (user a), images Pa1 and Pa2, surface,  thus captured by the cameras 11a and 12a. A geometric normalization is effected to parallelize the normal-line directions k1 and k2 of the images Pa1 and Pa2 with each other, thereby producing normalized images Pm1 and Pm2 ; An area such as the ear portion will be referred to as an "occlusion region" hereafter. Since in an occlusion region, a corresponding point in an object's image in one of the normalized images is masked by the other normalized image, the conventional matching of the feature points with one another, that is, matches (a1, b1), (a2, b2), (a3, b3), (a4, b4) and (a5, b5).
Then the matching unit received the normalized images. Matching unit 29 makes a match only between points where feature points have been extracted included in the normalized images Pm1 and Pm2, When the feature point sequences R1 and R2 existing on the epi-polar lines L1 and L1', respectively, are matched with each other in relation to the object, the feature point b1 on the epi-polar line L1'.);
obtain a model of object which is generated based on a plurality of captured images obtained by the plurality of image capturing apparatuses for generating a virtual viewpoint image ([0035], [0042], [0050], [0058], [0062] to [0064],  [0112], Fig. 3, Fig. 11 - As shown, at the site A, there are provided cameras 11a and 12a to image the user a as an object from different points of view, a display unit 5a to display an image of the user b, captured at the site B, to the user a, and an image processor 2a which generates a virtual viewpoint image on the basis of images Pa1 and Pa2 captured by the cameras 11a and 12a. [0035], [0042] – as shown in Fig. 4, virtual viewpoint image is generated by the virtual viewpoint image generator from the input of the corrected images.), the obtained representing at least one of a distorted contour or a missing portion of the specified object ([0035], [0042], [0050], [0058], [0062] to [0064],  [0112] - As shown in FIG. 6, when the user a is imaged by the cameras 11a and 12a from different viewpoints with the optical axes through their respective optical centers C1 and C2 being aligned, with a point M of the object (user a), images Pa1 and Pa2, surface,  thus captured by the cameras 11a and 12a. A geometric normalization is effected to parallelize the normal-line directions k1 and k2 of the images Pa1 and Pa2 with each other, thereby producing normalized images Pm1 and Pm2 ; An area such as the ear portion will be referred to as an "occlusion region" hereafter. Since in an occlusion region, a corresponding point in an object's image in one of the normalized images is masked by the other normalized image, the conventional matching of the feature points with one another, that is, matches (a1, b1), (a2, b2), (a3, b3), (a4, b4) and (a5, b5).); and modify the obtained model based on a reference model to correct the distorted contour or correct the missing portion of the specified object ([0035], [0042], [0050], [0058], [0062] to [0064],  [0112], Fig. 3, Fig. 11 - As shown, at the site A, there are provided cameras 11a and 12a to image the user a as an object from different points of view, a display unit 5a to display an image of the user b, captured at the site B, to the user a, and an image processor 2a which generates a virtual viewpoint image on the basis of images Pa1 and Pa2 captured by the cameras 11a and 12a. 
Pa1 and Pa2 is captured by the cameras and obtained by the correction unit and matching unit. FIG. 6 explains how to normalize, the images Pa1 and Pa2 captured by the cameras 11a and 12a if there is a misalignment. As shown in FIG. 6, when the user a is imaged by the cameras 11a and 12a from different viewpoints with the optical axes through their respective optical centers C1 and C2 being aligned, with a point M of the object (user a), images Pa1 and Pa2, surface,  thus captured by the cameras 11a and 12a. A geometric normalization is effected to parallelize the normal-line directions k1 and k2 of the images Pa1 and Pa2 with each other, thereby producing normalized images Pm1 and Pm2 ; An area such as the ear portion will be referred to as an "occlusion region" hereafter. Since in an occlusion region, a corresponding point in an object's image in one of the normalized images is masked by the other normalized image, the conventional matching of the feature points with one another, that is, matches (a1, b1), (a2, b2), (a3, b3), (a4, b4) and (a5, b5). Then the matching unit received the normalized images. Matching unit 29 makes a match only between points where feature points have been extracted included in the normalized images Pm1 and Pm2, When the feature point sequences R1 and R2 existing on the epi-polar lines L1 and L1', respectively, are matched with each other in relation to the object, the feature point b1 on the epi-polar line L1' thus ”missing portion” can be corrected.);
Wherein the reference model is generated before the imaging is performed ([0050], [0058], [0062] to [0064], [0086] -  Pa1 and Pa2 is captured by the cameras and obtained by the correction unit and then matching unit. FIG. 6 explains how to normalize, the images Pa1 and Pa2 captured by the cameras 11a and 12a if there is a misalignment. As shown in FIG. 6, when the user a is imaged by the cameras 11a and 12a from different viewpoints with the optical axes through their respective optical centers C1 and C2 being aligned, with a point M of the object (user a), images Pa1 and Pa2, surface,  thus captured by the cameras 11a and 12a. A geometric normalization is effected to parallelize the normal-line directions k1 and k2 of the images Pa1 and Pa2 with each other, thereby producing normalized images Pm1 and Pm2 ; An area such as the ear portion will be referred to as an "occlusion region" hereafter. Since in an occlusion region, a corresponding point in an object's image in one of the normalized images is masked by the other normalized image, the conventional matching of the feature points with one another, that is, matches (a1, b1), (a2, b2), (a3, b3), (a4, b4) and (a5, b5).
Then the matching unit received the normalized images. Matching unit 29 makes a match only between points where feature points have been extracted included in the normalized images Pm1 and Pm2, When the feature point sequences R1 and R2 existing on the epi-polar lines L1 and L1', respectively, are matched with each other in relation to the object, the feature point b1 on the epi-polar line L1'.
Furthermore each of the occlusion cost functions dx(x, y) and dy(x, y) is generated based on the parallax information. the distance from the cameras 11a and 12a to the user a as the object becomes longer (the angle become smaller), the probability of the occlusion region occurring will be lower. That is the angle between the optical axes of the image is in the state where the value of the angle is smaller to an acceptable angle (largest acceptable angle) so to lower the occlusion region.).
However Wu does not disclose three-dimensional shape model; specify an object; the specified object; a three-dimensional shape model of object .
Dal Mutto discloses specify an object ([0073], [0077] to [0078] -  FIG. 1A depicts a shopping cart or basket 8 containing items 10 that have been selected for purchase at a combination supermarket and department store (sometimes referred to as a "hypermarket").
As shown in FIGS. 1B and 1C, the system includes a 3-D scanner 100, which is configured to capture images of an object 10 and a 3-D model generation system 200 configured to generate a 3-D model of the object from the images captured by the scanner 100 in operation 1200. An analysis agent 300 identifies the object based on the captured 3-D model in operation 1300 and, in some embodiments, may perform further analysis of the object based on the captured model.
The analysis results generated by the analysis agent 300, including an object identification are stored by an object tracking system 400 in operation 1400. In some embodiments of the present invention, the object tracking system 400 is configured to maintain information (e.g., lists) regarding the particular object that is scanned, such as the identity of the object and an association between the scanned object and a particular shopping cart (e.g., tracking that the scanned object is now in the shopping cart) and/or associated with a particular customer. The object tracking system 400 may also be used to control a display device 450, which may display information to a user. The shopping cart or basket 8 may include a display panel 450 that is configured to display the current list 6 of items that are detected to be the shopping cart. As another example, the display device 450 may be an end user computing device (e.g., a smartphone) of the user and the current list 6 may be displayed in a web page or an application running on the end user computing device.);
the specified object ([0073], [0077] to [0078] -  FIG. 1A depicts a shopping cart or basket 8 containing items 10 that have been selected for purchase at a combination supermarket and department store (sometimes referred to as a "hypermarket").
As shown in FIGS. 1B and 1C, the system includes a 3-D scanner 100, which is configured to capture images of an object 10 and a 3-D model generation system 200 configured to generate a 3-D model of the object from the images captured by the scanner 100 in operation 1200. An analysis agent 300 identifies the object based on the captured 3-D model in operation 1300 and, in some embodiments, may perform further analysis of the object based on the captured model.
The analysis results generated by the analysis agent 300, including an object identification are stored by an object tracking system 400 in operation 1400. In some embodiments of the present invention, the object tracking system 400 is configured to maintain information (e.g., lists) regarding the particular object that is scanned, such as the identity of the object and an association between the scanned object and a particular shopping cart (e.g., tracking that the scanned object is now in the shopping cart) and/or associated with a particular customer. The object tracking system 400 may also be used to control a display device 450, which may display information to a user. The shopping cart or basket 8 may include a display panel 450 that is configured to display the current list 6 of items that are detected to be the shopping cart. As another example, the display device 450 may be an end user computing device (e.g., a smartphone) of the user and the current list 6 may be displayed in a web page or an application running on the end user computing device.) ;
three-dimensional shape model ([0012], [0073], [0077] to [0078] - 3-D model. As shown in FIGS. 1B and 1C, the system includes a 3-D scanner 100, which is configured to capture images of an object 10 and a 3-D model generation system 200 configured to generate a 3-D model of the object from the images captured by the scanner 100 in operation 1200. An analysis agent 300 identifies the object based on the captured 3-D model in operation 1300 and, in some embodiments, may perform further analysis of the object based on the captured model.);
a three-dimensional shape model of object ([0012], [0073], [0077] to [0078] - 3-D model. As shown in FIGS. 1B and 1C, the system includes a 3-D scanner 100, which is configured to capture images of an object 10 and a 3-D model generation system 200 configured to generate a 3-D model of the object from the images captured by the scanner 100 in operation 1200. An analysis agent 300 identifies the object based on the captured 3-D model in operation 1300 and, in some embodiments, may perform further analysis of the object based on the captured model. );
three-dimensional shape mode of the specified object ([0073], [0077] to [0078] -  FIG. 1A depicts a shopping cart or basket 8 containing items 10 that have been selected for purchase at a combination supermarket and department store (sometimes referred to as a "hypermarket").
As shown in FIGS. 1B and 1C, the system includes a 3-D scanner 100, which is configured to capture images of an object 10 and a 3-D model generation system 200 configured to generate a 3-D model of the object from the images captured by the scanner 100 in operation 1200. An analysis agent 300 identifies the object based on the captured 3-D model in operation 1300 and, in some embodiments, may perform further analysis of the object based on the captured model.
The analysis results generated by the analysis agent 300, including an object identification are stored by an object tracking system 400 in operation 1400. In some embodiments of the present invention, the object tracking system 400 is configured to maintain information (e.g., lists) regarding the particular object that is scanned, such as the identity of the object and an association between the scanned object and a particular shopping cart (e.g., tracking that the scanned object is now in the shopping cart) and/or associated with a particular customer. The object tracking system 400 may also be used to control a display device 450, which may display information to a user. The shopping cart or basket 8 may include a display panel 450 that is configured to display the current list 6 of items that are detected to be the shopping cart. As another example, the display device 450 may be an end user computing device (e.g., a smartphone) of the user and the current list 6 may be displayed in a web page or an application running on the end user computing device.).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Wu with specify an object; the specified object; a three-dimensional shape model of object as taught by Dal Mutto. The motivation for doing is to track objects accurately as taught by Dal Mutto in paragraph(s) [00005]. 
However Wu in view of Dal Mutto specify an object based on an event; performed by the plurality of image capturing apparatuses in the event.
IDA discloses specify an object based on an event ([0049] to [0050] - the predetermined event includes a target object's passing across the video surveillance line, predetermined situations of the target object in the surveillance region (intruding, exiting, appearing, disappearing, fighting, staying, roving, falling down, standing up, crouching down, changing the moving direction, reverse traveling, shoplifting, detouring, damaging, carrying away, leaving behind, painting graffiti, and the like), movement of the target object of a specific route defined by a line segment, there is a target object's passing in the direction specified by a user is detected as a predetermined event.); 
performed by the plurality of image capturing apparatuses in the event ([0072] - two or more surveillance cameras 9 capture images at the surveillance position where the predetermined event is detected.).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Wu in view of  Dal Mutto with as taught by IDA. The motivation for doing so the operation can be performed efficiently by IDA in paragraph(s) [00012]. 

Regarding claim 2, Wu in view of Dal Mutto, IDA disclose all the limitation of claim 1, including specified object, and the reference model.
Wu discloses wherein, the object is selected from among models generated based on the plurality of captured images ([0035], [0042], [0050], [0058], [0062] to [0064],  [0112], Fig. 3, Fig. 11 - As shown, at the site A, there are provided cameras 11a and 12a to image the user a as an object from different points of view, a display unit 5a to display an image of the user b, captured at the site B, to the user a, and an image processor 2a which generates a virtual viewpoint image on the basis of images Pa1 and Pa2 captured by the cameras 11a and 12a. 
Pa1 and Pa2 is captured by the cameras and obtained by the correction unit and matching unit. FIG. 6 explains how to normalize, the images Pa1 and Pa2 captured by the cameras 11a and 12a if there is a misalignment. As shown in FIG. 6, when the user a is imaged by the cameras 11a and 12a from different viewpoints with the optical axes through their respective optical centers C1 and C2 being aligned, with a point M of the object (user a), images Pa1 and Pa2, surface,  thus captured by the cameras 11a and 12a. A geometric normalization is effected to parallelize the normal-line directions k1 and k2 of the images Pa1 and Pa2 with each other, thereby producing normalized images Pm1 and Pm2 ; An area such as the ear portion will be referred to as an "occlusion region" hereafter. Since in an occlusion region, a corresponding point in an object's image in one of the normalized images is masked by the other normalized image, the conventional matching of the feature points with one another, that is, matches (a1, b1), (a2, b2), (a3, b3), (a4, b4) and (a5, b5).
Then the matching unit received the normalized images. Matching unit 29 makes a match only between points where feature points have been extracted included in the normalized images Pm1 and Pm2, When the feature point sequences R1 and R2 existing on the epi-polar lines L1 and L1', respectively, are matched with each other in relation to the object, the feature point b1 on the epi-polar line L1'. ).
[0005], [0033] - objects of 3-D models.).
the three-dimensional shape model of an object ([0005], [0033] - the object is identified from the objects of 3-D models.);
identifying the reference model from among three-dimensional shape models ([0005], [0033] - the object is identified from the objects of 3-D models.).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Wu in view of  Dal Mutto with obtains the three-dimensional shape model of an object; identifying the reference model from among three-dimensional shape models as taught by Dal Mutto. The motivation for doing is to track objects accurately as taught by Dal Mutto in paragraph(s) [00005]. 

Regarding claim 4, Wu in view of Dal Mutto, IDA disclose all the limitation of claim 1 including the three-dimensional shape model, obtained three-dimensional model; the reference model.
Wu discloses wherein in a case where a surface of the reference model and a surface of the obtained model do not match each other in a state where a position of the obtained model and a position of the reference model are set to match with each other, the obtained model is corrected such that the surface of the reference model appears as the surface of the obtained model ([0050], [0058], [0062] to [0064] -  Pa1 and Pa2 is captured by the cameras and obtained by the correction unit and matching unit. FIG. 6 explains how to normalize, the images Pa1 and Pa2 captured by the cameras 11a and 12a if there is a misalignment. As shown in FIG. 6, when the user a is imaged by the cameras 11a and 12a from different viewpoints with the optical axes through their respective optical centers C1 and C2 being aligned, with a point M of the object (user a), images Pa1 and Pa2, surface,  thus captured by the cameras 11a and 12a. A geometric normalization is effected to parallelize the normal-line directions k1 and k2 of the images Pa1 and Pa2 with each other, thereby producing normalized images Pm1 and Pm2 ; An area such as the ear portion will be referred to as an "occlusion region" hereafter. Since in an occlusion region, a corresponding point in an object's image in one of the normalized images is masked by the other normalized image, the conventional matching of the feature points with one another, that is, matches (a1, b1), (a2, b2), (a3, b3), (a4, b4) and (a5, b5).
Then the matching unit received the normalized images. Matching unit 29 makes a match only between points where feature points have been extracted included in the normalized images Pm1 and Pm2, When the feature point sequences R1 and R2 existing on the epi-polar lines L1 and L1', respectively, are matched with each other in relation to the object, the feature point b1 on the epi-polar line L1'.).

Regarding claim 5, Wu in view of Dal Mutto, IDA disclose all the limitation of claim 2 including the three-dimensional shape model, shape feature, and obtained three-dimensional shape model.
Wu discloses wherein the obtained shape model is corrected in a case where a shape feature does not meet a predetermined criterion ([0050], [0058], [0062] to [0064], [0086] -  Pa1 and Pa2 is captured by the cameras and obtained by the correction unit and then matching unit. FIG. 6 explains how to normalize, the images Pa1 and Pa2 captured by the cameras 11a and 12a if there is a misalignment. As shown in FIG. 6, when the user a is imaged by the cameras 11a and 12a from different viewpoints with the optical axes through their respective optical centers C1 and C2 being aligned, with a point M of the object (user a), images Pa1 and Pa2, surface,  thus captured by the cameras 11a and 12a. A geometric normalization is effected to parallelize the normal-line directions k1 and k2 of the images Pa1 and Pa2 with each other, thereby producing normalized images Pm1 and Pm2 ; An area such as the ear portion will be referred to as an "occlusion region" hereafter. Since in an occlusion region, a corresponding point in an object's image in one of the normalized images is masked by the other normalized image, the conventional matching of the feature points with one another, that is, matches (a1, b1), (a2, b2), (a3, b3), (a4, b4) and (a5, b5).
Then the matching unit received the normalized images. Matching unit 29 makes a match only between points where feature points have been extracted included in the normalized images Pm1 and Pm2, When the feature point sequences R1 and R2 existing on the epi-polar lines L1 and L1', respectively, are matched with each other in relation to the object, the feature point b1 on the epi-polar line L1'.
Furthermore each of the occlusion cost functions dx(x, y) and dy(x, y) is generated based on the parallax information. the distance from the cameras 11a and 12a to the user a as the object becomes longer (the angle become smaller), the probability of the occlusion region occurring will be lower. That is the angle between the optical axes of the image is in the state where the value of the angle is smaller to an acceptable angle (largest acceptable angle) so to lower the occlusion region.).

Regarding claim 6, Wu in view of Dal Mutto, IDA disclose all the limitation of claim 5 including the shape feature; obtained three-dimensional shape model, the plurality of captured images.
Wu discloses wherein the shape feature includes the number of the plurality of captured images used to generate the obtained model, and the criterion includes a state where the number of the captured images used to generate the obtained model is equal to or more than a predetermined number ([0050], [0058], [0062] to [0064], Figs. 5-7 and Fig. 9B -  there  are a number of images being generated to constructed the corrected images that is more than 1. The images that are used to generate the corrected images and one from left and another from right. ).

Regarding claim 7, Wu in view of Dal Mutto, IDA disclose all the limitation of claim 5 including the shape feature; obtained three0dimensional shape model, the plurality of captured images.
Wu discloses wherein the shape feature includes a largest angle between optical axes of the image capturing apparatuses that obtained the captured images used to generate the model, and the criterion includes a state where the largest angle is equal to or smaller than a predetermined value ([0050], [0058], [0062] to [0064], [0086],   
The images that are used to generate the corrected images and one from left and another from right. Each of the occlusion cost functions dx(x, y) and dy(x, y) is generated based on the parallax information. As the distance from the cameras 11a and 12a to the user a as an object becomes shorter (the angle become larger), the probability of the occlusion region occurring will be higher. As the distance from the cameras 11a and 12a to the user a as the object becomes longer (the angle become smaller), the probability of the occlusion region occurring will be lower. That is the angle between the optical axes of the image is in the state where the value of the angle is smaller to an acceptable angle (largest acceptable angle) so to lower the occlusion region.) .

Regarding claim 8, Wu in view of Dal Mutto, IDA disclose all the limitation of claim 5 including the shape feature; obtained three-dimensional shape mode, the reference three-dimensional shape model, and the plurality of captured images.
Wu discloses wherein the criterion includes a state where a difference between feature and another feature is equal to or smaller than a predetermined value ([0050], [0058], [0062] to [0064], [0086],   
The images that are used to generate the corrected images and one from left and another from right. Each of the occlusion cost functions dx(x, y) and dy(x, y) is generated based on the parallax information. As the distance from the cameras 11a and 12a to the user a as an object becomes shorter (the angle become larger), the probability of the occlusion region occurring will be higher. As the distance from the cameras 11a and 12a to the user a as the object becomes longer (the angle become smaller), the probability of the occlusion region occurring will be lower. That is the angle between the optical axes of the image is in the state where the value of the angle is smaller to an acceptable angle (largest acceptable angle) so to lower the occlusion region.).
Dal Mutto discloses a difference between the shape feature of the three-dimensional shape model and a shape feature of the reference model ( [0083], [0146]  Aspects of embodiments of the present invention are well suited for, but not limited to, circumstances in which the items to be identified may be characterized by their surface colors and geometry, including the size of the object (although there might be some variation between different instances of the same item or good). In many embodiments of the present invention, this type color and shape of information can be used to automate the identification of different items (e.g., identifying retail goods during checkout). The 3-D scanning is performed by aggregating information from a multitude of 3-D scanners 100 at different vantage-points. Therefore, a 3-D scanning system 99 may include one or more 3-D scanners or depth cameras 100.
the descriptor vector is used to query a database of objects for which are associated with descriptors that were previously computed using the same technique. This database of objects constitutes a set of known objects, and a known object corresponding to the current object (e.g., the scanned object or "query object") can be identified by searching for the closest (e.g. most similar) descriptor in the multi-dimensional space of descriptors, with respect to the descriptor of the current object. ).


Regarding claim 10, Wu in view of Dal Mutto, IDA disclose all the limitation of claim 1 including three-dimensional shape model and including the one or more processor further execute the instructions.
Wu discloses to generate the virtual viewpoint image based on the corrected model ([0035], [0042] – as shown in Fig. 4, virtual viewpoint image is generated by the virtual viewpoint image generator from the input of the corrected images, based on the images before being corrected, and the plurality of captured images.).

Regarding claim 11, Wu in view of Dal Mutto, IDA disclose all the limitation of claim 10 including an three-dimensional shape model, the reference model, the virtual viewpoint image, the third information, and including the one or more processor further execute the instructions.
Wu discloses obtain virtual viewpoint information for specifying a position of a virtual viewpoint and a view direction from the a virtual viewpoint, obtains information, and generates the virtual viewpoint image based on the obtained virtual viewpoint information, and the obtained information ( [0035], [0042] – as shown in Fig. 4, virtual viewpoint image is generated by the virtual viewpoint image generator from the input of the corrected images, matching unit, correction unit, and the plurality of captured images. 
[0014], [0050], [0058], [0062] to [0064], [0086] -  Pa1 and Pa2 is captured by the cameras and obtained by the correction unit and then matching unit. FIG. 6 explains how to normalize, the images Pa1 and Pa2 captured by the cameras 11a and 12a if there is a misalignment. As shown in FIG. 6, when the user a is imaged by the cameras 11a and 12a from different viewpoints with the optical axes through their respective optical centers C1 and C2 being aligned, with a point M of the object (user a), images Pa1 and Pa2, surface,  thus captured by the cameras 11a and 12a. A geometric normalization is effected to parallelize the normal-line directions k1 and k2 of the images Pa1 and Pa2 with each other, thereby producing normalized images Pm1 and Pm2 ; An area such as the ear portion will be referred to as an "occlusion region" hereafter. Since in an occlusion region, a corresponding point in an object's image in one of the normalized images is masked by the other normalized image, the conventional matching of the feature points with one another, that is, matches (a1, b1), (a2, b2), (a3, b3), (a4, b4) and (a5, b5).
Then the matching unit received the normalized images. Matching unit 29 makes a match only between points where feature points have been extracted included in the normalized images Pm1 and Pm2, When the feature point sequences R1 and R2 existing on the epi-polar lines L1 and L1', respectively, are matched with each other in relation to the object, the feature point b1 on the epi-polar line L1'.
Determine a correspondence between at least one pixel position on a horizontal line in one of the corrected object images and at least one pixel position on the same horizontal line in another of the corrected object images
Furthermore each of the occlusion cost functions dx(x, y) and dy(x, y) is generated based on the parallax information. the distance from the cameras 11a and 12a to the user a as the object becomes longer (the angle become smaller), the probability of the occlusion region occurring will be lower. That is the angle between the optical axes of the image is in the state where the value of the angle is smaller to an acceptable angle (largest acceptable angle) so to lower the occlusion region.) .
Dal Mutto discloses information containing at least color information on the reference model and the obtained color information ([0152] Once the query object has been classified, data about the identified class may be retrieved from, for example, a database of metadata about the objects. The retrieved class data may include, the expected dimensions of objects of the given class (e.g., size, shape, color), a reference 3-D model.).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Wu in view of  Dal Mutto with information containing at least color information on the reference model and the obtained color information as taught by Dal Mutto. The motivation for doing is to track objects accurately as taught by Dal Mutto in paragraph(s) [00005]. 

Regarding claim 12, Wu in view of Dal Mutto, IDA disclose all the limitation of claim 11 including obtained three-dimensional shape model ; three-dimensional shape model, the reference model, and the virtual viewpoint image.
Wu discloses wherein the image is generated not using the model before being corrected but using the corrected model ( ([0035], [0041] to [0043] – as shown in Fig. 4, virtual viewpoint image is generated by the virtual viewpoint image generator uses the input of the corrected images, and not using the input of uncorrected image by the unit. virtual viewpoint image generator uses the input of the corrected images, after the images being corrected by the unit, to generate virtual viewpoint images.).

Regarding claim 13, Wu in view of Dal Mutto, IDA disclose all the limitation of claim 11 including color information and reference three-dimensional shape model.
Dal Mutto discloses wherein the information includes ([0152] Once the query object has been classified, data about the identified class may be retrieved from, for example, a database of metadata about the objects. The retrieved class data may include, the expected dimensions of objects of the given class (e.g., size, shape, color), a reference 3-D model.); and 
texture data of the reference model ([0152] Once the query object has been classified, data about the identified class may be retrieved from, for example, a database of metadata about the objects. The retrieved class data may include, the expected dimensions of objects of the given class (e.g., size, shape, color), a reference 3-D model.).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Wu in view of  Dal Mutto with disclose wherein the information includes; and texture data of the reference model as taught by Dal Mutto. The motivation for doing is to track objects accurately as taught by Dal Mutto in paragraph(s) [00005]. 

Regarding claim 14, Wu in view of Dal Mutto, IDA disclose all the limitation of claim 10 including three-dimensional shape model, the reference model, the virtual viewpoint image, and obtained three-dimensional shape model.
Wu discloses wherein the virtual viewpoint image is generated based on the corrected model, the obtained model before being corrected, and the plurality of captured images ([0035], [0042] – as shown in Fig. 4, virtual viewpoint image is generated by the virtual viewpoint image generator from the input of the corrected images, based on the captured images before being corrected, and the plurality of captured images.).

Regarding claim 17, Wu discloses An image processing method comprising: ([0035], [0042], [0050], [0058], [0062] to [0064],  [0112], Fig. 3, Fig. 11 - As shown, at the site A, there are provided cameras 11a and 12a to image the user a as an object from different points of view, a display unit 5a to display an image of the user b, captured at the site B, to the user a, and an image processor 2a which generates a virtual viewpoint image on the basis of images Pa1 and Pa2 captured by the cameras 11a and 12a. [0035], [0042] – as shown in Fig. 4, virtual viewpoint image is generated by the virtual viewpoint image generator from the input of the corrected images. ).
Remaining language, see rejection on claim 1. 

Regarding claim 18, Wu discloses a non-transitory computer readable storage medium storing a program which causes a computer to execute a method comprising ([0013], [0040] - An image processing apparatus that is a personal computer with a computer readable storage medium storing a program execute the following methods. Furthermore, a personal computer is known to have a computer readable storage medium storing a program.): 
Remaining language, see rejection on claim 1. 

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Wu (Publication: US 2005/0129325 A1) in view of Dal Mutto et al. (Publication: US 2019/0108396 A1), IDA ET AL. (PUBLICATION: 2018/0091741 A1), and Ikeda (Publication: 2014/0168464 A1).

Regarding claim 9, Wu in view of Dal Mutto, IDA disclose all the limitation of claim 5 including specified object, obtained three-dimensional shape model.
Wu discloses wherein the shape feature includes partial loss of the object in captured images used to generate the obtained model, and the criterion includes a state [0050], [0058], [0062] to [0064], [0086] -  The images that are used to generate the corrected images and one from left and another from right. Each of the occlusion cost functions dx(x, y) and dy(x, y) is generated based on the parallax information. As the distance from the cameras 11a and 12a to the user a as an object becomes shorter (the angle become larger), the probability of the occlusion region occurring will be higher. As the distance from the cameras 11a and 12a to the user a as the object becomes longer (the angle become smaller), the probability of the occlusion region occurring will be lower. That is the angle between the optical axes of the image is in the state where the value of the angle is smaller to an acceptable angle (largest acceptable angle) so to lower the occlusion region.).
However Wu in view of Dal Mutto, IDA do not disclose feature is a sum or an average of ratios of partial loss; data is the sum or the average of the ratios of partial loss of the object.
Ikeda disclose feature is a sum or an average of ratios of partial loss ([0028], [0034] - When the highlight-detail loss frame average value Dave is 70% or smarter, that is to say, when no highlight-detail loss frame is extracted, or when a highlight-detail loss frame is extracted but the highlight-detail loss ratio is small, the tone modification amount is 0% and corresponds to the curve A that is the initial value. As the highlight-detail loss frame average value Dave increases, that is to say, as the highlight-detail loss ratio increases, the tone modification amount increases and ultimately corresponds to the curve B, and the frames are expressed more darkly with their tone being emphasized (i.e., with the slope of their input/output being steep). With this tone control, a high-luminance object is darkened such that the degree of the highlight-detail loss can be reduced, and the difference between bright and dark parts is increased. Although the entire image capturing screen is divided into a plurality of areas for calculation in the present embodiment, any area may be divided into a plurality of areas for calculation.); 
data is the sum or the average of the ratios of partial loss of the object ([0028], [0034] - When the highlight-detail loss frame average value Dave is 70% or smarter, that is to say, when no highlight-detail loss frame is extracted, or when a highlight-detail loss frame is extracted but the highlight-detail loss ratio is small, the tone modification amount is 0% and corresponds to the curve A that is the initial value. As the highlight-detail loss frame average value Dave increases, that is to say, as the highlight-detail loss ratio increases, the tone modification amount increases and ultimately corresponds to the curve B, and the frames are expressed more darkly with their tone being emphasized (i.e., with the slope of their input/output being steep). With this tone control, a high-luminance object is darkened such that the degree of the highlight-detail loss can be reduced, and the difference between bright and dark parts is increased. Although the entire image capturing screen is divided into a plurality of areas for calculation in the present embodiment, any area may be divided into a plurality of areas for calculation.).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify Wu in view of  Dal Mutto with feature is a sum or an average of ratios of partial loss; data is the sum or the average of the . 

Response to Arguments

Claim Rejection Under 35 U.S.C. 103
Applicant asserts “Wu, however, does not appear to disclose obtaining a three-dimensional shape model of the specified object which is generated based on a plurality of captured images obtained by the plurality of image capturing apparatuses for generating a virtual viewpoint image, the obtained three-dimensional shape model representing at least one of a distorted contour or a missing portion of the specified object; and modify the obtained three-dimensional shape model of the specified object based on a reference three-dimensional shape model of the specified object to correct the distorted contour or correct the missing portion of the specified object, as set forth in the above-discussed notable features of amended Claim 1.
Dal Mutto, however, also does not appear to disclose that the
obtained three-dimensional shape model represents at least one of a distorted contour or a missing portion of the specified object; and modifying the obtained three-dimensional shape model of the specified object based on a reference three-dimensional shape model of the


In response to Applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986).
It is Wu in view of  Dal Mutto that disclose the limitations above and not Wu alone. See rejection above for detail.


Regarding claims 2, and 4 - 15, the Applicant asserts that they are not obvious over based on their dependency from independent claims 16, 17, and 18 respectively. The examiner cannot concur with the Applicant respectfully from same reason noted in the examiner’s response to argument asserted from claims 17, and 18 respectively. 


Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicant's disclosure.  

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Gregory Tryder can be reached on 571-270-7365.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MING WU/
Primary Examiner, Art Unit 2616