Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Notice to Applicants
This communication is in response to the Application filed on 1/29/2021.
Claims 1-17 are pending.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim 1, 10, 13, and 14 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Srinivasan et al. (U.S Publication No. 2014/0254876) (hereafter, "Srinivasan"). 
Regarding claim 1, Srinivasan teaches a computer-implemented method comprising: obtaining first data comprising a first collection of coordinate data representing a position, or positions, of a first group of one or more objects detected in a first frame of a scene; obtaining second data comprising a second collection of coordinate data representing a position, or positions, of a second group of one or more objects detected in a second frame of the scene ([0034] The example audience measurement device 104 of FIG. 1 utilizes first and second image sensors 110 and 112 to capture a plurality of frame pairs of image data of the environment 100; [0051] the information passed to the face data tracker 504 for each frame includes any generated face rectangles and corresponding coordinate(s) within the frame; [0037] The example audience measurement device 104 also records a position of the detected face. In the illustrated example of FIG. 3, the recorded position 304 is defined by X-Y coordinates at a center of the face box 300 (e.g., the point 304 of FIG. 3) surrounding the face), wherein the second frame represents a different view of the scene than the first frame ([0034] The first image sensor 110 captures a first image within a first field of view and the second image sensor 112 simultaneously (e.g., within a margin of error) captures a second image within a second field of view); and determining whether any of the first group of objects correspond to any of the second group of objects in the scene based on the first collection of coordinate data and the second collection of coordinate data ([0058] The example checker 706 of FIG. 7 receives the face rectangles that have been designated by the false positive identifier 704 as false positives and determines whether any of those face rectangles have previously (e.g., in connection with a previous frame or set of frames) been verified as corresponding to a human face. The example checker 706 of FIG. 7 includes a location calculator 718 to determine coordinates at which a received false positive face rectangle is located. In the illustrated example, the location calculator 718 retrieves the location data from the face data tracker 504 which, as described above, receives location data in association with the detected face rectangles from the face detector 502. The example checker 706 of FIG. 7 also includes a prior frame retriever 720 that retrieves data from the frame database 510 of FIG. 5 using the coordinates of the received false positive face rectangle. The example frame database 510 of FIG. 5 includes historical data indicative of successful face detections and the locations (e.g., coordinates) of the successful face detections. The example prior frame retriever 720 queries the frame database 510 with the coordinates of the received false positive face rectangle to determine whether a face was detected at that location in a previous frame within a threshold amount of time (e.g., within the previous twelve frames)), wherein the first frame and the second frame are obtained from image data captured from a single camera position ([0033] The example audience measurement device 104 can be implemented in additional and/or alternative types of environments such as, for example, a room in a non-statistically selected household, a theater, a restaurant, a tavern, a retail location, an arena, etc; [0034] The example audience measurement device 104 of FIG. 1 utilizes first and second image sensors 110 and 112 to capture a plurality of frame pairs of image data of the environment 100).
Regarding claim 10, Srinivasan teaches wherein determining whether any of the first group of objects correspond to any of the second group of objects, comprises: applying one or more object recognition algorithms to image data representing the first group of objects and to image data representing the second group of objects to identify at least some of the first group of objects and the second group of objects ([0064] when a face rectangle detected in connection with the second image sensor 405 is determined to be redundant to a face rectangle detected in connection with the first image sensor 404, an entry is added to the correlated face history 808; [0065] The example location calculator 810 of FIG. 8 determines first coordinates for the face rectangle detected in connection with the first image sensor 404 and second coordinates for the overlap face rectangle detected in connection with the second image sensor 405 (the face rectangle determined to be redundant); [0066] The example frame pair overlap eliminator 514 includes a searcher 812 to query the correlated face history 808 such that a face rectangle detected in connection with the second image sensor 405 in a current frame pair can be identified as a redundancy based on its location and the presence of another face rectangle detected in connection with the first image sensor 404 at the counterpart location of the first frame); and determining whether any of the first group of objects correspond to any of the second group of objects based on the first collection of coordinate data, the second collection of coordinate data, and the identifying of at least some of the first group of objects and the second group of objects ([0069] An example implementation of the example grouper 516 is illustrated in FIG. 9. The example grouper 516 of FIG. 9 includes a location calculator 900 to obtain the location of the face rectangles of the sets 506 and 508. As described above, location information of the face rectangles (e.g., coordinates of a center of the corresponding detected face rectangle as shown in FIG. 3) is stored in the example face data tracker 504 and, thus, the example location calculator 900 retrieves the location information from the data tracker 504 in the illustrated example. The example grouper 516 of FIG. 9 also includes a comparator 902 to compare the retrieved locations of the face rectangles. The example comparator 902 determines whether any of the face rectangles of the sets 506, 508 collected over the period of time defined by the interval tracker 500 are similarly located within a threshold).
With respect to claim 13, arguments analogous to those presented for claim 1, are applicable.
With respect to claim 14, arguments analogous to those presented for claim 1 and 10, are applicable.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 2, 3, and 4 are rejected under 35 U.S.C. 103 as being unpatentable over Srinivasan et al. (U.S Publication No. 2014/0254876) (hereafter, "Srinivasan") in view of Xiong et al. (U.S Publication No. 2021/0354299) (hereafter, "Xiong").
Regarding claim 2, Srinivasan teaches all the limitations of claim 1 above. Srinivasan teaches wherein determining whether any of the first group of objects correspond to any of the second group of objects comprises ([0058] determines whether any of those face rectangles have previously (e.g., in connection with a previous frame or set of frames) been verified as corresponding to a human face. The example checker 706 of FIG. 7 includes a location calculator 718 to determine coordinates at which a received false positive face rectangle is located … The example checker 706 of FIG. 7 also includes a prior frame retriever 720 that retrieves data from the frame database 510 of FIG. 5 using the coordinates of the received false positive face rectangle ... The example prior frame retriever 720 queries the frame database 510 with the coordinates of the received false positive face rectangle to determine whether a face was detected at that location in a previous frame within a threshold amount of time (e.g., within the previous twelve frames)).
Srinivasan does not expressly teach transforming at least one of the first collection of coordinate data and the second collection of coordinate data such that the first collection of coordinate data and the second collection of coordinate data correspond to a common coordinate system; and determining differences between the first collection of coordinate data and the second collection of coordinate data according to the common coordinate system.
However, Xiong teaches transforming at least one of the first collection of coordinate data and the second collection of coordinate data such that the first collection of coordinate data and the second collection of coordinate data correspond to a common coordinate system ([0047] Step 204: converting the first sensor data and the second sensor data to a same coordinate system to obtain corresponding first converted sensor data and second converted sensor data); and determining differences between the first collection of coordinate data and the second collection of coordinate data according to the common coordinate system ([0045] It is for the convenience of calibrating the external parameter based on the difference between the first sensor data and the second sensor data; [0048] In which, since the first sensor data collected by the first sensor is data in the coordinate system of the first sensor, while the second sensor data collected by the second sensor is data in the coordinate system of the second sensor, in order to facilitate comparison between the two, it needs to convert them to the same coordinate system).
It would have been obvious before the effective filing date of the claimed invention to one having ordinary skill in the art to modify the device and method of Srinivasan to incorporate the step/system of converting the first coordinate data and the second coordinate data to a same coordinate system and determining differences between the first coordinate data and the second coordinate data according to the same coordinate system taught by Xiong.
The suggestion/motivation for doing so would have been to improve the consistency of the first sensor and the second sensor in the coordinate system ([0029] The positional relationship parameter of the first sensor and the second sensor can be solved by only moving the calibration reference object to collect the N sets of the coordinate data, which not only simplifies the calibration, but also greatly reduces the deviation to improve the consistency of the first sensor and the second sensor in the coordinate system of the robot). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predicted results. Therefore, it would have been obvious to combine Srinivasan with Xiong to obtain the invention as specified in claim 2.
Regarding claim 3, the combination of Srinivasan and Xiong teaches all the limitations of claim 2 above. Xiong teaches wherein the first data comprises first calibration data associated with at least one characteristic of the first frame, the second data comprises second calibration data associated with at least one characteristic of the second frame, and wherein the transforming of at least one of the first collection of coordinate data and the second collection of coordinate data includes using the first calibration data and the second calibration data ([0048] the first sensor data and the second sensor data can be uniformly converted to the coordinate system of the first sensor or the coordinate system of the second sensor so that both are in the same coordinate system and to obtain the first converted sensor data and the second converted sensor data. In another embodiment, the first sensor data and the second sensor data are respectively converted to the coordinate system of the robot to obtain the first converted sensor data and the second converted sensor data, respectively; [0049] Step 206: determining a first coordinate of a reference point of the calibration reference object based on the first converted sensor data, and determining a second coordinate of the reference point of the calibration reference object based on the second converted sensor data, and using the first coordinate and the second coordinate as a set of coordinate data). Motivation for this combination has been stated in claim 2.
Regarding claim 4, the combination of Srinivasan and Xiong teaches all the limitations of claim 3 above. Xiong teaches wherein transforming at least one of the first collection of coordinate data and the second collection of coordinate data comprises transforming the first collection of coordinate data and the second collection of coordinate data to a spherical coordinate system ([0053] for a three-dimensional space, which includes 6 degrees of freedom [x, y, z, roll, pitch, yaw], where “roll” represents rotating around the z axis, “pitch” represents rotating about the x-axis, and “yaw” represents rotating about the y-axis. Then, N should be greater than or equal to 6; [0065] it assumes that the data set of the first sensor is (xi,yi) and the data set of the second sensor is (xi′,yi′), and for the 2D navigation application of the robot, the parameter to be solved is (Δx, Δy, Δyaw)). Motivation for this combination has been stated in claim 2.

Claim 5 and 6 are rejected under 35 U.S.C. 103 as being unpatentable over Srinivasan et al. (U.S Publication No. 2014/0254876) (hereafter, "Srinivasan") in view of Xiong et al. (U.S Publication No. 2021/0354299) (hereafter, "Xiong") in further in view of Tao et al. (U.S. Publication No. 2014/0152647) (hereafter, "Tao").
Regarding claim 5, the combination of Srinivasan and Xiong teaches all the limitations of claim 2 above. The combination of Srinivasan and Xiong does not expressly teach comprising obtaining a set of depth estimations each depth estimation being associated with a respective object of the first group of objects and the second group of objects, wherein determining differences between the first collection of coordinate data and the second collection of coordinate data includes determining differences between respective depth estimations.
However, Tao teaches comprising obtaining a set of depth estimations each depth estimation being associated with a respective object of the first group of objects and the second group of objects, wherein determining differences between the first collection of coordinate data and the second collection of coordinate data includes determining differences between respective depth estimations ([0065] The position of the image objects 1002a-c, 1004a-c, and 1006a-c in the rows 1204a-c of the two-dimensional epipolar data structure 1202 corresponds to the depths of the respective objects 1002, 1004, 1006 within the imaged space; [0066] FIG. 13 is a modeling diagram depicting analysis of image data using a two-dimensional epipolar data structure 1202. The image manipulation application 116 can determine a depth of an object in an image, such as a pixel, by analyzing the data of the two-dimensional epipolar data structure ... The image manipulation application 116 can estimate a depth of each object based on a derivative of the function (i.e., the slope and direction of the line defined by the function); [0071] the image manipulation application 116 can use other suitable algorithms to estimate a depth of an object in an imaged spaced based on the relationship between objects in different rows of an epipolar data structure; [0059] The image manipulation application can use two-dimensional epipolar data structures to determine the displacement of objects between different perspectives of an image space; [0077] Determining the displacement in a single direction can improve the accuracy of a depth map determined using the displacement between image pairs).
It would have been obvious before the effective filing date of the claimed invention to one having ordinary skill in the art to modify the device and method of the combination of Srinivasan and Xiong to incorporate the step/system of obtaining a set of depth estimations each depth estimation being associated with a respective object of each group of objects and determining displacement of objects between different perspectives including determining depth map by using the displacement between image pairs taught by Tao.
The suggestion/motivation for doing so would have been to improve the accuracy of depth estimation ([0007] systems and methods are desirable for improving the accuracy of depth estimation; [0077] Determining the displacement in a single direction can improve the accuracy of a depth map determined using the displacement between image pairs). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predicted results. Therefore, it would have been obvious to combine Srinivasan and Xiong with Tao to obtain the invention as specified in claim 5.
Regarding claim 6, the combination of Srinivasan, Xiong and Tao teaches all the limitations of claim 5 above. Tao teaches wherein the set of depth estimates are determined by applying one or more monocular depth estimation algorithms to image data representing the first frame and image data representing the second frame ([0171] the location space may be the physical space (e.g., the scene space). In one such example, the determined reference position is based on at least one point selected using a video stream that includes depth information, such as a video stream from a structured light imager or other depth camera (e.g., Microsoft Kinect). Such a video stream may be displayed on a touchscreen by, for example, mapping the depth value of each pixel to a corresponding color). Motivation for this combination has been stated in claim 5.

Claim 7, 8, 11, 16 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Srinivasan et al. (U.S Publication No. 2014/0254876) (hereafter, "Srinivasan") in view of KORNIENKO et al. (U.S Publication No. 2021/0118104) (hereafter, "KORNIENKO").
Regarding claim 7, Srinivasan teaches all the limitations of claim 1 above. Srinivasan does not expressly teach wherein the first frame and second frame are each generated by applying at least one transformation to at least part of the image data captured from the single camera position to adjust a geometric distortion of a frame represented by the image data.
However, KORNIENKO teaches wherein the first frame and second frame are each generated by applying at least one transformation to at least part of the image data captured from the single camera position to adjust a geometric distortion of a frame represented by the image data ([0018] Each half of the image may be transformed using a stereographic projection to a flat plane to generate two transformed images. In this way, a panoramic view of the scene obtained by the image capture device may be created using the two transformed images. This allows a single camera to be used to capture the scene in the room, instead of a pair of opposed cameras; [0032] Transformation data 270 representing at least one transformation for application to the input image data to adjust a geometric distortion of the input frame may also be obtained).
It would have been obvious before the effective filing date of the claimed invention to one having ordinary skill in the art to modify the device and method of Srinivasan to incorporate the step/system of generating each frame by applying transformation to the input image data captured from the single camera position to adjust a geometric distortion of the input frame taught by KORNIENKO.
The suggestion/motivation for doing so would have been to improve to reduce the memory and artefacts ([0015] the input image data comprising a geometric distortion in at least part of the frame represented by the input image data may be streamed into temporary storage and processed in a suitable order to generate output image data in which the geometric distortion is adjusted. This may reduce the memory and other hardware resources which are used to process input image data representing input frames comprising geometric distortion as well as increasing the throughput of a corresponding image processing system; [0016] With this transformation, wide view angle images can be generated with reduced artefacts). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predicted results. Therefore, it would have been obvious to combine Srinivasan and KORNIENKO to obtain the invention as specified in claim 7.
Regarding claim 8, the combination of Srinivasan and KORNIENKO teaches all the limitations of claim 7 above. Srinivasan teaches wherein the first frame and the second frame are generated at least in part from the same image data, the image data having been captured by a single camera ([0034] The example audience measurement device 104 of FIG. 1 utilizes first and second image sensors 110 and 112 to capture a plurality of frame pairs of image data of the environment 100. The first image sensor 110 captures a first image within a first field of view and the second image sensor 112 simultaneously (e.g., within a margin of error) captures a second image within a second field of view; [0036] A first region including the field of view of the first image sensor 110 is labeled with reference numeral 200 in FIG. 2. A second region including the field of view of the second image sensor 112 is labeled with reference numeral 202 in FIG. 2. An overlap region in which the first region 200 and the second region 202 intersect is labeled with reference numeral 204 in FIG. 2).
Regarding claim 11, the combination of Srinivasan and KORNIENKO teaches all the limitations of claim 7 above. Srinivasan teaches comprising, based on the determining of whether any of the first group of objects correspond to any of the second group of objects ([0058] determines whether any of those face rectangles have previously (e.g., in connection with a previous frame or set of frames) been verified as corresponding to a human face. The example checker 706 of FIG. 7 includes a location calculator 718 to determine coordinates at which a received false positive face rectangle is located … The example checker 706 of FIG. 7 also includes a prior frame retriever 720 that retrieves data from the frame database 510 of FIG. 5 using the coordinates of the received false positive face rectangle ... The example prior frame retriever 720 queries the frame database 510 with the coordinates of the received false positive face rectangle to determine whether a face was detected at that location in a previous frame within a threshold amount of time (e.g., within the previous twelve frames)), whereby to reduce a likelihood that any of the first group of objects correspond to any of the second group of objects ([0031] when examples disclosed herein group together face detections for the period of time using the individual frames, examples disclosed herein also eliminate redundant ones of the groups that fall in the overlap region. Thus, examples disclosed herein eliminate redundant face detections in individual frames collected over a period of time, as well as redundant face detection groups formed from for the period of time using the individual frames; [0063] If the comparator 806 determines that the overlap face rectangle detected in connection with the second image sensor 405 is sufficiently similar to one of the face rectangles detected in connection with the first image sensor 404 (e.g., within the threshold), the comparator 806 designates the overlap rectangle detected in connection with the second image sensor 405 as a redundant face rectangle and eliminates the face rectangle from the second set of face rectangles 508 of the data tracker 504 of FIG. 5; See Para. 0071).
Srinivasan does not expressly teach modifying the at least one transformation applied to the image data captured from the single camera position to adjust the geometric distortion of the frame to alter either of the first view of the scene and second view of the scene.
However, KORNIENKO teaches modifying the at least one transformation applied to the image data captured from the single camera position to adjust the geometric distortion of the frame to alter either of the first view of the scene and second view of the scene ([0018] Each half of the image may be transformed using a stereographic projection to a flat plane to generate two transformed images. In this way, a panoramic view of the scene obtained by the image capture device may be created using the two transformed images. This allows a single camera to be used to capture the scene in the room, instead of a pair of opposed cameras; [0032] Transformation data 270 representing at least one transformation for application to the input image data to adjust a geometric distortion of the input frame may also be obtained… The type of transformation which is used may also depend on a desired geometric distortion in the output frame).
It would have been obvious before the effective filing date of the claimed invention to one having ordinary skill in the art to modify the device and method of Srinivasan to incorporate the step/system of modifying transformation to the input image data captured from the single camera position to adjust a geometric distortion of the input frame taught by KORNIENKO.
The suggestion/motivation for doing so would have been to improve to reduce the memory and artefacts ([0015] the input image data comprising a geometric distortion in at least part of the frame represented by the input image data may be streamed into temporary storage and processed in a suitable order to generate output image data in which the geometric distortion is adjusted. This may reduce the memory and other hardware resources which are used to process input image data representing input frames comprising geometric distortion as well as increasing the throughput of a corresponding image processing system; [0016] With this transformation, wide view angle images can be generated with reduced artefacts). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predicted results. Therefore, it would have been obvious to combine Srinivasan and KORNIENKO to obtain the invention as specified in claim 11.
With respect to claim 16, arguments analogous to those presented for claim 7, are applicable.
With respect to claim 17, arguments analogous to those presented for claim 11, are applicable.

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Srinivasan et al. (U.S Publication No. 2014/0254876) (hereafter, "Srinivasan") in view of KORNIENKO et al. (U.S Publication No. 2021/0118104) (hereafter, "KORNIENKO") in further in view of Xiong et al. (U.S Publication No. 2021/0354299) (hereafter, "Xiong").
Regarding claim 9, the combination of Srinivasan and KORNIENKO teaches all the limitations of claim 7 above. The combination of Srinivasan and KORNIENKO does not expressly teach wherein generating the first frame and the second frame comprises generating first calibration data associated with at least one characteristic of the first frame and generating second calibration data associated with at least one characteristic of the second frame.
However, Xiong teaches wherein generating the first frame and the second frame comprises generating first calibration data associated with at least one characteristic of the first frame and generating second calibration data associated with at least one characteristic of the second frame ([0048] the first sensor data and the second sensor data can be uniformly converted to the coordinate system of the first sensor or the coordinate system of the second sensor so that both are in the same coordinate system and to obtain the first converted sensor data and the second converted sensor data; [0049] Step 206: determining a first coordinate of a reference point of the calibration reference object based on the first converted sensor data, and determining a second coordinate of the reference point of the calibration reference object based on the second converted sensor data, and using the first coordinate and the second coordinate as a set of coordinate data).
It would have been obvious before the effective filing date of the claimed invention to one having ordinary skill in the art to modify the device and method of Srinivasan and KORNIENKO to incorporate the step/system of determining a first coordinate of a reference point of the calibration reference object based on the first converted sensor data and a second coordinate of the reference point of the calibration reference object based on the second converted sensor data taught by Xiong. 
The suggestion/motivation for doing so would have been to improve the consistency of the first sensor and the second sensor in the coordinate system ([0029] The positional relationship parameter of the first sensor and the second sensor can be solved by only moving the calibration reference object to collect the N sets of the coordinate data, which not only simplifies the calibration, but also greatly reduces the deviation to improve the consistency of the first sensor and the second sensor in the coordinate system of the robot). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predicted results. Therefore, it would have been obvious to combine Srinivasan and KORNIENKO with Xiong to obtain the invention as specified in claim 9.

Claim 12 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Srinivasan et al. (U.S Publication No. 2014/0254876) (hereafter, "Srinivasan") in view of Yoshimura et al. (U.S Publication No. 2021/0225080) (hereafter, "Yoshimura").
Regarding claim 12, Srinivasan teaches all the limitations of claim 1 above. Srinivasan does not expressly teach comprising obtaining scene data representing a layout of the scene, and, using the scene data, determine a third collection of coordinate data representing positions of objects in the layout of the scene from the first collection of coordinate data and the second collection of coordinate data.
However, Yoshimura teaches comprising obtaining scene data representing a layout of the scene, and, using the scene data, determine a third collection of coordinate data representing positions of objects in the layout of the scene from the first collection of coordinate data and the second collection of coordinate data ([0034] a description is given of processing for generating, in addition to the virtual viewpoint image, a layout that is a figure representing positions of objects based on images captured from a plurality of different viewpoints; [0039] The scene generation unit 204 generates scene data to be used for rendering of the virtual viewpoint image, based on the images and the camera parameters (viewpoint information) acquired by the data acquisition unit 203; [0040] The layout generation unit 205 generates a layout that is a figure representing positions of the objects in the scene data, based on the scene data generated by the scene generation unit 204; [0043] In step S301, the data acquisition unit 203 acquires, from the external server 111, data of the plurality of images captured by the plurality of imaging units disposed at the plurality of different positions and to be used for generation of the virtual viewpoint image).
It would have been obvious before the effective filing date of the claimed invention to one having ordinary skill in the art to modify the device and method of Srinivasan to incorporate the step/system of generating a layout which is a figure representing positions of the objects based on the scene data and images captured from a plurality of different viewpoints taught by Yoshimura.
The suggestion/motivation for doing so would have been to improve viewer's comprehension of a scene ([0034] a description is given of processing for generating, in addition to the virtual viewpoint image, a layout that is a figure representing positions of objects based on images captured from a plurality of different viewpoints and for displaying the layout together with the virtual viewpoint image in order to improve viewer's comprehension of a scene). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predicted results. Therefore, it would have been obvious to combine Srinivasan and Yoshimura to obtain the invention as specified in claim 12.
Regarding claim 15, Srinivasan teaches all the limitations of claim 14 above. Srinivasan teaches wherein the at least one memory further comprises computer-readable instructions which, when executed by the at least one processor, cause the at least one processor to: receive object location data comprising an indication of whether any of the first group of objects correspond to any of the second group of objects; and ([0058] determines whether any of those face rectangles have previously (e.g., in connection with a previous frame or set of frames) been verified as corresponding to a human face. The example checker 706 of FIG. 7 includes a location calculator 718 to determine coordinates at which a received false positive face rectangle is located … The example checker 706 of FIG. 7 also includes a prior frame retriever 720 that retrieves data from the frame database 510 of FIG. 5 using the coordinates of the received false positive face rectangle ... The example prior frame retriever 720 queries the frame database 510 with the coordinates of the received false positive face rectangle to determine whether a face was detected at that location in a previous frame within a threshold amount of time (e.g., within the previous twelve frames)).
Srinivasan does not expressly teach generate a mapping of objects in the scene based on the object location data and scene data representing a layout of the scene.
However, Yoshimura teaches generate a mapping of objects in the scene based on the object location data and scene data representing a layout of the scene ([0039] The scene generation unit 204 generates scene data to be used for rendering of the virtual viewpoint image, based on the images and the camera parameters (viewpoint information) acquired by the data acquisition unit 203; [0040] The layout generation unit 205 generates a layout that is a figure representing positions of the objects in the scene data, based on the scene data generated by the scene generation unit 204).
It would have been obvious before the effective filing date of the claimed invention to one having ordinary skill in the art to modify the device and method of Srinivasan to incorporate the step/system of generating a layout that is a figure representing positions of the objects based on the scene data taught by Yoshimura.
The suggestion/motivation for doing so would have been to improve viewer's comprehension of a scene ([0034] a description is given of processing for generating, in addition to the virtual viewpoint image, a layout that is a figure representing positions of objects based on images captured from a plurality of different viewpoints and for displaying the layout together with the virtual viewpoint image in order to improve viewer's comprehension of a scene). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predicted results. Therefore, it would have been obvious to combine Srinivasan and Yoshimura to obtain the invention as specified in claim 15.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DANIEL C. CHANG whose telephone number is (571)270-1277. The examiner can normally be reached Monday-Thursday and Alternate Fridays 8:00-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan S. Park can be reached on (571) 272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DANIEL C CHANG/Examiner, Art Unit 2669                                                                                                                                                                                                        /CHAN S PARK/Supervisory Patent Examiner, Art Unit 2669