Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 10/28/21 is being considered by the examiner.

Response to Arguments
Applicant’s arguments with respect to claims 1, 13 and 19 have been considered but they are not persuasive.
Applicant argues that the claims recite that particular pixels in input images are projected onto a three-dimensional skeleton of an object, Macmillan's mapping of hemispherical camera images to the cubic representation cannot be interpreted as the same because Macmillan does not teach mapping particular pixels to the cubic representation (p. 6 of Remarks).
Examiner notices that the claim language “mapping each of the input images to a top-down view of the object at least in part by projecting a plurality of pixels in the input images onto the three-dimensional skeleton of the object” doesn’t request to map particular pixels to the cubic representation. There is no clear boundary to define “particular pixels” as argued. The above claim language is merely interpreted as “mapping each view images onto a composite image”. Please note that “a 3D skeleton of the object” is not a real captured image, which should be interpreted as a 3D object model generated from a plurality of viewing images. Fig 6 is merely a 2D composited image, not a 3D skeleton of the object.
Chen discloses acquiring one or more images of vehicle, such as side view images, front and/or rear view images, top and/or bottom view images, inside view images in C15L28-32; “perform character or facial recognition in images to identify text, people, particular buildings, or other features of objects within images in order to automatically identify people, objects, or other features depicted within the image” in C1L42-45; “select or identify a base object model of the automobile or vehicle depicted in the obtained target object images” in C16L12-14; “the point cloud object model 402 may include points that define the outer contours of the various body panels of the automobile, but the point cloud model 402 could additionally or instead define other contours of the automobile” in C16L47-50; “the base object model…or could be made up of a set of two dimensional models (e.g., from a set of two or more images of the object from different points of view or angles, in a pre-damaged condition)” in C17L3-6; “the base object model 502 includes contour lines drawn between adjacent points in the point cloud model 402, forming triangles. Each triangle defines a particular surface segment of the automobile within the base object model 502 to thereby define the different flat surfaces (e.g., relatively flat or planar surfaces) of the base object…the base object model 502 of FIG. 5 is a triangulated surface model that illustrates the three dimensional surface contours of the vehicle being modeled” in C17L17-34; “if the target object images are used to create a composite target object  model, this target object model can be rotated to the same position as the base object model” in C24L23-26. In summary, Chen teaches “different images may show the same or overlapping portions of the vehicle” in C15L42-44 and generate a 3D skeleton model of a vehicle in Fig 5-6. Here, the overlapped portion within two perspective images (e.g. an overlapped region between top-down view and a front view) corresponds to a same portion of the 3D skeleton model. Macmillan further discloses a system to capture images with overlap portion, map a modified first/second hemispherical image to a first/second portion of the 2D projection of a cubic image, and generate an image representative of the spherical FOV (Abstract) as shown in Fig 3.

    PNG
    media_image1.png
    666
    1110
    media_image1.png
    Greyscale

FIG. 6 illustrates an example of a mapping from two hemispherical camera images to the projection of a cubic representation of the spherical FOV including the overlap portions captured by the hemispherical cameras” in [0010]; “the mapper 260 may convert these images to a cubic representation of the spherical images using various warping, mapping, and other image manipulation operations” in [0047]; “Converting the spherical image to the cubic representation may in one embodiment comprise mapping the spherical image to the six faces of a cube that forms the cubic image that represents the spherical FOV” in [0052]. Here, each portion (i.e. the captured images A-F) of the projection of a cubic image in Fig 3 corresponds to a projection of a 3D object (i.e. Chen’s vehicle in Fig 5-6) at a predetermined perspective angle. It is obvious that each pixel of cubic image is projected from a 3D object, while the 3D model can be a 3D mesh model of a vehicle as shown in Chen’s Fig 5-6. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 6, 8-10, 12-13 and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (US 10,657,647) in view of Grabner et al. (US 2019/0147221) and Macmillan et al. (US 2018/0027178).
As to Claim 1, Chen teaches a method comprising:
 receiving a request to generate a multi-view panel of a designated object component, the request identifying a plurality of designated viewpoints of the designated object component, each of the designated viewpoints specifying a respective camera pose with respect to the designated object component, wherein the respective camera pose includes a respective rotational angle identifying a respective degree of rotation of the viewpoint relative to a designated fixed position of an object (Chen discloses “In any event, the block 202 may select a subset of the raw images of the target object to use in processing to determine or quantify the change to the target object as compared to the base object model for the target object. If not enough images, or if images of the right type or quality, have not been collected, the block 202 may provide feedback to a user (via a user interface 102 of FIG. 1, for example,) that additional images of the target object are needed for change processing. This operation is indicated by the dotted lines 203 in FIG. 2. Such an indication may include a request for more images of the target object, a request for images of the target object from a particular angle or perspective, distance, etc.” in C9L37-49; “the block 206 performs multi-dimensional alignment so that images of the object generated from the base object model are aligned in three dimensions with the target object as depicted in the corresponding images of the target object as collected by the block 202” in C10L16-21; camera model may include the camera position with respect to the object in C23L54-56; determining the camera angle and camera position in C24L66-67. Grabner further discloses “An input image including an object can be obtained, and a pose of the object in the input image can be determined” in Abstract; “The pose estimation can produce a 3D pose with six degrees of freedom, including three dimensions for orientation and three dimensions for translation” in [0076]; “given a single color input image (e.g., an RGB image) including the target object, the pose estimation system 608 can estimate the 3D pose of the target object in the input image” in [0150]; “The camera pose includes six degrees-of- freedom, including the rotation (e.g., roll, pitch, and yaw) and the 3D translation of the camera with respect to the world” in [0159]);
identifying via a processor respective component information for each of a plurality of input images of the object, the respective component information indicating a respective portion of the respective input image in which the designated component of the object is depicted (Chen discloses “the block 206 may isolate the target object as depicted in each of the one or more selected images (from the block 202) by eliminating or reducing background effects and/or content, noise, or other image artifacts” in C10L56-59; “to enable identification of or detection of the boundaries of the various different components of the target object as depicted in the corrected images of the target object” in C11L40-42; “identify (block 1835) the target vehicle is that is depicted in the portion of the set of images…may use any image recognition or analysis technology to ascertain the make, model, and/or year of the target vehicle” in C39L14-18); 
determining a respective viewpoint for each input image via the processor, the respective viewpoint indicating a respective camera pose for the respective input image relative to the object (Chen discloses “the base object model 120 may be a set of two dimensional models (e.g., images) that define different views of a three dimensional object, as well as information defining the exact view (e.g., the angle, the distance, etc. of the object from the focal point of the camera) of the object within the images” in C9L63-C10L1;  “the block 812 can determine the angle and distance at which the camera that took the picture of the target vehicle” in C23L39-41); 
determining via the processor a three-dimensional skeleton of the object based on the viewpoints, a top-down view of the object, and the component information (Chen discloses creating a 3D model of the target object from the set of target images in C3L33-39; target object from various different viewpoints or sides of the target object in C13L39-40; obtaining image data from different angles, positions, view-points, distances, etc. to determine a 3D triangle mesh models in C16L56-65, see also a 3D skeleton of vehicle in Fig 5-6; the images can be captured from a front view, a side view, a top view or a bottom view of the object etc. in C9L10-14); and
storing on a storage device a multi-view panel including the portions of the selected subset of images in which the designated component of the object is depicted (Chen discloses the image processing system stores a set of 3D base object models in C2L15-17; “if the object being analyzed is an automobile, the base object model may identify the various body panels of the automobile separately and may generate images of the object with the various components of the object identified, outlined, or isolated” in C11L3-7; “as illustrated in FIG. 6, the base object model 502 (illustrated in FIG. 6 with the triangular surface segments being slightly grayed) may include indications or outlines of various different body panels (indicated in FIG. 6 by thicker lines) associated with the automobile being modeled… each of these segments may define or correspond to a different body panel or body part that is, for example, viewable from the outside of the automobile and thus that is present in the base object model 502” in C17L37-53; see also database in Fig 1).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Chen with the teaching of Grabner so as to determine poses of objects depicted in images and determine 3D models for representing the objects (Grabner, [0002]).
Chen and Grabner don’t directly use claim language “mapping”. The combination of Macmillan further teaches following limitations:
mapping each of the input images to a top-down view of the object at least in part by projecting a plurality of pixels in the input images onto the three-dimensional skeleton of the object (Chen discloses “In many cases, different images may show the same or overlapping portions of the vehicle” in C15L42-44; 3D skeleton model of a vehicle in Fig 5-6. Here, the overlapped portion within two perspective images (e.g. an overlapped region between top-down view and a front view) corresponds to a same portion of the 3D skeleton model. Macmillan further discloses a system for capturing 360 degree content and outputting the content in an encoded projection of a cubic image in [0022]; see also Fig 3 below.

    PNG
    media_image1.png
    666
    1110
    media_image1.png
    Greyscale

FIG. 6 illustrates an example of a mapping from two hemispherical camera images to the projection of a cubic representation of the spherical FOV including the overlap portions captured by the hemispherical cameras” in [0010]; “the mapper 260 may convert these images to a cubic representation of the spherical images using various warping, mapping, and other image manipulation operations” in [0047]; “Converting the spherical image to the cubic representation may in one embodiment comprise mapping the spherical image to the six faces of a cube that forms the cubic image that represents the spherical FOV” in [0052]. Here, each portion (i.e. the captured images A-F) of the projection of a cubic image in Fig 3 corresponds to a projection of a 3D object (i.e. Chen’s vehicle in Fig 5-6) at a predetermined perspective angle. It is obvious that each pixel of cubic image is projected from a 3D object, while the 3D model can be a 3D mesh model of a vehicle as shown in Chen’s Fig 5-6. See also above “Response to Arguments”);
evaluating via the processor the plurality of input images based on the mapping of the input images to the top-down view to select a subset of the images that includes the designated object component and that is associated with a respective viewpoint that matches one or more of the designated viewpoints (Chen discloses “select a sub-set of a series of images that have been previously collected or taken of the target object, such as different views of the target object from different angles” in C6L50-53; “Generally speaking, the block 206 performs multi-dimensional alignment so that images of the object generated from the base object model are aligned in three dimensions with the target object as depicted in the corresponding images of the target object as collected by the block 202… for each of the selected images, the block 206 produces an image from the base object model that matches or aligns with the orientation of the target object as depicted in the selected image. On the other hand, if desired, the block 206 may process the selected image to reorient the target object within the selected image to align with a particular image generated from the base object model or with a particular view of the base object model… This alignment can correct for (or be used to match) the camera position (e.g., camera angle), focal length, distance to the target object from the focal point of the camera, etc., to thereby size, scale, and orient the image from the base object model in a manner that is matched in size, scale, and orientation to the target object as depicted in the selected image” in C10L5-54. Macmillan further discloses a mapping between the captured images and the projection image of the 3D object in Fig 3 & 6).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Chen and Grabner with the teaching of Macmillan so as to generate a cubic image by mapping all the perspective images on a plane with efficient encoding for transmission (Macmillan, [0031]).

As to Claim 6, Chen in view of Grabner and Macmillan teaches the method recited in claim 1, the method further comprising: wherein the multi-view panel is generated based on target viewpoint information defined in the top-down view of the object (Macmillan discloses a mapper 260 to perform a cubic mapping conversion in [0050-0053], see also a cubic image in Fig 3; “the preview render 1210 (and other renders) may be converted back, or mapped back, into a spherical image for presentation to the viewing user using a process that reverses the mapping of the spherical image to the cubic image of the projection” in [0102]; reconstructing 3D scene in [0095]. Here, a cubic image can be mapped back to the captured images from various viewpoint. Chen also discloses “the base object model 120 may be a set of two dimensional models (e.g., images) that define different views of a three dimensional object” in C9L59-60; “as illustrated in FIG. 6, the base object model 502 (illustrated in FIG. 6 with the triangular surface segments being slightly grayed) may include indications or outlines of various different body panels (indicated in FIG. 6 by thicker lines) associated with the automobile being modeled” in C17L37-42.)

As to Claim 8, Chen in view of Grabner and Macmillan teaches the method recited in claim 1, wherein the object is a vehicle, and wherein the three- dimensional skeleton includes a door and a windshield (Chen discloses capturing a set of images on a vehicle from different perspective in C9L63-C10L1; an image depicts a front view, a side view, a corner view, a top view, a bottom view of the object, etc.)” in C9L13-14, see also Fig 5-6.)

As to Claim 9, Chen in view of Grabner and Macmillan teaches the method recited in claim 1, wherein the respective viewpoint further includes a respective distance of the camera from the object (Chen discloses “Such an indication may include a request for more images of the target object, a request for images of the target object from a particular angle or perspective, distance, etc.” in C9L46-49; “the base object model 120 may be a set of two dimensional models (e.g., images) that define different views of a three dimensional object, as well as information defining the exact view (e.g., the angle, the distance, etc. of the object from the focal point of the camera) of the object within the images” in C9L63-C10L1).

As to Claim 10, Chen in view of Grabner and Macmillan teaches the method recited in claim 1, wherein the respective camera pose includes a respective vertical angle identifying a respective angular height of the viewpoint relative to a 2D plane parallel to a surface on which the object is situated (Chen discloses camera model may include the camera position with respect to the object in C23L54-56; determining the camera angle and camera position in C24L66-67. Grabner further discloses “An input image including an object can be obtained, and a pose of the object in the input image can be determined” in Abstract; “The pose estimation can produce a 3D pose with six degrees of freedom, including three dimensions for orientation and three dimensions for translation” in [0076]; “given a single color input image (e.g., an RGB image) including the target object, the pose estimation system 608 can estimate the 3D pose of the target object in the input image” in [0150]; “The camera pose includes six degrees-of-freedom, including the rotation (e.g., roll, pitch, and yaw) and the 3D translation of the camera with respect to the world” in [0159].) 

As to Claim 12, Chen in view of Grabner and Macmillan teaches the method recited in claim 1, wherein the respective camera pose includes a respective position identifying a respective position of the viewpoint relative to a designated fixed position of the object (Chen discloses “the base object model 120 may be a set of two dimensional models (e.g., images) that define different views of a three dimensional object, as well as information defining the exact view (e.g., the angle, the distance, etc. of the object from the focal point of the camera) of the object within the images” in C9L63-C10L1; “the camera position ( e.g., camera angle), focal length, distance to the target object from the focal point of the camera)” in C10L48-50.)

Claim 13 recites similar limitations as claim 1 but in a system form. Therefore, the same rationale used for claim 1 is applied.

Claim 17 is rejected based upon similar rationale as Claim 9.
Claim 18 is rejected based upon similar rationale as Claim 10.
Claim 19 recites similar limitations as claim 1 but in a computer readable medium form. Therefore, the same rationale used for claim 1 is applied.
Claim 20 is rejected based upon similar rationale as a combination of Claim 9-10.

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Chen in view of Grabner, Macmillan and Holzer et al. (US 2017/0277363).
As to Claim 7, Chen in view of Grabner and Macmillan teaches the method recited in claim 1. The combination of Holzer further teaches wherein the plurality of images form a multi-view capture of the object navigable in three dimensions, the multi-view capture constructed based in part on inertial measurement unit (IMU) data collected from an IMU in a mobile phone (Holzer discloses “multiple images can be captured from various viewpoints and fused together to provide a MIDMR” in [0084]; “the user navigates through the MIDMR” in [0115]; the source of data that can be used to generate a MIDMR includes location information 106 collected by IMU in [0061].)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the invention of Chen, Grabner and Macmillan with the teaching of Holzer so as to capture location information by IMU to generate a MIDMR for user navigation application (Holzer, [0061, 0115]).

Conclusion
THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WEIMING HE whose telephone number is (571)270-1221.  The examiner can normally be reached on Monday to Friday from 8:00 am to 4:30pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Mehmood can be reached on 571-272-2976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Weiming He/
Primary Examiner, Art Unit 2612