DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Summary of Amendments
Claims 1, 17 and 18 have been amended.
Claims 2-16 have been previously presented.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-6, 10 and 15-18 are rejected under 35 U.S.C. 103 as being unpatentable over Clark et al.(hereinafter “Clark”, US 2018/0189974) in view of Li et al.(hereinafter “Li”, “Stereo Vision-based Semantic 3D Object and Ego-motion Tracking for Autonomous Driving”) in further view of Somasundaram et al.(hereinafter “Somasundaram”, US 2016/0004811).
Regarding claim 1, Clark teaches a method (0002 lines 1-6) comprising, by a computing system (0013 lines 1-9):
accessing a plurality of images captured by one or more cameras from a plurality of camera poses (0005 lines 5-11 and Fig. 4: 200a); 

accessing a three-dimensional (3D) model of the one or more objects (0015 lines 18-22 and Fig. 2: 206). However, Clark fails to teach determining, using the plurality of camera poses, a corresponding plurality of virtual camera poses relative to the 3D model of the one or more objects; and generating a semantic 3D model by projecting the semantic information of the plurality of semantic segmentations towards the 3D model using the plurality of virtual camera poses, wherein the semantic information from two or more of the plurality of semantic segmentations are combined to apply to a first object of the one or more objects.
Li teaches determining, using the plurality of camera poses, a corresponding plurality of virtual camera poses relative to the 3D model of the one or more objects (sec. 8 lines 4-6 and Fig. 2(c): ‘Camera pose’); and
generating a semantic 3D model by projecting the semantic information of the plurality of semantic segmentations towards the 3D model using the plurality of virtual camera poses (abst. lines 1-12, Fig. 1: capt. Lines 1-4 and Fig. 2( c): ‘Semantic Info’). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the segmentation information of Clark with the semantic 3D models of Li because this modification would improve tracking of 3D models displayed within a plurality of images through providing semantic data related to the models during changes in virtual camera poses or viewpoints in the scene. However, Clark and Li fail to teach wherein the semantic information from two or more of the plurality of semantic segmentations are combined to apply to a first object of the one or more objects.
in which information related to a plurality of semantic segmentations are combined for application to a respective model). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the segmentation information of Clark and semantic 3D models of Li with the combined segmentation of Somasundaram because this modification would save time utilized to view a plurality of segmentation information through consolidation of that information together for application to the view of an object.
Regarding claim 2, Clark fails to teach generating, using the plurality of images, a plurality of geometry-based segmentations comprising geometric information of the one or more objects captured in the plurality of images. Li teaches generating, using the plurality of images, a plurality of geometry-based segmentations comprising geometric information of the one or more objects captured in the plurality of images (Fig. 1, in which a plurality of segmentations are displayed using geometric information relate dot the several tracked objects though a plurality of images). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the segmentation information of Clark with the semantic 3D models of Li because this modification would improve tracking of 3D models displayed within a plurality of images through providing semantic data related to the models during changes in virtual camera poses or viewpoints in the scene.
Regarding claim 3, Clark fails to teach wherein generating the semantic 3D model by projecting the semantic information of the plurality of semantic segmentations further comprises using the geometric information of the one or more objects to project the semantic information Fig. 1, in which several 3D semantic box models are generated in relation to several objects captured within the plurality of images). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the segmentation information of Clark with the semantic 3D models of Li because this modification would improve tracking of 3D models displayed within a plurality of images through providing semantic data related to the models during changes in virtual camera poses or viewpoints in the scene.
Regarding claim 4, Clark fails to teach further comprising generating, using the plurality of images, a plurality of instance segmentations comprising object identification of the one or more objects captured in the plurality of images. Li teaches further comprising generating, using the plurality of images, a plurality of instance segmentations comprising object identification of the one or more objects captured in the plurality of images (Fig. 2(a)-(c), in which a plurality of segmentations of objects within a plurality of images are provided to identify the objects tracked within those images). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the segmentation information of Clark with the semantic 3D models of Li because this modification would improve tracking of 3D models displayed within a plurality of images through providing semantic data related to the models during changes in virtual camera poses or viewpoints in the scene.
and shown in Fig. 2(c): ‘Camera Pose’, in which a plurality of camera poses related to segmented 3D models are identified in within a plurality of images). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the segmentation information of Clark with the semantic 3D models of Li because this modification would improve tracking of 3D models displayed within a plurality of images through providing semantic data related to the models during changes in virtual camera poses or viewpoints in the scene.
Regarding claims 6 and 15, Clark fails to teach wherein generating the instance 3D model comprises combining object identification from each of the plurality of instance segmentations to apply to one of the one or more objects. Li teaches wherein generating the instance 3D model comprises combining object identification from each of the plurality of instance segmentations to apply to one of the one or more objects (Fig. 1 and Fig. 2(a)-(c), in which a plurality of instances of segmentations are applied and provided to a plurality of objects). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the segmentation information of Clark with the semantic 3D models of Li because this modification would improve tracking of 3D models displayed within a plurality of images through providing semantic data related to the models during changes in virtual camera poses or viewpoints in the scene.
Fig. 2(a)-(c), in which the 3D green cube models are generated based on the plurality of images). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the segmentation information of Clark with the semantic 3D models of Li because this modification would improve tracking of 3D models displayed within a plurality of images through providing semantic data related to the models during changes in virtual camera poses or viewpoints in the scene.
Regarding claim 16, Clark fails to teach wherein projecting the semantic information towards the 3D model comprises adding a label that corresponds to the respective object in the 3D model. Li teaches wherein projecting the semantic information towards the 3D model comprises adding a label that corresponds to the respective object in the 3D model (pg. 13 2nd para. lines 1-5).
Regarding claim 17, Clark teaches one or more computer-readable non-transitory storage media embodying software that is operable when executed (0050 lines 1-7) to: 
access a plurality of images captured by one or more cameras from a plurality of camera poses (0005 lines 5-11 and Fig. 4: 200a); 
generate, using the plurality of images, a plurality of semantic segmentations comprising semantic information of one or more objects captured in the plurality of images (0015 lines 18-22 and 0042 lines 1-14); 
access a three-dimensional (3D) model of the one or more objects (0015 lines 18-22 and Fig. 2: 206). However, Clark fails to teach determine, using the plurality of camera poses, a 
Li teaches determine, using the plurality of camera poses, a corresponding plurality of virtual camera poses relative to the 3D model of the one or more objects (sec. 8 lines 4-6 and Fig. 2(c): ‘Camera pose’); and
generate a semantic 3D model by projecting the semantic information of the plurality of semantic segmentations towards the 3D model using the plurality of virtual camera poses (abst. lines 1-12, Fig. 1: capt. Lines 1-4 and Fig. 2( c): ‘Semantic Info’). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the segmentation information of Clark with the semantic 3D models of Li because this modification would improve tracking of 3D models displayed within a plurality of images through providing semantic data related to the models during changes in virtual camera poses or viewpoints in the scene. However, Clark and Li fail to teach wherein the semantic information from two or more of the plurality of semantic segmentations are combined to apply to a first object of the one or more objects.
Somasundaram teaches wherein the semantic information from two or more of the plurality of semantic segmentations are combined to apply to a first object of the one or more objects (0023 lines 1-19 and 0027 lines 5-10, in which information related to a plurality of semantic segmentations are combined for application to a respective model). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed 
Regarding claim 18, Clark teaches a system (Fig. 6A: 300) comprising: 
one or more processors (0010 lines 1-5); and 
a non-transitory memory coupled to the processors comprising instructions executable by the processors, the processors operable when executing the instructions (0050 lines 1-7) to: 
access a plurality of images captured by one or more cameras from a plurality of camera poses (0005 lines 5-11 and Fig. 4: 200a); 
generate, using the plurality of images, a plurality of semantic segmentations comprising semantic information of one or more objects captured in the plurality of images (0015 lines 18-22 and 0042 lines 1-14); 
access a three-dimensional (3D) model of the one or more objects (0015 lines 18-22 and Fig. 2: 206). However, determine, using the plurality of camera poses, a corresponding plurality of virtual camera poses relative to the 3D model of the one or more objects; and generate a semantic 3D model by projecting the semantic information of the plurality of semantic segmentations towards the 3D model using the plurality of virtual camera poses, wherein the semantic information from two or more of the plurality of semantic segmentations are combined to apply to a first object of the one or more objects.
Li teaches determine, using the plurality of camera poses, a corresponding plurality of virtual camera poses relative to the 3D model of the one or more objects (sec. 8 lines 4-6 and Fig. 2(c): ‘Camera pose’); 
Fig. 1: capt. Lines 1-4 and Fig. 2( c):‘Semantic Info’). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the segmentation information of Clark with the semantic 3D models of Li because this modification would improve tracking of 3D models displayed within a plurality of images through providing semantic data related to the models during changes in virtual camera poses or viewpoints in the scene. However, Clark and Li fail to teach wherein the semantic information from two or more of the plurality of semantic segmentations are combined to apply to a first object of the one or more objects. Somasundaram teaches wherein the semantic information from two or more of the plurality of semantic segmentations are combined to apply to a first object of the one or more objects (0023 lines 1-19 and 0027 lines 5-10, in which information related to a plurality of semantic segmentations are combined for application to a respective model). Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the segmentation information of Clark and semantic 3D models of Li with the combined segmentation of Somasundaram because this modification would save time utilized to view a plurality of segmentation information through consolidation of that information together for application to the view of an object.

Claim Objections
Regarding claims 7-9 and 11-14, though Li teaches identifying a plurality of camera poses in relation to a segmented objects within a plurality of images (Fig. 2(a)-(c)), Li fails to teach the limitations of claims 7-9 and 11-14. Therefore, claims 7-9 and 11-14 are objected to as 

Response to Arguments
Applicant's arguments filed 11/09/20 have been fully considered but they are not persuasive. The applicant’s arguments state the amendments to claims 1, 17 and 18 were made to further clarify the distinction between the claims and the cited art with the belied that these amendments obviate the Examiner’s rejections. However, upon further search and consideration the applicant’s arguments in regards to claims 1, 17 and 18 are unpersuasive in view of the new grounds of rejections provided in the office action.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, George Eng can be reached on 571-272-7495. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/Said Broome/Primary Examiner, Art Unit 2699