DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments, see applicant’s correspondence, filed 12/30/2020, with respect to the rejection(s) of claim(s) 1-20 under 35 U.S.C. 103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Pollefeys.
Allowable Subject Matter
Claims 17-18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 6, 7, 10-14, 16, and 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jaafar et al. (US 10,380,803 B1) in view of Banik et al. (US 2019/0089910 A1) and in further view of Pollefeys et al. (US 2020/0302634 A1).
Regarding claim 1, Jaafar discloses: 
A method (Col. 2, lines 19-35: system and methods for virtualizing a target object within mixed reality presentation) comprising: 
generating, using one or more processors of a mobile device, an image of a physical environment (Col. 5, lines 34-41 of Jaafar: capture device, e.g. video camera, configured to capture video data representative of the real-world environment, by using the incorporated capture device to capture the video data – see Figs. 3 and 4; Col. 7, lines 5-11: video capture facility 102 implemented by capture device 206 and computing components 208 – examiner notes it is inherent that a computer generates image data from captured video data in order to be able to perform computer-based processing of the image data); 
a selection of an object to be replaced in the image (Col. 2, lines 53-55 of Jaafar: mixed reality presentation system identifies a target object among the real objects included within the real-world environment; Col. 8, line 63 to Col. 9, line 3: “target object” is a real object within a real-world environment);
determining a three-dimensional orientation of the object as depicted within the image using (Col. 6, lines 51-61 of Jaafar: local storage facility 108 maintains suitable data representative of one or more target objects and 3D models of the target objects; Col. 8, line 56 to Col. 9, line 12: target objects may include predesignated objects whose characteristics are stored in a database, that have been automatically learned by machine learning or artificial intelligence technology; Also Col. 9, line 54 to Col. 10, line 12 discusses use of object recognition benefiting from data stored during previous encounters of the system including artificial intelligence techniques, machine learning techniques, etc., to improve ability to successfully recognize and identify real objects in accessed video, and may draw upon a library (e.g., a database) of images of known objects and/or known characteristics of objects such as particular models;
Col. 11, lines 24-36 of Jaafar: virtual object 504 replaces teddy bear object 404-4 such that teddy bear object 404-4 appears to move from first location coincident with a first location of teddy bear object 404-4 within real-world environment seated in on table to second location distinct from first location such that bear object 404-4 may be virtualized so as to appear to come to life, stand up from where teddy bear object 404-4 is actually located, and move around the room freely; Col. 13, line 38 to Col. 14, line 8: once target object such as teddy bear object 404-4 has been extracted from real-world environment 300, system may replace the target object including generating a 3D model of the target object to serve as the second virtual object such that the model is appropriately sized, oriented in space, and provided with shading and lighting effects to be integrated into the virtual domain 500, such that the 3D model of the teddy bear object 404-4 implemented by virtual object 504 comes to life, walks around and interacts, rather than continuing to sit on the table)
removing from the image, the object using regions that are proximate to the object in the image (Col. 2, lines 56-60 of Jaafar: upon identifying the target object, the mixed reality presentation system extracts and replaces the target object in the mixed reality presentation; Col. 10, lines 12-20: extracting identified target object; Fig. 5 and Col. 10, lines 41-51: extraction object 502 aligned with virtual domain 500 so as to cover teddy bear object 404-4 within the video content currently presented within mixed reality presentation; Col. 11, lines 3-23: extraction object generated to mimic object or objects occluded by teddy bear object to produce effect that teddy bear object has been removed from scene leaving nothing in its place; Col. 11, lines 37-45: generate based on extracting what is behind the physical object, i.e. teddy bear 404-4; Col. 12, line 15 to Col. 13, line 3 discusses in more detail extrapolating occluded data based on representations of real objects associated with the occluded area, such as determining horizontal lines of blinds continue straight through occluded area – Figs. 4 and 5);
generating a render of a virtual model in the three-dimensional orientation and as illuminated by one or more virtual light sources based on a lighting scheme in the image (Col. 13, line 38 to Col. 14 line 9 of Jaafar: generating virtual object as a 3D model of physical object, e.g. teddy bear, and provide lighting effects for the model by generating such data based on captured video data, such that the model is appropriately sized, oriented in space, and provided with shading and lighting effects to be integrated into the virtual domain 500, such that the 3D model of the teddy bear object 404-4 implemented by virtual object 504 comes to life, walks around and interacts, rather than continuing to sit on the table; Col. 14, lines 34-46: system configured to simulate the interaction of virtual objects with light sources that illuminate the real-world environment, such that virtual object casts a similar shadow within real-world environment as a real object illuminated from angle and brightness, color of actual light sources illuminating the real-world environment; Col. 15, line 1 to Col. 16, line 26 provides a detailed description of accounting for lighting effects caused by real world light sources based on accessed video data); and
generating a modified image that depicts the render replacing the object in the physical environment (Col. 11, lines 12-23 of Jaafar: virtual object 504 replaces teddy bear object 404-4 (i.e. the physical object of the teddy bear), with an animated 3D model of the teddy bear object within the mixed reality presentation 400; Also Col. 11, lines 24-36 further discusses replacement; Col. 13, line 38 to Col. 14, line 9 discusses inserting virtual model of target object into video including lighting effects)
Jaafar does not explicitly disclose receiving a selection as used in the claim.  
Banik discloses:
Receiving, on a touchscreen of the mobile device, a selection of an object to be replaced in the image (Par. 22 of Manik: device 102 includes camera phone or smartphone, laptop computer, or tablet; Par. 26: graphical user interface including live preview of scene 108; Par. 27: first preview of scene comprises plurality of objects, allowing user to select an undesired object from the first preview of the scene, which may correspond to at least one of a touch-based input; Par. 28: image capture device configured to remove the detected undesired object from the first preview of the scene; Par. 29: fill in a portion of first preview of the scene corresponding to the removed undesired object with at least one of background region or foreground region; Par. 39: input device includes touch screen; Also Par. 58)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects with virtual objects in augmented reality as provided by Jaafar, to provide user controls for Banik, using known electronic interfacing and programming techniques.  The modification provides a user with greater control for selecting preferred target object out a plurality of target objects, reducing unnecessary processing of unintended targets and allowing improved customization to user preferences with an easy to use touch interface.
Although Jaafar teaches determining a 3D orientation of an object depicted in the image as discussed above (i.e. orientating and sizing 3D model to match teddy bear image to come to life as discussed above), Jaafar modified by Banik does not explicitly disclose determining a 3D orientation of the object depicted within the image using a pose detection neural network comprising a convolutional neural network trained to detect three-dimensional orientation of objects in a plurality of object training images.
Pollefeys discloses: 
determining a three-dimensional orientation of the object as depicted within the image using a pose detection neural network comprising a convolutional neural network trained to detect three-dimensional orientation of objects in a plurality of object training images, the objects of the plurality of object training images being of a same type as the object detected in the image (Par. 18 of Pollefeys: estimate 3D articulated object and target object poses; Par. 24: data from the outward facing camera devices 18B may be used to jointly estimate three-dimensional articulated object and target object poses and recognize target objects and action classes for articulated objects and target objects in the scene being captured by the outward facing camera devices 18A, including human hand and various rigid or soft body objects in the physical environment;  Par. 30: object types for articulated objects; Par. 31: sequence of input image frames 34 processed by the trained neural network 40, including a fully convolutional neural network, where trained neural network 40 is configured as a single shot feedforward fully convolutional neural network that jointly estimates 3D articulated object and target poses and recognizes the target objects and action classes concurrently in  a feed-forward pass through the neural network, where the fully convolutional neural network 46 may be configured to process single input image frames 32 to determine highest confidence predictions for three-dimensional articulated object and target object poses at each input image frame 32; Par. 72: neural network 50 trained on single frames, and complete model takes as input a sequence of images and outputs per-frame three-dimensional articulated object-target object pose predictions, target object classes and action labels along with the estimates of interactions for the entire sequence)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects with virtual objects in augmented reality as provided by Jaafar, incorporating user controls for selecting a real-world object for processing as provided by Banik, with the use of a trained neural network for pose estimation of objects as provided by Pollefeys
Regarding claim 19, Jaafar discloses:
A system comprising: one or more processors; memory storing instructions that, when executed by the one or more processors, cause the system to perform operations (Col. 23, line 30 to Col. 24, line 8 of Jaafar: system including software embodied on non-transitory computer-readable medium configured to perform the processes, where a processor (e.g., a microprocessor) receives instructions, from a non-transitory computer-readable medium, (e.g., a memory, etc.), and executes those instructions)
Further regarding claim 19, the operations perform the same method as recited in claim 1 and as such claim 19 is further rejected based on the same rationale as claim 1 set forth above and incorporated herein.  
Regarding claim 20, Jaafar discloses:
A machine-readable storage device embodying instructions that, when executed by a device, cause the device to perform operations (Col. 23, line 30 to Col. 24, line 8 of Jaafar: system including software embodied on non-transitory computer-readable medium configured to perform the processes, where a processor (e.g., a microprocessor) receives instructions, from a non-transitory computer-readable medium, (e.g., a memory, etc.), and executes those instructions)
Further regarding claim 20, the operations perform the same method as recited in claim 1 and as such claim 20 is further rejected based on the same rationale as claim 1 set forth above and incorporated herein.  
Regarding claim 2, Jaafar further discloses: 
Determining the lighting scheme of the image (Col. 14, lines 34-46 of Jaafar: system configured to simulate the interaction of virtual objects with light sources that illuminate the real-world environment, such that virtual object casts a similar shadow within real-world environment as a real object illuminated from angle and brightness, color of actual light sources illuminating the real-world environment; Col. 15, line 1 to Col. 16, line 26 provides a detailed description of accounting for lighting effects caused by real world light sources based on accessed video data)
Regarding claim 6, Jaafar further discloses: 
Wherein, in the image, the object is depicted in an object image region, and the regions that are proximate to the image in the image are proximate regions that are external to the object image region (Col. 2, lines 56-60 of Jaafar: upon identifying the target object, the mixed reality presentation system extracts and replaces the target object in the mixed reality presentation; Col. 10, lines 12-20: extracting identified target object; Fig. 5 and Col. 10, lines 41-51: extraction object 502 aligned with virtual domain 500 so as to cover teddy bear object 404-4 within the video content currently presented within mixed reality presentation; Col. 11, lines 3-23: extraction object generated to mimic object or objects occluded by teddy bear object to produce effect that teddy bear object has been removed from scene leaving nothing in its place; Col. 11, lines 37-45: generate based on extracting what is behind the physical object, i.e. teddy bear 404-4; Col. 12, line 15 to Col. 13, line 3 discusses in more detail extrapolating occluded data based on representations of real objects associated with the occluded area, such as determining horizontal lines of blinds continue straight through occluded area – Figs. 4 and 5)
Regarding claim 7, Jaafar further discloses: 
Wherein the object is removed by merging the proximate regions and the object image region (Col. 2, lines 56-60 of Jaafar: upon identifying the target object, the mixed reality presentation system extracts and replaces the target object in the mixed reality presentation; Fig. 5 and Col. 10, lines 41-51: extraction object 502 aligned with virtual domain 500 so as to cover teddy bear object 404-4 within the video content currently presented within mixed reality presentation; Fig. 7 shows merging image of area of target object with proximate regions around the target object on display screen of mobile device)
Regarding claim 10, Jaafar further discloses:
Displaying the image on a display device of the mobile device (Fig. 2 and Col. 6, line 51 to Col. 7, line 3 of Jaafar: mixed reality player device 202 including display screen 204; Fig. 7 and Col. 17, lines 11-29 discloses image displayed on mobile device with replaced target object)
Banik discloses: 
Receiving selection of the object through the display device of the mobile device (Par. 27: first preview of scene comprises plurality of objects, allowing user to select an undesired object from the first preview of the scene, which may correspond to at least one of a touch-based input; Par. 23: display associated with handheld computer or smartphone; Par. 39: device including touch screen; Also Par. 58)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects with virtual objects in augmented reality as provided by Jaafar, to provide user controls for selecting a real-world object for processing as provided by Banik, using known electronic interfacing and programming techniques.  The modification provides a user with greater control for selecting preferred target object out a plurality of target objects, reducing unnecessary 
Regarding claim 11, Jaafar modified by Banik further discloses: 
Wherein receiving selection of the object comprises receiving selection of a selected region of the image that depicts the object (Par. 27 of Banik: The user of the image-capture device 102 may provide a first user input to select the first object 110 in the first preview of the scene 108 as the undesired object, using a touch-based input; Par. 58: user may single tap on the object rendered on the display)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects with virtual objects in augmented reality as provided by Jaafar, to provide user controls for selecting a real-world object for processing as provided by Banik, using known electronic interfacing and programming techniques.  The modification provides a user with greater control for selecting preferred target object out a plurality of target objects, reducing unnecessary processing of unintended targets and allowing improved customization to user preferences with an easy to use touch interface.
Regarding claim 12, Jaafar further discloses: 
Generating an image mask using the selected region (Fig. 5 and Col. 10, lines 21-35 of Jaafar: virtual domain 500 may be configured such that the blank areas within virtual domain 500 are masked to show the video data illustrated in mixed reality presentation 400 within FIG. 4, while the non-blank areas are displayed on top of (e.g., instead of) the corresponding areas of the video data)
Regarding claim 13, Jaafar modified by Banik further discloses: 
wherein the selected region is identified from a user input on the image as displayed on the touch screen of the mobile device (Par. 27 of Banik: The user of the image-capture device 102 may provide a first user input to select the first object 110 in the first preview of the scene 108 as the undesired object, using a touch-based input; Par. 39: input device includes touch screen; Par. 58: user may single tap on the object rendered on the display)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects with virtual objects in augmented reality as provided by Jaafar, to provide user controls for selecting a real-world object for processing as provided by Banik, using known electronic interfacing and programming techniques.  The modification provides a user with greater control for selecting preferred target object out a plurality of target objects, reducing unnecessary processing of unintended targets and allowing improved customization to user preferences with an easy to use touch interface.
Jaafar modified by Banik does not explicitly disclose segmenting the image into segment regions using an image segmentation convolutional neural network (CNN).
Pollefeys further discloses: 
Segmenting the image into segment regions using an image segmentation convolutional neural network (CNN) (Par. 40 of Pollefeys: trained neural network 40 processes each input from with the fully convolutional network 46 and divides the input image frame 32 into regular gird of input cells)
(note alternative rejection in view of Holzer below)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects Jaafar, providing user controls for selecting a real-world object for processing as provided by Banik, with the use of a trained neural network for pose estimation of objects as provided by Pollefeys, using known electronic interfacing and programming techniques.  The modification provides an improved and efficient object recognition technique for better determining object orientation and location in an image for improved visual transition between the display of real-world objects and virtual objects in an augmented reality display.  Moreover, the modification merely substitutes one known artificial intelligence system for object pose detection for another, yielding predictable results of utilizing a CNN image-based object recognition technique for determining the location and pose of an object, allowing for objects substitution for augmented reality visualization. 
Regarding claim 14, Jaafar modified by Banik further discloses: 
Wherein the user input is one of: a tap gesture or a click (Par. 27 of Banik: The user of the image-capture device 102 may provide a first user input to select the first object 110 in the first preview of the scene 108 as the undesired object, using a touch-based input; Par. 58: user may single tap on the object rendered on the display)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects with virtual objects in augmented reality as provided by Jaafar, to provide user controls for selecting a real-world object for processing as provided by Banik
Regarding claim 16, Jaafar further discloses: 
Classifying the object into an object category using an object classification (Col. 9, lines 3-12 of Jaafar: target objects include predesignated objects that have been automatically “learned” by machine learning and/or artificial intelligence technology; Col. 9, line 43 to Col. 10, line 12: accessing and analyzing video data to identify one or more objects, including object recognition techniques having identifiable characteristics, e.g. furniture with back and four legs recognized as a chair, using artificial intelligence techniques or machine learning techniques, and drawing upon a library of images of know characteristics of objects, such as models and brands, etc. – i.e. using characteristics of chair to identify a chair object is a “classification”)
Jaafar does not explicitly teach use of a classification neural network.
Pollefeys further discloses: 
Classifying the object into an object category using an object classification neural network (Par. 61 of Pollefeys: computing one or more candidate target object classes for the target object and a respective target object class probability for each of the one or more candidate target object classes, where the trained neural network 40 is trained to recognize candidate target object classes based on feature of the target object in the input image frame identifiable by the trained neural network)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects with virtual objects in augmented reality as provided by Jaafar, providing user controls for selecting a real-world object for processing as provided by Banik, with the use of a trained neural network for pose estimation of objects as provided by Pollefeys, with the additional Pollefeys, using known electronic interfacing and programming techniques.  The modification results in an improved identification of objects within in an image a neural network classification as provided by Pollefeys for a more accurate identification of objects in an image, for more realistic and visually pleasing result.  In addition, the modification substitutes one known technique for identifying an object in a digital image for another, yielding predictable results of utilizing a neural network classification and identification of objects within an image used in the method and system for processing objects from the AR display. 

Claims 3-5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jaafar et al. (US 10,380,803 B1) in view of Banik et al. (US 2019/0089910 A1) and Pollefeys et al. (US 2020/0302634 A1) and in further view of Yildiz et al. (US 10,540,812 B1). 
Regarding claim 3, the limitations included from claim 2 are rejected based on the same rationale as claim 2 set forth above and incorporated herein.  Further regarding claim 3, Yildiz discloses: 
Wherein determining the lighting scheme comprises determining one or more bright regions of the image (Fig. 5 and Col. 13, lines 20-45 of Yildiz: use RGB camera feed to determine light intensity in regions of image (where intensity includes brightness), and map real-world location and size of real-world light source to virtual space, where images from RGB cameras are sent through processing step to identify regions of bright color as light source color)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects Jaafar, providing user controls for selecting a real-world object for processing as provided by Banik, and using a trained neural network for pose estimation of objects as provided by Pollefeys, with the technique for determining lighting for virtual simulation of real-world lighting as provided by Yildiz, using known electronic interfacing and programming techniques.  The modification results in an improved integration of virtual objects into a real-world object by providing more accurate lighting simulation, while also merely substituting one known technique for simulating lighting effects in an virtual environment for another, yielding predictable results of utilizing image characteristics for calculating and simulating lighting effects in augmented reality displays. 
Regarding claim 4, Jaafar modified by Banik, Pollefeys and Yildiz further discloses:
Positioning, in a virtual environment, the one or more virtual light sources based on locations of the one or more bright regions of the image (Col. 13, lines 33-45 of Yildiz: images from the RGB cameras are sent through a processing step where regions of bfight color are identified as potential light sources, and after regions of interest are determined, the depth image from the depth sensor can be used to find the distance to light source, and finally use distance  with user’s position to determine location of real-world light source with respect to user; Fig. 5 and Col. 14, line 59 to Col. 15, line 2: detect location and size of light sources using SLAM pose and depth sensor data to transform real-world coordinates into virtual space coordinates, recording physical location, intensity, size, and color of real-world light source in virtual space coordinates; Col. 15, lines 10-20: set one or more artificial virtual light sources in rendering engine with the same location, intensity, color, size, and type detected in the real-world light source)
Jaafar, providing user controls for selecting a real-world object for processing as provided by Banik, and using a trained neural network for pose estimation of objects as provided by Pollefeys, with the technique for determining lighting for virtual simulation of real-world lighting as provided by Yildiz, using known electronic interfacing and programming techniques.  The modification results in an improved integration of virtual objects into a real-world object by providing more accurate lighting simulation, while also merely substituting one known technique for simulating lighting effects in an virtual environment for another, yielding predictable results of utilizing image characteristics for calculating and simulating lighting effects in augmented reality displays. 
Regarding claim 5, Jaafar modified by Banik, Pollefeys and Yildiz further discloses: 
Wherein the determining of the one or more bright regions of the image comprises determining an area of pixels in the image having higher brightness values (Col. 14, lines 50-58 of Yildiz: transform RGB channels of image from camera into luminance for each pixel and search image for local maximums with Monte-Carlo method, and selecting the point with the largest luminance from set; Col. 14, lines 59-60: use threshold value T)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects with virtual objects in augmented reality as provided by Jaafar, providing user controls for selecting a real-world object for processing as provided by Banik, and using a trained neural network for pose estimation of objects as provided by Pollefeys, with the technique for determining lighting for virtual simulation of real-world lighting as provided by Yildiz, using . 

Claim 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jaafar et al. (US 10,380,803 B1) in view of Banik et al. (US 2019/0089910 A1) and Pollefeys et al. (US 2020/0302634 A1) and in further view of Guilin Liu et al. (Guilin Liu, et al., “Image Inpainting for Irregular Holes Using Partial Convolutions”, Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 85-100)
Regarding claim 8, the limitations included from claim 7 are rejected based on claim 7 set forth above and incorporated herein.  Further regarding claim 8, Liu discloses: 
Wherein the proximate regions and the object image region are merged using a neural network that implements partial convolution layers (Pages 1-2, Introduction of Guilin Liu, Par. 1 discusses user of image inpainting for image editing to remove unwanted image content, while filling in the resulting space with plausible imagery; Page 3, last paragraph, “In summary…” provides the use of partial convolutions with an automatic mask update step for image inpainting; Page 5, Section 3.1, Par. 1: “We refer to our partial convolution operation and mask update function jointly as the Partial Convolutional Layer”; Pages 5-6, Par. beginning with “After each partial convolution…” discloses implementation in any deep learning framework – Note Page 2, last Par. discusses deep neural network learning; Fig. 3 on Page 8 shows merging hole region with proximate regions into single visible image)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects with virtual objects in augmented reality as provided by Jaafar, providing user controls for selecting a real-world object for processing as provided by Banik, and using a trained neural network for pose estimation of objects as provided by Pollefeys, with the technique for replacing unwanted image features as provided by Guilin Liu, using known electronic interfacing and programming techniques.  The modification results in an improved removal of unwanted image features by utilizing the partial convolution layer technique to produce a more visibly pleasing and continuous image, while also merely substituting one known technique for removing an unwanted object in a digital image for another, yielding predictable results of utilizing the partial convolution layer technique for artifact removal to remove the real-world object in the method and system for removing objects from the AR display. 

Claim 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jaafar et al. (US 10,380,803 B1) in view of Banik et al. (US 2019/0089910 A1) and Pollefeys et al. (US 2020/0302634 A1) and in further view of Liu et al. (US 9,240,055 B1).
Regarding claim 9, the limitations included from claim 6 are rejected based on claim 6 set forth above and incorporated herein.  Further regarding claim 9, Liu discloses: 
Wherein the object is removed by interpolating the proximate regions and the object image region (Abstract of Liu: replacing pixels in images in order to remove obstructions from image; Col. 3, line 55 to Col. 4, line 3: hole filling or interpolation of images, including removing obstruction objects from image and filling resulting holes using symmetry of objects e.g. buildings; See Figs. 7 and 8 showing buildings as surrounding tree objects removed; Also Col. 4, lines 54-57: interpolation from neighboring pixels used to fill in hole pixel)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects with virtual objects in augmented reality as provided by Jaafar, providing user controls for selecting a real-world object for processing as provided by Banik, and using a trained neural network for pose estimation of objects as provided by Pollefeys, with the technique for replacing unwanted image features as provided by Liu, using known electronic interfacing and programming techniques.  The modification results in an improved removal of unwanted image features by utilizing interpolation to produce a more visibly pleasing and continuous image, while also merely substituting one known technique for removing an unwanted object in a digital image for another, yielding predictable results of utilizing the interpolation technique for artifact removal to remove the real-world object in the method and system for removing objects from the AR display. 

Claims 13-14 is/are alternatively rejected under 35 U.S.C. 103 as being unpatentable over Jaafar et al. (US 10,380,803 B1) in view of Banik et al. (US 2019/0089910 A1) and Pollefeys et al. (US 2020/0302634 A1) and in further view of Holzer et al. (US 2017/0148223 A1).
Regarding claim 13, the limitations included from claim 11 are rejected based on the same rationale as claim 11 set forth above and incorporated herein.  Further regarding claim 13, Jaafar modified by Banik further discloses: 
wherein the selected region is identified from a user input on the image as displayed on the touch screen of the mobile device (Par. 27 of Banik: The user of the image-capture device 102 may provide a first user input to select the first object 110 in the first preview of the scene 108 as the undesired object, using a touch-based input; Par. 39: input device includes touch screen; Par. 58: user may single tap on the object rendered on the display)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects with virtual objects in augmented reality as provided by Jaafar, to provide user controls for selecting a real-world object for processing as provided by Banik, using known electronic interfacing and programming techniques.  The modification provides a user with greater control for selecting preferred target object out a plurality of target objects, reducing unnecessary processing of unintended targets and allowing improved customization to user preferences with an easy to use touch interface.
Jaafar modified by Banik does not explicitly disclose segmenting the image into segment regions using an image segmentation convolutional neural network (CNN).
Holzer discloses: 
Segmenting the image into segment regions using an image segmentation convolutional neural network (CNN) (Par. 50 of Holzer: separating content of image by semantic segmentation with neural networks, where resulting separation may be used to remove parts of imagery; Par. 126: the neural network system is a convolutional neural network)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects with virtual objects in augmented reality as provided by Jaafar, providing user controls for Banik, and using a trained neural network for pose estimation of objects as provided by Pollefeys, with the technique for separating image objects for further processing using the segmentation as provided by Holzer, using known electronic interfacing and programming techniques.  The modification results in an improved identification of objects within in an image using image segmentation provided by Holzer for a more accurate identification and removal of the target object, for more realistic and visually pleasing result.  In addition, the modification substitutes one known technique for identifying and removing an unwanted object in a digital image for another, yielding predictable results of utilizing segmentation technique for object removal within an image and used in the method and system for removing objects from the AR display. 
Regarding claim 14, Jaafar modified by Banik further discloses: 
Wherein the user input is one of: a tap gesture or a click (Par. 27 of Banik: The user of the image-capture device 102 may provide a first user input to select the first object 110 in the first preview of the scene 108 as the undesired object, using a touch-based input; Par. 58: user may single tap on the object rendered on the display)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects with virtual objects in augmented reality as provided by Jaafar, to provide user controls for selecting a real-world object for processing as provided by Banik, using known electronic interfacing and programming techniques.  The modification provides a user with greater control for selecting preferred target object out a plurality of target objects, reducing unnecessary processing of unintended targets and allowing improved customization to user preferences with an easy to use touch interface.

Claims 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jaafar et al. (US 10,380,803 B1) in view of Banik et al. (US 2019/0089910 A1) and Pollefeys et al. (US 2020/0302634 A1) and in further view of Price et al. (US 2016/0062615 A1).
Regarding claim 15, the limitations included from claim 11 are rejected based on the same rationale as claim 11 set forth above and incorporated herein.  Further regarding claim 15, Jaafar modified by Banik further discloses: 
receiving, on the touchscreen of the mobile device (Par. 22 of Manik: device 102 includes camera phone or smartphone, laptop computer, or tablet; Par. 39: input device includes touch screen; Also Par. 58)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects with virtual objects in augmented reality as provided by Jaafar, to provide user controls for selecting a real-world object for processing as provided by Banik, using known electronic interfacing and programming techniques.  The modification provides a user with greater control for selecting preferred target object out a plurality of target objects, reducing unnecessary processing of unintended targets and allowing improved customization to user preferences with an easy to use touch interface.
Price discloses: 
Wherein receiving selection of the object through the display device comprises: receiving, on the touchscreen of the mobile device, a swipe gesture over at least a portion of the object as depicted in the image (Fig. 2 and Paras. 27-28 of Price: user interface supporting touch interaction with input patterns mapped to corresponding actions and functionality, such as swiping back and forth across portions of image to cause addition of portions to a selected group – see Fig. 2 showing touch swipe across displayed image; Par. 17: computing device as handheld tablet or mobile phone)
It would have been obvious, at the time the invention was made and with a reasonable expectation of success, to modify the augmented reality system for replacing real-world objects with virtual objects in augmented reality as provided by Jaafar, providing user controls for selecting a real-world object for processing as provided by Banik, and using a trained neural network for pose estimation of objects as provided by Pollefeys, with the technique for selecting image portions using touch interface as provided by Price, using known electronic interfacing and programming techniques.  The modification results in an improved selection of object data by utilizing more intuitive touch controls, while also merely substituting one known technique for selecting an object in an image using user input for another, yielding predictable results of utilizing a swipe gesture for selecting an object in an image in a augmented reality system utilizing selection of a target object for processing. 
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM A BEUTEL whose telephone number is (571)272-3132.  The examiner can normally be reached on Monday-Friday 9:00 AM - 5:00 PM (EST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Devona Faulk can be reached on 571-272-7515.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.