DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claims 37–39 are objected to because of the following informalities:  claim 37 recites “a display device a memory storing . . .” in lines 3–4, which is missing any type of grammatical separation between limitations.  Examiner suggests adding a semi-colon after “a display device”.  Appropriate correction is required.  Claims 38 and 39 depend from claim 37 and incorporate the same language as claim 37, and are therefore objected to for the same reason as claim 37.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 21–40 rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1–15, 17, 18, and 20 of U.S. Patent No. 11,164,384 B2. Although the claims at issue are not identical, they are not patentably distinct from each other because the claims of U.S. Pat. 11,164,384 B2 anticipate the claims of the instant application.
The following table illustrates the conflicting claim pairs:
Instant Application
21
22
23
24
25
26
27
28
29
30
U.S. Pat. 11,164,384
1
1
2
3
4
5
6
7
8
9


Instant Application
31
32
33
34
35
36
37
38
39
40
U.S. Pat. 11,164,384
10
11
12
13
14
15
17
17
18
20


The following table illustrates the limitations of claim 21 of the instant application when compared against the limitations of claim 1 of U.S. Patent No. 11,164,384 B2:
Instant Application – Claim 21

U.S. Pat. 11,164,384 B2 – Claim 1
A method comprising:
A method comprising: 
Generating, using one or more processors of a user device, an image of a physical environment;
generating, using one or more processors of a mobile device, an image of a physical environment; 
Receiving, on a display device of the user device, a selection of an object to be replaced in the image;
receiving, on a touchscreen of the mobile device, (examiner notes that a touch screen of the mobile device is a “display device of the user device”) a selection of an object to be replaced in the image; 

classifying the object into an object category using an object classification neural network; selecting a pose detection neural network from a plurality of pose detection neural networks based on the object being classified in the object category, each of the plurality of pose detection neural networks being trained for different types of objects; 
Determining a three-dimensional orientation of the object as depicted within the image using a pose detection neural network comprising a convolutional neural network trained to detect three-dimensional orientation of objects in a plurality of object training images, the objects of the plurality of object training images being of the same type as the object detected in the image;
determining a three-dimensional orientation of the object as depicted within the image using the pose detection neural network comprising a convolutional neural network trained to detect three-dimensional orientation of objects in a plurality of object training images, the objects of the plurality of object training images being of a same type as the object detected in the image; 
Removing, from the image, the object using regions that are proximate to the object in the image; and
removing, from the image, the object using regions that are proximate to the object in the image; 

generating a render of a virtual model in the three-dimensional orientation and as illuminated by one or more virtual light sources based on a lighting scheme in the image; and 
Generating a modified image that depicts a render of a virtual model that replaces the object in the physical environment
generating a modified image that depicts the render replacing the object in the physical environment.


The following table illustrates the limitations of claim 37 of the instant application when compared against the limitations of claim 17 of U.S. Patent No. 11,164,384 B2:
Instant Application – Claim 37

U.S. Pat. 11,164,384 B2 – Claim 17
A system comprising:
A system comprising: 
One or more processors
One or more processors; 
A display device
A touch screen (examiner notes that a touch screen is a “display device”)
A memory storing instructions that, when executed by the one or more processors, cause the system to perform operations comprising:
A memory storing instructions that, when executed by the one or more processors, cause the system to perform operations comprising:
Generating an image of a physical environment;
generating an image of a physical environment; 
Receiving, on the display device, a selection of an object to be replaced in the image;
receiving, on the touchscreen, a selection of an object to be replaced in the image; 

classifying the object into an object category using an object classification neural network; 
Determining a three-dimensional orientation of the object as depicted within the image using a pose detection neural network comprising a convolutional neural network trained to detect three-dimensional orientation of objects in a plurality of object training images, the objects of the plurality of object training images being of the same type as the object detected in the image;
selecting a pose detection neural network from a plurality of pose detection neural networks based on the object being classified in the object category, each of the plurality of pose detection neural networks being trained for different types of objects; 
Removing, from the image, the object using regions that are proximate to the object in the image; and
determining a three-dimensional orientation of the object as depicted within the image using the pose detection neural network comprising a convolutional neural network trained to detect three-dimensional orientation of objects in a plurality of object training images, the objects of the plurality of object training images being of a same type as the object detected in the image;

removing, from the image, the object using regions that are proximate to the object in the image; generating a render of a virtual model in the three-dimensional orientation and as illuminated by one or more virtual light sources based on a lighting scheme in the image; and 
Generating a modified image that depicts a render of a virtual model that replaces the object in the physical environment
generating a modified image that depicts the render replacing the object in the physical environment.


The following table illustrates the limitations of claim 40 of the instant application when compared against the limitations of claim 20 of U.S. Patent No. 11,164,384 B2:
Instant Application – Claim 37

U.S. Pat. 11,164,384 B2 – Claim 17
A machine-readable storage device embodying instructions that, when executed by a device, cause the device to perform operations comprising:
A machine-readable storage device embodying instructions that, when executed by a device, cause the device to perform operations comprising:
Generating an image of a physical environment;
generating an image of a physical environment; 
Receiving, on a display device, a selection of an object to be replaced in the image;
receiving, on a touchscreen, a selection of an object to be replaced in the image; 

classifying the object into an object category using an object classification neural network; 
Determining a three-dimensional orientation of the object as depicted within the image using a pose detection neural network comprising a convolutional neural network trained to detect three-dimensional orientation of objects in a plurality of object training images, the objects of the plurality of object training images being of the same type as the object detected in the image;
selecting a pose detection neural network from a plurality of pose detection neural networks based on the object being classified in the object category, each of the plurality of pose detection neural networks being trained for different types of objects; 
Removing, from the image, the object using regions that are proximate to the object in the image; and
determining a three-dimensional orientation of the object as depicted within the image using the pose detection neural network comprising a convolutional neural network trained to detect three-dimensional orientation of objects in a plurality of object training images, the objects of the plurality of object training images being of a same type as the object detected in the image;

removing, from the image, the object using regions that are proximate to the object in the image; generating a render of a virtual model in the three-dimensional orientation and as illuminated by one or more virtual light sources based on a lighting scheme in the image; and 
Generating a modified image that depicts a render of a virtual model that replaces the object in the physical environment
generating a modified image that depicts the render replacing the object in the physical environment.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 21, 27, 28, 30–33, 37 and 40 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cohen et al. (US 2015/0097827 A1) in view of Li et al. (US 2020/0160616 A1, with filing priority to CN201811359461.2, Nov. 15, 2018).
Regarding claim 37, Cohen discloses:
A system comprising (Cohen ¶ 26 and Fig. 1: computing device with image capture devices): 
one or more processors (Cohen ¶ 27: computing device 102 having processor resources;  Also see ¶ 151); 
a display device; 
A memory storing instructions that, when executed by the one or more processors, cause the system to perform operations comprising (Cohen ¶ 152: computer-readable storage media such as memory; ¶ 159: software embodied on computer-readable storage media and executable by processing system to perform functions)
Generating an image of a physical environment; (Cohen ¶ 29–30: image capture devices capture images of scene – see Fig. 2, element 112)
Receiving, on the display device, a selection of an object to be replaced in the image; (Cohen ¶ 50: user may provide a mask specifying one or more target regions, such as through use of selection tool, gesture or through automatic selection by a module, e.g. of a foreground object)
Removing, from the image, the object using regions that are proximate to the object in the image; and (Cohen ¶ 36: fill in a target region of an image to remove an object from the image – e.g. remove basketball 204 – see Fig. 2; Also Fig. 11)
Generating a modified image (Cohen ¶ 36 and Fig. 2: generate image 201; Also Figs. 11–14 show generated image without object showing; Note ¶ 24: technique used for individual images) 
Cohen does not explicitly disclose determining a three-dimensional orientation of the object as depicted within the image using a pose detection neural network comprising a convolutional neural network trained to detect three-dimensional orientation of objects in a plurality of object training images, the objects of the plurality of object training images being of a same type as the object detected in the image; and generating a modified image that depicts a render of a virtual model that replaces the object in the physical environment.  
Li discloses:
determining a three-dimensional orientation of the object as depicted within the image using a pose detection neural network comprising a convolutional neural network trained to detect three-dimensional orientation of objects in a plurality of object training images, the objects of the plurality of object training images being of a same type as the object detected in the image; (Li ¶ 9: estimating of the 3D pose may include classifying a type of the object using the neural network, and estimating the 3D pose of the object based on a result of the classification using the neural network; ¶ 50: processor 101 estimates a 3D pose of an object in the 2D input image using the neural network; ¶ 98: neural network includes convolutional layer; ¶ 107: training process uses composite image 603 and real image as input to neural network, and uses 3D pose classifier and feature point detector for weighting image data; ¶ 108 further discusses processing input images of different domains; ¶ 121 further discusses matching target 3D model vehicle that matches vehicle within image – see Fig. 1 showing target 3D model as within same type as object in 2D input image)
generating a modified image that depicts a render of a virtual model that replaces the object in the physical environment.  (Li ¶ 52: processor aligns target 3D model and object based on 3D pose - see Fig. 4, final alignment)
Both Cohen and Li are directed to augmenting image data based on detection of a target object and replacement of the object data.  It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success, to modify the image processing system for determining and replacing a target object in the image as provided by Cohen, by utilizing the technique for detecting target object pose and replacing the target object with a 3D model as provided by Li, using known electronic interfacing and programming techniques.  The modification results in an enhanced augmented reality experience by providing a more accurate depiction of an object within an image, for more realistic viewing and improved augmented reality effect (e.g. Li ¶ 42 discussing alignment providing enhanced AR).  
Regarding claim 21, the system of claim 37 performs the method of claim 21 and as such claim 21 is rejected based on the same rationale as claim 37 set forth above. 
Regarding claim 40, Cohen discloses: 
A machine-readable storage device embodying instructions that, when executed by a device, cause the device to perform operations (Cohen ¶ 152: computer-readable storage media such as memory; ¶ 159: software embodied on computer-readable storage media and executable by processing system to perform functions)
Further regarding claim 40, the operations perform the same method as claim 21 and as such claim 40 is further rejected based on the same rationale as claim 21 set forth above and incorporated herein.  
Regarding claim 27, Cohen further discloses: 
Wherein, in the image, the object is depicted in an object image region, and the regions that are proximate to the object in the image are proximate regions that are external to the object image region (Cohen ¶ 45: “T” is target region vs “S” is source region, which is an area of the image outside the target region; ¶ 48: update best matches and blend matches into target region, driving content into target region from the outside region – See Fig. 3
Regarding claim 28, Cohen further discloses: 
Wherein the object is removed by merging the proximate regions and the object image region (Cohen ¶ 75: color of target pixel calculated using a weighted blending of values of source patches “s” matched to each target patch “t”)
Regarding claim 30, Cohen further discloses:
Wherein the object is removed by interpolating the proximate regions and the object image region (Cohen ¶ 75: weighted blending of source patches “s” and target patch “t” using equation c, shown in paragraph 75)
Regarding claim 31, Cohen further discloses:  
Displaying an image on a display device of the user device; and receiving selection of the object through the display device of the user device (Cohen ¶ 50: user may provide a mask specifying one or more target regions, such as through use of selection tool, gesture or through automatic selection by a module, e.g. of a foreground object; Also rejected in view of Banik set forth below)
Regarding claim 32, Cohen further discloses:  
Wherein receiving selection of the object comprises receiving selection of a selected region of the image that depicts the object (Cohen ¶ 50: user may provide a mask specifying one or more target regions, such as through use of selection tool, gesture or through automatic selection by a module, e.g. of a foreground object; Also rejected in view of Banik set forth below)
Regarding claim 33, Cohen further discloses: 
Generating an image mask using the selected region (Cohen ¶ 50: user may then provide a mask specifying one or more target regions as shown in the examples 1000-1500 of FIGS. 10-15, performed by manual selection through user interaction, e.g. gesture or cursor device)

Claim(s) 22, 23, 38 and 39 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cohen et al. (US 2015/0097827 A1) in view of Li et al. (US 2020/0160616 A1, with filing priority to CN201811359461.2, Nov. 15, 2018) and in further view of Jaafar et al. (US 10,380,803 B1). 
Regarding claim 38, the limitations included from claim 37 are rejected based on the same rationale as claim 37 set forth above and incorporated herein.  Further regarding claim 38, Li further discloses: 
Generating the render of the virtual model in the three-dimensional orientation (Li ¶ 52: processor aligns target 3D model and object based on 3D pose - see Fig. 4, final alignment)
Both Cohen and Li are directed to augmenting image data based on detection of a target object and replacement of the object data.  It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success, to modify the image processing system for determining and replacing a target object in the image as provided by Cohen, by utilizing the technique for detecting target object pose and replacing the target object with a 3D model as provided by Li, using known electronic interfacing and programming techniques.  The modification results in an enhanced augmented reality experience by providing a more accurate depiction of an object within an image, for more realistic viewing and improved augmented reality effect (e.g. Li ¶ 42 discussing alignment providing enhanced AR).  
Cohen and Li do not explicitly disclose the illuminating by one or more virtual light sources based on a lighting scheme in the image. 
Jaafar discloses: 
Generating the render of the virtual model in the three-dimensional orientation and as illuminated by one or more virtual light sources based on a lighting scheme in the image (Jaafar [15:1–24]: system accounts for lighting effects caused by light source 602 and other light sources within real-world environment, by identifying one or more light sources that project light onto target objects within real-world environment based on accessed video data, including mapping out light sources and storing data representative of the locations and characteristics of the light sources – i.e. “virtual light sources based on lighting scheme in the image” – and recalling that stored data during a later mixed reality presentation; [16:10–26] simulate interaction from light with virtual object 610; Note [2:19–35] discloses virtualizing a 3D model of the target object within the mixed reality presentation) 
Cohen, Li and Jaafar are directed to augmented reality image processing.  It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success, to modify the image processing system for determining and replacing a target object in the image as provided by Cohen, using the technique for detecting target object pose and replacing the target object with a 3D model as provided by Li, and further utilizing the lighting simulation technique of Jaafar, using known electronic interfacing and programming techniques.  The modification results in an enhanced augmented reality experience by providing improved light and shadow simulations on virtual objects to provide a more realistic insertion of a virtual object into the augmented reality scene.
Regarding claim 22, the system of claim 38 performs the method of claim 22 and as such claim 22 is rejected based on the same rationale as claim 38 set forth above. 
Regarding claim 39, Cohen modified by Li and Jaafar further discloses: 
Determining the lighting scheme of the image (Jaafar [14:34–46]: system configured to simulate the interaction of virtual objects with light sources that illuminate the real-world environment, such that virtual object casts a similar shadow within real-world environment as a real object illuminated from angle and brightness, color of actual light sources illuminating the real-world environment; [15:1 – 16:26] discloses accounting for lighting effects caused by real world light sources based on accessed video data)
Cohen, Li and Jaafar are directed to augmented reality image processing.  It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success, to modify the image processing system for determining and replacing a target object in the image as provided by Cohen, using the technique for detecting target object pose and replacing the target object with a 3D model as provided by Li, and further utilizing the lighting simulation technique of Jaafar, using known electronic interfacing and programming techniques.  The modification results in an enhanced augmented reality experience by providing improved light and shadow simulations on virtual objects to provide a more realistic insertion of a virtual object into the augmented reality scene.
Regarding claim 23, the system of claim 39 performs the method of claim 23 and as such claim 23 is rejected based on the same rationale as claim 39 set forth above. 

Claim(s) 24–26 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cohen et al. (US 2015/0097827 A1) in view of Li et al. (US 2020/0160616 A1, with filing priority to CN201811359461.2, Nov. 15, 2018) and Jaafar et al. (US 10,380,803 B1) and in further view of Yildiz et al. (US 10,540,812 B1). 
Regarding claim 24, the limitations included from claim 23 are rejected based on the same rationale as claim 23 set forth above and incorporated herein.  Further regarding claim 24, Yildiz discloses: 
Wherein determining the lighting scheme comprises determining one or more bright regions of the image (Yildiz [13:20–45] and Fig. 5: use RGB camera feed to determine light intensity in regions of image (where intensity includes brightness), and map real-world location and size of real-world light source to virtual space, where images from RGB cameras are sent through processing step to identify regions of bright color as light source color)
Cohen, Li, Jaafar and Yildiz are directed to augmented reality image processing.  It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success, to modify the image processing system for determining and replacing a target object in the image as provided by Cohen, using the technique for detecting target object pose and replacing the target object with a 3D model as provided by Li, and further utilizing the lighting simulation technique of Jaafar, with the technique for determining lighting for virtual simulation of real-world lighting as provided by Yildiz, using known electronic interfacing and programming techniques.  The modification results in an improved integration of virtual objects into a real-world object by providing more accurate lighting simulation, while also merely substituting one known technique for simulating lighting effects in an virtual environment for another, yielding predictable results of utilizing image characteristics for calculating and simulating lighting effects in augmented reality displays. 
Regarding claim 25, Cohen modified by Li, Jaafar and Yildiz further discloses:
Positioning, in a virtual environment, the one or more virtual light sources based on locations of the one or more bright regions of the image (Col. 13, lines 33-45 of Yildiz: images from the RGB cameras are sent through a processing step where regions of bfight color are identified as potential light sources, and after regions of interest are determined, the depth image from the depth sensor can be used to find the distance to light source, and finally use distance  with user’s position to determine location of real-world light source with respect to user; Fig. 5 and Col. 14, line 59 to Col. 15, line 2: detect location and size of light sources using SLAM pose and depth sensor data to transform real-world coordinates into virtual space coordinates, recording physical location, intensity, size, and color of real-world light source in virtual space coordinates; Col. 15, lines 10-20: set one or more artificial virtual light sources in rendering engine with the same location, intensity, color, size, and type detected in the real-world light source)
Cohen, Li, Jaafar and Yildiz are directed to augmented reality image processing.  It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success, to modify the image processing system for determining and replacing a target object in the image as provided by Cohen, using the technique for detecting target object pose and replacing the target object with a 3D model as provided by Li, and further utilizing the lighting simulation technique of Jaafar, with the technique for determining lighting for virtual simulation of real-world lighting as provided by Yildiz, using known electronic interfacing and programming techniques.  The modification results in an improved integration of virtual objects into a real-world object by providing more accurate lighting simulation, while also merely substituting one known technique for simulating lighting effects in an virtual environment for another, yielding predictable results of utilizing image characteristics for calculating and simulating lighting effects in augmented reality displays. 
Regarding claim 26, Cohen modified by Li, Jaafar and Yildiz further discloses: 
Wherein the determining of the one or more bright regions of the image comprises determining an area of pixels in the image having higher brightness values (Col. 14, lines 50-58 of Yildiz: transform RGB channels of image from camera into luminance for each pixel and search image for local maximums with Monte-Carlo method, and selecting the point with the largest luminance from set; Col. 14, lines 59-60: use threshold value T)
Cohen, Li, Jaafar and Yildiz are directed to augmented reality image processing.  It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success, to modify the image processing system for determining and replacing a target object in the image as provided by Cohen, using the technique for detecting target object pose and replacing the target object with a 3D model as provided by Li, and further utilizing the lighting simulation technique of Jaafar, with the technique for determining lighting for virtual simulation of real-world lighting as provided by Yildiz, using known electronic interfacing and programming techniques.  The modification results in an improved integration of virtual objects into a real-world object by providing more accurate lighting simulation, while also merely substituting one known technique for simulating lighting effects in an virtual environment for another, yielding predictable results of utilizing image characteristics for calculating and simulating lighting effects in augmented reality displays. 

Claim 29 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cohen et al. (US 2015/0097827 A1) in view of Li et al. (US 2020/0160616 A1, with filing priority to CN201811359461.2, Nov. 15, 2018) and in further view of Guilin Liu et al. (Guilin Liu, et al., “Image Inpainting for Irregular Holes Using Partial Convolutions”, Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 85-100).
Regarding claim 29, the limitations included from claim 28 are rejected based on claim 28 set forth above and incorporated herein.  Further regarding claim 29, Liu discloses: 
Wherein the proximate regions and the object image region are merged using a neural network that implements partial convolution layers (Pages 1-2, Introduction of Guilin Liu, Par. 1 discusses user of image inpainting for image editing to remove unwanted image content, while filling in the resulting space with plausible imagery; Page 3, last paragraph, “In summary…” provides the use of partial convolutions with an automatic mask update step for image inpainting; Page 5, Section 3.1, Par. 1: “We refer to our partial convolution operation and mask update function jointly as the Partial Convolutional Layer”; Pages 5-6, Par. beginning with “After each partial convolution…” discloses implementation in any deep learning framework – Note Page 2, last Par. discusses deep neural network learning; Fig. 3 on Page 8 shows merging hole region with proximate regions into single visible image)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success, to modify the image processing system for determining and replacing a target object in the image as provided by Cohen, using the technique for detecting target object pose and replacing the target object with a 3D model as provided by Li, with the technique for replacing unwanted image features as provided by Guilin Liu, using known electronic interfacing and programming techniques.  The modification results in an improved removal of unwanted image features by utilizing the partial convolution layer technique to produce a more visibly pleasing and continuous image, while also merely substituting one known technique for removing an unwanted object in a digital image for another, yielding predictable results of utilizing the partial convolution layer technique for artifact removal to remove the real-world object in the method and system for removing objects from the AR display. 

Claim(s) 31-–33 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cohen et al. (US 2015/0097827 A1) in view of Li et al. (US 2020/0160616 A1, with filing priority to CN201811359461.2, Nov. 15, 2018) and in further view of Banik et al. (US 2019/0089910 A1). 
Regarding claim 31, the limitations included from claim 21 are rejected based on the same rationale as claim 21 set forth above and incorporated herein.  Further regarding claim 31, Banik discloses: 
Displaying the image on a display device of the user device and receiving selection of the object through the display device of the user device (Par. 27 of Banik: The user of the image-capture device 102 may provide a first user input to select the first object 110 in the first preview of the scene 108 as the undesired object, using a touch-based input; Par. 58: user may single tap on the object rendered on the display)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success, to modify the image processing system for determining and replacing a target object in the image as provided by Cohen, using the technique for detecting target object pose and replacing the target object with a 3D model as provided by Li, to provide user controls for selecting a real-world object for processing as provided by Banik, using known electronic interfacing and programming techniques.  The modification provides a user with greater control for selecting preferred target object out a plurality of target objects, reducing unnecessary processing of unintended targets and allowing improved customization to user preferences with an easy to use touch interface.
Regarding claim 32, Cohen modified by Li and Banik further discloses: 
Wherein receiving selection of the object comprises receiving selection of a selected region of the image that depicts the object (Par. 27 of Banik: The user of the image-capture device 102 may provide a first user input to select the first object 110 in the first preview of the scene 108 as the undesired object, using a touch-based input; Par. 58: user may single tap on the object rendered on the display)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success, to modify the image processing system for determining and replacing a target object in the image as provided by Cohen, using the technique for detecting target object pose and replacing the target object with a 3D model as provided by Li, to provide user controls for selecting a real-world object for processing as provided by Banik, using known electronic interfacing and programming techniques.  The modification provides a user with greater control for selecting preferred target object out a plurality of target objects, reducing unnecessary processing of unintended targets and allowing improved customization to user preferences with an easy to use touch interface.
Regarding claim 33, Cohen further discloses: 
Generating an image mask using the selected region (Cohen ¶ 50: user may then provide a mask specifying one or more target regions as shown in the examples 1000-1500 of FIGS. 10-15, performed by manual selection through user interaction, e.g. gesture or cursor device)

Claim(s) 34 and 35 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cohen et al. (US 2015/0097827 A1) in view of Li et al. (US 2020/0160616 A1, with filing priority to CN201811359461.2, Nov. 15, 2018) and in further view of Banik et al. (US 2019/0089910 A1) and Holzer et al. (US 2017/0148223 A1)
Regarding claim 34, the limitations included from claim 32 are rejected based on the same rationale as claim 32 over Cohen modified by Li and Banik, and incorporated herein.  Further regarding claim 34, Banik further discloses: 
wherein the selected region is identified from a user input on the image as displayed by the user device (Par. 27 of Banik: The user of the image-capture device 102 may provide a first user input to select the first object 110 in the first preview of the scene 108 as the undesired object, using a touch-based input; Par. 58: user may single tap on the object rendered on the display)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success, to modify the image processing system for determining and replacing a target object in the image as provided by Cohen, using the technique for detecting target object pose and replacing the target object with a 3D model as provided by Li, to provide user controls for selecting a real-world object for processing as provided by Banik, using known electronic interfacing and programming techniques.  The modification provides a user with greater control for selecting preferred target object out a plurality of target objects, reducing unnecessary processing of unintended targets and allowing improved customization to user preferences with an easy to use touch interface.
Cohen further does not explicitly disclose segmenting the image into segment regions using an image segmentation convolutional neural network (CNN).
Holzer discloses: 
Segmenting the image into segment regions using an image segmentation convolutional neural network (CNN) (Par. 50 of Holzer: separating content of image by semantic segmentation with neural networks, where resulting separation may be used to remove parts of imagery; Par. 126: the neural network system is a convolutional neural network)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success, to modify the image processing system for determining and replacing a target object in the image as provided by Cohen, using the technique for detecting target object pose and replacing the target object with a 3D model as provided by Li, and providing user controls for selecting a real-world object for processing as provided by Banik, with the technique for separating image objects for further processing using the segmentation as provided by Holzer, using known electronic interfacing and programming techniques.  The modification results in an improved identification of objects within in an image using image segmentation provided by Holzer for a more accurate identification and removal of the target object, for more realistic and visually pleasing result.  In addition, the modification substitutes one known technique for identifying and removing an unwanted object in a digital image for another, yielding predictable results of utilizing segmentation technique for object removal within an image and used in the method and system for removing objects from the AR display.
Regarding claim 35, Cohen modified by Li and Banik further discloses: 
Wherein the user input is one of the following: a tap gesture or a click (Par. 27 of Banik: The user of the image-capture device 102 may provide a first user input to select the first object 110 in the first preview of the scene 108 as the undesired object, using a touch-based input; Par. 58: user may single tap on the object rendered on the display)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success, to modify the image processing system for determining and replacing a target object in the image as provided by Cohen, using the technique for detecting target object pose and replacing the target object with a 3D model as provided by Li, to provide user controls for selecting a real-world object for processing as provided by Banik, using known electronic interfacing and programming techniques.  The modification provides a user with greater control for selecting preferred target object out a plurality of target objects, reducing unnecessary processing of unintended targets and allowing improved customization to user preferences with an easy to use touch interface.

Claim(s) 36 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cohen et al. (US 2015/0097827 A1) in view of Li et al. (US 2020/0160616 A1, with filing priority to CN201811359461.2, Nov. 15, 2018) and in further view of Price et al. (US 2016/0062615 A1).
Regarding claim 36, the limitations included from claim 32 are rejected in view of Cohen and Li as set forth above and incorporated herein.  Further regarding claim 36, Price discloses:
Wherein receiving selection of the object through the display device comprises: receiving, through the display device, a swipe gesture over at least a portion of the object as depicted in the image (Fig. 2 and Paras. 27-28 of Price: user interface supporting touch interaction with input patterns mapped to corresponding actions and functionality, such as swiping back and forth across portions of image to cause addition of portions to a selected group)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success, to modify the image processing system for determining and replacing a target object in the image as provided by Cohen, using the technique for detecting target object pose and replacing the target object with a 3D model as provided by Li, with the technique for selecting image portions using touch interface as provided by Price, using known electronic interfacing and programming techniques.  The modification results in an improved selection of object data by utilizing more intuitive touch controls, while also merely substituting one known technique for selecting an object in an image using user input for another, yielding predictable results of utilizing a swipe gesture for selecting an object in an image in a augmented reality system utilizing selection of a target object for processing.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM A BEUTEL whose telephone number is (571)272-3132. The examiner can normally be reached Monday-Friday 9:00 AM - 5:00 PM (EST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kee Tung can be reached on 571-272-7794. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/WILLIAM A BEUTEL/Primary Examiner, Art Unit 2616