DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Remarks
This office action is responsive to the amendment filed on 01/18/2022.  
Claim(s) 1-20 is/are pending in the application.
Independent claim(s) 1, 11, 20 was/were amended.
Dependent claim(s) 2 was/were amended.

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 01/18/2022 has been entered.
 
Response to Arguments
Applicant's argument(s), regarding the amended portion(s) as recited in independent claim 1 (and similarly in independent claim(s) 11, 20), filed 01/18/2022, have/has been fully considered and is/are persuasive. However, upon further consideration, a new ground(s) of rejection is made, adding/using Huang to be relied upon for the aforementioned amended portion(s). To note, applicant's amendment necessitated the new ground(s) of rejection presented in this office action.

Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claim(s) 1, 10-12, 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (US 2019/0082118 A1) in view of Tomisawa et al. (US 2004/0240056 A1) and Webster et al. (US 2015/0002545 A1) and Huang et al. (US 2021/0027539 A1)

In regards to claim 1, Wang teaches a method, comprising:
selecting a set of augmented reality content generators from a plurality of available augmented reality content generator, the set of augmented reality content generators including at least one augmented reality content generator without a 3D effect and at least one augmented reality content generator with a 3D effect (e.g. [0042],Fig.3D: a full-screen playback view includes scene selector 313 that can be displayed when user 302d has selected the "SCENES" affordance 312; in an embodiment, scene selector 313 is a touch control that can be swiped by user 302d to select virtual background 303d, which in this example is a Japanese tea garden; see also [0045]: for virtual background processing, one or more of 2D image source 411, 3D image source 412 or 360 degree video source 413 can be used to generate virtual background content 415; in an embodiment, a 3D image source can be a rendered 3D image scene with 3D characters; these media sources can each be processed by motion source module 412, which selects the appropriate source depending the virtual environment selected by the user; Examiner’s note: this shows that a set of scenes/virtual backgrounds (augmented reality content generator), which may be 2D or 3D, are displayed to the user, from which a plurality of different sources can be used)
receiving, at a client device, a selection of a selectable graphical item from a plurality of selectable graphical items, the selectable graphical item comprising an augmented reality content generator including a 3D effect (e.g. as above, [0042],Fig.3D: scene selector 313 is a touch control that can be swiped by user 302d to select virtual background 303d, which in this example is a Japanese tea garden); [0045]: one or more of 2D image source 411, 3D image source 412 or 360 degree video source 413 can be used to generate virtual background content 415);
capturing image data and depth data using at least one camera of the client device (e.g. [0025]: a selfie subject can be composited with virtual background content extracted from a virtual environment data model; in a preprocessing stage, a coarse matte is generated from depth data provided by a depth sensor and then refined using video data (e.g. RGB video data); in an embodiment, the depth sensor is an infrared (IR) depth sensor embedded in the mobile device; see also [0044]: forward-facing camera 401 generates RGB video and IR depth sensor 402 generates depth data, which are received by Audio/Visual (A/V) processing module 403); and 
applying, to the image data and the depth data, the 3D effect based at least in part on the augmented reality content generator (e.g. [0025]: the matte is composited (e.g. using alpha compositing) with the video data containing an image of the selfie subject, and the real-world background behind the subject is replaced and continuously updated with virtual background content selected from a virtual environment selected by the user; the video data, refined matte, virtual background content and optionally one or more animation layers are composited to form an AR selfie video; the AR selfie video is displayed to the user by a viewport of the mobile device),
but does not explicitly teach the method, 
wherein selection is based on metadata associated with each respective augmented reality content generator, the metadata including information indicating a corresponding augmented reality content generator includes at least a 3D effect, and
wherein applying the 3D effect comprises:
generating a depth map using at least depth data,
generating a segmentation mask based at least on the image data, and
performing a background inpainting and blurring of the image data using at least the segmentation mask to generate background inpainted image data, the performing background inpainting comprising performing a diffusion based inpainting technique that fills in a missing region by propagating image content from a boundary between the missing region and a background region to an interior of the missing region.

However, Tomisawa teaches a method,
wherein selection is based on metadata associated with each respective augmented reality content generator, the metadata including information indicating a corresponding augmented reality content generator includes at least a 3D effect (e.g. [0067]: parameter providing device preferably provides a three-dimensional effect parameter (metadata) indicating an extent of three-dimensional effect, depending on a depth position and contents of at least one display object; three-dimensional effect parameter provided by the parameter providing device may include information indicating to which extent three-dimensional effect is to be emphasized (e.g. what length of depth range from among the entire length can provide a three-dimensional effect of a display object), or may include information indicating an instruction to display a display object in two-dimension without a three-dimensional effect being provided).

Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to have modified the teachings/combination of Wang to select/display effects, in the same conventional manner as taught by Tomisawa as both deal with image processing. The motivation 

However, Webster teaches a method,
wherein applying the effect comprises:
generating a depth map using at least depth data (e.g. [0088]: at a step 110, performed by the processor 405 as directed by the program 433, a captured image 111 (also referred to as an initial image) with large depth of field (see 1301 in Fig.13) is obtained; a corresponding binary depth map 112 is obtained (see 1302 in Fig.13) at a following step 120),
generating a segmentation mask based at least on the image data (e.g. [0088].Fig.1: at a following step 130, performed by the processor 405 as directed by the program 433, the binary depth map 112 is used to select background pixels in the image 111 obtained in the step 110 to form a background image 131; the binary depth map 112 identifies at least one foreground region 215 and at least one background region 220 (see Fig.2A) in the image 111; Examiner’s note: the binary depth map being viewed/used as a segmentation mask), and
performing a background inpainting and blurring of the image data using at least the segmentation mask to generate background inpainted image data (e.g. [0088]: the selected background pixels (i.e. the background image 131) are blurred using a spatially invariant filter to form a blurred background image 132 (also referred to as a noise-restored normalized blurred background image); also [0059]: in-paint the obscured background pixels before blurring).

Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to have modified the teachings/combination of Wang to perform processing, in the 

Even further, Huang teaches a method,
wherein performing inpainting comprises performing a diffusion based inpainting technique that fills in a missing region by propagating image content from a boundary between the missing region and a background region to an interior of the missing region (e.g. [0026]: removal engine 210 is configured to receive a selection of a region in an image and remove an object depicted in the region using areas surrounding the selected region; [0047]: in some example embodiments, the removal engine 210 implements a diffusion based inpainting scheme (e.g. Navier Strokes) to fill missing areas in the images; Examiner’s note: this suggests the mixing/blurring of the missing region and the surrounding regions (viewed as background)).

Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to have modified the teachings/combination of Wang, Tomisawa and Webster to inpaint, in the same conventional manner as taught by Huang as both deal with inpainting in images. The motivation to combine the two would be that it would allow the use of diffusion-based inpainting.

In regards to system claim 11 and medium claim 20, claim(s) 11, 20 recite(s) limitations that is/are similar in scope to the limitations recited in claim 1. Therefore, claim(s) 11, 20 is/are subject to rejections under the same rationale as applied hereinabove for claim 1. To note, Wang discloses the use of one or more processors in paragraph [0081] and memory/medium in paragraphs [0090]-[0091] and [0097]. 
	

In regards to claim 10, Wang teaches a method, further comprising:
causing display of an interface comprising a plurality of selectable graphical items, each selectable graphical item corresponding to a respective augmented reality content generator of the set of augmented reality content generators (e.g. as above, [0042],Fig.3D: a full-screen playback view includes scene selector 313 that can be displayed when user 302d has selected the "SCENES" affordance 312; in an embodiment, scene selector 313 is a touch control that can be swiped by user 302d to select virtual background 303d, which in this example is a Japanese tea garden).

In regards to claim 12, Wang teaches a system, further comprising:
generating a 3D message based at least in part on the applied 3D effect (e.g. [0073]: AR selfie video can be played back from storage through the viewport and also shared with others on, for example, on social networks; as above, [0025]: the video data, refined matte, virtual background content and optionally one or more animation layers are composited to form an AR selfie video; also as above, [0045]: one or more of 2D image source 411, 3D image source 412 or 360 degree video source 413 can be used to generate virtual background content 415; Examiner’s note: sharing/posting content on social networks may be viewed as messages); and 
rendering a view of the 3D message based at least in part on the applied 3D effect (e.g. as above, [0073]; Examiner’s note: shared content on social networks suggests the displaying and rendering of said content when other users view such shared content).

In regards to claim 19, Wang teaches a system, wherein the augmented reality content generator includes a 3D object rendered in proximity to facial image data from the image data (e.g. as above, [0025]: the video data, refined matte, virtual background content and optionally one or more animation layers are composited to form an AR selfie video [0045]: for virtual background processing, one or more of 2D image source 411, 3D image source 412 or 360 degree video source 413 can be used to generate virtual background content 415; in an embodiment, a 3D image source can be a rendered 3D image scene with 3D characters; Examiner’s note: selfie would include facial image data, such as shown in Fig.3D).

Claim(s) 3-5, 7, 13-15, 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Wang, Tomisawa, Webster and Huang as applied to claims 1, 11 above, and further in view of Sun et al. (US 2013/0208093 A1).

In regards to claim 3, the combination of Wang, Tomisawa, Webster and Huang teaches the method of claim 1, but does not explicitly teach the method, wherein the at least one camera comprises a first camera and a second camera, the first camera having a first focal length and the second camera having a second focal length, the first focal length and the second focal length being different.

However, Sun teaches a method, wherein the at least one camera comprises a first camera and a second camera, the first camera having a first focal length and the second camera having a second focal length, the first focal length and the second focal length being different (e.g. [0022]: imaging system 12 (e.g. camera module 12) may include one or more image sensors 14 and corresponding lenses; when device 10 includes two image sensors 14, device 14 may be able to capture stereo images; [0038]: stereo disparity may be used in computing depth and the depth then used for blurring operations; depth (e.g. a depth map) may be deduced from a disparity map, which is a map of pixel disparities in left and right images (e.g. a map detailing how much each object in a scene shifts in the left and right image, which is indicative of its distance from the stereo imager); in general, the depth of a particular object is a function proportional to the focal length and baseline separation between the left and right cameras and inversely proportional to the disparity of that object; Examiner’s note: this suggests that each image sensor/camera would have a focal length).

Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to have modified the teachings/combination of Wang, Tomisawa, Webster and Huang to use multiple cameras, in the same conventional manner as taught by Sun as both deal with image processing. The motivation to combine the two would be that it would allow the use of more than one camera in order to determine depth data.

In regards to system claim 13, claim(s) 13 recite(s) limitations that is/are similar in scope to the limitations recited in claim 3. Therefore, claim(s) 13 is/are subject to rejections under the same rationale as applied hereinabove for claim 3.

In regards to claim 4, Sun also teaches a method, wherein a disparity map is generated based at least in part, on a distance between a first pixel from a first image captured by the first camera and a second pixel from a second image captured by the second camera, the first pixel and second pixel corresponding to a same object (e.g. as above, [0038]: disparity map, which is a map of pixel disparities in left and right images (e.g. a map detailing how much each object in a scene shifts in the left and right image, which is indicative of its distance from the stereo imager)).

In addition, the same rationale/motivation of claim 3 is used for claim 4.

claim 14, claim(s) 14 recite(s) limitations that is/are similar in scope to the limitations recited in claim 4. Therefore, claim(s) 14 is/are subject to rejections under the same rationale as applied hereinabove for claim 4.

In regards to claim 5, Sun also teaches a method, wherein the disparity map comprises an image where each pixel includes a distance value between a pixel from the first image to corresponding pixel from the second image (e.g. as above, [0038]: disparity map, which is a map of pixel disparities in left and right images (e.g. a map detailing how much each object in a scene shifts in the left and right image, which is indicative of its distance from the stereo imager); Examiner’s note: the map can be viewed as an image).

In addition, the same rationale/motivation of claim 4 is used for claim 5.

In regards to system claim 15, claim(s) 15 recite(s) limitations that is/are similar in scope to the limitations recited in claim 5. Therefore, claim(s) 15 is/are subject to rejections under the same rationale as applied hereinabove for claim 5.

In regards to claim 7, Sun also teaches a method, wherein the depth map is generated based at least in part on the disparity map (e.g. as above, [0038]: depth (e.g. a depth map) may be deduced from a disparity map).

In addition, the same rationale/motivation of claim 4 is used for claim 7.

claim 17, claim(s) 17 recite(s) limitations that is/are similar in scope to the limitations recited in claim 7. Therefore, claim(s) 17 is/are subject to rejections under the same rationale as applied hereinabove for claim 7.

Claim(s) 6, 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Wang, Tomisawa, Webster, Huang and Sun as applied to claims 4, 14 above, and further in view of Javidnia et al. (US 2019/0333237 A1).

In regards to claim 6, the combination of Wang, Tomisawa, Webster,Huang and Sun teaches the method of claim 4, but does not explicitly teach the method, wherein first pixels of a first object in the disparity map have a greater brightness than second pixels of a second object in the disparity map, the first pixels having a lesser depth values than second depth values of the second pixels.

However, Javidnia teaches a method, wherein first pixels of a first object in the disparity map have a greater brightness than second pixels of a second object in the disparity map, the first pixels having a lesser depth values than second depth values of the second pixels (e.g. Figs.2A,B,F; Examiner’s note: Fig.2F shows a final disparity of the gray-scale images of Figs.2A,2B; Fig.2F also shows that objects with lesser depth values are displayed in a greater brightness).

Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to have modified the teachings/combination of Wang, Tomisawa, Webster, Huang and Sun to determine a disparity map, in the same conventional manner as taught by Javidnia as both deal with image processing. The motivation to combine the two would be that it would allow the generation of a disparity map, which is visualized using brightness to represent depth.

In regards to system claim 16, claim(s) 16 recite(s) limitations that is/are similar in scope to the limitations recited in claim 6. Therefore, claim(s) 16 is/are subject to rejections under the same rationale as applied hereinabove for claim 6.

Claim(s) 8-9, 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Wang, Tomisawa, Webster and Huang as applied to claims 1, 11 above, and further in view of Sharif (US 2020/0020173 A1).

In regards to claim 8, the combination of Wang, Tomisawa, Webster and Huang teaches the method of claim 1, but does not explicitly teach the method, wherein the augmented reality content generator includes a beautification operation.

However, Sharif teaches a method, wherein the augmented reality content generator includes a beautification operation (e.g. [0057]: the 3D facial model may be used for constructing a 3D model of the user 102 that are applicable in personalization of products, services, gaming, graphical content, identification, augmented reality, facial make up, etc.; [0093]-[0095]: facial texture is generated so as to preserve lighting effects of the 2D facial image 400; in at least one example embodiment, the facial texture is generated by removing a plurality of pixels from the 2D facial image 400; it may be understood here that removal of unwanted pixels may include performing beautification of the face 402; facial graphics data obtained from the 2D facial image 400 are mapped to a generic 3D head model for rendering a 3D facial model of the face 402).



In regards to system claim 18, claim(s) 18 recite(s) limitations that is/are similar in scope to the limitations recited in claim 8. Therefore, claim(s) 18 is/are subject to rejections under the same rationale as applied hereinabove for claim 8.

In regards to claim 9, Wang teaches a method, wherein the augmented reality content generator includes a 3D object rendered in proximity to facial image data from the image data (e.g. as above, [0025]: the video data, refined matte, virtual background content and optionally one or more animation layers are composited to form an AR selfie video [0045]: for virtual background processing, one or more of 2D image source 411, 3D image source 412 or 360 degree video source 413 can be used to generate virtual background content 415; in an embodiment, a 3D image source can be a rendered 3D image scene with 3D characters; Examiner’s note: selfie would include facial image data, such as shown in Fig.3D).

Allowable Subject Matter
Claim(s) 2 is/are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.



Claim(s) 2 was/were carefully reviewed and a search with regards to independent claim(s) 1 and has been made. Accordingly, those claim(s) are believed to be distinct from the prior art searched.

Regarding claim(s) 2 (and specifically independent claim(s) 1), the prior art search was found to neither anticipate nor suggest a method, wherein the background region comprises a particular region of the image data without a foreground subject and the missing region includes the foreground subject, generating a depth map comprises converting a single channel floating point texture into a raw depth map, portions of the single channel floating point texture is sent into multiple lower precision channels, the raw depth map has a lower resolution than the image data, and further comprising: generating a 3D message based at least in part on the applied 3D effect; and rendering a view of the 3D message based at least in part on the applied 3D effect (emphasis added).

It is viewed that any of the previously cited references or any of the prior art searched, in part or in whole, cannot be combined in such a way to render the claimed invention obvious.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JED-JUSTIN IMPERIAL whose telephone number is (571)270-5807. The examiner can normally be reached Monday to Friday, 11am - 7pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JED-JUSTIN IMPERIAL/Examiner, Art Unit 2612                                                                                                                                                                                                        
/JENNIFER MEHMOOD/Supervisory Patent Examiner, Art Unit 2612