Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
This action is in reply to the amendments and remarks filed on 02/17/2021.
Claims 1, 6-9, and 12-20 are pending.
Claims 1, 6, 8-9, 12-13, and 15-20 have been amended.
Claims 2-5 and 10-11 have been canceled.

Response to Arguments
Applicant’s arguments, with respect to objections of claims 1, 16-19 and some objection(s) of claim 6 have been fully considered and are persuasive. Therefore, the objections set forth in the previous office action have been withdrawn. However, upon further consideration, a new ground(s) of objection have been made.

Applicant’s arguments, with respect to claim 6 objections, have been fully considered but they are not persuasive. Examiner notes that not all of the objection(s) to claims 6 were addressed and are therefore maintained.

Applicant’s arguments, with respect to rejection(s) of claim(s) 6, 10-11, 13, and some rejection(s) of claims 16-19 under 35 U.S.C. 112(b), have been fully considered and are persuasive. Therefore, the rejections set forth in the previous office action have 

Applicant’s arguments, with respect to the rejection(s) of claim(s) 17-19 under 35 U.S.C. 112(b), have been fully considered but they are not persuasive. Examiner notes that not all of the rejection(s) to claims 17-19 were addressed and are therefore maintained.

Applicant’s arguments, with respect to the rejection(s) of claim(s) 1 under 35 U.S.C. 103, have been considered but they are not persuasive. More specifically, the applicant argues that no art of record teaches the amended claim language that now states “applying the first neural network to frames in the video data to transform the frames to the first style; outputting the transformed video data to a display of the mobile device in real-time as the position and the orientation of the camera system is changing such that the transformed video data output to the display appears consistent with a current position and a current orientation of the camera system”, since “paragraphs [0053] and [0054] of Rymkowski do not describe presenting transformed video data in real-time, nor do they describe performing the transformation in real-time”. The examiner respectfully disagrees. Rymkowski, paragraphs 0032, 0035, and 0053-0054 teach displaying (outputting) “a video stream as it is captured while…contemporaneously generate a fused version of the captured video stream” on a mobile device “user interface” (transformed video data to a display of the mobile device in real-time…such that the transformed video data output to the display appears 
Further, Baruch was cited in alternative to Rymkowski, where Baruch paragraphs 0002, 0024, 0040, 0043, 0082, and 0083 teach a mobile phone’s “artistic preview-mode in an RGBD camera” application for a user to apply and view (outputting) a displayed “artistic filter” over a selected object in “real-time”, including and color editing via a neural network (transformed video data). It is further taught that this can be used to overcome “occlusions and abrupt and/or fast movements of the objects in the image or the camera” in “real-time” (the transformed video data output to the display appears consistent with a current position and a current orientation of the camera system). Paragraphs 0033-0036 further teach the overcoming of camera movements while displaying (outputting) the “processed” frames (transformed video data output) “to a person viewing the video sequence” in “real-time” so the processed video appears “instantaneous[ly]” (transformed video data output to the display appears consistent with a current position and a current orientation of the camera system).
See 35 U.S.C 103 section for full mapping of claim limitations necessitated by applicant amendments.


Claim Objections
Claims 6 and 15 are objected to because of the following informalities:
Claim 6 recites a grammatical error stating “orientation in the real-time on the display”. An optional solution to overcome this objection is to change the claim to read “orientation in  real-time on the display”.
Claim 15 recites a grammatical error stating “generating second frames that each includes one”. An optional solution to overcome this objection is to change the claim to read “generating second frames that each include one”.
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 6 and 17-19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
Claim 6 recites dependency upon canceled claim 5 and is therefore indefinite. Applicant can overcome this rejection by listing dependency upon independent claim 1.

the portion of the second frames” in line 3 with insufficient antecedent basis for this limitation in the claim.

Claim 17 recites the limitation “wherein an object associated with MVIDMR appears to change in a first orientation in the real-time on the display”, but it is unclear to the examiner if this MVIDMR refers to the “MVIDMR” of claim 15 or is a different MVIDMR.

Claim 18 recites the limitation "the style associated with the first neural network” in line 3 with insufficient antecedent basis for this limitation in the claim. Applicant can overcome this rejection by amending the claim to read “the first style associated with the first neural network”

Claim 18 recites the limitation “wherein an object associated with MVIDMR appears to change in a first orientation in the real-time on the display”, but it is unclear to the examiner if this MVIDMR refers to the “MVIDMR” of claim 15 or is a different MVIDMR.

Claim 19 recites the limitation “a first style associated with the first neural network”, but it is unclear to the examiner if this “first style” refers to the “first style” of claim 1 or is a different “first style”. Applicant can overcome this rejection by amending the claim to read “…video data to the first style associated…”

the portion of the second frames…applying the neural network to a portion of the second frames” in line 3 with insufficient antecedent basis for this limitation in the claim and further it is unclear if the “a portion” is meant to refer to “the portion” in claim 19. Applicant can overcome this rejection by amending the claim to read “ a portion of the second frames…applying the neural network to the portion of the second frames”

Claim 19 recites the limitation “wherein an object associated with MVIDMR appears to change in a first orientation in the real-time on the display”, but it is unclear to the examiner if this MVIDMR refers to the “MVIDMR” of claim 15 or is a different MVIDMR.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.

4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1, 6, 12-14, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Rymkowski et al (US Pub 20180082715) hereinafter Rymkowski, in view of Sommer et al (US Pub 20170357910) hereinafter Sommer, and in further view of Baruch et al (US Pub 20170337693) hereinafter Baruch.
Regarding claim 1, Rymkowski teaches a method comprising: 
receiving a selection of a first one of a plurality of styles on a mobile device (paragraphs 0033 and 0042 teach “a user may simply select the artistic style that he or she wishes to apply (receiving a selection of a first…styles), e.g., from a list of paintings, artists, or predetermined available artistic style ‘filters (of a plurality of styles).’” It is further taught that the neural network style transfer of videos being operated on “mobile phones (on a mobile device)”);
receiving, on the mobile device, a plurality of sets of weighting factors from a remote computing device via a network interface, the weighting factors associated with a first neural network, wherein the first neural network is trained to convert video images to the first style (paragraphs 0045 teach changing or updating “some of the stylization parameter(s)…e.g., the various neural network parameters or the size of the ‘interpolation neighborhood,’ i.e., the number of prior and subsequent frames pulled into the blending process” (receiving a plurality of sets of weighting factors) of the “neural network” (associated with the first neural network), where paragraphs 0030-0031 and 0033 teach these are used by the neural network for optimized “style transfer” of consecutive images in a video sequence (wherein the first neural network is trained to convert video images to the first style), that “gradually alter the target image’s content over multiple iterations”. Further, paragraph 0033 teaches the neural network style transfer of videos being operated on “mobile phones (on the mobile device)”.); 
receiving a live feed of video data from a camera system on the mobile device wherein a position and orientation of the camera system is changing as a function of time (paragraphs 0032, 0035, and 0053-0054 teach displaying “a video stream as it is captured (receiving a live feed of video data)” of moving, identified objects from frame to frame over time via (from) “image capture circuitry (camera system)” on the mobile phone (device), where the user can apply “stabilization constraint(s)” understood to be used when a handheld video is captured through a mobile device camera (wherein a position and orientation of the camera system is changing as a function of time)); 
applying the first neural network to frames in the video data to transform the frames to the first style (paragraphs 0033, 0048, and Fig. 3 teach using (applying) a “style-specific neural network pass (first style)” on “each image in the video sequence (frames in the video data to transform the frames)” to output “desired stylized output image[s]”, for example, by “application of the generated neural network (applying the first neural network) to the region 516 of the target image”); 
outputting the transformed video data to a display of the mobile device in real-time as the position and the orientation of the camera system is changing such that the transformed video data output to the display appears consistent with a current position and a current orientation of the camera system (paragraphs 0032, 0035, and 0053-0054 teach displaying (outputting) “a video stream as it is captured while…contemporaneously generate a fused version of the captured video stream” on a mobile device “user interface” (transformed video data to a display of the mobile device in real-time…such that the transformed video data output to the display appears consistent with a current position and a current orientation of the camera system) of moving, identified objects from frame to frame over time via “image capture circuitry (camera system)” on the mobile phone (device), where the user can apply “stabilization constraint(s)” understood to be used when a handheld video is captured through a mobile device camera (as the position and the orientation of the camera system is changing). Paragraph 0049 further teaches that the user can “preview” the changed images based on the current style, and can select a different style if the user desires.); and 
recording the video data from the live feed to a memory module (paragraphs 0045, 0050, 0053-0054, and Fig. 5 teach the user selects to capture a video when deemed necessary with the presented overlay and able to “store the stylized video sequence in memory”; where “a video stream [can be displayed] as it is captured while…contemporaneously generate a fused version of the captured video stream, storing the video stream in memory 760 and/or storage 765” (recording the video data from the live feed to a memory module)).
However Rymkowski does not explicitly teach receiving, on the mobile device, a plurality of sets of weighting factors from a remote computing device via a network interface.
Sommer teaches receiving, on the mobile device, a plurality of sets of weighting factors from a remote computing device via a network interface (paragraphs 0010-0014, 0024, and Fig. 1 teach an “AI cloud service” on a server transmitting an AI model update over a network (remote computing device via a network interface) to replace or change the AI models and/or the AI model weights on different mobile devices that host and run AI model such as neural networks (receiving, on the mobile device, a plurality of sets of weighting factors)).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to implement Sommer’s teachings of receiving mobile phone hosted neural network model updates from a remote server into Rymkowski’s teaching of mobile phone image/video sequence frame style transfer with a “style-specific neural network[s]” in order to increase storage space on the mobile 
Further, Rymkowski at least implies receiving a live feed of video data from a camera system on the mobile device wherein a position and orientation of the camera system is changing as a function of time (see mapping above) and outputting the transformed video data to a display of the mobile device in real-time as the position and the orientation of the camera system is changing such that the transformed video data output to the display appears consistent with a current position and a current orientation of the camera system (see mapping above), however Baruch teaches receiving a live feed of video data from a camera system on the mobile device wherein a position and orientation of the camera system is changing as a function of time (paragraphs 0002, 0024, 0040, 0043, 0082, and 0083 teach a mobile phone’s “artistic preview-mode in an RGBD camera (from a camera system on the mobile device)” able to display video in “real-time” (receiving a live feed of video data), where the user is able to apply filters for “overcoming occlusions and abrupt and/or fast movements of the objects in the image or the camera” in real-time (wherein a position and orientation of the camera system is changing as a function of time)); and 
outputting the transformed video data to a display of the mobile device in real-time as the position and the orientation of the camera system is changing such that the transformed video data output to the display appears consistent with a current position and a current orientation of the camera system (paragraphs 0002, 0024, 0040, 0043, 0082, and 0083 teach a mobile phone’s “artistic preview-mode 
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify mobile phone image/video sequence frame style transfer with a “style-specific neural network[s]”, as taught by Rymkowski as modified by receiving mobile phone hosted neural network model updates from a remote server as taught by Sommer, to include applying “artistic filter[s]” for color editing and “overcoming occlusions and abrupt and/or fast movements of the objects in the image or the camera” of a mobile device in real-time via neural networks as taught by Baruch in order to optimize the previewing of the artistically filtered video of a moving mobile phone camera (Baruch, paragraphs 0002, 0024, 0040, 0043, 0082, and 0083).

Regarding claim 6, the combination of Rymkowski, Sommer, and Baruch teach all the claim limitations of claim 5 above; and further teach generating, based upon a second style, a second neural network including second weighting factors (Rymkowski, paragraphs 0029-0035, 0039, 0042-0044, and 0048-0049 teach iteratively training and optimizing a style-specific neural network (generating, based upon a second style, a second neural network including second weighting factors) in order to apply it to image frames in a video sequence);
applying the second neural network to second frames in the video data to transform the second frames to the second style associated with the second neural network (Rymkowski teaches as shown in mapping below); and 
outputting the second frames to the display in the real-time (Rymkowski, paragraphs 0029-0035, 0039, 0042-0044, 0048-0049, 0053-0054, and Fig. 3 teach iteratively training and optimizing a style-specific neural network (second style associated with the neural network) in order to use the network on “each image in the video sequence (applying the second neural network to second frames in the video data to transform the second frames)” to output “desired stylized output image[s]”, for example, by “application of the generated neural network (applying the second neural network) to the region 516 of the target image”, and further displaying (outputting) “a video stream as it is captured while…contemporaneously generate a fused version of the captured video stream” on a mobile device “user interface” (outputting the second frames to the display in the real-time) of moving, identified objects from frame to frame over real time via “image capture circuitry (camera system)” on the mobile phone. Paragraph 0049 further teaches that the user can “preview” the changed images based on the current style, and can select a different style if the user desires.).

Regarding claim 12, the combination of Rymkowski, Sommer, and Baruch teach all the claim limitations of claim 1 above; and further teach generating composite frames including both video data and the transformed video data and outputting the composite frames to the display (Rymkowski, paragraphs 0029-0035, 0042-0044, 0048-0049, and 0052-0053 teach portions of the image frames from the video sequence being stylized (generating composite frames including both video data and the transformed video data), that the user can “preview” (outputting) the changed images based on the current style (the composite frames), and can select a different style if the user desires on the mobile device display (to the display)).
Rymkowski at least implies generating composite frames including both video data and the transformed video data and outputting the composite frames to a display (see mapping above), however Baruch teaches generating composite frames including both video data and the transformed video data and outputting the composite frames to the display (paragraphs 0002, 0024, 0040, 0043, 0082, and 0083 teach a mobile phone’s “artistic preview-mode in an RGBD camera” application for a user to apply and view (outputting) an “artistic filter” over a selected object (generating composite frames), including and color editing via a neural network (including both video data and the transformed video data). It is further taught that this can be used to overcome “occlusions and abrupt and/or fast movements of the objects in the image or the camera” in real-time (including both video data and the transformed video data). Paragraphs 0033-0036 further teach the overcoming of camera movements while displaying (outputting) the “processed” frames of selected objects (generating composite frames including both video data and the transformed video data) “to a .
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify mobile phone image/video sequence frame style transfer with a “style-specific neural network[s]”, as taught by Rymkowski as modified by receiving mobile phone hosted neural network model updates from a remote server as taught by Sommer, to include applying “artistic filter[s]” for color editing and “overcoming occlusions and abrupt and/or fast movements of the objects in the image or the camera” of a mobile device in real-time via neural networks as taught by Baruch in order to optimize the previewing of the artistically filtered video of a moving mobile phone camera (Baruch, paragraphs 0002, 0024, 0040, 0043, 0082, and 0083).

Regarding claim 13, the combination of Rymkowski, Sommer, and Baruch teach all the claim limitations of claim 1 above; and further teach identifying an object in the video data, wherein each frame in the video data includes a respective first portion that includes the object and a respective second portion that does not include the object (Baruch, paragraphs 0002, 0024, 0033-0036, 0038-0040, 0043, 0082, 0083, and Fig. 6 teach a mobile phone’s “artistic preview-mode in an RGBD camera” application for a user to apply and view an “artistic filter” over video frame’s segmented “user selected object” (identifying an object in the video data) in a “boundary” (wherein each frame in the video data includes a respective first portion that includes the object) away from the background (and a respective second portion that does not include the object), including and color editing via a neural network for the , and wherein applying the first neural network to the frames in the video data involves applying the first neural network to the first portion of each frame but not the second portion of each frame (Baruch, paragraphs 0002, 0024, 0033-0036, 0040, 0043, 0082, and 0083 teach a mobile phone’s “artistic preview-mode in an RGBD camera” application for a user to apply and view an “artistic filter” over a “user selected object”, including and color editing via a neural network for the selected object in a video (applying the first neural network to the first portion of each frame), “and not over the background” (but not the second portion of each frame)).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify mobile phone image/video sequence frame style transfer with a “style-specific neural network[s]”, as taught by Rymkowski as modified by receiving mobile phone hosted neural network model updates from a remote server as taught by Sommer, to include applying “artistic filter[s]” for color editing and “overcoming occlusions and abrupt and/or fast movements of the objects in the image or the camera” of a mobile device in real-time via neural networks as taught by Baruch in order to optimize the previewing of the artistically filtered video of a selected object (Baruch, paragraphs 0002, 0024, 0040, 0043, 0082, and 0083).

Regarding claim 14, the combination of Rymkowski, Sommer, and Baruch teach all the claim limitations of claim 13 above; and further teach the object is a person (Rymkowski, paragraph 0035 and Fig. 3 teach runner (object is a person)).
the object is a person (see mapping above), however Baruch teaches the object is a person (paragraphs 0002, 0024, 0033-0036, 0040, 0043, 0082, 0083, and Fig. 1 teach a mobile phone’s “artistic preview-mode in an RGBD camera” application for a user to apply and view an “artistic filter” over a “user selected object” that can be a human subject (the object is a person), including and color editing via a neural network for the selected object in a video, “and not over the background”).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify mobile phone image/video sequence frame style transfer with a “style-specific neural network[s]”, as taught by Rymkowski as modified by receiving mobile phone hosted neural network model updates from a remote server as taught by Sommer, to include applying “artistic filter[s]” for color editing and “overcoming occlusions and abrupt and/or fast movements of the objects in the image or the camera” of a mobile device in real-time via neural networks as taught by Baruch in order to optimize the previewing of the artistically filtered video of a selected object (Baruch, paragraphs 0002, 0024, 0040, 0043, 0082, and 0083).

Regarding claim 20, the combination of Rymkowski, Sommer, and Baruch teach all the claim limitations of claim 1 above; and further teach receiving a stream of video data from a remote device (Baruch, paragraphs 0081-0083, 0104, 0107, and Fig. 8 teach obtaining (receiving) video images (stream of video data) at “logic modules” from remote “imagining devices” (from a remote device)); 
applying the first neural network to first frames in the streamed video data to transform the first frames to the first style (Baruch, paragraphs 0002, 0024, 0040, 0043, and 0081-0083 teach a mobile phone’s “artistic preview-mode in an RGBD camera” application for a user to apply and view an “artistic filter” over a selected object, including and color editing via a neural network (applying the first neural network to first frames in the streamed video data to transform the first frames to the first style). It is further taught that this can be used to overcome “occlusions and abrupt and/or fast movements of the objects in the image or the camera” in real-time from frame to frame. Paragraphs 0033-0036 further teach the overcoming of camera movements while displaying the “processed” frames “to a person viewing the video sequence” in “real-time” so the processed video appears “instantaneous[ly]”); and 
outputting the transformed streamed video data to a display of the mobile device (Baruch, paragraphs 0002, 0024, 0040, 0043, and 0081-0083 teach a mobile phone’s “artistic preview-mode in an RGBD camera” application for a user to apply and view (outputting) an “artistic filter” over a selected object, including and color editing via a neural network (transformed streamed video data). It is further taught that this can be used to overcome “occlusions and abrupt and/or fast movements of the objects in the image or the camera” in real-time (display the transformed streamed video data). Paragraphs 0033-0036 further teach the overcoming of camera movements while displaying (outputting) the “processed” frames (display the transformed streamed video data) “to a person viewing the video sequence” on a mobile phone in “real-time” so the processed video appears “instantaneous[ly]” (outputting the transformed streamed video data to a display of the mobile device)).
.

Claims 7-9 are rejected under 35 U.S.C. 103 as being unpatentable over Rymkowski et al (US Pub 20180082715) hereinafter Rymkowski, in view of Sommer et al (US Pub 20170357910) hereinafter Sommer, in view of Baruch et al (US Pub 20170337693) hereinafter Baruch, and further in view of van der Merwe et al (US Pub 20130240628) hereinafter Merwe.
Regarding claim 7, the combination of Rymkowski, Baruch, and Merwe teach all the claim limitations of claim 1 above; and further teach after the transformed video data is output to the display, receiving a request to record the video data from the live feed (Rymkowski, paragraph 0035 teaches “the process may utilize one or more prior-captured initial stylized images and/or one or more subsequently-captured initial stylized images (after the transformed video data is output to the display) to seed the optimization process” (receiving a request to record the video data from the live feed)).
after the transformed video data is output to the display, receiving a request to record the video data from the live feed (see mapping above), however Merwe teaches after the transformed video data is output to the display, receiving a request to record the video data from the live feed (paragraph 0050 and Fig. 5 teach overlaying the current user’s mobile device video on a  “a live video preview screen” (transformed video data is output to the display), and then (after) the user selects to capture a video when deemed necessary with the presented overlay (receiving a request to record the video data from the live feed)).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify mobile phone image/video sequence frame style transfer with a “style-specific neural network[s]”, as taught by Rymkowski as modified by receiving mobile phone hosted neural network model updates from a remote server as taught by Sommer, as further modified by applying “artistic filter[s]” for color editing and “overcoming occlusions and abrupt and/or fast movements of the objects in the image or the camera” of a mobile device in real-time via neural networks as taught by Baruch, to include a user selecting to capture a video after previewing an overlaid video stream as taught by Merwe in order to optimize user satisfactory and efficiency for previewing an overlaid video before choosing to capture the video (Merwe, paragraph 0050 and Fig. 5).

Regarding claim 8, the combination of Rymkowski, Sommer, Baruch, and Merwe teach all the claim limitations of claim 7 above; and further teach storing information about the first style to the memory module (Rymkowski, paragraph .

Regarding claim 9, the combination of Rymkowski, Sommer, Baruch, and Merwe teach all the claim limitations of claim 8 above; and further teach stopping the recording (Rymkowski, paragraphs 0029-0035, 0043-0044, and 0048 teach applying a “style-specific neural network pass” on each captured “image in the video sequence” including current frames of frames “0 to N” taught to be stored (stopping the recording), to output from the neural network the “desired stylized output image[s]” and “are combined into a stylized video sequence”); 
retrieving the recorded video data and the first style from the memory module (Rymkowski, paragraphs 0029-0035, 0043-0044, and 0048 teach applying a “style-specific neural network pass” on each captured “image in the video sequence” including current frames of frames “0 to N” taught to be stored (retrieving the recorded video data and the first style from the memory module), to output from the neural network the “desired stylized output image[s]” and “are combined into a stylized video sequence”); 
applying the first neural network to first frames from the recorded video data to transform the first frames to the first style (Rymkowski, paragraphs 0029-0035, 0043-0044, and 0048 teach applying a “style-specific neural network pass” on  and 
outputting the transformed video data to the display of the mobile device (Rymkowski, paragraphs 0049 and 0052-0053 teach that the user can “preview” (outputting) the changed images based on the current style (transformed video data), and can select a different style if the user desires on the mobile device display (to the display of the mobile device)). 

Claims 15-19 are rejected under 35 U.S.C. 103 as being unpatentable over Rymkowski et al (US Pub 20180082715) hereinafter Rymkowski, in view of Sommer et al (US Pub 20170357910) hereinafter Sommer, in view of Baruch et al (US Pub 20170337693) hereinafter Baruch, and further in view of Chan et al (“An Object-Based Approach to Image/Video-Based Synthesis and Processing for 3-D and Multiview Televisions”, 2009) hereinafter Chan.
Regarding claim 15, the combination of Rymkowski, Sommer, and Baruch teach all the claim limitations of claim 1 above; and further teach retrieving a multi-view interactive digital media representation (MVIDMR) from a memory on the mobile device, wherein the MVIDMR includes a plurality of images (Baruch teaches as mapped below); and 
generating second frames that each includes one of the plurality of images from the MVIDMR inserted into one of the frames of the video data (Baruch, paragraph 0035 teaches “Depth image data may be determined by a stereo camera system, such as with RGBD cameras, that captures images of the same scene from multiple angles (multi-view interactive digital media representation (MVIDMR)).  The system may perform a number of computations to determine a 3D space for the scene in the image and the depth dimension for each point, pixel, or feature in the image”. Paragraphs 0083-0086 teach 3D imaging can be accomplished on a mobile device through “multiple images” stored (retrieving a multi-view interactive digital media representation (MVIDMR) from a memory on the mobile device wherein the MVIDMR includes a plurality of images), and paragraphs 0019, 0036, 0040, 0062, and 0068 further teach the user selected object is “segmented from the background” and can be placed within a different background in the video frames while recording (generating second frames that each includes one of the plurality of images from the MVIDMR inserted into one of the frames of the video data)).
The combination at least implies retrieving a multi-view interactive digital media representation (MVIDMR) from a memory on the mobile device, wherein the MVIDMR includes a plurality of images; and generating second frames that each includes one of the plurality of images from the MVIDMR inserted into one of the frames of the video data (see mapping above), however Chan teaches retrieving a multi-view interactive digital media representation (MVIDMR) from a memory on the mobile device, wherein the MVIDMR includes a plurality of images (Chan teaches as mapped below); and 
generating second frames that each includes one of the plurality of images from the MVIDMR inserted into one of the frames of the video data (section 1, paragraph 5 teaches placing “the IBR objects (MVIDMR) onto the background of the original or other plenoptic videos (generating second frames that each includes one of the plurality of images from the MVIDMR inserted into one of the frames of the video data)”, where an “IBR object (MVIDMR)” is segmented within its original “plenoptic video” where the object has its own “image sequence (includes a plurality of images), depth map, and other relevant information such as shape information”. Section 2B further teaches these objects are captured though multiple cameras at different angles and can be transferred to other devices once recorded (from memory on the mobile device).).
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify mobile phone image/video sequence frame style transfer with a “style-specific neural network[s]”, as taught by Rymkowski as modified by receiving mobile phone hosted neural network model updates from a remote server as taught by Sommer, as further modified by applying “artistic filter[s]” for color editing and “overcoming occlusions and abrupt and/or fast movements of the objects in the image or the camera” of a mobile device in real-time via neural networks as taught by Baruch, to include IBR objects placed on the background of “other plenoptic videos” as taught by Chan in order to maximize diversity of background videos for IBR images (Chan, section 1, paragraph 5 and section 2B).

Regarding claim 16, the combination of Rymkowski, Sommer, Baruch, and Chan teach all the claim limitations of claim 15 above; and further teach applying the first neural network to the second frames in the video data to transform the second frames to the first style (Baruch teaches as mapped below); and 
outputting the transformed second frames to the display of the mobile device in real-time as the position and the orientation of the camera system is changing, such that the transformed second frames output to the display appear consistent with the current position and the current orientation of the camera system, wherein an object associated with the MVIDMR appears to change in a first orientation in real-time on the display (Baruch, paragraph 0035 teaches “Depth image data may be determined by a stereo camera system, such as with RGBD cameras, that captures images of the same scene from multiple angles (MVIDMR).  The system may perform a number of computations to determine a 3D space for the scene in the image and the depth dimension for each point, pixel, or feature in the image”. Paragraphs 0002, 0024, 0019, 0036, 0040, 0043, 0062, 0068, and 0082-0086 further teach 3D imaging can be accomplished on a mobile device through “multiple images” and the user selected object is “segmented from the background” and can be placed within a different background in the video frames while recording, where a mobile phone’s “artistic preview-mode in an RGBD camera” application for a user to apply and view (outputting) an “artistic filter” over a selected object (MVIDMR), including and color editing via a neural network (applying the first neural network to the second frames in the video data to transform the second frames to the first style…outputting the transformed second frames to the display of the mobile device). It is further taught that .
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify mobile phone image/video sequence frame style transfer with a “style-specific neural network[s]”, as taught by Rymkowski as modified by receiving mobile phone hosted neural network model updates from a remote server as taught by Sommer, as modified by applying “artistic filter[s]” for color editing and “overcoming occlusions and abrupt and/or fast movements of the objects in the image or the camera” of a mobile device in real-time via neural networks as taught by Baruch, as further modified by IBR objects placed on the background of “other plenoptic videos” as taught by Chan, to include three dimensional objects as the user selected, segmented video objects as taught by Baruch in order to enhance an image/video’s perception and style through neural networks (Baruch, 0002, 0024, 0019, 0033-0036, 0040, 0043, 0062, 0068, and 0082-0086).

Regarding claim 17, the combination of Rymkowski, Sommer, Baruch, and Chan teach all the claim limitations of claim 15 above; and further teach applying the first neural network to the one of the plurality of images from the MVIDMR inserted into each of the second frames in the video data to transform the portion of the second frames associated with the MVIDMR to the first style (Baruch teaches as mapped below); and 
outputting the transformed second frames to the display of the mobile device in real-time as the position and the orientation of the camera system is changing, such that transformed second frames output to the display appears consistent with the current position and the current orientation of the camera system, wherein an object associated with MVIDMR appears to change in a first orientation in real-time on the display (Baruch, paragraph 0035 teaches “Depth image data may be determined by a stereo camera system, such as with RGBD cameras, that captures images of the same scene from multiple angles (MVIDMR).  The system may perform a number of computations to determine a 3D space for the scene in the image and the depth dimension for each point, pixel, or feature in the image”. Paragraphs 0002, 0024, 0019, 0036, 0040, 0043, 0062, 0068, and 0082-0086 further teach 3D imaging can be accomplished on a mobile device through “multiple images” and the user selected object is “segmented from the background” and can be placed within a different background in the video frames while recording, where a mobile phone’s “artistic preview-mode in an RGBD camera” application for a user to apply and view (outputting) an “artistic filter” over a selected object (MVIDMR), including and color editing via a neural network (applying the first neural network to the one of the plurality of images from the MVIDMR inserted into each of the second frames in the video data to transform the portion of the second frames associated with the MVIDMR to the first .
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify mobile phone image/video sequence frame style transfer with a “style-specific neural network[s]”, as taught by Rymkowski as modified by receiving mobile phone hosted neural network model updates from a remote server as taught by Sommer, as modified by applying “artistic filter[s]” for color editing and “overcoming occlusions and abrupt and/or fast movements of the objects in the image or the camera” of a mobile device in real-time via neural networks as taught by Baruch, as further modified by IBR objects placed on the background of “other plenoptic videos” as taught by Chan, to include three dimensional objects as the user selected, segmented video objects as taught by Baruch in order to enhance an image/video’s perception and style through neural networks (Baruch, 0002, 0024, 0019, 0033-0036, 0040, 0043, 0062, 0068, and 0082-0086).

Regarding claim 18, the combination of Rymkowski, Sommer, Baruch, and Chan teach all the claim limitations of claim 15 above; and further teach applying the first neural network to a portion of the second frames associated with the video data to transform the portion of the second frames associated with the video data to the style associated with the first neural network (Baruch teaches as mapped below); and 
outputting the transformed second frames to the display of the mobile device in real-time as the position and the orientation of the camera system is changing, such that transformed second frames output to the display appears consistent with the current position and the current orientation of the camera system, wherein an object associated with MVIDMR appears to change in a first orientation in real-time on the display (Baruch, paragraph 0035 teaches “Depth image data may be determined by a stereo camera system, such as with RGBD cameras, that captures images of the same scene from multiple angles (MVIDMR). The system may perform a number of computations to determine a 3D space for the scene in the image and the depth dimension for each point, pixel, or feature in the image”. Paragraphs 0002, 0024, 0019, 0036, 0040, 0043, 0062, 0068, and 0082-0086 further teach 3D imaging can be accomplished on a mobile device through “multiple images” and the user selected object is “segmented from the background” and can be placed within a different background in the video frames while recording, where a mobile phone’s “artistic preview-mode in an RGBD camera” application for a user to apply and view (outputting) an “artistic filter” over a selected object (MVIDMR), including and color editing via a neural network (applying the first neural network to a portion of the second .
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify mobile phone image/video sequence frame style transfer with a “style-specific neural network[s]”, as taught by Rymkowski as modified by receiving mobile phone hosted neural network model updates from a remote server as taught by Sommer, as modified by applying “artistic filter[s]” for color editing and “overcoming occlusions and abrupt and/or fast movements of the objects in the image or the camera” of a mobile device in real-time via neural networks as taught by Baruch, as further modified by IBR objects placed on the background of “other plenoptic videos” as taught by Chan, to include three dimensional objects as the user selected, segmented video objects as taught by Baruch in order to enhance an image/video’s perception and style through neural networks (Baruch, 0002, 0024, 0019, 0033-0036, 0040, 0043, 0062, 0068, and 0082-0086).

Regarding claim 19, the combination of Rymkowski, Sommer, Baruch, and Chan teach all the claim limitations of claim 15 above; and further teach applying the first neural network to the one of the plurality of images from the MVIDMR inserted into each of the second frames in the video data to transform the portion of the second frames associated with the MVIDMR to the first style (Baruch, paragraph 0035 teaches “Depth image data may be determined by a stereo camera system, such as with RGBD cameras, that captures images of the same scene from multiple angles (MVIDMR). The system may perform a number of computations to determine a 3D space for the scene in the image and the depth dimension for each point, pixel, or feature in the image”. Paragraphs 0002, 0024, 0019, 0033-0036, 0040, 0043, 0062, 0068, and 0082-0086 further teach 3D imaging can be accomplished on a mobile device through “multiple images” and the user selected object is “segmented from the background” and can be placed within a different background in the video frames while recording, where a mobile phone’s “artistic preview-mode in an RGBD camera” application for a user to apply and view an “artistic filter” over a selected object (MVIDMR), including and color editing via a neural network (applying the first neural network to the one of the plurality of images from the MVIDMR inserted into each of the second frames in the video data to transform the portion of the second frames associated with the MVIDMR to the first style).); 
applying the first neural network to a portion of the second frames associated with the video data to transform the portion of the second frames associated with the video data to a first style associated with the first neural network (Baruch teaches as mapped below); and 
outputting the transformed second frames to the display of the mobile device in real-time as the position and the orientation of the camera system is changing, such that transformed second frames output to the display appears consistent with the current position and the current orientation of the camera system, wherein an object associated with MVIDMR appears to change in a first orientation in real-time on the display (Baruch, paragraph 0035 teaches “Depth image data may be determined by a stereo camera system, such as with RGBD cameras, that captures images of the same scene from multiple angles (MVIDMR). The system may perform a number of computations to determine a 3D space for the scene in the image and the depth dimension for each point, pixel, or feature in the image”. Paragraphs 0002, 0024, 0019, 0036, 0040, 0043, 0062, 0068, and 0082-0086 further teach 3D imaging can be accomplished on a mobile device through “multiple images” and the user selected object is “segmented from the background” and can be placed within a different background in the video frames while recording, where a mobile phone’s “artistic preview-mode in an RGBD camera” application for a user to apply and view (outputting) an “artistic filter” over a selected object (MVIDMR), including and color editing via a neural network (applying the first neural network to a portion of the second frames associated with the video data to transform the portion of the second frames associated with the video data to a first style associated with the first neural network). It is further taught that this can be used to overcome “occlusions and abrupt and/or fast movements of the objects in the image or the camera” in real-time (outputting the .
Thus it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to modify mobile phone image/video sequence frame style transfer with a “style-specific neural network[s]”, as taught by Rymkowski as modified by receiving mobile phone hosted neural network model updates from a remote server as taught by Sommer, as modified by applying “artistic filter[s]” for color editing and “overcoming occlusions and abrupt and/or fast movements of the objects in the image or the camera” of a mobile device in real-time via neural networks as taught by Baruch, as further modified by IBR objects placed on the background of “other plenoptic videos” as taught by Chan, to include three dimensional objects as the user selected, segmented video objects as taught by Baruch in order to enhance an image/video’s perception and style through neural networks (Baruch, 0002, 0024, 0019, 0033-0036, 0040, 0043, 0062, 0068, and 0082-0086).

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CLINT MULLINAX whose telephone number is 571-272-3241.  The examiner can normally be reached on Mon - Fri 8:00-4:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on 571-270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for 




/C.M./Examiner, Art Unit 2123                                                                                                                                                                                                        

/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123