DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 7, 9, 11, 17, and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over NA (PGPUB: 20150022698) in view of Lu (PGPUB: 20190130229), and further in view of Garten (PGPUB: 20110103644).

Regarding claims 1 and 11, NA teaches an electronic device comprising:
a display (see Fig. 1, item 130);
a camera (see Fig. 1, item 120);
a memory storing one or more instructions (see Fig. 1, item 110); and 
at least one processor (see Fig. 1, item 100) configured to execute the one or more instructions to: 
control the camera to obtain a plurality of images including a first image including a plurality of objects (see Fig. 4A, item buildings, 410, 421);
see Fig. 4A-4C, paragraph 43, generates a final desired image by replacing the unwanted objects with a background image);
detect movement of a second object based on a plurality of positions of the second object in at least two images of the plurality of images (see Fig. 6, paragraph 47, Control unit 100 may compare a previously buffered frame with the next frame (e.g., a frame currently being processed) at operation 615, and may identify motion vectors of objects in the image based on the analysis result at operation 617), the second object being different from the first object (see Fig. 4A, item 410 as the first object, such as buildings, the second object is 421 and 423);
based on detecting the movement of the second object, determine whether the second object is an undesired object (see Fig. 4A-4C, paragraph 31 and 41 and 52, the analyzer 230 may analyze movements of objects included in the image by implementing a frame by frame analysis. The following is an example of how analyzer 230 may identify an object moving in an image. Analyzer 230 may compare a frame output by buffer 220 with a subsequent frame output by image processing unit 210. The frames may be taken sequentially in time; the analyzer 230 and the shooting time setting unit 240 may identify motion vectors of moving objects 421, 423, and 425 in preview frames. Analyzer 230 and shooting time setting unit 240 may determine a photographing time and photographing interval that enables removal of the undesired objects so as to restore a background image by analyzing the speed of the motion vector and/or the object size; the photographing interval for each unwanted object may be determined such that control unit 100 may delete a plurality of undesired objects as well as minimize the photographing time);
obtain a second image by removing the second object from the first image, the second image including the first object (see Fig. 4D, paragraph 43, image synthesizer 250 may remove the frames with the unwanted moving objects 421, 423, and 425 from the image of FIG. 4A, and may generate a final image by synthesizing the background images in areas 431, 433, and 435, as shown in FIG. 4C. In this instance, the final image generated by the image synthesizer 250 may look like the image shown in FIG. 4D); and
control the display to display the obtained second image (see Fig. 4 and 5, paragraph 46, after removing the objects, control unit 100 displays the desired image in the display unit 130 and stores the generated image in the storage unit 110 at operation 523).
However, NA does not expressly teach a first AI model stored in the memory.
Lu teaches the act 304 can include applying a real-time salient content neural network to the real-time digital visual media feed to identify an object in the real-time digital visual media feed. Indeed, as described in FIG. 2, the deep salient object segmentation system 110 can train a real-time salient content neural network to predict salient objects and then apply the real-time salient content neural network to digital images of the real-time digital visual media feed captured at the mobile device 300 to identify objects portrayed in the real-time digital visual media feed (see Fig. 3, paragraph 81).
It would have been obvious to one of ordinary skill in the art before the effective 
However, the combination does not expressly teach based on determining that the second object is the undesired object and remove the undesired object.
Garten teaches that motion of undesired objects in the image frame may be detected based on image data associated with the first and second frames. For example, the processor may be configured to perform motion estimation to determine one or more motion vectors associated with image date in the first and second frames. Detection of local motion regions may be determined by generating a difference image after image data for the first and second frames have been aligned to a single image that serves as a reference image (see Fig. 2, item 215, paragraph 19); based on the motion of detected objects, the processor may replace image data of the first frame with image data of the second frame at block 220. In certain embodiments, image data may be extracted from the first image and to replace image data of the second image to generate a correct image at block 220. Alternatively, or in combination, image data from a plurality of regions of each image may be employed to replace image data relative to the respective image. Image data replacement may be performed to replace unwanted or disturbing objects with image data associated with a background (see Fig. 2, item 220, paragraph 20).
It would have been obvious to one of ordinary skill in the art before the effective 

Regarding claims 7 and 17, the combination teaches wherein the at least one processor is further configured to:  
73acquire information of second relative locations between the camera and the second object (see NA, Fig. 4, paragraph 43, if frames shown in FIGS. 4A to 4C are obtained in the eraser photographing mode, the buffer 220 temporarily stores the obtained frames. The image synthesizer 250 removes frames containing the objects 421, 423, and 425, as shown in FIGS. 4A to 4C, and generates a final desired image by replacing the unwanted objects with a background image); 
input the information of the second relative locations to the first Al (see NA, Fig. 4A-4D, paragraph 44, after selecting an image, the control unit 100 may remove the moving object in the selected frame at operation 921. Control unit 100 may synthesize a plurality of frames by selecting a frame having a back ground image at the location of the removed object at operation 917); and 
see NA, Fig. 4B, paragraph 41, moving objects 421, 423, and 425 cannot be completely removed because areas 431, 433, and 435 still contain portions of the moving objects).

Regarding claim 9 and 19, the combination teaches further comprising: 
a user input interface configured to receive a user input of driving the camera (see NA, Fig. 5, paragraph 44, when a user requests to photograph an image using camera 120, control unit 100 detects the user request at operation 511 and enters a preview mode to process frames detected by the camera 120 at operation 513); and 
a communication interface configured to: 
form, in response to the user input, a communication link with a server that updates at least one of the first Al and the second Al neural network and 74receive, over the communication link, data for updating at least one of the first Al model and the second Al model from the server (see Lu, Fig. 1, paragraph 54, upon training one or more salient content neural networks, the deep salient object segmentation system 110 can then utilize the server(s) 102 to provide the one or more neural networks to the client device 104a (and/or the client devices 104b-104n). For instance, the deep salient object segmentation system 110 can provide a real-time salient content neural network and/or a static salient content neural network to the client device 104a (i.e., a mobile device) as part of a digital image editing application installed on the client device 104a. The deep salient object segmentation system 110 can then utilize the client device 104a (and the digital image editing application) to apply the real-time salient content neural network and/or the static salient content neural network).

Claims 2, 4-6, 8, 10, 12, 14-16, 18, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over NA (PGPUB: 20150022698) in view of Lu (PGPUB: 20190130229), in view of Garten (PGPUB: 20110103644), and further in view of Lin (PGPUB: 20170287137).

Regarding claims 2 and 12, the combination teaches wherein the obtaining the second image comprises:
detect first data corresponding to the first object and second data corresponding to the second object (see NA, Fig. 4A, paragraph 41, the analyzer 230 and the shooting time setting unit 240 may identify motion vectors of moving objects 421, 423, and 425 in preview frames. Analyzer 230 and shooting time setting unit 240 may determine a photographing time and photographing interval that enables removal of the undesired objects so as to restore a background image by analyzing the speed of the motion vector and/or the object size);
remove the second data corresponding to the second object from the first image (see NA, Fig. 4A, paragraph 41, determine a photographing time and photographing interval that enables removal of the undesired objects); and
generate the second image by restoring third data corresponding to at least a portion of the first object hidden by the second object, wherein the third data replaces see Fig. 4A-4C, paragraph 41, area 431 may include the moving object 421 and a still object (for example, portion of bridge) may also be in the background. In this instance, the image of area 431 may be restored with a still background image after removing moving object 421).
However, the combination does not expressly teach restoring the third data using a second AI model.
Lin teaches that Based on the digital training image pairs, act 1100 can include the first neural network using deep learning neural network techniques to learn to generate an accurate probability map for a given input image. In one or more embodiments the first neural network is a fine-tuned deconvolution neural network; the act 1140 can include, identifying, by at least one processor, a set of pixels corresponding to the object portrayed in the input image based on the segmentation mask. For example, in one or more embodiments, the act 1140 can also include segmenting the object portrayed in the input image for editing purposes. For example, act 1140 can include copying the set of pixels, deleting the set of pixels, replacing the set of pixels, as well as other editing functionality (see Fig. 11, paragraph 123); 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination by Lin for providing the act 1140 can also include segmenting the object portrayed in the input image for editing purposes. For example, act 1140 can include copying the set of pixels, deleting the set of pixels, replacing the set of pixels, as well as other editing functionality, as restoring the third data using a second AI model. Therefore, the combination of the teaching, suggestion, or motivation in the prior art would have led one of ordinary skill to modify 

Regarding claims 4 and 14, the combination teaches wherein the at least one processor is further configured to: 
determine a first sharpness of the third data corresponding to the at least the portion of the first object (see Fig. 4A, item 410, paragraph 43, the image synthesizer 250 selects a specific image without the unwanted objects (for example, a good image having no blur), removes the frames with unwanted moving objects from the image, and generates a final image by synthesizing images having a background image at the location of the removed object); and 
perform image processing such that a second sharpness of the third data corresponding to the at least the portion of the first object corresponds to the first sharpness (see NA, Fig. 4, paragraph 43, image synthesizer 250 may remove the frames with the unwanted moving objects 421, 423, and 425 from the image of FIG. 4A, and may generate a final image by synthesizing the background images in areas 431, 433, and 435).  

Regarding claim 5 and 15, the combination teaches further comprising 
a user input interface configured to receive user selection of the second object from the first image, wherein the at least one processor is further configured to detect the second data corresponding to the sub-object, based on the user selection (see NA, Fig. 3, paragraph 38, shooting time setting unit 240 may estimate a photographing interval for removing a selected object. This method may include identifying objects unwanted by a user. The analyzer 230 obtains motion vectors for each unwanted object. The shooting time setting unit 240 obtains the object size as the object moves toward a direction by using the motion vector, and calculates the maximum photographing frames per second by dividing the object size by the motion vector).

Regarding claims 6 and 16, the combination teaches wherein the display is further configured to display an indicator indicating the second data corresponding to the second object, together with the first image (see NA, Fig. 2, paragraph 27, an analyzer 230 may identify a moving vector of an unwanted object by tracking a movement of the object in the preview frames; Fig. 4A and 4B, Item 410, paragraph 41, moving objects 421, 423, and 425 are partially displayed in areas 431, 433, and 435 respectively).  

Regarding claims 8 and 18, the combination teaches further comprising: 
a user input interface configured to receive a user input of driving the camera (see NA, Fig. 5, paragraph 44, when a user requests to photograph an image using camera 120, control unit 100 detects the user request at operation 511 and enters a preview mode to process frames detected by the camera 120 at operation 513); and 
a communication interface configured to: 
form, in response to the user input, a communication link with a server including see Lu, Fig. 1, paragraph 56, Upon identifying one or more salient objects in digital visual media, the deep salient object segmentation system 110 can also modify the digital visual media. For example, the deep salient object segmentation system 110 can identify a salient object and move, copy, paste, or delete the selected salient object based on additional user input), 
transmit the first image to the server, and receive, from the server (see Lu, Fig. 1, paragraph 50, the environment 100 may also include the server(s) 102. The server(s) 102 may generate, store, receive, and transmit any type of data, including, for example: a training image repository, one or more neural networks, and/or digital image data. For example, the server(s) 102 may receive data from a client device, such as the client device 104a, and send the data to another client device, such as the client device 104b and/or 104n. In one example embodiment, the server(s) 102 is a data server), information about a result of detecting the first data corresponding to the first object and the second data corresponding to the second object from the first image (see Lu, Fig. 1, paragraph 31, the deep salient object segmentation system can utilize the real-time salient content neural network to analyze the real-time digital visual media feed and dynamically segment objects portrayed in the real-time digital visual media feed. The deep salient object segmentation system can also determine that a mobile device has captured (or is otherwise displaying) a static digital image).   

Regarding claims 10 and 20, the combination teaches wherein the first Al model see Lu, Fig. 1, paragraph 53, the deep salient object segmentation system 110 can train the real-time salient content neural network and the static salient content neural network to identify salient objects (e.g., foreground and background pixels) in new digital images. Additional detail regarding training the real-time salient content neural network and the static salient content neural network is provided below (e.g., in relation to FIGS. 2, 3, 5, and 6)).


Claims 3 and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over NA (PGPUB: 20150022698) in view of Lu (PGPUB: 20190130229), in view of Garten (PGPUB: 20110103644), in view of Lin (PGPUB: 20170287137), and further in view of Bae (PGPUB: 20140022394).

Regarding claim 3 and 13, the combination teaches wherein the restoring of the third data corresponding to the at least the portion of the first object comprises:
restore the third data corresponding to the at least the portion of the first object, based on the information of the first relative locations (see NA, Fig. 1, paragraph 29, the image synthesizer 250 may reproduce the image by restoring a background image at the location of the removed object by synthesizing a plurality of preview frames).
The combination does not expressly teach that acquire information of first relative 
Bae teaches that the pattern setting unit 500 obtains information about the position of the horizon, an object to be tracked, information about the actual size of the object, and information about the actual height of the camera 100. In this state, the pattern setting unit 500 sets a pattern size P1 by applying the above information and a distance d1 from the horizon to a predetermined pixel pix1 to Equation 1 (see Fig. 3-4, paragraph 61).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination by Bae for providing obtains information about the position of the horizon, an object to be tracked, information about the actual size of the object, and information about the actual height of the camera 100, as acquire information of first relative locations between the camera and the first object. Therefore, the combination would provide the location information of camera and an object for tracking the object.


Response to Arguments
Applicant's arguments filed 7/6/2021 have been fully considered but they are not persuasive. 
In page 12, lines 9-12, applicant argues that “As discussed in Applicant's previous remarks, to the extent that Lu discloses a neural network for identifying objects in images and, for example, highlighting the identified objects, (See Lu, paragraphs [0081 ]-[0082]), nothing in Lu discloses determining, using the neural network, whether 
Examiner respectfully disagrees. Na indeed teaches that the analyzer 230 may identify motion vectors for each unwanted object that appear to move between the intermediate frames (see Fig. 1-2, paragraph 32); control unit 100 may identify a photographing time and the size of an unwanted object by using the computed motion vector (see Fig. 1-2, paragraph 33); when removing an object 421 from the image of FIG. 4A, a frame may be captured such that object 421 is removed from area 431. Here, area 431 may include the moving object 421 and a still object (for example, portion of bridge) may also be in the background. In this instance, the image of area 431 may be restored with a still background image after removing moving object 421 (see Fig. 4, paragraph 41). Na teaches to identify the desired object, such as item 431, and undesired object 421 via the control unit 100 and the analyzer 230. 
Lu teaches that the act 304 can include applying a real-time salient content neural network to the real-time digital visual media feed to identify an object in the real-time digital visual media feed. Indeed, as described in FIG. 2, the deep salient object segmentation system 110 can train a real-time salient content neural network to predict salient objects and then apply the real-time salient content neural network to digital images of the real-time digital visual media feed captured at the mobile device 300 to identify objects portrayed in the real-time digital visual media feed (see Fig. 3, paragraph 81). Lu teaches to identify the object via neural network. 
Therefore, one skilled in art to combine NA with Lu according to the technologies and methods they disclose would yield for determining, using the neural network, whether the identified objet is a ‘desired object’ or an ‘undesired object. And further, the 

In page 12, lines 17-20, applicant argues that Na does not disclose artificial intelligence technology. In other words, Applicant submits that Na does not disclose identifying main object/ sub-object using an artificial intelligence model such as CNN and restoring the area corresponding to the sub-object using the artificial intelligence model such as GAN.
Examiner respectfully disagrees. NA indeed teaches that identifying main object, sub-objects, and moving objects shown in Fig. 4A-D with analysis and extracting via processing unit. NA does not expressly teach using CNN to identify object, but Lu teaches to identify object via CNN technique, which is obvious to skilled in art.  

In page 14, lines 1-3, applicant argues that applicant submits that Lin discloses identifying pixel sets corresponding to objects from images, but does not disclose identify main object/ sub objects using the artificial intelligence model. 
Examiner respectfully disagrees. NA indeed teaches that identifying main object, sub-objects, and moving objects shown in Fig. 4A-D with analysis and extracting via processing unit. NA does not expressly teach using CNN to identify object, but Lu teaches to identify object via CNN technique, which is obvious to skilled in art.

In page 14, lines 20, applicant argues that Lin does not disclose restoring the main object hidden by the sub object. 


Therefore, the combination teaches the limitation of “based on detecting the movement of the second object, determining, by a first AI model stored in a memory of the electronic device, whether the second object is an undesired object; obtaining a second image by removing the second object from the first image, the second image including the first object,” as claimed in claim 1.


Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to XIN JIA whose telephone number is (571)270-5536.  The examiner can normally be reached on 9:00 am-7:30pm.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matthew Bella can be reached on (571)272-7778.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/XIN JIA/Primary Examiner, Art Unit 2667