Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Applicant’s response to the last office action, filed August 20, 2021 has been entered and made of record. Claims 1-3, 6, 15, 17, 20, 22-23, and 26 have been amended; claims 7-14, 16, 21, and 27 have been cancelled. By this amendment, claims 1-6, 15, 17-20, and 22-26 are pending in this application.
In view of Applicant’s amendment, the objection to claims 22-27 for minor informalities has been withdrawn.

Response to Arguments
Applicant's arguments filed 08/20/2021 have been fully considered but they are not persuasive. 
-- Applicant asserted, (Page 12) that Feris fails to disclose or suggest the 
“for each specified target set, selecting a preset number of specified target video frames from video frames corresponding to the specified target set according to a preset video frame extraction manner; recognizing the specified target in all specified target video frames through the preset recognition algorithm to obtain a target recognition result; and wherein recognizing the specified target in all specified target video frames through the preset recognition algorithm to obtain the target recognition result comprises: extracting features of the specified targets in all the specified target video frames to obtain target 
target features as the target recognition result”.

	However, the Examiner respectfully disagrees for the following reasons: 
	It should be noted that that Feris clearly discloses the determining whether a specified target corresponding to each specified target set is a sensitive target respectively through the preset recognition algorithm, (see at least: col. 2, lines 42-67, “see the Final office action for more details”). Further, Feris discloses in col. 5, lines 38-40, the content obfuscation application identifies a subset of the tagged images that correlate to the user-defined sensitivities, where the subset of the tagged images implicitly includes a preset number of tagged images, “specified target video” that correlates “correspond” to the user-defined sensitivities, which implicitly results in obtaining a target recognition result based on recognizing the specified target in all specified target video frames through the correlation. Further, Feris discloses in col. 6, lines 16-19, that the user-configurable settings for content obfuscation are configured to use prediction, machine learning algorithms, and neural networks, such as a deep convolutional neural network, which implicitly involves know algorithms such as “the preset recognition algorithm”, [i.e., for each specified target set, selecting a preset number of specified target video frames from video frames corresponding to the specified target set according to a preset video frame extraction manner, and recognizing the specified target in all specified target video frames through a preset recognition algorithm to obtain a target recognition in conjunction with the index, via the obfuscator engine 108, to enable the selection of options by a user to indicate what types of elements in a media file are considered to be sensitive to that user or family of users, [i.e., determining that the specified target corresponding to the target recognition result is a sensitive target that meet the user’s user settings 110 applied via the obfuscator engine 108, as well as the sensitive target that does not meet the user’s user settings 110 applied via the obfuscator engine 108]. Further, Feris discloses in col. 2, lines 50-52, processing a media segments or video frames by the neural network to produce high-level features that are extracted, and forming feature vector from the set of features [i.e., the extracted high-level features correspond to features of the specified targets in a media segments or video frames, “the specified target video frames” to obtain feature vector from the set of features, “obtain target features”]. Feris further discloses in col. 2, lines 54-62, the determining sensitivity indication for the media segment based on the classifier, [i.e., implicitly recognizing the sensitive features from the target features through a preset target classification algorithm or a recognition algorithm]. Feris further discloses in col. 3, lines 1-37, a confidence score, which indicates a value representing the level of confidence that a selected image from the content 102 matches an image from the training data.  The content 102 is analyzed against the training data 106 via the content analyzer 104, and the content analyzer 104 generates a content index including a sensitive characteristics and a corresponding confidence score, (col. 2, lines 42-45), [accordingly, the matching selected image from the content 102 to an image from the training data, implicit the relation between the 
	For reasons stated above, the rejection of claim 1 and its dependent claims was proper, and the rejection is maintained.
	Claims 15 and 22 recites substantially similar limitations as set forth in claim 1. Thus, the remarks above, apply also the claims 15 and 22.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 6-7, 15, 20-22, and 26-26 are rejected under 35 U.S.C. 103 as being unpatentable over Feris et al, (US Patent 9,471,852) in view of Lin et al, (US-PGPUB 2014/0211002)

In regards to claim 1, Feris discloses a method for selecting a to-be-masked region in a video, comprising: 
obtaining a video to be detected, (see at least: col. 2, lines 30-34, the content 102 may be multimedia content, such as consumer programming. Alternatively, the content 102 may be a surveillance or security video, [i.e., a video image is implicitly obtained]);
determining specified target sets in the video through a preset target detection algorithm, (see at least: col. 2, lines 7-17,  the user-configurable settings for content obfuscation provide modification of video elements based on user-defined sensitivities. The sensitivities may include, e.g., emotional response, privacy, security, or other sensitivities. Elements subject to obfuscation may include images depicting objects used in the commission of violent acts, faces of people subject to privacy concerns, victims of violence, or any element that is determined to contribute to the defined sensitivities, [i.e., implicitly determining objects used in the commission of violent acts, faces of people subject to privacy concerns, victims of violence, “specified target sets in the video”, through the user-configurable settings for content obfuscation, “preset target detection algorithm”]);
determining whether a specified target corresponding to each specified target set is a sensitive target respectively through a preset recognition algorithm, (see at least: col. 2, lines 42-67, the content 102 is analyzed against the training data 106 via the content analyzer 104, “preset recognition algorithm”, which may include machine learning techniques, such as a neural network to process the content, wherein the sensitivity indication is determined for the media segment using classifier, such that  tagging the media segments to specify a type of sensitivity based on the results of the classifier, [i.e., 
using the specified target set as to-be-masked regions in the video when a specified target corresponding to this specified target set is determined as a sensitive target, (see at least: col. 5, lines 15-26, presenting a compiled listing of sensitivities (e.g., via the index) for a given media file may, such that the user can select from the listing which sensitivities to apply when obfuscating aspects of the media file. The sensitivities can be broadly defined (e.g., acts of violence) or may be more granular in nature (e.g., blood, a war scene, etc.). A user-defined sensitivity may indicate any action or condition that is considered objectionable to the user and which can be identified from a media file through the processing described herein, [i.e., implicitly using a specified target set as to-be-masked regions in the video when a specified target corresponding to this specified target set is determined as a sensitive target]. See also, col. 6, lines 6-16);
wherein determining whether the specified target corresponding to each specified target set is the sensitive target respectively through the preset recognition algorithm comprises: 
for each specified target set, selecting a preset number of specified target video frames from video frames corresponding to the specified target set according to a preset video frame extraction manner, (Feris, see at least: col. 5, lines 38-40, implicit by identifying a subset of the tagged images, “number of specified target video frames from video frames corresponding to the specified target”. See also col. 6, lines 16-19);

determining that the specified target corresponding to the target recognition result is a sensitive target when the target recognition result meets a preset determination rule; or determining that the specified target corresponding to the target recognition result is not a sensitive target when the target recognition result does not meet the preset determination rule, (Feris, col. 3, lines 38-42, the user settings 110 enable the selection of options by a user to indicate what types of elements in a media file are considered to be sensitive to that user or family of users, [i.e., implicitly meeting certain set by the user or preset determination rule”);
wherein recognizing the specified target in all specified target video frames through the preset recognition algorithm to obtain the target recognition result comprises: 
extracting features of the specified targets in all the specified target video frames to obtain target features, (Feris et al, see at least: col. 2, lines 50-52, processing a media segments or video frames by the neural network to produce  high-level features that are extracted, and forming feature vector from the set of features [i.e., the extracted high-level features correspond to features of the specified targets in a media segments or video frames, “the specified target video frames” to obtain feature vector from the set of features, “obtain target features”]);
classification algorithm or a recognition algorithm]); and
using a relation between the number of the sensitive features and the number of all the target features as the target recognition result, (Feris et al, see at least: col. 3, lines 1-37, a confidence score, which indicates a value representing the level of confidence that a selected image from the content 102 matches an image from the training data.  The content 102 is analyzed against the training data 106 via the content analyzer 104, and the content analyzer 104 generates a content index including a sensitive characteristics and a corresponding confidence score, (col. 2, lines 42-45), [accordingly, the matching selected image from the content 102 to an image from the training data, implicit the relation between the number of the sensitive features and the number of all the target features as the target recognition result]).
Feris does not expressly disclose wherein each specified target set is a set of pixels of one specified target in video frames of the video.
However, Lin et al discloses the video object detection system 30, which is capable of determining location of the target video object based on the defined detection region of the video frame. Since each of the defined detection regions is formed with a set of image pixels in the video frame I, the object detection unit 308 can determine whether the target video object locates on the image pixels of the defined detection region for determining the location of the target video object, [i.., the target video object, “specified set of image pixels, “a set of pixels of one specified target”, in video frames of the video], (see at least: Par. 0024).
Feris and Lin et al are combinable because they are both concerned with object detection. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify Feris, to include the object detection unit 308, as though by Lin et al, in order to determining the location of the target video object, based on set of image pixels in the video frame, (Lin et al, Par. 0024)

The following prior art of record, (CN103839057), discloses also wherein 
recognizing the specified target in all specified target video frames through the preset recognition algorithm to obtain the target recognition result comprises: extracting features of the specified targets in all the specified target video frames to obtain target features: recognizing sensitive features from the target features through a preset target classification algorithm or the recognition algorithm: and using a relation between the number of the sensitive features and the number of all the target features as the target recognition result, (see at least: Abstract, and Par. 0042-0045)

In regards to claim 2, the combine teaching Feris and Lin et al as whole discloses the limitations of claim 1.
The combine teaching Feris and Lin et al as whole does not expressly disclose masking the to-be-masked regions in the video frames of the video.
However, the masking one or more regions in the video frames of video, is exceedingly well-known and practiced in the art, (e.g., masking a privacy zone in a picture taken by the monitor camera)
Therefore, it would have been obvious to a person of ordinary skill in the art, to mask a privacy zone (or more regions) in the video frames of video.

The prior art of record, Wada et al, (US-Patent 6,744,461) discloses the aspect of 
masking a privacy zone in a picture taken by the monitor camera.

In regards to claim 6, the combine teaching Feris and Lin et al as whole discloses the limitations of claim 1.

determining that the specified target corresponding to the target recognition result is a sensitive target when the target recognition result meets a preset determination rule; or determining that the specified target corresponding to the target recognition result is not a sensitive target when the target recognition result does not meet the preset determination rule, (Feris, col. 3, lines 38-42, the user settings 110 enable the selection of options by a user to indicate what types of elements in a media file are considered to be sensitive to that user or family of users, [i.e., implicitly meeting certain set by the user or preset determination rule”)

In regards to claim 7, the combine teaching Feris and Lin et al as whole discloses the limitations of claim 6.
Furthermore, Feris discloses wherein recognizing the specified target in all specified target video frames through a preset recognition algorithm to obtain a target recognition result comprises: 
extracting features of the specified targets in all the specified target video frames to obtain target features, (Feris et al, see at least: col. 2, lines 50-52);
recognizing sensitive features from the target features through a preset target classification algorithm or a recognition algorithm, (Feris et al, see at least: col. 2, lines 42-67);


Regarding claim 15, claim 15 recites substantially similar limitations as set forth in claim 1. As such, claim 15 is in rejected for at least similar rational.
The Examiner further acknowledged the following additional limitation(s): “electronic device, comprising a processor and a memory, wherein the memory is configured for storing a computer program; and the processor is configured for executing the program stored in the memory”. However, Feris discloses the electronic device, comprising a processor, (Feris, col. 1, lines 26), and a memory, (Feris, col. 6, lines 36-40), wherein the memory is configured for storing a computer program, (Feris, col. 6, lines 36-40; and the processor is configured for executing the program stored in the memory, (Feris, col. 1, lines 26-29)

Regarding claim 20, claim 20 recites substantially similar limitations as set forth in claim 6. As such, claim 20 is in rejected for at least similar rational.

Regarding claim 21, claim 21 recites substantially similar limitations as set forth in claim 7. As such, claim 21 is in rejected for at least similar rational.

Regarding claim 22, claim 22 recites substantially similar limitations as set forth in claim 1. As such, claim 22 is in rejected for at least similar rational.
The Examiner further acknowledged the following additional limitation(s): “system for selecting a to-be-masked region in a video”. However, discloses the “system for selecting a to-be-masked region in a video”, (Feris, col. 1, line 22)

Regarding claim 26, claim 26 recites substantially similar limitations as set forth in claim 6. As such, claim 26 is in rejected for at least similar rational.

Regarding claim 27, claim 27 recites substantially similar limitations as set forth in claim 7. As such, claim 27 is in rejected for at least similar rational.

Claims 3, 5, 17, 19, 23, and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Feris et al, and Lin et al, as applied to claim 1 above; and further in view of Bolle et al, (US-PGPUB 2007/0201694)

In regards to claim 3, the combine teaching Feris and Lin et al as whole discloses the limitations of claim 1.
The combine teaching Feris and Lin et al as whole does not expressly disclose  wherein determining specified target sets in the video through a preset target detection algorithm comprises: detecting regions corresponding to all specified targets in each video frame of the video respectively through the preset target detection algorithm; for each of the specified targets, associating regions corresponding to the specified target in 
However, Bolle et al discloses detecting regions corresponding to all specified targets in each video frame of the video respectively through the preset target detection algorithm, (see at least: Par. 0038); for each of the specified targets, associating regions corresponding to the specified target in chronological order (see at least: Par. 0073), to obtain a trajectory of the specified target, (see at least: Par. 0082); and using the trajectories of the specified targets as the specified target sets for the specified targets in the video, (see at least: Par. 0093-0094)
Feris and Lin et al and Bolle et al are combinable because they are all concerned with object detection. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify the combine teaching Feris and Lin et al, to include the tracking module 350, as though by Bolle et al, in order to compute one or more these trajectory based attributes of each track, (Bolle, Par. 0082)

In regards to claim 5, the combine teaching Feris et al, and Lin et al, and Bolle et al as whole discloses the limitations of claim 3.
Furthermore, Feris et al disclose wherein for each of the specified targets, associating regions corresponding to the specified target in chronological order to obtain a trajectory of the specified target comprises: extracting features of regions corresponding to all the specified targets in each video frame of the video to obtain region features, (Feris et al, col. 2, lines 50-52).


Regarding claim 17, claim 17 recites substantially similar limitations as set forth in claim 3. As such, claim 17 is in rejected for at least similar rational.

Regarding claim 19, claim 19 recites substantially similar limitations as set forth in claim 5. As such, claim 19 is in rejected for at least similar rational.

Regarding claim 23, claim 23 recites substantially similar limitations as set forth in claim 3. As such, claim 23 is in rejected for at least similar rational.

Regarding claim 25, claim 25 recites substantially similar limitations as set forth in claim 5. As such, claim 25 is in rejected for at least similar rational.

Claims 4, 18, and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Feris et al, and Lin et al, and Bolle et al, as applied to claim 3 above; and further in view of Cooper et al, (US-PGPUB 2004/0017938)

In regards to claim 4, the combine teaching Feris et al, and Lin et al, and Bolle et al as whole discloses the limitations of claim 3.
Furthermore, Feris discloses partitioning each video frame of the video into a preset number of regions to obtain a plurality of pixel regions, (col. 2, lines 49-50); 
extracting a feature of each pixel region respectively through a pre-trained convolutional neural network, (col. 2, lines 50-52); 
determining whether each pixel region matches with any of the specified targets through a preset classifier according to the feature of the pixel region, (col. 3, lines  1-3); and
The combine teaching Feris et al, and Lin et al, and Bolle et al as whole does not expressly disclose that in response to a pixel region matching with a specified target, determining a region corresponding to the specified target through a bounding box regression algorithm based on all pixel regions matching with the specified target, 
However, Cooper matching pixel region matching with a specified target, and determining a region corresponding to the specified target through a bounding box based on all pixel regions matching with the specified target, (see at least: Par. 0065-0069)
Feris, Lin et al, Bolle et al, and Cooper are combinable because they are all concerned with object detection. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify the combine teaching Feris, Lin et al, and Bolle et al, to match each region in the current frame to the face region in the preceding frame, using bounding box, (Cooper, 0069)
The combine teaching Feris, Lin et al, Bolle et al, and Cooper as whole does not expressly regression algorithm
However, it should be noted that using regression algorithm, is exceedingly well-known and practiced in the art.
Therefore, it would have been obvious to a person of ordinary skill in the art, to use the regression algorithm for target detection.

Regarding claim 18, claim 18 recites substantially similar limitations as set forth in claim 4. As such, claim 18 is in rejected for at least similar rational.

Regarding claim 24, claim 24 recites substantially similar limitations as set forth in claim 4. As such, claim 24 is in rejected for at least similar rational.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until 

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMARA ABDI whose telephone number is (571)270-1670. The examiner can normally be reached 9:00am-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached on (571)272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the 



/AMARA ABDI/Primary Examiner, Art Unit 2668                                                                                                                                                                                                        10/26/2021