DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments (1/7/22 Remarks: page 8, line 14 - page 9, line 15) with respect to the rejection of claims 1-6, 16, & 19 under 35 USC §103 and the objection to claims 3-4 & 6 have been fully considered and are persuasive.  The rejection of claims 1-6, 16, & 19 under 35 USC §103 and the objection to claims 3-4 & 6 have been withdrawn. However, upon further consideration, a new ground(s) of rejection is made in view of Tao (US 20170132472) and Yang (“Object Detection and Viewpoint Estimation with Auto-Masking Neural Network”)
Claims 1-2, 6, 16, & 19 are rejected under 35 U.S.C. 103(a) as being unpatentable over Tao (US 20170132472) in view of Yang (“Object Detection and Viewpoint Estimation with Auto-Masking Neural Network”).
Tao discloses:
Claim 1: A method for adapting a pre-trained Convolutional Neural Network (CNN) to a target video (Tao Abstract and paragraph 0070, tracking a position of a target object in a video sequence using a machine learning model pre-trained offline), comprising:
transforming a first feature map into a plurality of sub-feature maps, wherein the first feature map is generated by the pre-trained CNN according to a frame of the target video (Tao paragraphs 0056-0058, deep convolutional network DCN including first convolutional layer C1 which generates first feature maps which are input to a second convolutional layer C2, each convolutional layer applies a plurality of filters (generating a plurality of sub-feature maps from each of the plurality of filters)), and a correlation among at least some of the plurality of sub-feature maps is reduced by a mask layer (see secondary reference below);
convolving each of the sub-feature maps with one of a plurality of adaptive convolution kernels, respectively, to output a plurality of second feature maps with improved adaptability (Tao paragraph 0058, further convolutional layers in addition to C1 and C2 generating further feature maps generating an additional plurality of sub-feature maps from each of the plurality of filters incorporating further adaptations); and
training, frame by frame, the adaptive convolution kernels (Tao paragraphs 0038-0040 & 0049-0051, training the deep convolutional network kernels by adjusting the weights iteratively to reduce error).
Tao does not expressly disclose the element annotated “(see secondary reference below)” (i.e. correlation among a plurality of sub-feature maps reduced by a mask layer).
Yang discloses:
…a correlation among at least some of the plurality of sub-feature maps is reduced by a mask layer (Yang Abstract, Section 2.2 and Figure 1, mask layer between convolutional layers allowing only key sections of feature maps to pass, thereby reducing correlation among the convolutional layer output maps)…
Tao and Yang are combinable because they are from the field of neural network image processing.
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to apply the mask layer of Yang to the neural network processing of Tao.
The suggestion/motivation for doing so would have been to reduce computation time and resources (Yang Section 2.2, remove elements other than key parts, thereby avoiding the need to process less important information).
Therefore, it would have been obvious to combine Tao with Yang to obtain the invention as specified in claim 1.
Applying these teachings as they are applied to claim 1 above to claims 2, 6, 16, & 19:
Claim 2: The method of claim 1 (see above), wherein the transforming and the convolving are implemented in an adaptive CNN comprising:
a first convolution layer, linked to the pre-trained CNN and configured to transform the first feature map into the plurality of sub-feature maps (Tao paragraphs 0056-0058, deep convolutional network DCN including first convolutional layer C1, convolutional layer applies a plurality of filters (generating a plurality of sub-feature maps from each of the plurality of filters)); and
a second convolution layer, linked to the first convolution layer and configured to convolve each of the sub-feature maps with one of the adaptive convolution kernels (Tao paragraphs 0056-0058, deep convolutional network DCN including second convolutional layer C2, convolutional layer applies a plurality of filters (generating a plurality of sub-feature maps from each of the plurality of filters)), respectively.
Claim 6: The method of claim 2 (see above), wherein the mask layer is linked to the second convolution layer of the adaptive CNN (Yang Abstract, Section 2.2 and Figure 1, mask layer situated between convolutional layers (i.e. first and second convolutional layers)).
Claim 16: A system for adapting a pre-trained CNN to a target video, comprising:
a memory that stores executable components (Tao paragraphs 0095 & 0110-0111, program memory); and
a processor electrically coupled to the memory (Tao paragraphs 0095 & 0110-0111, program memory associated with processor) to execute the method of claim 1 (see above).
Claim 19: A non-transitory computer readable storage medium for storing computer readable instructions (Tao paragraphs 0013 & 0112, software implemented in non-transitory computer-readable medium) executable by a processor (Tao paragraphs 0095 & 0110-0111, program memory associated with processor) to perform the method of claim 1 (see above).
Claim 21 is rejected under 35 U.S.C. 103(a) as being unpatentable over Tao in view of Yang as applied to claim 1, and further in view of Ranjan (US 20180211099, cited in 5/18/21 Office Action).
Tao in view of Yang discloses the invention of claim 1 (see above).
Tao in view of Yang does not expressly disclose the training of adaptive convolution kernels under different loss criteria.
Ranjan discloses:
Claim 21: The method of claim 1 (see above), wherein each of the adaptive convolution kernels is trained under a different loss criterion (Ranjan paragraph 0037, task-specific loss function).
Tao in view of Yang and Ranjan are combinable because they are from the field of neural network image processing.
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to apply the specific loss functions of Ranjan to the neural network training of Tao in view of Yang.
The suggestion/motivation for doing so would have been to provide task-specific specialized network training, thereby specifically tailoring the network function for those tasks (Ranjan paragraphs 0036-0037, task-specific loss functions).
Therefore, it would have been obvious to combine Tao in view of Yang with Ranjan to obtain the invention as specified in claim 21.
Allowable Subject Matter
Claims 7-15, 17-18, & 20 are allowed.
Claims 3-4 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:
With respect to claim 3 (and dependent claim 4), the art of record does not teach or suggest the recited arrangement of feeding a first frame training sample forward through a pre-trained convolutional neural network (CNN) and an adaptive CNN, comparing the resulting image to a first frame ground truth, repeatedly back-propagating the first training errors through the pre-trained CNN and the adaptive CNN to train adaptive convolution kernels and obtain a plurality of parameters, grouping a parameter having the smallest training error and the remaining parameters into ensemble and candidate sets, and optimizing the parameters grouped into the candidate set according to a subsequent video frame in conjunction with the recited arrangement of generating a plurality of sub-feature maps, reducing sub-feature map correlation by a mask layer, and outputting a plurality of sub-feature maps.
With respect to claims 7 & 17 (and dependent claims 8-15, 18, & 20), the art of record does not teach or suggest the recited region of interest determination; forward feed of the region of interest through a convolutional neural network; first, second, and third location determination; and first, second, and third scale estimation in conjunction with the recited convolutional neural network feature map generation and network training arrangement.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Kim and El-Khamy disclose examples of a convolutional network having a mask layer.
Any inquiry concerning the contents of this communication or earlier communications from the examiner should be directed to Stephen M. Brinich at 571-272-7430 (voice) or 571-273-7430 (fax).
Any inquiry relating to the status of this application, entry of papers into this application, or other any inquiries of a general nature concerning application processing should be directed to the Tech Center 2600 Customer Service center at 571-272-2600 or to the USPTO Contact Center at 800-786-9199 or 571-272-1000.
The examiner can normally be reached on weekdays 7:30-4:00 Eastern Time.
If attempts to contact the examiner and the Customer Service Center are unsuccessful, supervisor Claire Wang can be contacted at 571-270-1051.
Hand-carried correspondence may be delivered to the Customer Service Window, located at the Randolph Building, 401 Dulany Street, Alexandria, VA 22314.
/S. M. B./
Examiner, Art Unit 2663
/CLAIRE X WANG/     Supervisory Patent Examiner, Art Unit 2663