DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of the Claims
Original claims 1-14, filed December 22, 2020, are pending.

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on December 22, 2020, is being considered by the examiner.

Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 
The following title is suggested: Online Distillation Using Frame Cache.

Claim Objections
Claim 1 is objected to because of the following informality: “and” should be inserted at the end of line 22 on page 33.  Appropriate correction is required.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 

Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are:
the “storage unit” of claim 1,
the “storage unit” of claim 13, and
the “storage unit” of claim 14.

Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1-3, 6-8, and 13-14 is/are rejected under 35 U.S.C. 103 as being unpatentable over ‘Yoshioka’ (“Dataset Culling: Towards Efficient Training of Distillation-Based Domain Specific Models,” 2019).
Regarding claim 1, Yoshioka teaches an image analysis apparatus (see Note Regarding Apparatus below) comprising:
a controller (see Note Regarding Apparatus below); and 
a storage unit (see Note Regarding Apparatus below; Note interpretation under 35 U.S.C. 112(f); Corresponding structure, material and acts, and equivalents thereof, includes a computer memory – see e.g. [0046] of published application), 
wherein 
the controller is programmed to sequentially obtain a plurality of image frames that constitute moving image data (Fig. 2, raw dataset input; e.g. Sec. 3, 2nd par., dataset is a video, which is a sequentially obtained plurality of image frames that constitute moving image data), 
the controller is programmed to analyze the plurality of image frames by using a first image analysis model (e.g. Fig. 2, stage 2, compute teacher prediction; Also see pg. 3238, last par.; The teacher model is the first image analysis model), 
the controller is programmed to analyze the plurality of image frames by using a second image analysis model (e.g. Fig. 2, stage 1, compute student prediction; The student model is the second image analysis model), a processing speed of the second image analysis model is higher than that of the first image analysis model (e.g. Sec. 1, 2nd par., student model is much smaller – and therefore has a higher processing speed – than teacher; Pg. 3238, last par., “computationally-expensive teacher model”), an analysis accuracy of the second image analysis model is lower than that of the first image analysis model (e.g. Sec. 1, 1st par., “Since CNNs generally obtain better classification performance with larger networks, there exists a tradeoff between accuracy and computation cost (or efficiency)”; e.g. Sec. 3238, last par., outputs of teacher are treated as ground truth – i.e. as having perfect accuracy, which will necessarily be higher than the student’s accuracy), the second image analysis model is allowed to be updated by using a result of an analysis performed by using the first image analysis model (e.g. pg. 3240, Table 1 and Results, second image analysis is trained on culled dataset that was produced using results of teacher as shown in Fig. 2; also e.g. Pg. 3238, last par., teacher outputs are used as ground truth), 
the storage unit stores therein a plurality of analyzed frames (e.g. Fig. 2, frames that advance to stage 2, and therefore have been analyzed by both student and teacher), which are image frames of the plurality of image frames that are already analyzed by using the first and second image analysis models (e.g. Fig. 2, frames that advance to stage 2, and therefore have been analyzed by both student and teacher), in association with an evaluation value for evaluating a result of an analysis performed by using the second image analysis model on the analyzed frames (Fig. 2, evaluation value is precision loss                                 
                                    
                                        
                                            L
                                        
                                        
                                            p
                                            r
                                            e
                                            c
                                        
                                    
                                
                            ; Pg. 3238, last par., precision loss is an evaluation of the prediction analysis performed by the student based on comparing it to the teacher’s prediction), 
the controller is programmed to extract at least one analyzed frame that satisfies a predetermined extraction condition based on the evaluation value, from the plurality of analyzed frames stored in the storage unit (Fig. 2, stage 2,                                 
                                    n
                                
                             frames with largest                                 
                                    
                                        
                                            L
                                        
                                        
                                            p
                                            r
                                            e
                                            c
                                        
                                    
                                
                             are extracted), 
the controller is programmed to update the second image analysis model by using a result of an analysis performed by using the first image analysis model on the extracted analyzed frame and a result of an analysis performed by using the first image analysis model on a new frame that is a newly obtained image frame of the plurality of image frames (e.g. Fig. 2, culling is applied to an entire dataset, including an earlier frame that may be extracted and included in the culled dataset, as well as a later frame that is included in the culled dataset; The later frame is a new frame that is a newly obtained image frame of the plurality of image frames at least because it corresponds to a newer/later time than the earlier extracted frame).

Note Regarding Apparatus.  While Yoshioka certainly implies the use of a computer (e.g. pg. 3237, second-to-last par., computation cost and GPU), Yoshioka focuses on describing an image analysis algorithm (e.g. Fig. 2) and does not discuss details of its computer implementation.  
In particular, Yoshioka does not explicitly teach implementing its algorithm as an image analysis apparatus comprising: a controller, and a storage unit, wherein the controller performs steps of the algorithm and the storage unit stores analyzed frames in association with evaluation values.
However, Examiner takes Official Notice that it is old and well-known in the art of image analysis to implement an algorithm as an image analysis apparatus comprising: a controller, and a storage unit, such as a computer memory, wherein the controller performs steps of the algorithm and the storage unit stores various data used and processed by the algorithm.  Such computer implementation advantageously allows an algorithm to be performed quickly and efficiently.
Examiner notes that the algorithm of Yoshioka uses and processes analyzed frames and associated evaluation values, and so these data would be stored in the storage unit were Yoshioka modified with the known computer implementation.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to implement the algorithm of Yoshioka as an image analysis apparatus comprising: a controller, and a storage unit, such as a computer memory, wherein the controller performs steps of the algorithm and the storage unit stores various data used and processed by the algorithm, with the reasonable expectation that this would result in an algorithm that advantageously could be performed quickly and efficiently.  
Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Yoshioka to obtain the invention as specified in claim 1.	

Regarding claim 2, Yoshioka teaches the apparatus of claim 1, and Yoshioka further teaches that 
the extraction condition includes a condition based on magnitude of the evaluation value (Fig. 2, stage 2, extraction condition is selecting                                 
                                    n
                                
                             frames with highest evaluation values                                 
                                    
                                        
                                            L
                                        
                                        
                                            p
                                            r
                                            e
                                            c
                                        
                                    
                                
                            , and therefore is based on a magnitude of the evaluation value).

Regarding claim 3, Yoshioka teaches the apparatus of claim 2, and Yoshioka further teaches that 
the evaluation value decreases more as a result of the analysis performed by using the second image analysis model becomes better (e.g. Fig. 2, as its name implies, confidence loss will decrease as the analysis by the student/second model improves – i.e. becomes more confident; Also see e.g. Sec. 2.1, first and last pars., the point of the culling is to select only the most difficult frames, for which the analysis by the student is worst, so that means that lower                                 
                                    
                                        
                                            L
                                        
                                        
                                            p
                                            r
                                            e
                                            c
                                        
                                    
                                
                             values correspond to better student/second model performance), and 
the extraction condition includes at least one of 
a condition that a probability of extracting the analyzed frame increases as the associated evaluation value increases, 
a condition that the analyzed frame with which the associated evaluation value that is maximum or that is larger than a first threshold value is associated is extracted, and 
a condition that a predetermined number of the analyzed frames are extracted in descending order of the associated evaluation values (Fig. 2, stage 2, predetermined number of frames                                 
                                    n
                                
                             with largest evaluation values                                 
                                    
                                        
                                            L
                                        
                                        
                                            p
                                            r
                                            e
                                            c
                                        
                                    
                                
                             are extracted).

Regarding claim 6, Yoshioka teaches the apparatus of claim 1, and Yoshioka further teaches that 
the controller is programmed to discard at least one analyzed frame that satisfies a predetermined discard condition based on the evaluation value, from the plurality of analyzed frames stored in the storage unit (e.g. Fig. 2, stage 2, frames that do not have                                 
                                    n
                                
                             largest evaluation value                                 
                                    
                                        
                                            L
                                        
                                        
                                            p
                                            r
                                            e
                                            c
                                        
                                    
                                
                             values are not picked and are instead discarded/culled).

Regarding claim 7, Yoshioka teaches the apparatus of claim 6, and Yoshioka further teaches that 
the discard condition includes a condition based on magnitude of the evaluation value (Fig. 2, stage 2, frames are discarded if the magnitude of their evaluation value                                 
                                    
                                        
                                            L
                                        
                                        
                                            p
                                            r
                                            e
                                            c
                                        
                                    
                                
                             does not fall within the highest                                 
                                    n
                                
                             values).

Regarding claim 8, Yoshioka teaches the apparatus of claim 7, and Yoshioka further teaches that
the evaluation value decreases more as a result of the analysis performed by using the second image analysis model becomes better (e.g. Fig. 2, as its name implies, confidence loss will decrease as the analysis by the student/second model improves – i.e. becomes more confident; Also see e.g. Sec. 2.1, first and last pars., the point of the culling is to select only the most difficult frames, for which the analysis by the student is worst, so that means that lower                                 
                                    
                                        
                                            L
                                        
                                        
                                            p
                                            r
                                            e
                                            c
                                        
                                    
                                
                             values correspond to better student/second model performance), and
the discard condition includes at least one of 
a condition that a probability of discarding the analyzed frame increases as the associated evaluation value decreases, 
a condition that the analyzed frame with which the associated evaluation value that is minimum or that is less than a second threshold value is associated is discarded (Fig. 2, any frames with an evaluation value                                 
                                    
                                        
                                            L
                                        
                                        
                                            p
                                            r
                                            e
                                            c
                                        
                                    
                                
                             that is less than the value of the                                 
                                    n
                                
                            th-highest precision loss are discarded; Accordingly, the value of the                                 
                                    n
                                
                            th-highest precision loss serves as a second threshold value), and 
a condition that a predetermined number of the analyzed frames are discarded in ascending order of the associated evaluation value (Fig. 2, frames with                                 
                                    n
                                
                             highest evaluation values are selected and the rest are discarded; If                                 
                                    N
                                
                             is the total number of frames in the raw dataset, then                                 
                                    N
                                    -
                                    n
                                
                             frames are discarded in ascending order of the associated evaluation value; e.g. pg. 3240, Results,                                 
                                    n
                                
                             has a predetermined value, such as 256; Par. spanning pgs. 3239-3240, in at least some circumstances,                                 
                                    N
                                
                             is also predetermined, such as dataset for which                                 
                                    N
                                    =
                                    86,400
                                
                            ; In circumstances where both                                 
                                    n
                                
                             and                                 
                                    N
                                
                             are predetermined, then the number of discarded frames                                 
                                    N
                                    -
                                    n
                                
                             is also predetermined).

Regarding claim 13, Examiner notes that the claim recites a method that is substantially the same as the method performed by the apparatus of claim 1.  Yoshioka teaches the apparatus of claim 1.  Accordingly, claim 13 is also rejected under 35 U.S.C. 103 as being unpatentable over Yoshioka for substantially the same reasons as claim 1.



Regarding claim 14, Examiner notes that the claim recites a non-transitory program recording medium on which a computer program for allowing a computer to execute an image analysis method is recorded, the image analysis method being substantially the same as the method performed by the apparatus of claim 1.
Yoshioka teaches the apparatus of claim 1.
While Yoshioka certainly implies the use of a computer (e.g. pg. 3237, second-to-last par., computation cost and GPU), Yoshioka focuses on describing an image analysis method (e.g. Fig. 2) and does not discuss details of its computer implementation.  
In particular, Yoshioka does not explicitly teach implementing its method as a non-transitory program recording medium on which a computer program for allowing a computer to execute the method is recorded.
However, Examiner takes Official Notice that it is old and well-known in the art of image analysis to implement a method as a non-transitory program recording medium on which a computer program for allowing a computer to execute the method is recorded.
Such computer implementation advantageously allows an algorithm to be performed quickly and efficiently.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to implement the method of Yoshioka as described above with respect to claim 1 as a non-transitory program recording medium on which a computer program for allowing a computer to execute the method is recorded, with the reasonable expectation that this would result in a method that advantageously could be performed quickly and efficiently.  
Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Yoshioka to obtain the invention as specified in claim 14.	

Allowable Subject Matter
Claims 4-5 and 9-12 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
The following prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
‘Cioppa’ (“ARTHuS: Adaptive Real-Time Human Segmentation in Sports through Online Distillation,” 2019)
Periodically updates a training dataset                     
                        D
                    
                 used for online distillation – pgs. 2506-2507, “Our online knowledge distillation method, ARTHuS” and Algorithm 1
Selects only one of every three frames in dataset to use for training – pg. 2507, right col., Training
Suggests that “the update strategy of                     
                        D
                    
                 can be revised in order to keep the possibility to use older frames if they are more informative than the new ones” – Sec. 4, last par.
‘Mullapudi’ (“Online Model Distillation for Efficient Video Inference,” 2018)
Selects a frame at each stride, evaluates both a student and a teacher model for that frame, and retrains the student if its output does not sufficiently match the output of the teacher – e.g. Fig. 3
The stride is variable and based on whether the training is able to improve the student’s performance sufficiently – Sec. 3.3, 3rd par., and Fig. 3, lines 19-22
‘Anantha’ (US 2021/0089833 A1)
Uses evaluation values indicating similarity between predictions of lightweight and heavyweight models to select data for training a decision tree – Figs. 3 and 5
‘Li’ (US 2016/0078339 A1)
General example of student-teacher knowledge distillation
‘Chen’ (US 10,990,850 B1)
Another general example of student-teacher knowledge distillation

Any inquiry concerning this communication or earlier communications from the examiner should be directed to GEOFFREY E SUMMERS whose telephone number is (571)272-9915. The examiner can normally be reached Monday-Friday, 7:00 AM to 3:30 PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached on (571) 272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/GEOFFREY E SUMMERS/Examiner, Art Unit 2669