DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . This is in response to the applicant’s reply filed May 2, 2021. In the applicant’s reply; claims 1, 3, 6, 8, 10-13, 15, 17 and 19-20 were amended.  Claims 1-20 are pending in this application.
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Examiner’s Responses to Applicant’s Remark
Applicants' amendments filed on May 25, 2021 have been fully considered. The amendments overcome the following rejections set forth in the office action mailed on March 18, 2021.
Applicant’s amendments overcome the rejections of Claims 1-20 under 35 U.S.C. 103(a) as being unpatentable over Zhang et al. (US PGPub US 2010/0321513), hereby referred to as “Zhang”, in view of Jung et al. (US PGPub US 2011/0255741), hereby referred to as “Jung”, and the rejection is hereby withdrawn. 
Applicant’s amendments overcome the objection to the title of the specification, and the objection is hereby withdrawn. 
Applicant’s amendments overcome the rejections of claims 17-20 under 35 U.S.C. 101 for being directed to non-statutory subject matter, and the rejection is hereby withdrawn. 

Applicant's arguments with respect to claims 1-20 have been considered but are moot in view of the new ground(s) of rejection, presented below and necessitated by applicant’s amendments. 

Applicants' arguments filed on May 25, 2021 have been fully considered but they are not persuasive. The Examiner has thoroughly reviewed Applicants' arguments but firmly believes that the cited reference to reasonably and properly meet the claimed limitation. 

Applicant requested reconsideration and withdrawal of the double patenting rejections in light of applicant’s amendments. 
Examiner has considered the amendments submitted by the applicant and they do not overcome the double patenting rejections set forth in the previous office action, and the rejection is maintained.  

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-20 of U.S. Application No. 16/706,608. Although the claims at issue are not identical, they are not patentably distinct from each other because they are both directed towards using variance-based image analysis of vehicular environments for vehicular control . This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103(a) as being unpatentable over Jung et al. (US PGPub US 2011/0255741), hereby referred to as “Jung”, in view of Kefi-Fatteh, Takoua, et al. "Human face detection improvement using incremental learning based on low variance directions." Signal, Image and Video Processing 13.8 (published May 2019): 1503-1510), hereby referred to as “Fatteh”.  

Consider Claims 1, 8 and 15. 
Jung teaches: 
1. A method comprising: / 8. A system comprising: one or more processors; and one or more non-transitory computer-readable media that, when executed by the one or more processors, cause the system to perform operations comprising: / 15. One or more non-transitory computer-readable media that, when executed by one or more processors, cause the one or more processors to perform operations comprising: (Jung: abstract, A computer implemented method for detecting the presence of one or more pedestrians in the vicinity of the vehicle is disclosed. Imagery of a scene is received from at least one image capturing device. A depth map is derived from the imagery. A plurality of pedestrian candidate regions of interest (ROis) is detected from the depth map by matching each of the plurality of ROis with a 3D human shape model. At least a portion of the candidate ROIs is classified by employing a cascade of classifiers tuned for a plurality of depth bands and trained on a filtered representation of data within the portion of candidate RO Is to determine whether at least one pedestrian is proximal to the vehicle.)
1. receiving data associated with environments of vehicles; / 8. receiving sensor data from a sensor associated with environments of vehicles; / 15. determining, based on sensor data received from a sensor associated with a vehicle, (Jung: [0042]-[0047], [0043] FIG. 1 depicts a vehicle 100 that is equipped with an exemplary digital processing system 110 configured to acquire a plurality of images and detect the presence of one or more pedestrians 102 in a scene 104 in the vicinity of the vehicle 100, according to an embodiment of the present invention.)
1. determining annotated data based at least in part on the data, wherein the annotated data comprises an annotated high variance region in the data and an annotated low variance region in the data; / 8. annotating, as annotated data and based on the sensor adata, an annotated low variance region associated with the sensor data; / 15. annotated data, wherein the annotated data comprises one or more of an annotated low variance region or an annotated high variance region; (Jung: [0050] In block S4, a structure classification (SC) module employs a combined image derived from the pyramid of depth images, DO+Dl+D2, to classify image regions into several broad categories such as tall vertical structures, overhanging structures, ground, and poles and to remove pedestrian candidate regions having a significant overlap. These image regions classified as non-pedestrians are provided with scene labels 142. In block SS, the scene labels 142 are fused with the pedestrian candidate regions to produce a pruned set of pedestrian regions-of-interest (ROis ). In block S6, a pedestrian classification (PC) module takes in the list of pedestrian ROis and confirms valid pedestrian detections 144 by using a cascade of classifiers tuned for several depth bands and trained on a combination of pedestrian contour and gradient features. [0056] To further classify the patches, in step 618, a representation from the range map is created called a vertical support (VS) histogram. More particularly, a discrete 2D grid of the world X-coordinates and the world disparities is defined. Each point from the range map which satisfies a given distance range and a given height range is projected to a cell on the grid and its height recorded. For each bin, the variance of heights of all the points projected in the bin is computed. This provides a 2D histogram in X-d coordinates which measures the support at a given world location from any visible structure above it.)
1. inputting the data into a model; / 8. training a model based at least in part on the annotated data and the sensor data to generate a trained model, / 15. inputting the sensor data into a model; (Jung: [0048], FIG. 3 is a block diagram illustrating exemplary software modules that execute the steps of a method for detecting a pedestrian in the vicinity of the vehicle, according to an embodiment of the present invention. Referring now to FIGS. 1-3, in block Sl, at least one image of the scene is received by one or more image capturing devices 106 from the vehicle 100. In block S2, at least one stereo depth map is derived from the at least one image. In a preferred embodiment, disparities are generated at a plurality of pyramid resolutions, preferably three-Di, i=l, ... , 3, with DO being the resolution of the input image.)
1. determining, by the model, an output comprising a high variance output and a low variance output; / 8. the trained model configured to output an indication of a low variance region and an indication of a high variance region based at least in part on an input; / 15. determining, by the model, an output comprising a high variance output and a low variance output; (Jung: [0051] FIG. 4 depicts exemplary steps executed by the pedestrian detector (PD) module 400 in greater detail, according to an embodiment of the present invention. In the PD module 400, template matching is conducted using a 3D pedestrian shape template applied to a plurality ( e.g., three) disjoint range bands in front of the vehicle 100. The 3D shape size is a predetermined function of the actual range from the image capturing devices 106. [0052] As mentioned above, in step 402, depth maps are  obtained at separate image resolutions, Dl, i=l, ... , 3.)
1. and transmitting the model to a vehicle configured to be controlled by another output of the model. / 8. and transmitting the trained model to a vehicle configured to be controlled by another output of the model. / 15. and transmitting the model to a vehicle configured to be controlled by another output of the model. (Jung: Figure 4, [0053] FIGS. 5A-5D are visual depictions of an example of pedestrian ROI refinement, according to an embodiment of the present invention. Depth map based detected ROIs are further refined by examining a combination of depth and edge features of two individual pedestrian detections. In step 413, a new pedestrian ROI is initialized at each detected peak, which is refined first horizontally and then vertically to obtain a more centered and tightly fitting bounding box about a candidate pedestrian. This involves employing vertical and horizontal projections, respectively, of binarized disparity maps (similar to using the edge pixels above) followed by detection of peak and valley locations in the computed projections. After this refinement, in step 414, any resulting overlapping detections are again removed from the detection list. Jung: [0046] Portions of a processed video/audio data stream 130 may be stored temporarily in the computer readable medium 128 for later output to an on-board monitor 132, to an onboard automatic collision avoidance system 134, or to a network 136, such as the Internet. [0071]-[0073], [0071] FIGS, 12A-12C depict system performance based on different criteria, System performance was analyzed in terms of different distance intervals, which permit gauging the effectiveness of the system from an application point of view: low latency and high accuracy detection at short distances as well as distant target detection of potential threats of collisions, [0073] Performance was further analyzed in terms of another criteria that determines effectiveness for collision avoidance purposes,)
Jung does not teach: 
determining, by the model, an output comprising a low variance output including a first feature detection and a high variance output including a second feature detection based on the first feature detection
Fatteh teaches: 
1. A method comprising: / 8. (Currently Amended) A system comprising: one or more processors; and one or more non-transitory computer-readable media that, when executed by the one or more processors, cause the system to perform operations comprising: / 15. (Currently Amended) One or more non-transitory computer-readable media that, when executed by one or more processors, cause the one or more processors to perform operations comprising: (Fatteh: abstract, Face detection technology has been a hot topic in the past few decades. It has been maturely applied to many practical areas. Therefore, introducing an outperforming model is needed. Nevertheless, the proposed algorithms do not alter with the dynamic aspect of data and result in a high computational complexity. This paper expounds on how to promote the face detection rate from complex pictures by the means of one-class incremental learning strategy, while using low variance directions to project data. In fact, it has been shown that taking into account the information carried by low variance direction may improve the accuracy of the model in one-class classification problems. Besides, incremental learning is known to be compelling, especially in the case of dynamic data. A comparative evaluation of the proposed approach is performed in a decontextualized evaluation framework. Then a contextualized evaluation is conducted to show the effectiveness of the approach in the context of face detection.)
1. receiving data associated with facial detection; / 8. receiving sensor data from a sensor associated with environments of facial detection; / 15. determining, based on sensor data received from a sensor associated with facial detection, (Fatteh: page 1504 section 2 Related work on face detection Incremental learning performs effectively in dynamic environments, where more or less unstructured dataset is in continuous extension, such as the case of the facial data acquired by airport security systems. In fact, face images are sequentially obtained over the time and cannot be available when the training stage starts. Consequently, two major obstacles may appear: (i) the dimensionality of the treated objects is usually high and ever-increasing, and (ii) a huge number of non-face images may occur, without following a regular distribution. Thus, incremental learning approaches are a good candidate for face detection problems. They have been adopted for face detection in order to build performing systems. For instance, they have been used in [15] for feature extraction. In the proposed algorithm, incremental unsupervised approach is combined with semi-supervised approach for feature extraction in order to obtain objective function and solve the out-of-sample learning problem. page 1507 section 4.2.1 Synthetic datasets used)
1. determining annotated data based at least in part on the data, wherein the annotated data comprises an annotated high variance region in the data and an annotated low variance region in the data; / 8. annotating, as annotated data and based on the sensor data, an annotated low variance region associated with the sensor data; / 15. annotated data, wherein the annotated data comprises one or more of an annotated low variance (Fatteh: page 1507 section 4.2.1 Synthetic datasets used Knowing that the decontextualized evaluation aims to tease out the performance of the iCOSVM in a general context, we used data sets that are not related to human face detection .We have generated several 2Dbinary class datasets using specific distributions. page 1508 section 4.3 Contextualized evaluation and comparison subsection 4.3.1 Face datasets used; Face Detection Dataset and Benchmark (FDDB)database [27]: The FDDB includes 2845 images. It is known with a wide variation in backgrounds, appearance, poses and lighting. We used an automatic face detector [28] to extract the annotated faces. The obtained dataset contains 5171 faces, as target class, and 175 non-faces, as outliers.)
1. inputting the data into a model; / 8. training a model based at least in part on the annotated data and the sensor data to generate a trained model, / inputting the sensor data into a model; (Fatteh: page 1508 section 4.3 Contextualized evaluation and comparison subsection 4.3.1 Face datasets used; The BioID, the NUAA and the FERET databases are originally intended for face recognition, and they have been used for several applications throughout the literature from emotion recognition to specific facial feature recognition, etc. However, our purpose is to build a basic face detection application that can act as a “proof-of-concept” system for comparison purposes among contemporary SVM-based incremental classifiers. Therefore, 300 non-face images were added to each one of them. Note that the face images of all the databases are aligned to the same size 19 × 19 pixels.)
1. determining, by the model, an output comprising a low variance output including a first feature detection and a high variance output including a second feature detection based on the first feature detection; / 8. the trained model configured to output an indication of a low variance region including a first feature detection and an indication of a high variance region including a second feature detection based at least in part on the first feature detection an input; / 15. determining, by the model, an output comprising a low variance output including a first feature detection and a high variance output including a second feature detection based on the first feature detection; (Fatteh: section 2, page 1505-1506 Therefore, in this paper we solve this problem by using an incremental Covariance-guided One-Class SVM (iCOSVM) approach. In fact, iCOSVM incrementally emphasizes the low variance direction to improve classification discriminative power and then classification’s performance. It estimates the covariance matrix in the kernel space and incorporates it in the optimization problem, which controls the direction of the separating hyperplane. Since the optimization problem is still convex, it results in one global optimum solution. Hence, we can use a numerical method to solve it efficiently. Besides, the incorporation of the low variance in an incremental SVM-based method is an effective tool. Section 3.2 Incremental One-Class Covariance-guided Support Vector Machine (iCOSVM) The basic principle of the incremental Covariance-guided One-Class SVM (iCOSVM) is plugging the covariance matrix in the dual optimization problem of Eq. (1). The covariance matrix has, indeed, the needed information both along high and low variances. Since we are dealing with a minimization problem, we can presume that the merge of the supplementary term, i.e., the covariance matrix, will lead o emphasizing the low variance directions.)
1. determining a difference between the output and the annotated data; 1. altering one or more parameters of the model based at least in part on the difference; / 15. determining a difference between the annotated data and the output; / 15. altering one or more parameters associated with the model based at least in part on the difference; (Fatteh: page 1506-1507 section 3 Face Detection Method; Algorithm 1 Incremental Covariance-guided Face Detection Algorithm Input: Collect images of human faces from scenes and convert them into a matrix to train the iCOSVM. Step 1: Initialize parameters αi , b and R. Step 2: Compute R matrix, and update β and γ parameters, respectively, using to equations (10), (9) and (13),. Step 3: Estimate the largest increase of Δαc in order to satisfy the KKT conditions. Step 4: Recompute the needed parameters and update the subsets S,
E and O. Output: The testing images are classified as faces or non-faces.)
It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify Jung’s real-time pedestrian detection in the realm vehicular imaging in order to leverage Fatteh’s algorithm for incremental learning for human face detection based on low variance directions. The determination of obviousness is predicated upon the following findings:  One skilled in the art would have been motivated to improve the overall accuracy for real-time pedestrian detection of Jung using Fatteh’s algorithm for incremental learning based facial detection for accuracy and computational efficacy.  Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in the manner explained above using known engineering design, interface and programming techniques, without changing a “fundamental” operating principle of Jung, while the teaching of Fatteh continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result of leveraging the variance-based image analysis algorithm for the context of vehicular imaging and detection. It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question.

Consider Claims 2, 14 and 16. 
The combination of Jung and Fatteh teaches: 
2. The method as claim 1 recites, wherein the annotated low variance region is determined from one or more statistical models./ 14. The system as claim 8 recites, wherein the annotated low variance region is determined from one or more statistical models based at least in part on one or more of entropy, pixel intensity, or aspect ratios associated with indications of low variance regions. / 16. The one or more non-transitory computer-readable media as claim 15 recites, wherein determining the annotated data comprises determining the annotated low variance region based at least in part on a statistical model associated with one or more of the sensor data or an intermediary output of the model based at least in part on the sensor data. (Jung: Figure 4, [0053] FIGS. 5A-5D are visual depictions of an example of pedestrian ROI refinement, according to an embodiment of the present invention. Depth map based detected ROIs are further refined by examining a combination of depth and edge features of two individual pedestrian detections. In step 413, a new pedestrian ROI is initialized at each detected peak, which is refined first horizontally and then vertically to obtain a more centered and tightly fitting bounding box about a candidate pedestrian. This involves employing vertical and horizontal projections, respectively, of binarized disparity maps (similar to using the edge pixels above) followed by detection of peak and valley locations in the computed projections. After this refinement, in step 414, any resulting overlapping detections are again removed from the detection list. [0079] The input ROI 1502 to the multi-layer convolutional network 1500 may be preprocessed before propagation through the network 1500, according to an embodiment of the present invention. In a preferred embodiment, the input ROI 1502 may comprise an 80x40 pixel block. Contrast normalization is applied to the input ROI 1502. Each pixel's intensity is divided by the standard deviation of the surrounding neighborhood pixels ( e.g., a 7x7 pixel neighborhood). This preprocessing step increases contrast in low-contrast regions and decreases contrast in high-contrast regions.)

Consider Claims 3 and 10. 
The combination of Jung and Fatteh teaches: 
3. The method as claim 1 recites, wherein the annotated low variance region is determined based at least in part on a feature associated with the data. / 10. The system as claim 9 recites, wherein the annotated low variance region is associated with the first feature detection of the sensor data determined by the model. (Jung: [0050] In block S4, a structure classification (SC) module employs a combined image derived from the pyramid of depth images, DO+Dl+D2, to classify image regions into several broad categories such as tall vertical structures, overhanging structures, ground, and poles and to remove pedestrian candidate regions having a significant overlap. These image regions classified as non-pedestrians are provided with scene labels 142. In block SS, the scene labels 142 are fused with the pedestrian candidate regions to produce a pruned set of pedestrian regions-of-interest (ROis ). In block S6, a pedestrian classification (PC) module takes in the list of pedestrian ROis and confirms valid pedestrian detections 144 by using a cascade of classifiers tuned for several depth bands and trained on a combination of pedestrian contour and gradient features. [0056] To further classify the patches, in step 618, a representation from the range map is created called a vertical support (VS) histogram. Fatteh: page 1507 section 4.2.1 Synthetic datasets used Knowing that the decontextualized evaluation aims to tease out the performance of the iCOSVM in a general context, we used data sets that are not related to human face detection .We have generated several 2Dbinary class datasets using specific distributions. page 1508 section 4.3 Contextualized evaluation and comparison subsection 4.3.1 Face datasets used; Face Detection Dataset and Benchmark (FDDB)database [27]: The FDDB includes 2845 images. It is known with a wide variation in backgrounds, appearance, poses and lighting. We used an automatic face detector [28] to extract the annotated faces. The obtained dataset contains 5171 faces, as target class, and 175 non-faces, as outliers. section 2, page 1505-1506 Therefore, in this paper we solve this problem by using an incremental Covariance-guided One-Class SVM (iCOSVM) approach. In fact, iCOSVM incrementally emphasizes the low variance direction to improve classification discriminative power and then classification’s performance. It estimates the covariance matrix in the kernel space and incorporates it in the optimization problem, which controls the direction of the separating hyperplane. Since the optimization problem is still convex, it results in one global optimum solution. Hence, we can use a numerical method to solve it efficiently. Besides, the incorporation of the low variance in an incremental SVM-based method is an effective tool.)

Consider Claims 4 and 11. 
The combination of Jung and Fatteh teaches: 
4. The method as claim 3 recites, the method further comprising: inputting the feature into an additional model; receiving, from the additional model, a reconstructed output; and determining a loss based on a difference between the reconstructed output and the data, wherein altering the one or more parameters is further based at least in part on the loss./ 11. The system as claim 10 recites, the operations further comprising: mapping the first feature detection to a reconstructed input; and determining, as a loss, a difference between the sensor data and the reconstructed input, wherein training the model is further based at  (Jung: [0063]-[0065] For candidate ROIs (pedestrians) located at greater distances beyond a predetermined threshold, a cascade of HOG based classifiers is employed, HOG-based classifiers have been proven to be effective for relatively low-resolution images when body contours are distinguishable from the background. Each HOG classifier is trained separately for each resolution band, For this purpose, in the training phase. Zhang: [0031] The centroid around variance of the upper half image and lower half image are then compared to the image adaptive threshold VARTH, in the steps 308 and 312, respectively. If any of them is less than the threshold, a confidence value is set to a value (e.g. 3) and the process jumps to the object of interest isolation module. Otherwise, the processing is continued to a transition map-based detection. [0038] Fatteh: page 1507 section 4.2.1 Synthetic datasets used Knowing that the decontextualized evaluation aims to tease out the performance of the iCOSVM in a general context, we used data sets that are not related to human face detection .We have generated several 2Dbinary class datasets using specific distributions. page 1508 section 4.3 Contextualized evaluation and comparison subsection 4.3.1 Face datasets used; Face Detection Dataset and Benchmark (FDDB)database [27]: The FDDB includes 2845 images. It is known with a wide variation in backgrounds, appearance, poses and lighting. We used an automatic face detector [28] to extract the annotated faces. The obtained dataset contains 5171 faces, as target class, and 175 non-faces, as outliers. section 2, page 1505-1506 Therefore, in this paper we solve this problem by using an incremental Covariance-guided One-Class SVM (iCOSVM) approach. In fact, iCOSVM incrementally emphasizes the low variance direction to improve classification discriminative power and then classification’s performance. It estimates the covariance matrix in the kernel space and incorporates it in the optimization problem, which controls the direction of the separating hyperplane. Since the optimization problem is still convex, it results in one global optimum solution. Hence, we can use a numerical method to solve it efficiently. Besides, the incorporation of the low variance in an incremental SVM-based method is an effective tool.)

Consider Claim 18. The combination of Jung and Fatteh teaches: The one or more non-transitory computer-readable media as claim 17 recites, wherein altering the one or more parameters is further based at least in part on the second difference. (Jung: [0063]-[0065] For candidate ROIs (pedestrians) located at greater distances beyond a predetermined threshold, a cascade of HOG based classifiers is employed, HOG-based classifiers have been proven to be effective for relatively low-resolution images when body contours are distinguishable from the background. Each HOG classifier is trained separately for each resolution band, For this purpose, in the training phase.)

Consider Claims 5 and 19. 
The combination of Jung and Fatteh teaches: 
5. The method as claim 1 recites, wherein: the model is a neural network, and the high variance output is based on the low variance output, the method further comprising determining an additional high variance output, and further wherein altering the one or more parameters comprises training the model end- to-end based at least in part on the low variance output, the high variance output, and the additional high variance output. / 19. The one or more non-transitory computer-readable media as claim 17 recites, wherein: the model is a neural network, the operations further comprising determining an additional high variance output, and further wherein altering the one or more parameters comprises training the model end- (Jung: [0063]-[0065] For candidate ROIs (pedestrians) located at greater distances beyond a predetermined threshold, a cascade of HOG based classifiers is employed, HOG-based classifiers have been proven to be effective for relatively low-resolution images when body contours are distinguishable from the background. Each HOG classifier is trained separately for each resolution band, For this purpose, in the training phase. [0079] The input ROI 1502 to the multi-layer convolutional network 1500 may be preprocessed before propagation through the network 1500, according to an embodiment of the present invention. In a preferred embodiment, the input ROI 1502 may comprise an 80x40 pixel block. Contrast normalization is applied to the input ROI 1502. Each pixel's intensity is divided by the standard deviation of the surrounding neighborhood pixels ( e.g., a 7x7 pixel neighborhood). This preprocessing step increases contrast in low-contrast regions and decreases contrast in high-contrast regions.)

Consider Claims 6 and 20.  
The combination of Jung and Fatteh teaches: 
6. The method as claim 1 recites, wherein the first feature detection is a head detection and the second feature detection comprises a pedestrian detection./ 20. The one or more non-transitory computer-readable media as claim 16 recites, wherein the first feature detection comprises head detection and second feature detection comprises a pedestrian detection. (Jung: Figure 4, [0053] FIGS. 5A-4D are visual depictions of an example of pedestrian ROI refinement, according to an embodiment of the present invention. Depth map based detected ROIs are further refined by examining a combination of depth and edge features of two individual pedestrian detections. In step 413, a new pedestrian ROI is initialized at each detected peak, which is refined first horizontally and then vertically to obtain a more centered and tightly fitting bounding box about a candidate pedestrian. This involves employing vertical and horizontal projections, respectively, of binarized disparity maps (similar to using the edge pixels above) followed by detection of peak and valley locations in the computed projections. After this refinement, in step 414, any resulting overlapping detections are again removed from the detection list. Fatteh: page 1507 section 4.2.1 Synthetic datasets used Knowing that the decontextualized evaluation aims to tease out the performance of the iCOSVM in a general context, we used data sets that are not related to human face detection .We have generated several 2Dbinary class datasets using specific distributions. page 1508 section 4.3 Contextualized evaluation and comparison subsection 4.3.1 Face datasets used; Face Detection Dataset and Benchmark (FDDB)database [27]: The FDDB includes 2845 images. It is known with a wide variation in backgrounds, appearance, poses and lighting. We used an automatic face detector [28] to extract the annotated faces. The obtained dataset contains 5171 faces, as target class, and 175 non-faces, as outliers. section 2, page 1505-1506 Therefore, in this paper we solve this problem by using an incremental Covariance-guided One-Class SVM (iCOSVM) approach. In fact, iCOSVM incrementally emphasizes the low variance direction to improve classification discriminative power and then classification’s performance. It estimates the covariance matrix in the kernel space and incorporates it in the optimization problem, which controls the direction of the separating hyperplane. Since the optimization problem is still convex, it results in one global optimum solution. Hence, we can use a numerical method to solve it efficiently. Besides, the incorporation of the low variance in an incremental SVM-based method is an effective tool.)

Consider Claims 7 and 9. 
The combination of Jung and Fatteh teaches: 
7. The method as claim 1 recites, wherein the data comprises image data, a batch of image data, or an image space. / 9. The system as claim 8 recites, wherein the sensor data comprises at least one of image data, a batch of image data, or an image space. (Jung: [0042]-[0047], [0043] FIG. 1 depicts a vehicle 100 that is equipped with an exemplary digital processing system 110 configured to acquire a plurality of images and detect the presence of one or more pedestrians 102 in a scene 104 in the vicinity of the vehicle 100, according to an embodiment of the present invention.)

Consider Claim 12. 
The combination of Jung and Fatteh teaches: 12. The system as claim 11 recites, wherein mapping the first feature detection to reconstructed input comprises: inputting the feature into an additional model; and receiving, from the additional model, the reconstructed input.(Jung: [0042]-[0047], [0043] FIG. 1 depicts a vehicle 100 that is equipped with an exemplary digital processing system 110 configured to acquire a plurality of images and detect the presence of one or more pedestrians 102 in a scene 104 in the vicinity of the vehicle 100, according to an embodiment of the present invention.  Jung: [0063]-[0065] For candidate ROIs (pedestrians) located at greater distances beyond a predetermined threshold, a cascade of HOG based classifiers is employed, HOG-based classifiers have been proven to be effective for relatively low-resolution images when body contours are distinguishable from the background. Each HOG classifier is trained separately for each resolution band, For this purpose, in the training phase. [0079] The input ROI 1502 to the multi-layer convolutional network 1500 may be preprocessed before propagation through the network 1500, according to an embodiment of the present invention. In a preferred embodiment, the input ROI 1502 may comprise an 80x40 pixel block. Contrast normalization is applied to the input ROI 1502. Each pixel's intensity is divided by the standard deviation of the surrounding neighborhood pixels ( e.g., a 7x7 pixel neighborhood). This preprocessing step increases contrast in low-contrast regions and decreases contrast in high-contrast regions. Fatteh: page 1507 section 4.2.1 Synthetic datasets used Knowing that the decontextualized evaluation aims to tease out the performance of the iCOSVM in a general context, we used data sets that are not related to human face detection .We have generated several 2Dbinary class datasets using specific distributions. page 1508 section 4.3 Contextualized evaluation and comparison subsection 4.3.1 Face datasets used; Face Detection Dataset and Benchmark (FDDB)database [27]: The FDDB includes 2845 images. It is known with a wide variation in backgrounds, appearance, poses and lighting. We used an automatic face detector [28] to extract the annotated faces. The obtained dataset contains 5171 faces, as target class, and 175 non-faces, as outliers. section 2, page 1505-1506 Therefore, in this paper we solve this problem by using an incremental Covariance-guided One-Class SVM (iCOSVM) approach. In fact, iCOSVM incrementally emphasizes the low variance direction to improve classification discriminative power and then classification’s performance. It estimates the covariance matrix in the kernel space and incorporates it in the optimization problem, which controls the direction of the separating hyperplane. Since the optimization problem is still convex, it results in one global optimum solution. Hence, we can use a numerical method to solve it efficiently. Besides, the incorporation of the low variance in an incremental SVM-based method is an effective tool.)

Consider Claim 13. 
(Jung: [0063]-[0065] For candidate ROIs (pedestrians) located at greater distances beyond a predetermined threshold, a cascade of HOG based classifiers is employed, HOG-based classifiers have been proven to be effective for relatively low-resolution images when body contours are distinguishable from the background. Each HOG classifier is trained separately for each resolution band, For this purpose, in the training phase.)

Consider Claim 17. 
The combination of Jung and Fatteh teaches: 
17. The one or more non-transitory computer-readable media as claim 15 recites, the operations further comprising: inputting at least a portion of the sensor data into the model; receiving, as a set of features, an intermediate output of the model; inputting the set of features into one or more of an additional model or a portion of the model; receiving, from the one or more of additional model or portion of the model, a reconstructed input; and determining a second difference between the reconstructed output and the portion of the sensor data, wherein determining the annotated data comprises determining, using a statistical model, the low variance region associated with the set of features. (Jung: Figure 4, [0053] FIGS. 5A-5D are visual depictions of an example of pedestrian ROI refinement, according to an embodiment of the present invention. Depth map based detected ROIs are further refined by examining a combination of depth and edge features of two individual pedestrian detections. In step 413, a new pedestrian ROI is initialized at each detected peak, which is refined first horizontally and then vertically to obtain a more centered and tightly fitting bounding box about a candidate pedestrian. This involves employing vertical and horizontal projections, respectively, of binarized disparity maps (similar to using the edge pixels above) followed by detection of peak and valley locations in the computed projections. After this refinement, in step 414, any resulting overlapping detections are again removed from the detection list.)

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to TAHMINA N ANSARI whose telephone number is (571)270-3379.  The examiner can normally be reached on IFP Flex - Monday through Friday 9 to 5.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, SUMATI LEFKOWITZ can be reached on 571-272-3638.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


TAHMINA N. ANSARI
Examiner
Art Unit 2662

2662

August 21, 2021

/TAHMINA N ANSARI/Primary Examiner, Art Unit 2662