DETAILED ACTION
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

Claims 1-10 are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the independent claim 1 limitation(s) uses “a controller configured to generate feature pyramid images….”,  wherein “…controller”, as a generic placeholder that each is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure {such as a computer, or a processor, and non-transitory computer memory components} as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the 

















Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Bovyrin (US 20180314916 A1); in view of Ramaswamy (US 20150077323 A1).
Re Claim 1, Bovyrin discloses an apparatus for detecting an object of a vehicle (see Bovyrin: e.g., Fig. 1, [0002]-[0005], and [0080]), comprising:
a camera configured to acquire an image of an area in front of the vehicle (see Bovyrin: e.g., Fig. 1, and in [0005]-[0015]); and
a controller configured to generate feature pyramid  based on a plurality of feature images extracted from the image (see Bovyrin: e.g., -- a. Input: color image, output: list of rectangles, containing "objects" b. Set the initial scale=1, make the list of object candidates or feature candidate list (FCL) empty.  c. Until the original image width is not less than W.sub.0*scale and the original image height is not less than H.sub.0*scale, where (W.sub.0.times.H.sub.0) is the window size for the trained classifier, do: c.1 Generate feature channels.  Fast pyramid calculation approach can be used--, in [0034], {herein windows are the feature images}),
Bovyrin however does not explicitly disclose to generate feature pyramid images based on a plurality of feature images extracted from the image;
Ramaswamy teaches to generate feature pyramid images based on a plurality of feature images extracted from the image (see Ramaswamy: e.g.,-- Extracted features can also be based on higher-level characteristics or features of a user.  One example of higher-level feature detection may involve detection of a user feature (e.g., head or face) and then validating existence of the user in an image by detecting more granular components (e.g., eyes, nose, mouth).  In this example, a representation of the user can be detected within an image by generating from the image a set of pyramidal or hierarchical images that are convolved and subsampled at each ascending level of the image pyramid or hierarchy--, in [0040]-[0041]);
Bovyrin and Ramaswamy are combinable as they are in the same field of endeavor:  detection and identifying of object based on image features. Therefore it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify Bovyrin’s apparatus using Ramaswamy’s teachings by including generate feature pyramid images based on a plurality of feature images extracted from the image to Bovyrin’s analysis of image features in order to  detect face and determine where a representation of the face is positioned in one or more images (see Ramaswamy: e.g. in abstract, [0040]-[0041]);
Bovyrin as modified by Ramaswamy further disclose generate feature aggregation images by filtering the feature pyramid images (see Ramaswamy: e.g.-- Matching cost computations can also be based on truncated quadratics, contaminated Gaussians, phase responses, filter-bank responses, among others.  Another step for disparity computation is cost aggregation, which relates to distributing the matching cost computation over a support region, such as by summing or averaging over a respective window or region of a pair of stereo images.  A support region can be either two-dimensional at a fixed disparity or three-dimensional in x-y-d space.  Two-dimensional cost aggregation techniques can be based on square windows, Gaussian convolutions, multiple shiftable windows, windows with adaptive sizes, and windows based on connected components of constant disparity.  Three-dimensional cost aggregation techniques can be based on disparity differences, limited disparity gradients, and Prazdny's coherence principle.  In some embodiments, iterative diffusion can also be used for aggregating the matching cost to a pixel's neighbors.  Iterative diffusion operates by adding to each pixel's cost the weighted values of its neighboring pixels' costs.--, in [0033]-[0037]; also see Bovyrin: e.g., -- determining random features by defining a maximum allowed feature size of a training sample, sampling random filter positions of a training sample, calculating pixel weights in a patch of the maximum allowed feature size, and selecting a feature for applying a boosting classifier.  The method may also include using aggregate channel features for a first group of weak classifiers, using the same downscaled features channel, applying the filtered channel feature, and using features constructed based on errors of previously used weak classifiers as boosting classifiers.--, in [0078]),
detect a pedestrian area from the feature aggregation images (see Bovyrin: e.g., -- pedestrian detection--, in [0003], “"Filtered channel features for pedestrian detection," in [0013] See S. Zhang, et al., "Filtered channel features for pedestrian detection," in Proc.  of CVPR, 2015, incorporated herein as reference, and as provided in IDS {such as “The input image is transformed into a set of feature channels (also called feature maps” in Fig.1, and in pages 4-5}); and; 
detect face regions from the feature pyramid images, and determine at least one of the face regions that overlaps the pedestrian area as a face of a pedestrian (see Bovyrin: e.g., --"Aggregate Channel Features for Multi-view Face Detection," Biometrics (IJCB), 2014 IEEE International Joint Conference on IEEE, 2014, as provided in IDS).  
Most of the non-object windows are rejected using the fast features.--, in [0012], and incorporated herein as reference; also see Ramaswamy: e.g., -- face (or facial features, such as the eyebrows, eyes, nose, etc.) can generally be tracked, or detected --, in [0021]-[0028]). 

Re Claim 2, Bovyrin as modified by Ramaswamy further disclose wherein the feature pyramid images are generated by down-scaling the feature image from an original size at a predetermined ratio (see Bovyrin: e.g., -- For first Nf weak classifiers, very fast features may be used.  For example, Aggregate Channel Features (ACF), with a cell size that equals to 6.times.6 pixels, may be used.  To accelerate calculations, all feature channels can be downscaled (e.g. six times).  Thus, the ACF feature is represented by a corresponding pixel value in that corresponding channel.--, in [0018]; also see Ramaswamy: e.g., --The Naive Bayes classifier estimates the local appearance and position of face patterns at multiple resolutions.  At each scale, a face image is decomposed into subregions and the subregions are further decomposed according to space, frequency, and orientation.  The statistics of each projected subregion are estimated from the projected samples to learn the joint distribution of object and position.--, in [0046]. {so that scales corresponding to low lever features, higher level features of pyramid images}).

Re Claim 3, Bovyrin as modified by Ramaswamy further disclose to generate a filter bank by performing a convolution operation or a correlation operation on a first feature filter and a second feature filter and generate the feature aggregation images by filtering the feature pyramid images using the filter bank (see Ramaswamy: e.g., -- filter-bank responses--, in [0013]; details of pooling filter bank and use thereof  is taught in S. Zhang, et al., "Filtered channel features for pedestrian detection," in Proc.  of CVPR, 2015, incorporated herein Bovyrin’s [0012]-[0013] as the reference).

Re Claim 4,  Bovyrin as modified by Ramaswamy further disclose the controller is configured to generate a training feature image based on feature information extracted from a training image and generates the first feature filter from the training feature image based on a Local Binary Pattern method (see Bovyrin: e.g., -- Right after each intermediate sum is computed, it is compared to a "threshold," whose value is determined during the training.  If the sum is below "threshold," the remaining stages are skipped and the window is considered as "not an object." Otherwise, if the intermediate sums are all above the corresponding thresholds, the window is considered as a good object candidate and its position and size are stored together with the final sum of responses, which is treated as a candidate score.  All such candidates are collected from all the image layers.--, in [0017]-[0022]. details of features training, such as LBP  is taught in page 2 of S. Zhang, et al., "Filtered channel features for pedestrian detection," in Proc.  of CVPR, 2015, incorporated herein Bovyrin’s [0012]-[0013] as the reference).

Re Claim 5, Bovyrin as modified by Ramaswamy further disclose the controller is configured to generate a first feature aggregation image by filtering the training feature image using the first feature filter and generate a second feature filter from the first feature aggregation image based on a Locally De-correction Channel Feature (LDCF) method (see Bovyrin: e.g., in [0012]-[0022], and, details of Locally De-correction Channel Feature (LDCF) and pooling filter bank and use thereof  is taught in Channel Feature {such as in pages 4-5} S. Zhang, et al., "Filtered channel features for pedestrian detection," in Proc.  of CVPR, 2015, incorporated herein Bovyrin’s [0012]-[0013] as the reference).
.

Re Claim 6, Bovyrin as modified by Ramaswamy further disclose the controller is configured to generate a training feature aggregation image by filtering the training feature image using the filter bank and generate a pedestrian classifier for classifying the pedestrian area in the training feature aggregation image (see Ramaswamy: e.g., -- filter-bank responses--, in [0013]; details of pooling filter bank and use thereof  is taught in {such as in page 4} S. Zhang, et al., "Filtered channel features for pedestrian detection," in Proc.  of CVPR, 2015, incorporated herein Bovyrin’s [0012]-[0013] as the reference).

Re Claim 7, Bovyrin as modified by Ramaswamy further disclose the controller is configured to detect the pedestrian area from the feature aggregation images using the pedestrian classifier (see Bovyrin: e.g., in [0012]-[0022], and, details of Locally De-correction Channel Feature (LDCF) and pooling filter bank and use thereof  is taught in Channel Feature {such as Fig. 1, in page 1, and pages 4-5} S. Zhang, et al., "Filtered channel features for pedestrian detection," in Proc.  of CVPR, 2015, incorporated herein Bovyrin’s [0012]-[0013] as the reference).

Re Claim 8, Bovyrin as modified by Ramaswamy further disclose wherein the controller is configured to synthesize a region, having a highest score among the face regions that overlap the pedestrian area, with the pedestrian area (see Bovyrin: e.g., -- Apply non-maxima suppression to FCL, constructed in step c: [0043] d.1 Sort FCL in descending order by the score S.sub.i [0044] d.2.  For each (R.sub.i S.sub.i) starting from the highest score,--, in [0040]-[0044]; also see “max pooling” in in {such as in page 4} S. Zhang, et al., "Filtered channel features for pedestrian detection," in Proc.  of CVPR, 2015, incorporated herein Bovyrin’s [0012]-[0013] as the reference).

Re Claim 9, Bovyrin as modified by Ramaswamy further disclose the controller is configured to determine the region having the highest score as the face of the pedestrian (see Bovyrin: e.g., -- Apply non-maxima suppression to FCL, constructed in step c: [0043] d.1 Sort FCL in descending order by the score S.sub.i [0044] d.2.  For each (R.sub.i S.sub.i) starting from the highest score,--, in [0040]-[0044]; also see “max pooling” in in {such as in page 4} S. Zhang, et al., "Filtered channel features for pedestrian detection," in Proc.  of CVPR, 2015, incorporated herein Bovyrin’s [0012]-[0013] as the reference).

Re Claim 10, Bovyrin as modified by Ramaswamy further disclose the controller is configured to calculate a probability that an object of the image in the face region that overlaps the pedestrian area is a face of the pedestrian, and generate the score based on the probability (see  Ramaswamy: e.g., --The Naive Bayes classifier estimates the local appearance and position of face patterns at multiple resolutions.  At each scale, a face image is decomposed into subregions and the subregions are further decomposed according to space, frequency, and orientation.  The statistics of each projected subregion are estimated from the projected samples to learn the joint distribution of object and position.  A face is determined to be within an image if the likelihood ratio is greater than the ratio of prior probabilities--, in [0046]).


Re Claims 11-20, 11-20 are the corresponding method claims to claims 1-10, respectively. Thus claims 11-20 are rejected for the same reasons as discussed above for Claims 1-10 respectively. Furthermore, Bovyrin as modified by Ramaswamy further disclose robot identifying an obstacle on a method of steps (see Bovyrin: e.g., Fig. 1, [0002]-[0005], and [0080]).






Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Satzoda (US 20180365888 A1, which claims the priority from us-provisional-application US 62521106, June 16, 2017) teaches determining vehicle event data based on the sampled sensor signals.  Block S110 is preferably performed by an onboard vehicle system as described above, but can additionally or alternatively be performed by any other suitable system or subsystem.  Vehicle events associated with the vehicle event data determined in Block S110 can include: near-collision event, collision events, traffic events (e.g., merging, lane-keeping, slowing to a stop, operating in stop-and-go traffic, accelerating from a stop to a speed limit, holding a determined distance from a leading vehicle, etc.).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to WEI WEN YANG whose telephone number is (571)270-5670.  The examiner can normally be reached on 8:00 - 5:00 pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matthew Bella can be reached on 571-272-7778.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 

If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/WEI WEN YANG/Primary Examiner, Art Unit 2667