DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments (9/14/22 Remarks: page 4, lines 8-22) with respect to the rejection of claims 1-20 under 35 USC §112, the rejection of claims 1-20 under 35 USC §103, and the objection to claims 1, 10, & 17 have been fully considered. The rejection of claims 1-20 under 35 USC §112, the rejection of claims 1-20 under 35 USC §103, and the objection to claims 1, 10, & 17 have been obviated by the claims’ cancellation. However, upon further consideration, a new ground(s) of rejection is made in view of Wang (“Improvement of Non-Maximum Suppression in RGB-D Object Detection”, cited in 4/14/22 Office Action), Collet Romea (US 20140307056, cited in 9/15/21 Office Action) and Kim (“Implementation of Yolo-v2 Image Recognition and Other Testbenches for a CNN Accelerator”).
Claim Objections
Claim 21 is objected to because of the following informalities:
In claim 21, line 7, context indicates that “read” should apparently read “red”.
Appropriate correction is required.
Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 21-23 are rejected under 35 U.S.C. 103(a) as being unpatentable over Wang (“Improvement of Non-Maximum Suppression in RGB-D Object Detection”, cited in 4/14/22 Office Action) in view of Collet Romea (US 20140307056, cited in 9/15/21 Office Action) and Kim (“Implementation of Yolo-v2 Image Recognition and Other Testbenches for a CNN Accelerator”).
Wang discloses:
Claim 21: A method for detecting road hazards, comprising:
receiving … having a red channel, a blue channel, and a green channel, wherein the red channel, the blue channel, and the green channel produce a 3-channel RGB image (Wang Figure 3, RGB image);
receiving a depth image (Wang Figure 3, RGB image);
inputting the 3-channel RGB image and the depth image into an ensemble network (Wang Figure 3, RGB image and depth image input to network) to concatenate the depth image to the red, green, and blue channels of the 3-channel RGB image to produce a 4-channel RGBD image (Wang Figure 3, combining of RGB and depth images) and to determine one or more road hazards within …, wherein the ensemble network includes:
a first plurality of layers (Wang Figure 3, initial set of layers including “DBL” which combines convolutional, batch normal, and leaky ReLU elements), wherein each layer in the first plurality of layers includes:
a convolutional layer (Wang Figure 3, initial set of layers including “DBL” which includes convolution element);
a batch normal layer (Wang Figure 3, initial set of layers including “DBL” which includes batch normal element);
a leaky ReLU activation function (Wang Figure 3, initial set of layers including “DBL” which includes leaky ReLU element); and
…
a second plurality of layers (Wang Figure 3, another set of layers including “DBL” which combines convolutional, batch normal, and leaky ReLU elements), wherein each layer in the second plurality of layers includes:
a convolutional layer (Wang Figure 3, another set of layers including “DBL” which includes convolution element);
a batch normal layer (Wang Figure 3, another set of layers including “DBL” which includes batch normal element); and
a leaky ReLU activation function (Wang Figure 3, another set of layers including “DBL” which includes leaky ReLU element),
and wherein the second plurality of layers does not include a max pooling layer (Wang Figure 3, another set of layers including “DBL” which combines convolutional, batch normal, and leaky ReLU elements, “second plurality of layers” mapped to a set of layers including the above listed elements and excluding any max pooling layer which may be found in the network);
a third plurality of layers (Wang Figure 3, another set of layers including “DBL” which combines convolutional, batch normal, and leaky ReLU elements),
wherein each layer in the third plurality of layers includes:
a convolutional layer (Wang Figure 3, another set of layers including “DBL” which includes convolution element);
a batch normal layer (Wang Figure 3, another set of layers including “DBL” which includes batch normal element);
a leaky ReLU activation function (Wang Figure 3, another set of layers including “DBL” which includes leaky ReLU element); and
a you-only-look-once (“YOLO”) layer (Wang Section III-B, YOLOv3 basic framework); and
one or more skip architectures (Wang Section III-B, concatenation of features from multiple convolution kernels (i.e. concatenation of features including features from layers other than the immediately preceding one, thus skipping at least one layer)).
Wang does not expressly disclose:
…a stereo image…
Collet Romea discloses:
…a stereo image (Collet Romea paragraph 0025, stereo RGB camera)…
Wang and Collet Romea are combinable because they are from the field of depth image processing ((McMahon section IV-A spanning pages 3-4 and Figure 5, RGB-D (red, green, blue, depth) image camera; Collet Romea Abstract and paragraph 0009, depth image).
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to apply the Wang object detection arrangement to a stereo image.
The suggestion/motivation for doing so would have been to obtain stereo depth data in addition to the depth image data of Wang.
Wang does not expressly disclose:
…a max pooling layer…
Kim discloses the use of a max pooling layer in conjunction with the use of a convolution layer in a YOLO architecture (Kim Abstract, convolution and max pooling layers used as combined or separate layers in implementing a YOLO architecture).
Wang and Kim are combinable because they are from the field of image detection and recognition.
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to apply the use of a max pooling layer in a YOLO implementation as taught by Kim in the YOLO implementation of Wang.
The suggestion/motivation for doing so would have been to provide a YOLO implementation with an improved performance accelerator development (Kim Abstract).
Therefore, it would have been obvious to combine Wang with Collet Romea and Kim to obtain the invention as specified in claim 21.
Applying these teachings as applied to claim 21 above to claims 22-23:
Claim 22: The method of claim 21 (see above), wherein the one or more skip architectures include an output from a layer in the first plurality of layers that provides input to a layer including an upsample layer (Wang Figure 3, upsample layer in initial set of layers).
Claim 23: The method of claim 21 (see above), wherein the one or more skip architectures include an output from a layer in the second plurality of layers that provides input to the layer including an upsample layer (Wang Figure 3, upsample layer in another set of layers).
Conclusion
Any inquiry concerning the contents of this communication or earlier communications from the examiner should be directed to Stephen M. Brinich at 571-272-7430 (voice) or 571-273-7430 (fax).
Any inquiry relating to the status of this application, entry of papers into this application, or other any inquiries of a general nature concerning application processing should be directed to the Tech Center 2600 Customer Service center at 571-272-2600 or to the USPTO Contact Center at 800-786-9199 or 571-272-1000.
The examiner can normally be reached on weekdays 7:30-4:00 Eastern Time.
If attempts to contact the examiner and the Customer Service Center are unsuccessful, supervisor Claire Wang can be contacted at 571-270-1051.
Hand-carried correspondence may be delivered to the Customer Service Window, located at the Randolph Building, 401 Dulany Street, Alexandria, VA 22314.
/Stephen M Brinich/
Examiner, Art Unit 2663