DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
Applicant previously filed claims 1-20. Claims 1, 10, 14, 15, 19, 19 and 20 have been amended. New Claim 21 has been added. Accordingly, claims 1-21 are pending in the current application.
Response to Arguments
Applicant's arguments filed 05/03/2022 have been fully considered but they are not persuasive. 
Applicant’s arguments with respect to the claim(s) have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Applicant is reminded that although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims.  See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over McKennoch et al. (US 10600192 B1) in view of Wang et al. (US 20190065910 A1).
Regarding Claim 1, McKennoch et al. teaches a method for automated training of deep learning based object detection system (Column 23, Lines 39-59), comprising: 
(a) capturing two or more images of an object from a plurality of angles by a camera system that has been calibrated, the camera system including two or more cameras having a predetermined position between each camera and the object (Column 3, Lines 6-20); 
(b) generating a modelled bounding box offset and dimensions, the modelled bounding box having an approximate offset between two bounding boxes of the same object on images from two different cameras (Column 12, Line 17 – Column 13, Line 14); 
(c) propagating the two or more images through a neural network model, thereby producing a predicted object bounding box and class identifier for each captured image, and generating a predicted bounding box offset between bounding boxes from images of neighboring cameras (Column 13, Line 15 – Column 14, Line 5); 
(d) automated computing a loss value as a sum of: (i) a first penalty value computed as the discrepancy between the modelled bounding box offset and a predicted bounding box offset; and (ii) a second penalty value computed as the discrepancy between the modelled bounding box dimensions and dimensions of predicted bounding boxes from the same image in the two or more images captured by the camera system (Column 12, Line 17 – Column 14, Line 5); and 
(e) adjusting the plurality of neural network parameters Wi until the loss function is minimized to less than a predetermined threshold, based on an optimization algorithm and steps (c) and (d) for the loss computation with selected neural network parameters values (Column 10, Line 6 – Column 11, Line 54).
However, McKennoch et al. does not explicitly teach propagating the two or more images through a deep learning-based object detection neural network model; and training the object detection neural network model by adjusting a plurality of neural network weights Wi.
Wang et al., however, teaches propagating the two or more images through a deep learning-based object detection neural network model; and training the object detection neural network model by adjusting a plurality of neural network weights Wi (Paragraphs 3-4; Paragraphs 18-23).
It would have been obvious to a person having ordinary skill in the art at the time of the filing of the invention to have modified the training of object detection system as shown in McKennoch et al. to include the deep learning-based object detection neural network model as shown in Wang et al. above, in order to increase the detection quality of the images and the speed at which the objects are recognized and classified (See Wang et al. Paragraph 5).
Regarding Claim 2, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches after the adjusting step, further comprising iteratively repeating steps (a) through (e) by moving the camera system relative to the object to one or more different angles and/or one or more different distances until all or substantially all required distances and view angles are processed and the loss value is less than a predetermined threshold (Column 25, Line 53 – Column 26, Line 45).
Regarding Claim 3, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches wherein the computing the loss value as the sum of comprises (iii) a third penalty value added if the predicted class identifier for a particular image differs from the predicted class identifier for other images or differs from expected value, provided at the configuration of the neural network model; and (iv) a fourth penalty value added if more than one object per image is predicted (Column 12, Line 17 – Column 14, Line 5; Column 14, Line 56 – Column 15, Line 21).
Regarding Claim 4, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches wherein the loss value as a sum comprises an absolute value for each difference in offset, size, and one or more penalty values added if predicted class identifiers are different or there is more than one class identifier predicted for each image (Column 12, Line 17 – Column 14, Line 5; Column 14, Line 56 – Column 15, Line 21).
Regarding Claim 5, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches wherein the discrepancy between the modelled bounding box dimensions are computed using image analysis, based on background subtraction and color based segmentation (Column 7, Lines 1-26).
Regarding Claim 6, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches wherein the two or more cameras comprises three or more cameras, the three more cameras being grouped to stereo pairs and dependency based box loss is computed as a discrepancy between physical object coordinates, estimated using different stereo pairs, from predicted bounding boxes, the first camera being common for all the stereo pairs (Column 8, Lines 61 – Column 9, Line 13; Column 20, Line 45 – Column 21 Line 53).
Regarding Claim 7, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches wherein the two or more cameras comprises three or more cameras, the three more cameras being positioned in equidistant between the two more cameras and/or aligned to each other (Column 8, Lines 61 – Column 9, Line 13; Column 20, Line 45 – Column 21 Line 53).
Regarding Claim 8, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches wherein the two or more cameras comprises three or more cameras, the three more cameras being positioned not equidistant between the two more cameras and/or unaligned to each other (Column 8, Lines 61 – Column 9, Line 13; Column 20, Line 45 – Column 21 Line 53).
Regarding Claim 9, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches wherein two or more cameras are capturing moving objects within an instrumented environment and dependency based box loss is computed as a discrepancy between expected bounding box offset and the offset between predicted object bounding boxes associated with neighboring in time images (Column 12, Line 17 – Column 14, Line 5).
Regarding Claim 10, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches wherein the camera system further comprises a plurality of sensors and dependency based loss is extended with context information of an environment and an object, measuring discrepancy between predicted object box/class identifier and sensor values or other prior info about the environment and objects (Column 3, Line 6 – Column 4, Line 41; Column 5, Line 7 – Column 6, Line 29).
Regarding Claim 11, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches where the camera system moves on x-axis, y-axis, and z-axis image the object from different angles and one or more different distances (Column 4, Lines 42-60).
Regarding Claim 12, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches wherein workspace is equipped with calibration pattern and two or more cameras are capturing the images of an object and pattern and dependency based box loss is computed as a discrepancy between physical object coordinates with respect to the pattern, estimated using homography projection between first camera plane and workspace and physical object coordinates with respect to the pattern, estimated using homography projection between second camera plane and workspace (Column 10, Line 6 – Column 11, Line 54).
Regarding Claim 13, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches wherein the plurality of parameter values comprise a random set of parameters, or a predetermined set of parameters from a pretrained neural network (Column 15, Line 66 – Column 16, Line 25).
Regarding Claim 14, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches prior to the capturing step, further comprising initializing a neural network with a plurality of parameter values WO (Column 19, Lines 8-47).
However, McKennoch et al. does not explicitly teach a neural network with a plurality of parameter weight values W0 in the the plurality of neural network weights Wi.
Wang et al. teaches a neural network with a plurality of parameter weight values W0 in the the plurality of neural network weights Wi (Paragraphs 3-4; Paragraphs 18-23).
It would have been obvious to a person having ordinary skill in the art at the time of the filing of the invention to have modified the training of object detection system as shown in McKennoch et al. to include the deep learning-based object detection neural network model as shown in Wang et al. above, in order to increase the detection quality of the images and the speed at which the objects are recognized and classified (See Wang et al. Paragraph 5).
Regarding Claim 15, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches wherein the set of parameter values, Wi comprises WO, W1, W2 ... Wi, as determined by the optimizer (Column 10, Line 6 – Column 11, Line 54).
However, McKennoch et al. does not explicitly teach the plurality of neural network weights Wi.
Wang et al. teaches the plurality of neural network weights Wi (Paragraphs 3-4; Paragraphs 18-23).
It would have been obvious to a person having ordinary skill in the art at the time of the filing of the invention to have modified the training of object detection system as shown in McKennoch et al. to include the deep learning-based object detection neural network model as shown in Wang et al. above, in order to increase the detection quality of the images and the speed at which the objects are recognized and classified (See Wang et al. Paragraph 5).
Regarding Claim 16, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches wherein the propagating step is performed by a forward pass, the forward pass step being executed by an object detection neural network model (Column 10, Line 6 – Column 11, Line 54).
Regarding Claim 17, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches after step (g), wherein the neural network is self-trained to detect and identify the object (Column 3, Lines 39-59; Column 10, Line 6 – Column 11, Line 54).
Regarding Claim 18, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches wherein the optimization algorithm comprises Stochastic Gradient Descent (SGD), or Adaptive Moment Estimation (ADAM) Particle Filtering (Column 7, Lines 1-26).
Claim 19 has limitations similar to those rejected in claims 1-18 above and is rejected for the same reasons of anticipation as used above.
Claim 20 is drawn to the apparatus corresponding to the method claimed in claim 1 above, and is rejected for the same reasons of anticipation as used above. McKennoch et al. further teaches a system for automated training of deep learning based object detection system (Column 23, Lines 39-59).
Regarding Claim 21, McKennoch et al. and Wang et al. teach the method of claim 1, McKennoch et al. further teaches wherein the automated computing comprises computing without using or requiring ground truth information for each image in the two or more images (Column 5, Lines 7-67, “The epipoles 218A-18B may be points where a line from the focal center of the first image capture device 212A to the focal center of the second image capture device 212B intersects in their respective images 204A-04B. That is the epipole 218A is the point of intersection in the image 204A of the line from the first image capture device 212A, and the epipole 218B is the point of intersection in the image 204B of the same line. The line that intersects the epipoles 218A-18B may be determined from information about the physical proximity and orientation of each of the cameras. The relationship between the positions of the cameras may be determined initially using static correspondence points in the background.”, clearly teaches computing and deriving orientation of the cameras in two images directly from the static images and not with any ground truth describing the relative orientation).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FARHAN MAHMUD whose telephone number is (571)272-7712.  The examiner can normally be reached on 10-7.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Joseph Ustaris can be reached on 5712727383.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/FARHAN MAHMUD/Primary Examiner, Art Unit 2483