Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

DETAILED ACTION
Claims 1 – 20 are pending in this application. Claims 1, 8 and 15 are independent.

Specification
The title of the invention – "SYSTEM AND METHOD FOR OPTIMIZING PERFORMANCE OF AT LEAST ONE DOWNSTREAM TASK", is not descriptive. A new title that is clearly indicative of the invention to which the claims are directed is required.





























Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1 – 20 are rejected under 35 U.S.C. 103 as being unpatentable over Srinivasan, Praveen (US-20220012916-A1, hereinafter simply referred to as Srinivasan).

Regarding independent claims 1, 8 and 15, Srinivasan teaches:
A method for optimizing performance of at least one downstream task (See at least Srinivasan, ¶ [0052], FIG. 1, "…image data and ground truth data (e.g., lidar associated with the image data) data can be input into a machine-learning model. For example, the training data can be input to a machine-learning model where a known result (e.g., a ground truth, such as a known depth value) can be used to adjust weights and/or parameters of the machine-learning model to minimize an error…"), the method comprising the step of: generating visual semantic segmentation data (e.g., bounding boxes of objects in an environment can allow various systems of an autonomous vehicle to perform segmentation in Srinivasan) of a scene (e.g., environment 106 (FIG. 1) of Srinivasan) by a visual semantic segmentation model based on at least one image of the scene (See at least Srinivasan, ¶ [0042, 0052], FIG. 1, "…A machine-learning model can be trained to generate relative depth data using training image data and training lidar data as a ground truth for training… segmentation operations (e.g., semantic segmentation, instance segmentation, etc.) can be performed on the training image data…", "…image data and ground truth data (e.g., lidar associated with the image data) data can be input into a machine-learning model. For example, the training data can be input to a machine-learning model where a known result (e.g., a ground truth, such as a known depth value) can be used to adjust weights and/or parameters of the machine-learning model to minimize an error…"); generating labeled point cloud data (e.g., a point cloud of data (e.g., the local map or depth data) in Srinivasan) of a scene (e.g., environment 106 (FIG. 1) of Srinivasan) of the scene by a vision model based on raw point cloud data of the scene and the visual semantic segmentation data (See at least Srinivasan, ¶ [0027, 0042, 0052], FIG. 1, "…During localization operations, a vehicle can use depth data generated by the machine-learned model as a point cloud of data (e.g., the local map or depth data)…", "…A machine-learning model can be trained to generate relative depth data using training image data and training lidar data as a ground truth for training… segmentation operations (e.g., semantic segmentation, instance segmentation, etc.) can be performed on the training image data…", "…image data and ground truth data (e.g., lidar associated with the image data) data can be input into a machine-learning model. For example, the training data can be input to a machine-learning model where a known result (e.g., a ground truth, such as a known depth value) can be used to adjust weights and/or parameters of the machine-learning model to minimize an error…"); generating one or more clusters (e.g., by using clustering algorithms OR by classifying and/or detecting static objects and/or dynamic objects associated with the image data of Srinivasan) of the scene by a cluster generator model based on the labeled point cloud data (See at least Srinivasan, ¶ [0027, 0029, 0042, 0052], FIG. 1, "…During localization operations, a vehicle can use depth data generated by the machine-learned model as a point cloud of data (e.g., the local map or depth data)…", "…the localization operation can use perception operations to classify and/or detect the static objects and/or the dynamic objects associated with the image data…", "…A machine-learning model can be trained to generate relative depth data using training image data and training lidar data as a ground truth for training… segmentation operations (e.g., semantic segmentation, instance segmentation, etc.) can be performed on the training image data…", "…image data and ground truth data (e.g., lidar associated with the image data) data can be input into a machine-learning model. For example, the training data can be input to a machine-learning model where a known result (e.g., a ground truth, such as a known depth value) can be used to adjust weights and/or parameters of the machine-learning model to minimize an error…"); determining a clustering loss error (e.g., by computing a loss and/or minimizing an error of the depth data of Srinivasan) between the one or more clusters generated by the cluster generator model and one or more ground truth clusters (See at least Srinivasan, ¶ [0022, 0027, 0029, 0042, 0052], FIG. 1, "…he machine-learned model can use a Least Absolute Deviations algorithm (e.g., an L1 loss function) and/or a Least Square Errors e.g., an L2 loss function) to compute a loss and/or minimize an error of the depth data…", "…During localization operations, a vehicle can use depth data generated by the machine-learned model as a point cloud of data (e.g., the local map or depth data)…", "…the localization operation can use perception operations to classify and/or detect the static objects and/or the dynamic objects associated with the image data…", "…A machine-learning model can be trained to generate relative depth data using training image data and training lidar data as a ground truth for training… segmentation operations (e.g., semantic segmentation, instance segmentation, etc.) can be performed on the training image data…", "…image data and ground truth data (e.g., lidar associated with the image data) data can be input into a machine-learning model. For example, the training data can be input to a machine-learning model where a known result (e.g., a ground truth, such as a known depth value) can be used to adjust weights and/or parameters of the machine-learning model to minimize an error…"); and adjusting, based on the clustering loss error, one or more model weights (e.g., a known result (e.g., a ground truth, such as a known depth value) can be used to adjust weights and/or parameters of the machine-learning model to minimize an error of Srinivasan)  of at least one of: the visual semantic segmentation model and the vision model (See at least Srinivasan, ¶ [0022, 0027, 0029, 0042, 0052], FIG. 1, "…he machine-learned model can use a Least Absolute Deviations algorithm (e.g., an L1 loss function) and/or a Least Square Errors e.g., an L2 loss function) to compute a loss and/or minimize an error of the depth data…", "…During localization operations, a vehicle can use depth data generated by the machine-learned model as a point cloud of data (e.g., the local map or depth data)…", "…the localization operation can use perception operations to classify and/or detect the static objects and/or the dynamic objects associated with the image data…", "…A machine-learning model can be trained to generate relative depth data using training image data and training lidar data as a ground truth for training… segmentation operations (e.g., semantic segmentation, instance segmentation, etc.) can be performed on the training image data…", "…image data and ground truth data (e.g., lidar associated with the image data) data can be input into a machine-learning model. For example, the training data can be input to a machine-learning model where a known result (e.g., a ground truth, such as a known depth value) can be used to adjust weights and/or parameters of the machine-learning model to minimize an error…").
Srinivasan teaches all the subject matters of the claimed inventive concept as expressed in the rejections above. However, the teachings are taught in separate embodiments.
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Srinivasan taught in separate embodiments for the desirable and advantageous purpose of improving a functioning of a computing device by providing additional depth data for performing subsequent operations to control an autonomous vehicle such that depth data associated with image data can allow subsequent processes such as localization, perception, route planning, trajectory generation, and the like to be performed more accurately using less processing power, and/or requiring less memory, as discussed in Srinivasan (See ¶ [0025]); thereby, helping to improve the overall system robustness by improving a functioning of a computing device by providing additional depth data for performing subsequent operations to control an autonomous vehicle such that depth data associated with image data can allow subsequent processes such as localization, perception, route planning, trajectory generation, and the like to be performed more accurately using less processing power, and/or requiring less memory.

Regarding dependent claims 2, 9 and 16, Srinivasan teaches:
determining a labeled point cloud data error between the labeled point cloud data and ground truth labeled point cloud data (e.g., the error can include a difference between the depth value output based on the image data and a ground truth depth value associated with the captured depth data in Srinivasan) (See at least Srinivasan, ¶ [0022, 0027, 0029, 0042, 0052], FIG. 1, "…he machine-learned model can use a Least Absolute Deviations algorithm (e.g., an L1 loss function) and/or a Least Square Errors e.g., an L2 loss function) to compute a loss and/or minimize an error of the depth data…", "…During localization operations, a vehicle can use depth data generated by the machine-learned model as a point cloud of data (e.g., the local map or depth data)…", "…the localization operation can use perception operations to classify and/or detect the static objects and/or the dynamic objects associated with the image data…", "…A machine-learning model can be trained to generate relative depth data using training image data and training lidar data as a ground truth for training… segmentation operations (e.g., semantic segmentation, instance segmentation, etc.) can be performed on the training image data…", "…image data and ground truth data (e.g., lidar associated with the image data) data can be input into a machine-learning model. For example, the training data can be input to a machine-learning model where a known result (e.g., a ground truth, such as a known depth value) can be used to adjust weights and/or parameters of the machine-learning model to minimize an error…"); and adjusting one or more model weights of the visual semantic segmentation model based on the labeled point cloud data error (See at least Srinivasan, ¶ [0022, 0027, 0029, 0042, 0052], FIG. 1, "…he machine-learned model can use a Least Absolute Deviations algorithm (e.g., an L1 loss function) and/or a Least Square Errors e.g., an L2 loss function) to compute a loss and/or minimize an error of the depth data…", "…During localization operations, a vehicle can use depth data generated by the machine-learned model as a point cloud of data (e.g., the local map or depth data)…", "…the localization operation can use perception operations to classify and/or detect the static objects and/or the dynamic objects associated with the image data…", "…A machine-learning model can be trained to generate relative depth data using training image data and training lidar data as a ground truth for training… segmentation operations (e.g., semantic segmentation, instance segmentation, etc.) can be performed on the training image data…", "…image data and ground truth data (e.g., lidar associated with the image data) data can be input into a machine-learning model. For example, the training data can be input to a machine-learning model where a known result (e.g., a ground truth, such as a known depth value) can be used to adjust weights and/or parameters of the machine-learning model to minimize an error…").

Regarding dependent claims 3, 10 and 17, Srinivasan teaches:
determining a visual semantic segmentation data error between the visual semantic segmentation data and ground truth visual semantic segmentation data (e.g., the error can include a difference between the depth value output based on the image data and a ground truth depth value associated with the captured depth data in Srinivasan) (See at least Srinivasan, ¶ [0022, 0027, 0029, 0042, 0052], FIG. 1, "…he machine-learned model can use a Least Absolute Deviations algorithm (e.g., an L1 loss function) and/or a Least Square Errors e.g., an L2 loss function) to compute a loss and/or minimize an error of the depth data…", "…During localization operations, a vehicle can use depth data generated by the machine-learned model as a point cloud of data (e.g., the local map or depth data)…", "…the localization operation can use perception operations to classify and/or detect the static objects and/or the dynamic objects associated with the image data…", "…A machine-learning model can be trained to generate relative depth data using training image data and training lidar data as a ground truth for training… segmentation operations (e.g., semantic segmentation, instance segmentation, etc.) can be performed on the training image data…", "…image data and ground truth data (e.g., lidar associated with the image data) data can be input into a machine-learning model. For example, the training data can be input to a machine-learning model where a known result (e.g., a ground truth, such as a known depth value) can be used to adjust weights and/or parameters of the machine-learning model to minimize an error…"); and adjusting one or more model weights of the visual semantic segmentation model based on the visual semantic segmentation data error (See at least Srinivasan, ¶ [0022, 0027, 0029, 0042, 0052], FIG. 1, "…he machine-learned model can use a Least Absolute Deviations algorithm (e.g., an L1 loss function) and/or a Least Square Errors e.g., an L2 loss function) to compute a loss and/or minimize an error of the depth data…", "…During localization operations, a vehicle can use depth data generated by the machine-learned model as a point cloud of data (e.g., the local map or depth data)…", "…the localization operation can use perception operations to classify and/or detect the static objects and/or the dynamic objects associated with the image data…", "…A machine-learning model can be trained to generate relative depth data using training image data and training lidar data as a ground truth for training… segmentation operations (e.g., semantic segmentation, instance segmentation, etc.) can be performed on the training image data…", "…image data and ground truth data (e.g., lidar associated with the image data) data can be input into a machine-learning model. For example, the training data can be input to a machine-learning model where a known result (e.g., a ground truth, such as a known depth value) can be used to adjust weights and/or parameters of the machine-learning model to minimize an error…").

Regarding dependent claims 4, 11 and 18, Srinivasan teaches:
determining a labeled point cloud data error between the labeled point cloud data and ground truth labeled point cloud data (e.g., the error can include a difference between the depth value output based on the image data and a ground truth depth value associated with the captured depth data in Srinivasan) (See at least Srinivasan, ¶ [0022, 0027, 0029, 0042, 0052], FIG. 1, "…he machine-learned model can use a Least Absolute Deviations algorithm (e.g., an L1 loss function) and/or a Least Square Errors e.g., an L2 loss function) to compute a loss and/or minimize an error of the depth data…", "…During localization operations, a vehicle can use depth data generated by the machine-learned model as a point cloud of data (e.g., the local map or depth data)…", "…the localization operation can use perception operations to classify and/or detect the static objects and/or the dynamic objects associated with the image data…", "…A machine-learning model can be trained to generate relative depth data using training image data and training lidar data as a ground truth for training… segmentation operations (e.g., semantic segmentation, instance segmentation, etc.) can be performed on the training image data…", "…image data and ground truth data (e.g., lidar associated with the image data) data can be input into a machine-learning model. For example, the training data can be input to a machine-learning model where a known result (e.g., a ground truth, such as a known depth value) can be used to adjust weights and/or parameters of the machine-learning model to minimize an error…"); and adjusting one or more model weights of the vision model based on the labeled point cloud data error (See at least Srinivasan, ¶ [0022, 0027, 0029, 0042, 0052], FIG. 1, "…he machine-learned model can use a Least Absolute Deviations algorithm (e.g., an L1 loss function) and/or a Least Square Errors e.g., an L2 loss function) to compute a loss and/or minimize an error of the depth data…", "…During localization operations, a vehicle can use depth data generated by the machine-learned model as a point cloud of data (e.g., the local map or depth data)…", "…the localization operation can use perception operations to classify and/or detect the static objects and/or the dynamic objects associated with the image data…", "…A machine-learning model can be trained to generate relative depth data using training image data and training lidar data as a ground truth for training… segmentation operations (e.g., semantic segmentation, instance segmentation, etc.) can be performed on the training image data…", "…image data and ground truth data (e.g., lidar associated with the image data) data can be input into a machine-learning model. For example, the training data can be input to a machine-learning model where a known result (e.g., a ground truth, such as a known depth value) can be used to adjust weights and/or parameters of the machine-learning model to minimize an error…").

Regarding dependent claims 5, 12 and 19, Srinivasan teaches:
adjusting one or more model weights of the cluster generator model based on clustering loss error (See at least Srinivasan, ¶ [0022, 0027, 0029, 0042, 0052], FIG. 1, "…he machine-learned model can use a Least Absolute Deviations algorithm (e.g., an L1 loss function) and/or a Least Square Errors e.g., an L2 loss function) to compute a loss and/or minimize an error of the depth data…", "…During localization operations, a vehicle can use depth data generated by the machine-learned model as a point cloud of data (e.g., the local map or depth data)…", "…the localization operation can use perception operations to classify and/or detect the static objects and/or the dynamic objects associated with the image data…", "…A machine-learning model can be trained to generate relative depth data using training image data and training lidar data as a ground truth for training… segmentation operations (e.g., semantic segmentation, instance segmentation, etc.) can be performed on the training image data…", "…image data and ground truth data (e.g., lidar associated with the image data) data can be input into a machine-learning model. For example, the training data can be input to a machine-learning model where a known result (e.g., a ground truth, such as a known depth value) can be used to adjust weights and/or parameters of the machine-learning model to minimize an error…").

Regarding dependent claims 6 and 13, Srinivasan teaches:
wherein the one or more clusters are in the form of one or more bounding boxes (e.g., bounding boxes (e.g., FIGS. 5, 7) of objects in an environment can allow various systems of an autonomous vehicle to perform segmentation in Srinivasan) (See at least Srinivasan, ¶ [0022, 0027, 0029, 0042, 0052], FIG. 1, "…he machine-learned model can use a Least Absolute Deviations algorithm (e.g., an L1 loss function) and/or a Least Square Errors e.g., an L2 loss function) to compute a loss and/or minimize an error of the depth data…", "…During localization operations, a vehicle can use depth data generated by the machine-learned model as a point cloud of data (e.g., the local map or depth data)…", "…the localization operation can use perception operations to classify and/or detect the static objects and/or the dynamic objects associated with the image data…", "…A machine-learning model can be trained to generate relative depth data using training image data and training lidar data as a ground truth for training… segmentation operations (e.g., semantic segmentation, instance segmentation, etc.) can be performed on the training image data…", "…image data and ground truth data (e.g., lidar associated with the image data) data can be input into a machine-learning model. For example, the training data can be input to a machine-learning model where a known result (e.g., a ground truth, such as a known depth value) can be used to adjust weights and/or parameters of the machine-learning model to minimize an error…").

Regarding dependent claims 7, 14 and 20, Srinivasan teaches:
wherein the raw point cloud data is data from a LIDAR sensor (See at least Srinivasan, ¶ [0022, 0027, 0029, 0042, 0052], FIG. 1, "…he machine-learned model can use a Least Absolute Deviations algorithm (e.g., an L1 loss function) and/or a Least Square Errors e.g., an L2 loss function) to compute a loss and/or minimize an error of the depth data…", "…During localization operations, a vehicle can use depth data generated by the machine-learned model as a point cloud of data (e.g., the local map or depth data)…", "…the localization operation can use perception operations to classify and/or detect the static objects and/or the dynamic objects associated with the image data…", "…A machine-learning model can be trained to generate relative depth data using training image data and training lidar data as a ground truth for training… segmentation operations (e.g., semantic segmentation, instance segmentation, etc.) can be performed on the training image data…", "…image data and ground truth data (e.g., lidar associated with the image data) data can be input into a machine-learning model. For example, the training data can be input to a machine-learning model where a known result (e.g., a ground truth, such as a known depth value) can be used to adjust weights and/or parameters of the machine-learning model to minimize an error…").


























Conclusion
The prior art made of record and not relied upon is considered pertinent to Applicant's disclosure: See the Notice of References Cited (PTO–892)
Any inquiry concerning this communication or earlier communications from the examiner should be directed to IDOWU O OSIFADE whose telephone number is (571)272-0864. The Examiner can normally be reached on Monday-Friday 8:00am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the Examiner’s Supervisor, Emily Terrell can be reached on (571) 270 – 3717. The fax phone number for the organization where this application or proceeding is assigned is (571) 273 – 8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. 
Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at (866) 217 – 9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call (800) 786 – 9199 (IN USA OR CANADA) or (571) 272 – 1000.



/IDOWU O OSIFADE/Primary Examiner, Art Unit 2666