DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 21-32 and 35-40 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Banerjee et al (US Pub. 2020/0301013).
With respect to claim 21, Banerjee discloses A method performed by one or more data processing apparatus for aligning multi- modal sensor data (see figure 3), the method comprising:
obtaining multi-modal sensor data characterizing an environment, wherein the multi- modal sensor data comprises: (i) first sensor data generated by a first sensor modality, and (ii) second sensor data generated by a second sensor modality, wherein the second sensor modality is different than the first sensor modality, (see figure 3, camera and LiDAR, two different sensors “multi-modal sensor”);
processing each of a plurality of regions of the first sensor data using a first embedding neural network that is specific to the first sensor modality to generate a respective region embedding of each of the plurality of regions of the first sensor data;
processing each of a plurality of regions of the second sensor data using a second embedding neural network that is specific to the second sensor modality to generate a respective region embedding of each of the plurality of regions of the second sensor data, (see paragraph 0008, …feeding the image data and the projected depth data in to respective separate convolutional neural network to learn separate features…);
determining a plurality of similarity scores, wherein each similarity score measures a similarity between a region embedding of a respective region of the first sensor data and a region embedding of a respective region of the second sensor data; and
identifying a plurality of region embedding pairs that collectively define an alignment of the first sensor data and the second sensor data based on the plurality of similarity scores, wherein each region embedding pair comprises a region embedding of a respective region of the first sensor data and a region embedding of a respective region of the second sensor data, (see paragraph 0079, calibration between the camera and LiDAR …extracted edges from the camera image and LiDAR point cloud image … similarity …evaluated…similarity score can serve as the measure of good calibration…), as claimed.

With respect to claim 22, Banerjee further discloses wherein the first sensor modality is an imaging modality and the first sensor data comprises an image that characterizes a visual appearance of the environment, (see paragraph 0079, …the camera and the LiDAR), as claimed.

With respect to claim 23, Banerjee further discloses wherein the second sensor modality is a surveying sensor modality and the second sensor data comprises a point cloud, wherein the point cloud includes a collection of data points that characterize a three-dimensional geometry of the environment, wherein each data point defines a respective three-dimensional spatial position of a point on a surface in the environment, (see paragraph 0075, In a LiDAR, …to get the exact location the point in 3D space….), as claimed.  

With respect to claim 24, Banerjee further discloses wherein the second sensor data is captured by a lidar sensor or a radar sensor, (see paragraph 0075, In a LiDAR), as claimed.

With respect to claim 25, Banerjee further discloses wherein the second sensor data is captured by a lidar sensor, and each data point in the point cloud additionally defines a strength of a reflection of a pulse of light that was transmitted by the lidar sensor and that reflected from the point on the surface of the environment at the three-dimensional spatial position defined by the data point, (see paragraph 0074, LiDAR typically used ultraviolet light to determine the distance to an object…), as claimed.

With respect to claim 26, Banerjee further discloses wherein the first sensor data and the second sensor data are captured by sensors mounted on a vehicle, (see figure 17, 170 vehicle,  171 LiDAR and 172 camera), as claimed.

With respect to claim 27, Banerjee further discloses using the alignment of the first sensor data and the second sensor data to determine whether a first sensor that captured the first sensor data and a second sensor that captured the second sensor data are accurately calibrated, (see paragraph 0079), as claimed.

With respect to claim 28, Banerjee further discloses obtaining data defining a position of an object in the first sensor data; and identifying a corresponding position of the object in the second sensor data based on: (i) the position of the object in the first sensor data, and (ii) the alignment of the first sensor data and the second sensor data, (see paragraph 0078-0079), as claimed.

With respect to claim 29, Banerjee further discloses generating fused sensor data by fusing the first sensor data and the second sensor data using the alignment of the first sensor data and the second sensor data; and processing the fused sensor data using a neural network to generate a neural network output, (see figure 1 and 3; and paragraph 0026), as claimed.

With respect to claim 30, Banerjee further discloses wherein the neural network output comprises data identifying positions of objects in the environment, (see paragraph 0078, …six degrees of freedom with three translation parameters along X, Y and Z axis…), as claimed.

With respect to claim 31, Banerjee further discloses wherein the plurality of regions of the first sensor data cover the first sensor data, (see figure 17, 172 camera observing the surroundings of the vehicle), as claimed.

With respect to claim 32, Banerjee further discloses wherein the plurality of regions of the second sensor data cover the second sensor data, (see figure 17, 171 LiDAR observing the surroundings of the vehicle), as claimed.

Claims 35-37 and 38-40 are rejected for the same reasons as set forth in the rejections for claims 21-23, because claims 35-37 and 38-40 are claiming subject matter of similar scope as claimed in claims 21-23 respectively.  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 33-34 are rejected under 35 U.S.C. 103 as being unpatentable over Banerjee et al (US Pub. 2020/0301013) in view of 3D object instance recognition and pose estimation using triplet loss with dynamic margin, by Zakharov et al (IDS document).  
With respect to claim 33, Banerjee discloses all the limitation as disclose and as rejected above in claim 21. However, Banerjee fails to disclose wherein the plurality of region embedding pairs are identified using a greedy nearest neighbor matching algorithm, (emphasis added) as claimed.
Zakharov teaches wherein the plurality of region embedding pairs are identified using a greedy nearest neighbor matching algorithm, (emphasis added) (see section II Related work, third column, nearest neighbor search “a greedy nearest neighbor matching algorithm”), as claimed.
It would have been obvious to ne ordinary skilled in the art at the effective date of invention to combine the teaching of Zakharov to match the view using a similarity measure into the Banerjee system as there is matching disclose in Banerjee (see paragraph 0079) and this modification yield a improve system (see Zakharov, last paragraph of section Il).

With respect to claim 34, Banerjee discloses all the limitation as disclose and as rejected above in claim 21. However, Yan fail to disclose wherein the visual embedding neural network and the shape embedding neural network are jointly trained using a triplet loss objective function or a contrastive loss objective function, as claimed.
Zakharov teaches wherein the visual embedding neural network and the shape embedding neural network are jointly trained using a triplet loss objective function or a contrastive loss objective function, (see Il section Related work, third column, adds classification loss to triplet loss and learns the embedding from the input “trained using a triplet loss”) as claimed.
It would have been obvious to ne ordinary skilled in the art at the effective date of invention to combine the teaching of Zakharov to train neural network using a triplet loss in to the Banerjee as there is neural network modeling disclose in Banerjee (see paragraph 0019) and this modification yield a improve system (see Zakharov, last paragraph of section Il).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to VIKKRAM BALI whose telephone number is (571)272-7415. The examiner can normally be reached Monday-Friday 7:00AM-3:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Claire Wang can be reached on 571-270-1051. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/VIKKRAM BALI/Primary Examiner, Art Unit 2663