DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):

(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claim 9 is rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.  This claim has been amended to include the term “the same housing”; however, there is insufficient antecedent basis for the use of this term in this claim, as there is not any housing mentioned elsewhere in the claims.  As such, for purposes of compact prosecution, Examiner is interpreting “the same housing” to instead be “a shared housing”.  Appropriate correction is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

The claimed invention within Claims 1-2, 4-10, 12-14, 16-17, and 19-30 (not Claims 3, 11, 15, and 18) is directed to an abstract idea judicial exception without significantly more.  Claims 1-2 and 4-10 recite a first method, Claims 12-14 recite a second method, Claims 16-17 and 19-26 recite a first apparatus, and Claims 27-30 recite a second apparatus.  These are each statutory categories.  However, these claims merely recite the judicial exception of an abstract idea, particularly, a mental process, which is not integrated into a practical application, nor do these claims include additional elements that would be sufficient to amount to significantly more than the judicial exception.
These claims recite the abstract idea of a mental process, involving receiving first and second data and then processing the received data to come up with new “results” data (i.e. first and second data feature extractions, data conversion/transformation into a common spatial domain, concatenating feature data into a common feature map, detecting one or more objects/object features within the common feature map, determining feature map values, determining changes or lack of changes within the data, estimating sizes of one or more detected objects, etc.), which can be done in the human mind.  Within these claims, the only claimed structures are (a) an on-board computer of a host vehicle, which is generic computing structure, (b) at least one processor of the on-board computer of the host vehicle (per Claims 16-30), which is also generic computing structure, (c) a camera sensor of the host vehicle, which is external to the on-board computer and describes insignificant pre-solutionary activity by merely defining where the first received data is coming from, (d) a radar sensor of the host vehicle, which is also external to the on-board computer and also describes insignificant pre-solutionary activity by merely defining where the second received data is coming from, and (e) a neural network (per Claims 13 and 28), which may also be external to the on-board computer and/or may be generic computing structure, and also describes insignificant post-solutionary activity by merely defining an intended location of where to provide one or more results of the processing steps made to the received first and second data.  Further, each of the generically claimed computing components are doing old and well known generic computing functions.  It should be noted that Claims 12 and 27 also include the term “an encoder-decoder network” which could potentially be considered a structure (albeit would be considered similar to the above mentioned “neural network”); however, it’s use in these claims precludes this interpretation because it is being used in the limitations as though it is data transformation step rather than a structure (“apply/-ing an encoder-decoder network on the first camera frame to generate a first camera feature map in a spatial domain of the first radar frame”).
Under broadest reasonable interpretation, using the human mind to achieve these same functions covers performance of the limitations in the mind but for the recitation of generic computer components.  That is, other than reciting generic computing components, nothing in the claim elements preclude the functions described in these claims from practically being performed in the mind.  For example, but for the generic computing components language, receiving data in the context of these claims encompasses a user visually and mentally receiving data or recalling data from memory, perhaps based on previously seeing a first photograph/map of surroundings as taken with a camera sensor from a first perspective (i.e. first received data) as well as a second photograph/map of surroundings as taken with a radar sensor from a second perspective (i.e. second received data).  Similarly, because these claim limitations do not require the processing steps to have any particular level of accuracy or precision, nothing in the claim elements preclude these functions from practically being performed in the mind as well (i.e. the user’s data processing may be completely wrong, inaccurate, and/or inappropriate).  Further, saving any received data, or saving new data (i.e. results data) based on that data processing of the received data, in the context of what may later be included into these claims, would merely encompass a user committing all or a portion of the received data and/or the outcome of the processing steps to their memory/mind/brain (even if that saved received/new data is essentially a “decision” that may lead to a practical application at a later point in time).  If a claim limitation under its broadest reasonable interpretation covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, these claims recite an abstract idea.
In accordance with the April 2018 memo “Changes in Examination Procedure Pertaining to Subject Matter Eligibility, Recent Subject Matter Eligibility Decision (Berkheimer v. HP, Inc.), Part III. A., in a step 2B analysis, an additional element (or combination of elements) is not well-understood, routine or conventional unless the examiner finds, and expressly supports a rejection in writing with, one or more of the following: 1. A citation to an express statement in the specification or to a statement made by an applicant during prosecution that demonstrates the well-understood, routine, conventional nature of the additional element(s); 2. A citation to one or more of the court decisions discussed in MPEP §2106.05(d)(ll) as noting the well-understood, routine, conventional nature of the additional element(s); 3. A citation to a publication that demonstrates the well-understood, routine, conventional nature of the additional element(s); 4. A statement that the examiner is taking Official Notice of the well-understood, routine, conventional nature of the additional element(s).”
The abstract idea of a mental process done based on data determined by the person doing the mental process has previously been found to be ineligible under 35 USC 101 per the very similar claim concepts found in the precedential decision relating to Smart Systems Innovations.  As far as the claimed data receiving limitations which occur prior to the processing steps, generic data-gathering elements are similar to concepts that have also been identified as abstract by the courts, such as obtaining and comparing intangible data in Cybersource.
At best, these limitations can be performed by a generically recited and/or general purpose “on-board computer” computing device that has been pre-loaded with the data necessary for the subsequent data processing.  Even if all or a portion of the received data utilized for the data processing steps comes from one or more computing devices different than the one or more computing devices completing the data processing steps (in other words, if the camera sensor and/or radar sensor are components of a different computer than the “on-board computer”), then these limitations still involve no more than a plurality of generic computing devices in communication with each other.  Mere data communication steps that can be performed between any number of generic computing devices have also been previously identified by the courts as an abstract idea (i.e. a judicial exception): (A) Receiving and/or transmitting data is considered to be well-understood, routine, or conventional at least as evidenced by MPEP § 2106.05(d)(II)(i) "Receiving or transmitting data over a network", and (iv) "Storing and retrieving information in memory", and (B) Comparing the received data to other data is considered to be well-understood, routine or conventional at least as evidenced by MPEP§ 2106.05(d)(II)(ii) "Performing repetitive calculations".  Furthermore, Office takes Official Notice to the fact that (a) the only claimed structures as previously mentioned are either well-known structures of any generic general purpose computer or else describe pre-or-post insignificant extra-solutionary activity, and (b) the claimed functions to be executed by, and/or the capabilities of, these well-known structures of any generic general purpose computer, are also well understood, routine, and conventional computing functions previously known to the industry, as per the above cited to court decisions and MPEP sections.  It is undeniably old and well known for computers (and human minds) to receive data, process that received data, and then potentially generate new data (results data) (potentially for providing to other components of the computer or to other computers) based on the outcome/-s of the data processing steps.  These functionalities are all standard functionalities that computers and minds are known to be capable of.  All other claim limitations within these claims describe functionalities that are internal to the generic computing components (or human mind) and nothing received, processed, detected/determined/transformed/generated/provided/etc. is definitively utilized to serve any practical purpose.  Thus, within these claims, there are no elements that integrate the judicial exception into a practical application.  It should be noted that dependent Claims 10 and 14 each include a limitation that requires “performing an autonomous driving operation based on detecting the one or more objects”, and dependent Claims 25 and 29 each include a limitation that requires “trigger an autonomous driving operation based on detecting the one or more objects”; however, many autonomous driving operations include similar functions as described herein already (such as receiving data, processing data, analyzing data, creating new data), so under BRI it is entirely reasonable for these “autonomous driving operations” to still be generic computing functionalities that a computer (or a human mind) are capable of (for example, the operation could be a computer-based internal-only data-manipulation), and are thus not necessarily describing a practical application regardless of whether they use the term “performing” versus “trigger”.
Finally, these particular claims do not recite an improvement to another technology or technical field, an improvement to the functioning of the computer/-s itself/themselves, or meaningful limitations beyond generally linking the use of an abstract idea to a particular technological environment, that is, implementation via computer on-board a host/autonomous vehicle.  The limitations are no more than a field of use, or a field of use that also involves insignificant extra-solution activity (for example, pre-solution activity such as receiving data prior to data processing, or post-solution activity such as generating/providing result data based on the outputs from the data processing steps).  Based on this, when viewed as a whole, there are no claim elements in these particular claims that provide meaningful limitations to transform the abstract idea into a patent eligible application of the abstract idea such that the claims amount to significantly more than the abstract idea itself.  Therefore, Claims 1-2, 4-10, 12-14, 16-17, and 19-30 are rejected under 35 U.S.C. 101 as being directed to non-statutory subject matter.  Corrective actions are required.
It should be noted that Claims 3, 11, 15, and 18 are specifically not included in this rejection because: (a) Claims 11 and 15 clearly describe a practical application by further defining the “performing an autonomous driving operation” to specifically be (performing) “one or more of braking, accelerating, steering, adjusting a cruise control setting, or signaling”, and (b) Claims 3 and 18 specifically require “performing an explicit inverse perspective mapping transformation on the first camera feature map”, which is being considered too computationally complex to be reasonably performed purely in the human mind.  Additionally, Claims 26 and 30 were previously excluded from this rejection as they were similar to Claims 11 and 15; however, Claim 25 (which Claim 26 is dependent upon) and Claim 29 (which Claim 30 is dependent upon), were amended to change “perform an autonomous driving operation” to instead be “trigger an autonomous operation” (unlike Claims 11 and 15), and “triggering” alone is not “performing” as it may require external influence and/or additional unclaimed steps to actually get to the performance of said autonomous operation (i.e. the “triggering” could just be the outputting of a signal relating to an autonomous operation that would then lead to the performance of that autonomous operation only if that signal was later received and executed by an external structure in a future step).  As such, regardless of the autonomous operation being further defined in Claims 26 and 30, neither of these claims necessarily constitute a practical application anymore due to the change of “perform” to “trigger” in Claims 25 and 29, and that is why they are not included along with Claims 11 and 15 as exceptions to this 35 USC 101 rejection as mentioned at the top of this paragraph.
Claim Rejections - 35 USC § 102/103
Firstly, this application currently names joint inventors.  In considering patentability of the claims the Examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Secondly, the following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Thirdly, the following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
A) Claims 1, 3-7, 10-11, 16, 18-22, and 24-26 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Rust (US 2018/0341263, filed 25 May 17).
B) Claims 2, 8-9, 12-15, 17, 23, and 27-30 are rejected under 35 U.S.C. 103 as being obvious over Rust.
C) Additionally/alternatively, Claims 13 and 28 are rejected under 35 U.S.C. 103 as being obvious over Rust, further in view of Ozdemir et al. (US 2019/0122073, filed 23 Oct 17), herein “Ozdemir”.
Regarding Claims 1, 12, 16, and 27 (all independent) and Claims 2 and 17 (dependent), Rust discloses:
a method of fusing camera and radar frames to perform object detection in one or more spatial domains performed by an on-board computer of a host vehicle, comprising: (per Claims 1 and 12) / An on-board computer of a host vehicle, comprising: at least one processor configured to: (per Claims 16 and 27) (“system 100 provides for low level processing of three-dimensional images of surroundings of the vehicle 10, in the form of point clouds, to determine velocity of surrounding objects for use in controlling the vehicle 10”, Paragraph 32, “vehicle 10 is an autonomous vehicle and the system 100, and/or components thereof, are incorporated into the autonomous vehicle 10 (hereinafter referred to as the autonomous vehicle 10)”, Paragraph 34, “autonomous vehicle 10 corresponds to a level four or level five automation system under the Society of Automotive Engineers (SAE) “J3016” standard taxonomy of automated driving levels”, Paragraph 35, “sensing devices 40a-40n might include, but are not limited to, radars, lidars, global positioning systems, optical cameras, thermal cameras, ultrasonic sensors, and/or other sensors. The actuator system 30 includes one or more actuator devices 42a-42n that control one or more vehicle features such as, but not limited to, the propulsion system 20, the transmission system 22, the steering system 24, and the brake system 26”, Paragraph 39, “controller 34 includes at least one processor 44 and a computer-readable storage device or media 46”, Paragraph 41, “the autonomous driving system 70 can include a sensor fusion system 74, a positioning system 76, a guidance system 78, and a vehicle control system 80… the sensor fusion system 74 synthesizes and processes sensor data and predicts the presence, location, classification, and/or path of objects and features of the environment of the vehicle 10. In various embodiments, the sensor fusion system 74 can incorporate information from multiple sensors, including but not limited to cameras, lidars, radars, and/or any number of other types of sensors”, Paragraphs 53-54, “Referring now to FIG. 5, and with continued reference to FIGS. 1-4, a flowchart illustrates a control method 500 that can be performed by the systems 100 of FIG. 1 in accordance with the present disclosure, which includes the autonomous driving system 70 of FIG. 3 and the point cloud processing system 400 of FIG. 5”, Paragraph 81)
receive/-ing, from a camera sensor of the host vehicle, a plurality of camera frames (per Claims 1, 12, 16, and 27); receive/-ing, from a radar sensor of the host vehicle, a plurality of radar frames (per Claims 1, 12, 16, and 27); perform/-ing a camera feature extraction process on a first camera frame of the plurality of camera frames to generate a first camera feature map; (per Claims 1 and 16); perform/-ing a radar feature extraction process on a first radar frame of the plurality of radar frames to generate a first radar feature map; (per Claims 1 and 16) sensor (“system 28 obtains three-dimensional imaging data 402 of surroundings of the vehicle 10 through sensor devices 42n. The three-dimensional imaging data 402 may be obtained from a lidar device, radar device or other range finding device 42n or an optical camera. The three-dimensional data 402 may be divided into frames, whereby the sensor device 42n captures time spaced images of each scene surrounding the vehicle 10. The scene surrounding the vehicle 10 include objects (not shown) that can be identified and, through processing methods described herein with reference to FIGS. 4 and 5, velocity of the objects can be determined. The three dimensional imaging data 402 may be obtained through the sensor fusion system 74”, Paragraph 60)
convert/-ing the first camera feature map, the first radar feature map, or both to a common spatial domain (per Claims 1 and 16); concatenate/-ing the first radar feature map and the first camera feature map to generate a first concatenated feature map in the common spatial domain; and (per Claims 1 and 16) / apply/-ing an encoder-decoder network (i.e. interpreted under BRI to be the point cloud processing system 400 as discussed below) on the first camera frame to generate a first camera feature map in a spatial domain of the first radar frame (per Claims 12 and 27) (see obvious design choice discussion below); combine/-ing the first radar frame and the first camera feature map to generate a first combined feature map in the spatial domain of the first radar frame (see obvious design choice discussion below); and (per Claims 12 and 27) (“The point cloud processing system 400 includes a static scene alignment module 414 configured to use the segmented static data points 411a,b to position align the first point cloud 410a with at least one second point cloud 410b, which is/are less recent point clouds. That is, static features, as identified by the static data points 411a,b, in the first point cloud 410a are aligned with corresponding static features in the second point cloud 410b. The static scene alignment module 414 is configured to output position aligned first and second point clouds 416 that have moving data points 413a from the first point cloud 410a relatively shifted as compared to the moving data points 413b from the second point cloud 410b in a common static frame of reference provided by the position aligned first and second point clouds 416. The relatively moving data points 413a, b form clusters of moving data points 413a from the first point cloud 410a that are shifted in position relative to the clusters of moving data points 413b from the second point cloud when viewed in a static frame of reference provided by the position aligned first and second point clouds 416”, Paragraph 64)
detect/-ing one or more objects in the first concatenated feature map (per Claims 1 and 16) / detect/-ing one or more objects in the first combined feature map (per Claims 12 and 27) (“the sensor fusion system 74 synthesizes and processes sensor data and predicts the presence, location, classification, and/or path of objects and features of the environment of the vehicle 10”, Paragraph 54),
wherein the common spatial domain is a spatial domain of the radar sensor (per Claims 2 and 17, dependent upon independent Claims 1 and 16, respectively) (see obvious design choice discussion below).
While Rust mentions using one cluster of data points from a first point cloud to match with another cluster of data points from a second point cloud (“the method includes identifying a cluster of data points in the first and second time spaced point clouds corresponding to the moving object. The method may comprise matching an identified cluster of data points in the point cloud with an identified cluster of data points in the second point cloud. The matching step may comprise determine a spatial transformation from a cluster of data points in the first point cloud with a cluster of data points in the second point cloud”, Paragraph 8), Rust does not specify that the first or second point clouds refer to any particular sensors, such as the radar sensor and/or the camera sensor (“The sensor may be a lidar sensor or other range finding sensor such as radar. The sensor may also be an optical camera”, Paragraph 22, “sensing devices 40a-40n might include, but are not limited to, radars, lidars, global positioning systems, optical cameras, thermal cameras, ultrasonic sensors, and/or other sensors”, Paragraph 39), and thus Rust remains silent in that the common static frame of reference (i.e. common spatial domain) is specifically (a) a spatial domain of the radar sensor, as opposed to (b) a spatial domain of the camera sensor, or (c) a common spatial domain different from both a spatial domain of the radar sensor and a spatial domain of the camera sensor.  Relating to (c), since the sensor fusion system may have other sensors besides just the camera sensor and the radar sensor, it may thus be preferable to use a common spatial domain that is a spatial domain of another sensor (as obvious examples, one of ordinary skill in the art at the time of filing may prefer to try using a spatial domain of whichever sensor is closest to the center/middle of all the sensors, or may prefer using a spatial domain of whichever sensor collects the most amount of data, for the sake of trying to minimize processing requirements to localize all the sensory data together).  Additionally relating to (c), it may alternatively be preferable to use a common spatial domain that is some sort of average of all the sensors utilized in the fusion but not one in particular (although this may increase processing efforts because all sensor data would need to be transformed rather than eliminating one sensor’s data from needing to be transformed by choosing that sensor’s spatial domain as the common spatial domain).  Therefore, it appears to only be an obvious matter of design choice (and/or an obvious-to-try option out of a finite number of reasonable possibilities) as to which common spatial domain to utilize, and if it coincides with one of the sensor’s spatial domain, which sensor to utilize for that purpose.  In support of this obviousness rationale, the Applicant’s specification does not appear to particularly point out any unexpected result or particular advantage by specifically having the common spatial domain be a spatial domain of the radar sensor (as opposed to all other obvious-to-try options out of a finite number of reasonable alternatives for the common spatial domain, based on all of the sensors involved in the sensor fusion process and how processing-intensive the data transformations may be for each one).  As such, it would have been obvious to one of ordinary skill in the art at the time of filing to have modified the disclosure of Rust to specifically use a spatial domain of the radar sensor as the common spatial domain, as is merely one obvious-to-try option out of a finite number of reasonable options (and merely a matter of obvious design choice), in order to eliminate at least one set of sensory data (in this case, the radar sensor’s data) from having to be transformed in order to merge with the other sensory data (in this case, the camera sensor’s data).
Regarding Claims 3 and 18, Rust discloses the method of Claim 1 and the on-board computer of Claim 16, respectively, and Rust further discloses that converting the first camera feature map, the first radar feature map, or both to the common spatial domain comprises converting the first camera feature map to the common spatial domain, and converting the first camera feature map to the common spatial domain comprises performing an explicit inverse perspective mapping transformation on the first camera feature map (“the matching step may comprise determine a spatial transformation from a cluster of data points in the first point cloud with a cluster of data points in the second point cloud”, Paragraph 8, “an iterative closest point calculation to determine a spatial transformation between clusters of moving data points…The method may comprise determining a spatial transformation the mesh in the first and second position aligned point clouds, thereby allowing distance moved to be determined”, Paragraph 10, “The matching process may derive a spatial transformation between the moving objects, which are represented by clusters of moving data points”, Paragraph 15, “The spatial transformation may be determined using an iterative closest point or by using a mesh matching algorithm. The mesh matching algorithm can include a step of generating a mesh based on a cluster of moving data points in each of the first and second point clouds. The distance moved as described above can be determined based on a distance between the meshes in the position aligned first and second point clouds”, Paragraph 21, “The matching or registration process comprises, in various embodiments, a mesh matching algorithm. An exemplary mesh matching algorithm converts the first and second clusters of moving data points 413a,b into search and template polygonal or triangular mesh models representing a topology of the clusters of moving data points 413a,b. Each mesh surface of the template is matched to a search surface and transformation parameters therebetween are determined”, Paragraph 73).
Regarding Claims 4 and 19, Rust discloses the method of Claim 1 and the on-board computer of Claim 16, respectively, and Rust further discloses that converting the first camera feature map, the first radar feature map, or both to the common spatial domain comprises converting the first camera feature map to the common spatial domain, and converting the first camera feature map to the common spatial domain occurs during performing the camera feature extraction process (“The scene surrounding the vehicle 10 include objects (not shown) that can be identified and, through processing methods described herein with reference to FIGS. 4 and 5, velocity of the objects can be determined. The three dimensional imaging data 402 may be obtained through the sensor fusion system 74”, Paragraph 60, “The present disclosure proposes methods and system that allow object velocities to be determined from the point cloud data 406, without necessarily requiring object classification and high level processing through, for example, positioning system 70, which generally involves determining and tracking bounding boxes amongst other high level processing operation. The present disclosure allows for processing efficient determining of velocity of objects based on frames of point cloud data 406. Referring now to FIG. 5, and with continued reference to FIGS. 1-4, a flowchart illustrates a control method 500 that can be performed by the systems 100 of FIG. 1 in accordance with the present disclosure, which includes the autonomous driving system 70 of FIG. 3 and the point cloud processing system 400 of FIG. 5. As can be appreciated in light of the disclosure, the order of operation within the method is not limited to the sequential execution as illustrated in FIG. 5, but may be performed in one or more varying orders as applicable and in accordance with the present disclosure. In various embodiments, the method 400 can be scheduled to run based on one or more predetermined events, and/or can run continuously during operation of the autonomous vehicle 10”, Paragraphs 80-81).
Regarding Claims 5 and 20, Rust discloses the method of Claim 1 and the on-board computer of Claim 16, respectively, and Rust further discloses:
hash/-ing a plurality of blocks of the first camera frame to identify one or more blocks that have not changed between a previous camera frame of the plurality of camera frames and the first camera frame; and copy/-ing feature map values of a second camera feature map of the previous camera frame to corresponding feature map values of the first feature map (“the static scene alignment module 414 is configured to register the point clouds 416 based on visual odometry methods, in which two frames are compared and the difference between them is minimized. Such methods are able to remove error from inertial sensors, as well as to build a high resolution local map”, Paragraph 67, “Clusters of moving data points 413a,b that correspond to the same object in real space are identified in the matching step 508. A registration or matching algorithm is run in step 508 to derive a spatial transformation from a reference cluster of moving data points 413a to a target cluster of moving data points 413b. The registration or matching algorithm may be an iterative closest point algorithm or a mesh matching algorithm in exemplary embodiments. The matching step 508 is carried out through the object matching module 418 and produces transformation data 420. The method 500 includes a step 510 of determining distance moved d.sub.1 . . . d.sub.n of each cluster identified as being corresponding in step 508. In particular, the transformation data 420 provides a spatial relationship between clusters of moving data points 413a,413b that have moved in the position aligned static scenes 416 constituting a static frame of reference. Such a spatial relationship allows a distance parameter d.sub.1 . . . d.sub.n to be derived in scalar or vector form. The step 510 of determining distance moved d.sub.1 . . . d.sub.n is carried out through the distance module 422”, Paragraphs 85-86).
Regarding Claims 6 and 21, Rust discloses the method of Claim 1 and the on-board computer of Claim 16, respectively, and Rust further discloses estimate/-ing a width, length, or both of the one or more objects based on a bounding box in the first camera frame encapsulating each of the one or more objects (“A number of possibilities exist for identifying matching moving data points 413a, 413b. For example, groupings can be made based on similar rotational/translational velocities. Grouping can be based on similar shapes between recent point clouds 416. Grouping can be based on shapes in a predetermined obstacle set”, Paragraph 69, “The matching or registration process comprises, in various embodiments, a mesh matching algorithm. An exemplary mesh matching algorithm converts the first and second clusters of moving data points 413a,b into search and template polygonal or triangular mesh models representing a topology of the clusters of moving data points 413a,b. Each mesh surface of the template is matched to a search surface and transformation parameters therebetween are determined”, Paragraph 73).
Regarding Claims 7 and 22, Rust discloses the method of Claim 6 and the on-board computer of Claim 21, respectively, and Rust further discloses that the width, length, or both of the one or more objects is estimated based at least in part on a make, model, or both of the one or more objects (“Grouping can be based on similar shapes between recent point clouds 416. Grouping can be based on shapes in a predetermined obstacle set. For example, a member of this obstacle set could be a particular vehicle model which tend to look the same no matter where and when you see them”, Paragraph 69).
Regarding Claims 8 and 23, Rust discloses the method of Claim 1 and the on-board computer of Claim 16, respectively, but Rust remains silent regarding specifically perform/-ing the camera feature extraction process on a second camera frame of the plurality of camera frames to generate a second camera feature map; perform/-ing the radar feature extraction process on a second radar frame of the plurality of radar frames to generate a second radar feature map; convert/-ing the second camera feature map to the common spatial domain to generate a converted camera feature map, the second radar feature map to the common spatial domain to generate a converted radar feature map, or both; and concatenate/-ing the converted second radar feature map, the converted second camera feature map, or both to generate a second concatenated feature map, wherein detecting the one or more objects is further based on the second concatenated feature map.  However, this is merely the exact duplication of the previously cited to steps of (per the prior art rejection of independent Claims 1 and 16 shown above): extracting features from a first camera frame of the plurality of camera frames to generate a first camera feature map, extracting features from a first radar frame of the plurality of radar frames to generate a first radar feature map, convert/-ing the first camera feature map and/or the first radar feature map to a common spatial domain; and concatenate/-ing the converted first radar feature map and the converted first camera feature map to generate a first concatenated feature map, wherein detecting the one or more objects is based on the first concatenated feature map.  As such, it is merely a matter of obvious design choice to duplicate an already known process (for example, repeating the process over time) in order to improve the accuracy of the feature extraction processes, especially when trying to detect one or more objects that may be moving over time or may be present in one or more feature maps at one point in time but missing in one or more feature maps at another point in time.
Regarding Claim 9, Rust discloses the method of Claim 1, and Rust further renders obvious that: the radar sensor and the camera sensor are collocated in a shared housing in the host vehicle (i.e. the shared housing may be considered the dotted line around sensor system 28 itself as shown in Fig. 1, which includes sensors 40a, 40b, 40n, or even the shared housing being the body of the vehicle itself; regardless, collocating multiple sensors, even of different types, is undeniably known in the art and it would be merely a matter of obvious design choice to do so, as Applicant’s own specification describes collocation and non-collocation as equal options and has not provided any reasoning for one option being advantageous of the other (Although FIG. 1 illustrates an example in which the radar component and the camera component are collocated components in a shared housing, as will be appreciated, they may be separately housed in different locations within the vehicle 100”, Paragraph 29 of the Specification); “vehicle 10 is an autonomous vehicle and the system 100, and/or components thereof, are incorporated into the autonomous vehicle 10 (hereinafter referred to as the autonomous vehicle 10)”, Paragraph 34, “sensor system 28 includes one or more sensing devices 40a-40n that sense observable conditions of the exterior environment and/or the interior environment of the autonomous vehicle 10. The sensing devices 40a-40n might include, but are not limited to, radars, lidars, global positioning systems, optical cameras, thermal cameras, ultrasonic sensors, and/or other sensors”, Paragraph 39).
Regarding Claims 10, 14, 25, and 29, Rust discloses the method of Claim 1 (per Claim 10) and the on-board computer of Claim 16 (per Claim 25), and Rust renders obvious the method of Claim 12 (per Claim 14) and the on-board computer of Claim 27 (per Claim 29), and Rust further discloses: perform/-ing an autonomous driving operation based on detecting the one or more objects (per Claims 10 and 14) / trigger an autonomous driving operation based on detecting the one or more objects (per Claims 25 and 29) (“Exemplary uses of the velocity parameters by the autonomous driving system 70 include inference of the future motion of identified objects. Such inference may involve use of a Kalman filter that assumed a pre-determined movement model, or a generative model that has been trained on how similar looking obstacles have moved in the past e.g. pedestrians on this crosswalk tend to ignore the light etc. Based on the inferred future motion, the autonomous driving system 70 can generate one or more autonomous driving commands taking into account probable future motion of identified object”, Paragraph 78).  It should be further noted that regarding Claims 25 and 29, “triggering” an autonomous operation could merely be describing an intended use, as these claims do not have any limitations anywhere describing any one or more actuators that may later potentially execute said autonomous operation (assuming these one or more actuators actually successfully receive what was merely used to provide the “trigger”), and even if they did, these one or more actuators would certainly not be part of the claimed on-board computers of Claims 16 and 27.  As such, these particular claims merely mentioning an intended use of the detected one or more objects (i.e. to “trigger” an autonomous operation) do not appear to contain significant patentable weight as currently claimed.
Regarding Claims 11, 15, 26, and 30, Rust discloses the method of Claim 10 (per Claim 11) and the on-board computer of Claim 25 (per Claim 26), and Rust renders obvious the method of Claim 14 (per Claim 15) and the on-board computer of Claim 29 (per Claim 30), and Rust further discloses that the autonomous driving operation is one or more of braking, accelerating, steering, adjusting a cruise control setting, or signaling (“The actuator system 30 includes one or more actuator devices 42a-42n that control one or more vehicle features such as, but not limited to, the propulsion system 20, the transmission system 22, the steering system 24, and the brake system 26”, Paragraph 39).
Regarding Claims 13 and 28, Rust renders obvious the method of Claim 12 and the on-board computer of Claim 27, respectively, but Rust remains silent regarding provide/-ing the first combined feature map to a neural network.  However, “providing” said feature map is merely describing it’s intended use, as the neural network itself is not claimed to actually receive said feature map, is not claimed to be part of the claimed on-board computer of Claim 27, nor has any limitations anywhere describing what the neural network may utilize this feature map for (assuming it actually successfully received what was merely “provided”).  As such, these claims merely mentioning an intended target of the first combined feature map do not appear to contain significant patentable weight as currently written.  As such, these claims are considered to be insignificantly different from (or at best, obvious variants to) their respective independent claims.  However, in the case that Applicant traverses this obviousness “intended-use” argument, Ozdemir teaches these limitations precisely (“The system and method receives a signal based upon acquired data from a subject or object in the multi-dimensional space. It interprets a combination of information from the signal (e.g. signal intensity, signal phase, etc.) and confidence information (which quantifies uncertainty), and based thereon, performs at least one of detection and characterization of at least one property of interest related to the object or subject. Illustratively, the multi-dimensional space can be a 2D image (such as pixels) or a 3D spatial representation (such as voxels, multi-layer slices, etc.). The detection and/and characterization can include use of a learning algorithm (such as a convolutional neural network) trained based on the combination of information from the signal and confidence level information, and can optionally include evaluation by the learning algorithm that has been trained via (e.g.) the convolutional neural network”, Paragraph 9, “during prediction, the system and method passes the test slice through the neural network M”, Paragraph 48, “the system and method can apply to 2D and 3D data that is derived from automotive sensors and sensor arrays (for example as used in collision avoidance, self-parking and self-driving arrangements). Such sensors can include visible light cameras and pattern-recognition, LIDAR and/or RADAR and the resulting images are used by the automotive processor(s) to evaluate include obstacles to avoid, street signs to identify, traffic signals, road markings, and/or other driving hazards. More generally, the system and method herein is applicable where uncertainty information is fused temporally across multiple frames to refine previously computed confidence estimates. For example, if an object of interest is a pedestrian with high confidence in acquired 2D or 3D image frames 1-3 and 5-8 of a stream of acquired images (by any of the devices/imaging modalities described above), there is a high likelihood that the object is a person in frame 4 as well. Even if temporary occlusions or lighting changes in frame 4 (shadows, flying birds, camera glare, fog, haze, smoke, dust, etc.) cause uncertainty when frame 4 is evaluated in isolation. In general, the system and method can apply where the temporally fusing the confidence estimate is based on subject/object tracking (i.e. a moving object, a moving acquisition device (aircraft, satellite, watercraft, robot manipulator, conveyor, etc.) or both), rather than purely upon spatial location. The fusion of such confidence information can occur in a variety of ways”, Paragraph 59).  It would have been obvious to one of ordinary skill in the art at the time of filing to have further modified the disclosure of Rust with the teachings of Ozdemir in order to improve the performance, accuracy, learning, training, etc. of the method/system.
Regarding Claim 24, Rust discloses the method of Claim 1 and the on-board computer of Claim 16, respectively, and Rust further discloses that: the radar sensor and the camera sensor are collocated in the host vehicle (“vehicle 10 is an autonomous vehicle and the system 100, and/or components thereof, are incorporated into the autonomous vehicle 10 (hereinafter referred to as the autonomous vehicle 10)”, Paragraph 34, “sensor system 28 includes one or more sensing devices 40a-40n that sense observable conditions of the exterior environment and/or the interior environment of the autonomous vehicle 10. The sensing devices 40a-40n might include, but are not limited to, radars, lidars, global positioning systems, optical cameras, thermal cameras, ultrasonic sensors, and/or other sensors”, Paragraph 39).
Response to Arguments
Applicant’s arguments/remarks filed 4 April 2022 have been fully considered but they have only been found to be partially persuasive.
Firstly, with regards to Applicant’s remarks about the previously made 35 USC 101 rejections, Examiner has not been persuaded.  Applicant firstly submits that the claimed invention cannot be done in the human mind because a human, while able to view camera/radar frames on a display or printout, cannot receive camera/radar frames from a camera/radar sensor without the help of software/hardware of the vehicle.  Applicant secondly submits that the claimed invention provides an improvement to another technology or technical field because bringing radar and camera frames into the same scene in order for existing object detection techniques to be applicable, and that “the lower the level at which fusion is performed, the higher the subsequent computational cost. However, the accuracy can be much higher. As such, it would be beneficial to be able to fuse information from different sources at an early stage while reducing computation costs".
In rebuttal to the aforementioned arguments, firstly, Examiner points out that the abstract idea itself is what can be a mental process, but the mere data gathering steps that occur prior to the abstract idea/mental process (i.e. camera/radar sensor data being sent to and received by generic computing processing components) only describe insignificant pre-solutionary activity, as described in the 35 USC 101 rejection.  As such, these sensors, nor the alleged lack of ability for a human mind to receive camera/radar frames from a camera/radar sensor without the help of software/hardware of the vehicle, are insufficient for overcoming this rejection.  Secondly, Examiner points out that the alleged improvements to another technology or technical field as per the Applicant’s remarks involve discussing a benefit to a timing of when sensor fusion occurs (i.e. early stage/lower level); however, these claims never once mention particular stages or levels as to when sensor fusion will occur, nor any timings.  As such, this argument is also non-persuasive, and the 35 USC 101 rejection has been substantially maintained (except where amendments to the claims necessitated a change to the rejection).
Secondly, with regards to Applicant’s remarks about the previously made 35 USC 112 rejections, stating that these should all be withdrawn due to being either improper or moot in view of the amendments made to the claims, Examiner is persuaded and thus these rejections have been withdrawn.  However, in view of the amendments made to the claims, a new rejection under 35 USC 112(b) has been made against Claim 9 for lack of antecedent basis in the term “the same housing”.
Thirdly, with regards to Applicant’s remarks about the previously made 35 USC 102/103 prior art rejections, Examiner is not persuaded.  Applicant asserts that Rust does not disclose or suggest at least the features of "concatenating the first radar feature map and the first camera feature map to generate a first concatenated feature map in the common spatial domain; and detecting one or more objects in the first concatenated feature map," as recited in independent Claim 1 and similarly claimed in independent Claims 12, 16, and 27.  Applicant further asserts that “while the sensor device 42n may be a radar or an optical camera, there is no disclosure or suggestion that the point clouds 410a,b are from different sensors.  Instead, they appear to be from the same sensor, just at different times.  That is, the method described in Rust with reference to Fig. 4 may be performed separately for different types of sensor devices 42n – the method is not performed on different types of frames from different types of sensor devices 42n in the same execution.”  The remainder of Applicant’s arguments all stem from this one premise, which Examiner respectfully disagrees with.
In rebuttal to Applicant’s prior art arguments, Examiner points out that that Rust first brings up the sensing devices in Paragraph 39, which does not merely mention a single sensor 42n (as was merely an example sensor used for explaining the process of Fig. 4), but in fact mentions sensing devices 40a-40n “The sensor system 28 includes one or more sensing devices 40a-40n that sense observable conditions of the exterior environment and/or the interior environment of the autonomous vehicle 10. The sensing devices 40a-40n might include, but are not limited to, radars, lidars, global positioning systems, optical cameras, thermal cameras, ultrasonic sensors, and/or other sensors”.  It should be further noted that ‘devices’ is plural, there’s an ‘a’ through ‘n’ rather than just an ‘n’, and the list of potential types of sensors says “not limited to” and follows with an ‘and/or’ rather than just an ‘or’.  Secondly, Rust’s paragraph 54 states “the sensor fusion system 74 synthesizes and processes sensor data and predicts the presence, location, classification, and/or path of objects and features of the environment of the vehicle 10. In various embodiments, the sensor fusion system 74 can incorporate information from multiple sensors, including but not limited to cameras, lidars, radars, and/or any number of other types of sensors”.  It should be noted that again, ‘multiple sensors’ are mentioned, including both cameras and radars, and ends the list with an ‘and/or’ rather than just ‘or’.  As such, it is clear that Rust’s disclosure includes any number of any combination of these single type or multiple types of sensors, to include the required possibility of there being a first sensor as either a radar or a camera and a second sensor as the other of the radar or the camera.  Further, the process of Fig. 4 that the Applicant points to includes “The point cloud processing system 400 is configured to process frames of point clouds obtained through sensor system 28”, Paragraph 59, and from Rust’s Paragraph 39 it is known that the sensor system 28 may include sensing devices 40a-40n, which may comprise cameras and radars. Additionally, with regards to the process of Fig. 4 that that Applicant points to, Rust’s Paragraph 60 states “The three dimensional imaging data 402 may be obtained through the sensor fusion system 74”, but additionally, Rust’s Paragraph 58 states “The systems and methods for determining velocity described further below with respect to FIGS. 4 and 5 is, in embodiments, incorporated in the positioning system 70 and may be included downstream of the sensor fusion system 74”, and further, Rust’s Paragraph 61 states “The three-dimensional data 402 obtained from the sensor system 28 may be preprocessed by the sensor fusion system 74, which may be included as part of a data receiving module 404 in order to generate point cloud data 406”.  In other words, the process of Fig. 4 may occur ‘downstream’ (i.e. after) the data from the various sensors have already been processed through the sensor fusion system 74.  This further evidences the Examiner’s perspective that Rust discloses that the point clouds that are being merged into a common spatial domain may in fact come from different types of sensors (or alternatively, from multiple sensors of the same type, or alternatively still, from a single sensor of a single type).  Therefore, Applicant’s arguments have been found to be non-persuasive, and the 35 USC 102/103 prior art rejections have been substantially maintained (except where amendments to the claims necessitated a change to the rejection/-s).
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.
The Examiner has cited particular paragraphs or columns and line numbers in the references applied to the claims above for the convenience of the Applicant.  Although the specified citations are representative of the teachings of the art and are applied to specific limitations within the individual claim, other passages and figures may apply as well.  It is respectfully requested of the Applicant in preparing responses, to fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.  See MPEP 2141.02 [R-07.2015] VI. A prior art reference must be considered in its entirety, i.e., as a whole, including portions that would lead away from the claimed Invention.  W.L. Gore & Associates, Inc. v. Garlock, Inc., 721 F.2d 1540, 220 USPQ 303 (Fed. Cir. 1983), cert, denied, 469 U.S. 851 (1984).  See also MPEP §2123.
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure, and may be found on an accompanying PTO-892, when applicable.  When a PTO-892 exists, all cited references have either (a) been utilized in the above rejections for their specific teachings (wherein relevant teachings are cited to within the prior art rejections above in specific association with the limitation/-s that they disclose, teach, suggest, or render obvious), (b) have significant relevance to the application as a whole (analogous art), or (c) have significant relevance to one or more specific limitation/-s within the claims.  If a cited reference does not pre-date the effective filing date of the instant application, despite not being “prior” art, it still represents a current state of the art that may be found useful to the Applicant.  Currently, it is the Office’s belief that the reason/-s for why a particular reference has been included in any past or current PTO-892 is self-evident; however, if Applicant cannot determine why any one or more reference/-s has/have been included on a PTO-892, upon request from the Applicant, the Examiner can provide an explanation within a future Office action and/or during a future interview.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to THOMAS E WORDEN whose telephone number is 571-272-4876.  The examiner can normally be reached between 1000-1700hrs, M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool.  To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Geepy Pe can be reached on 571-270-3703.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).  If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/THOMAS E WORDEN/Primary Examiner, Art Unit 3663