Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Breed et al. (U.S. Patent Publication No. 2018/0025632).
Regarding claim 1, Breed et al. teaches A method comprising: receiving a plurality of images, each image corresponding to a video frame captured by one or more sensors of a vehicle; (Abstract; See "Vehicle-mounted device includes an inertial measurement unit (IMU) (8) having at least one accelerometer or gyroscope, a GPS receiver (6), a camera (10) positioned to obtain unobstructed images of an area exterior of the vehicle (16) and a control system (20) coupled to these components. The control system (20) re-calibrates each accelerometer or gyroscope using signals obtained by the GPS receiver (6), and derives information about objects in the images obtained by the camera (10) and location of the objects based on data from the IMU (8) and GPS receiver (6). A communication system (18) communicates the information derived by the control system (20) to a location separate and apart from the vehicle (16). The control system (20) includes a processor that provides a location of the camera (10) and a direction in which the camera (10) is imaging based on data from the IMU corrected based on data from the GPS receiver (6), for use in creating the map database (12). (FIG. 2)") processing the plurality of images using a first neural network model configured to generate, for a first image of the plurality of images, a first feature map indicating, for each pixel of the first image, a feature vector corresponding to an intent associated with the pixel; (Par. 0046; See "When an image is acquired by camera 10, it can be subjected to a coding process and coded data entered into a pattern recognition algorithm such as a neural network in the ECU 20. In one preferred implementation, the pixels of each image from camera 10 are arranged into a vector and the pixels are scanned to locate edges of objects. When an edge is found by processing hardware and/or software in the ECU 20, the value of the data element in the vector which corresponds to the pixel can be set to indicate the angular orientation of the edge in the pixel. For example, a vertical edge can be assigned a 1 and a horizontal element an 8 and those at in between angles assigned numbers between 1 and 8 depending on the angle. If no edge is found, then the pixel data can be assigned a value of O. When this vector is entered into a properly trained neural network, the network algorithm can output data indicating that a pole, tree, building, or other desired to-be-recognized object has been identified and provide the pixel locations of the object. This can be accomplished with high accuracy providing the neural network has been properly trained with sufficient examples of the objects sought to be identified. Development of the neural network is known to those skilled in the art with the understanding as found by the inventors that a large number of vectors may be needed to make up the training database for the neural network. In some cases, the number of vectors in the training database can approach or exceed one million Only those objects which are clearly recognizable are chosen as fiduciaries." The vectors are used to determine the intent of the objects within the images. The combination of vectors identified by the neural network act as a feature map which is also described within Par. 0048; See “This is only one of a large number of such techniques where observed object properties exhibited in the pixels are used to form the neural network vectors. Others include color, texture, material properties, reflective properties etc. and this invention should not be limited to a particular method of forming the neural network vectors or the pixel properties chosen.”) using a second neural network model, identifying one or more objects within the first image based upon the first image and the first feature map of the first image;  (Par. 0048; See “The above described neural network is based on using the edges of objects to form the vectors analyzed by the neural network in the ECU 20. This is only one of a large number of such techniques where observed object properties exhibited in the pixels are used to form the neural network vectors. Others include color, texture, material properties, reflective properties etc. and this invention should not be limited to a particular method of forming the neural network vectors or the pixel properties chosen.” The first and second neural network are no different from one another. The neural network within Breed et al. performs the same functions by identifying pixels within the first image and using them to identify objects in the image.) determining, for each of the identified one or more objects, an overall intent of the object, based upon an aggregation of the feature vectors corresponding to pixels encompassed by the object; (Pars. 0047-0048; See "Once pixels which represent a pole, for example, have been identified, then one or more vectors can be derived extending from the camera in the direction of the pole based on the location and angle of the camera 10. When the pole is identified in two such images (from the same or different cameras 10) then the intersection of the vectors can be calculated and the pole location in space determined." & "The above described neural network is based on using the edges of objects to form the vectors analyzed by the neural network in the ECU 20. This is only one of a large number of such techniques where observed object properties exhibited in the pixels are used to form the neural network vectors. Others include color, texture, material properties, reflective properties etc. and this invention should not be limited to a particular method of forming the neural network vectors or the pixel properties chosen." While the literal word intent is not mentioned in reference to the objects, knowing the vectors with the properties of the neural network allows for understanding the intent of the objects. While poles are stationary, Breed et al. uses multiple images to locate the vectors of the objects in relation to a moving object. The neural network uses the vectors of the objects identified in the image to identify the intent of the object. Other properties may be identified as well as described in Par. 0048.) and generating one or more commands to control the vehicle based upon the determined overall intents of the one or more objects. (Par. 0053; See "In this regard, a display may be provided to the driver of the probe vehicle 16 indicating the maximum speed which is determined based on the number of fiduciaries in the images being obtained by the camera 10 on the probe vehicle 16. If the probe vehicle 16 is autonomous, then its speed may be limited by known control systems the number of fiduciaries in the images being obtained by camera 10. In the same manner, the highest speed of the probe vehicle 16 may be notified to the driver or limited by control systems based on the accuracy desired for the images obtained by the camera 10, i.e., on the illumination present and the properties of the imager, as a sort of feedback technique. Data about the time and accuracy of the processing of images from the camera 10 by the ECU 20 is thus used to control a driver display (not shown) to show the highest speed or to control the autonomous vehicle speed control system.")
Regarding claim 2, Breed et al. teaches The method of claim 1, wherein a feature vector associated with a pixel corresponds to a plurality of intents associated with the pixel, each intent associated with a predicted value representative of a statistical distribution of the intent and an uncertainty value associated with the predicted value. (Par. 0046; See "When an image is acquired by camera 10, it can be subjected to a coding process and coded data entered into a pattern recognition algorithm such as a neural network in the ECU 20. In one preferred implementation, the pixels of each image from camera 10 are arranged into a vector and the pixels are scanned to locate edges of objects. When an edge is found by processing hardware and/or software in the ECU 20, the value of the data element in the vector which corresponds to the pixel can be set to indicate the angular orientation of the edge in the pixel. For example, a vertical edge can be assigned a 1 and a horizontal element an 8 and those at in between angles assigned numbers between 1 and 8 depending on the angle. If no edge is found, then the pixel data can be assigned a value ofO. When this vector is entered into a properly trained neural network, the network algorithm can output data indicating that a pole, tree, building, or other desired to-be-recognized object has been identified and provide the pixel locations of the object. This can be accomplished with high accuracy providing the neural network has been properly trained with sufficient examples of the objects sought to be identified. Development of the neural network is known to those skilled in the art with the understanding as found by the inventors that a large number of vectors may be needed to make up the training database for the neural network. In some cases, the number of vectors in the training database can approach or exceed one million Only those objects which are clearly recognizable are chosen as fiduciaries." & Par. 0061; See "Two images of a particular fiduciary (taken from different locations) are necessary to establish an estimate of the location of the fiduciary. Such an estimate contains errors in, for example, the GPS determination of the location of the device each second for calibration, errors in the IMU determination of its location over and above the GPS errors, errors in the determination of the angle of the fiduciary as determined by the IMU and the camera pixels and errors due to the resolutions of all of these devices. When a third image  is available, two additional estimates are available when image 1 is compared with image 3 and image 2 is also compared with image 3. The number of estimates E available can be determined by the formula E=n*(n-1)/2, wherein n is the number of images. Thus the number of estimates grows rapidly with the number of images. For example, if 10 images are available, 45 estimates of the position of the fiduciary can be used. Since the number of estimates increases rapidly with the number of images, convergence to any desired accuracy level is rapid. 100 images, for example, can provide almost 5000 such estimates.")
Regarding claim 3, Breed et al. teaches The method of claim 1, wherein the determined overall intents are representative of predicted actions to be performed by the one or more objects. (Par. 0047; See "Once pixels which represent a pole, for example, have been identified, then one or more vectors can be derived extending from the camera in the direction of the pole based on the location and angle of the camera 10. When the pole is identified in two such images (from the same or different cameras 10) then the intersection of the vectors can be calculated and the pole location in space determined.” The intent of the pole is determined from the relationship between the vectors of the pole within the multiple images. While the pole is stationary, this method would work similarly to moving objects as everything is moving within the images while the vehicle is moving.)
Regarding claim 4, Breed et al. teaches The method of claim 1, wherein identifying the one or more objects using the second network model, further comprises: performing object recognition on the first image to generate a bounding box around each of the one or more objects in the first image, the bounding box encompassing a plurality of pixels representative of the object. (Par. 0046; See "When an image is acquired by camera 10, it can be subjected to a coding process and coded data entered into a pattern recognition algorithm such as a neural network in the ECU 20. In one preferred implementation, the pixels of each image from camera 10 are arranged into a vector and the pixels are scanned to locate edges of objects. When an edge is found by processing hardware and/or software in the ECU 20, the value of the data element in the vector which corresponds to the pixel can be set to indicate the angular orientation of the edge in the pixel. For example, a vertical edge can be assigned a 1 and a horizontal element an 8 and those at in between angles assigned numbers between 1 and 8 depending on the angle. If no edge is found, then the pixel data can be assigned a value ofO. When this vector is entered into a properly trained neural network, the network algorithm can output data indicating that a pole, tree, building, or other desired to-be-recognized object has been identified and provide the pixel locations of the object. This can be accomplished with high accuracy providing the neural network has been properly trained with sufficient examples of the objects sought to be identified. Development of the neural network is known to those skilled in the art with the understanding as found by the inventors that a large number of vectors may be needed to make up the training database for the neural network. In some cases, the number of vectors in the training database can approach or exceed one million Only those objects which are clearly recognizable are chosen as fiduciaries.") Edges of the object are determined thus creating an estimated boundary.
Regarding claim 5, Breed et al. teaches The method of claim 1, wherein the overall intent of the object is based on a relationship with one or more other objects in the first image. (Par. 0047; See "Once pixels which represent a pole, for example, have been identified, then one or more vectors can be derived extending from the camera in the direction of the pole based on the location and angle of the camera 10. When the pole is identified in two such images (from the same or different cameras 10) then the intersection of the vectors can be calculated and the pole location in space determined.” The intent of the pole is determined from the relationship between the vectors of the pole within the multiple images.)
Regarding claim 6, Breed et al. teaches The method of claim 5, wherein the relationship is based on relative positions of the object with the one or more other objects in the first image. (Par. 0046; See "When an image is acquired by camera 10, it can be subjected to a coding process and coded data entered into a pattern recognition algorithm such as a neural network in the ECU 20. In one preferred implementation, the pixels of each image from camera 10 are arranged into a vector and the pixels are scanned to locate edges of objects. When an edge is found by processing hardware and/or software in the ECU 20, the value of the data element in the vector which corresponds to the pixel can be set to indicate the angular orientation of the edge in the pixel. For example, a vertical edge can be assigned a 1 and a horizontal element an 8 and those at in between angles assigned numbers between 1 and 8 depending on the angle. If no edge is found, then the pixel data can be assigned a value ofO. When this vector is entered into a properly trained neural network, the network algorithm can output data indicating that a pole, tree, building, or other desired to-be-recognized object has been identified and provide the pixel locations of the object. This can be accomplished with high accuracy providing the neural network has been properly trained with sufficient examples of the objects sought to be identified. Development of the neural network is known to those skilled in the art with the understanding as found by the inventors that a large number of vectors may be needed to make up the training database for the neural network. In some cases, the number of vectors in the training database can approach or exceed one million Only those objects which are clearly recognizable are chosen as fiduciaries." & Par. 0061; See "Two images of a particular fiduciary (taken from different locations) are necessary to establish an estimate of the location of the fiduciary. Such an estimate contains errors in, for example, the GPS determination of the location of the device each second for calibration, errors in the IMU determination of its location over and above the GPS errors, errors in the determination of the angle of the fiduciary as determined by the IMU and the camera pixels and errors due to the resolutions of all of these devices. When a third image  is available, two additional estimates are available when image 1 is compared with image 3 and image 2 is also compared with image 3. The number of estimates E available can be determined by the formula E=n*(n-1)/2, wherein n is the number of images. Thus the number of estimates grows rapidly with the number of images. For example, if 10 images are available, 45 estimates of the position of the fiduciary can be used. Since the number of estimates increases rapidly with the number of images, convergence to any desired accuracy level is rapid. 100 images, for example, can provide almost 5000 such estimates.")
Regarding claim 7, Breed et al. teaches The method of claim 1, wherein the overall intent of the object is based on a presence or absence of other objects in the first image. (Par. 0006; See "In order to achieve a new and improved method and arrangement for creating maps of terrain surrounding and/or including roads, a method and system for mapping terrain including one or more roads in accordance with the invention includes a vehicle equipped with at least one camera, a position determining system that determines its position and an inertial measurement unit (IMU) that provides, when corrected by readings from the position determining system, the position and angular orientation of the camera(s) and IMU, all of which are in a set configuration relative to one another. A processor at a remote location apart from the vehicle receives data from the vehicle and converts information related to fiduciaries from the images from the camera(s) to a map including objects from the images by identifying common objects in multiple images, which may be obtained from the same or different vehicles, and using the position information and the inertial measurement information from when the multiple images were obtained and knowledge of the set configuration of the camera(s), the position determining system and the IMU. The information derived from the images, position information and inertial measurement information are transmitted to the processor by a communications unit on the vehicle. The position determining unit, the IMU, the camera and the communications unit may be combined into a single device for easy retrofit application to one or more vehicles.")
Regarding claim 8, Breed et al. teaches The method of claim 1, further comprising: generating, using the first neural network model, a second feature map for a second image that includes an object from the one or more objects within the first image, the second feature map indicating a feature vector for each pixel encompassed by the object within the second image. (Pars. 0046-0047; See "When an image is acquired by camera 10, it can be subjected to a coding process and coded data entered into a pattern recognition algorithm such as a neural network in the ECU 20. In one preferred implementation, the pixels of each image from camera 10 are arranged into a vector and the pixels are scanned to locate edges of objects. When an edge is found by processing hardware and/or software in the ECU 20, the value of the data element in the vector which corresponds to the pixel can be set to indicate the angular orientation of the edge in the pixel. For example, a vertical edge can be assigned a 1 and a horizontal element an 8 and those at in between angles assigned numbers between 1 and 8 depending on the angle. If no edge is found, then the pixel data can be assigned a value ofO. When this vector is entered into a properly trained neural network, the network algorithm can output data indicating that a pole, tree, building, or other desired to-be-recognized object has been identified and provide the pixel locations of the object. This can be accomplished with high accuracy providing the neural network has been properly trained with sufficient examples of the objects sought to be identified. Development of the neural network is known to those skilled in the art with the understanding as found by the inventors that a large number of vectors may be needed to make up the training database for the neural network. In some cases, the number of vectors in the training database can approach or exceed one million Only those objects which are clearly recognizable are chosen as fiduciaries." & "Once pixels which represent a pole, for example, have been identified, then one or more vectors can be derived extending from the camera in the direction of the pole based on the location and angle of the camera 10. When the pole is identified in two such images (from the same or different cameras 10) then the intersection of the vectors can be calculated and the pole location in space determined.") Multiple images are received from one or more vehicles which are then analyzed using the neural networks.
Regarding claim 9, Breed et al. teaches The method of claim 7, further comprising: identifying, using the second neural network model, the object within the second image based upon the second image and the second feature map of the second image; and determining an updated overall intent of the object based upon the aggregation of the feature vectors corresponding to the pixels encompassed by the object within the second image. (Pars. 0046-0047; See "When an image is acquired by camera 10, it can be subjected to a coding process and coded data entered into a pattern recognition algorithm such as a neural network in the ECU 20. In one preferred implementation, the pixels of each image from camera 10 are arranged into a vector and the pixels are scanned to locate edges of objects. When an edge is found by processing hardware and/or software in the ECU 20, the value of the data element in the vector which corresponds to the pixel can be set to indicate the angular orientation of the edge in the pixel. For example, a vertical edge can be assigned a 1 and a horizontal element an 8 and those at in between angles assigned numbers between 1 and 8 depending on the angle. If no edge is found, then the pixel data can be assigned a value ofO. When this vector is entered into a properly trained neural network, the network algorithm can output data indicating that a pole, tree, building, or other desired to-be-recognized object has been identified and provide the pixel locations of the object. This can be accomplished with high accuracy providing the neural network has been properly trained with sufficient examples of the objects sought to be identified. Development of the neural network is known to those skilled in the art with the understanding as found by the inventors that a large number of vectors may be needed to make up the training database for the neural network. In some cases, the number of vectors in the training database can approach or exceed one million Only those objects which are clearly recognizable are chosen as fiduciaries." & "Once pixels which represent a pole, for example, have been identified, then one or more vectors can be derived extending from the camera in the direction of the pole based on the location and angle of the camera 10. When the pole is identified in two such images (from the same or different cameras 10) then the intersection of the vectors can be calculated and the pole location in space determined." & Par. 0058; See "Image acquisition by a probe vehicle 16 can be controlled by a remote site (e.g., personnel at the remote station 14) on an as-needed basis. If the remote station 14 determines that more images would be useful, for example if it indicates a change or error in the map, it can send a command to the probe vehicle 16 to upload one or more images. In this manner, the roads can be continuously monitored for changes and the maps kept continuously accurate. Similarly, once the system is largely operational, a probe vehicle 16 can be constantly comparing what it sees using the camera 10 with its copy of the map in map database 12 and when it finds a discrepancy in the presence or location of a fiduciary found in the image from camera 10 relative to the contents of the map database 12, for example, it can notify the control site and together they can determine whether the probe vehicle's map needs updating or whether more images are needed indicating a change in the roadway or its surrounding terrain.") Multiple images are received from one or more vehicles which are then analyzed using the neural networks. The vehicle continuously updates surrounding sensor information using the multiple images after being analyzed by the neural network.
Regarding claim 10, Breed et al. teaches A non-transitory computer readable medium storing instructions that when executed by a processor cause the processor to perform steps comprising: receiving a plurality of images, each image corresponding to a video frame captured by one or more sensors of a vehicle; (Abstract; See "Vehicle-mounted device includes an inertial measurement unit (IMU) (8) having at least one accelerometer or gyroscope, a GPS receiver (6), a camera (10) positioned to obtain unobstructed images of an area exterior of the vehicle (16) and a control system (20) coupled to these components. The control system (20) re-calibrates each accelerometer or gyroscope using signals obtained by the GPS receiver (6), and derives information about objects in the images obtained by the camera (10) and location of the objects based on data from the IMU (8) and GPS receiver (6). A communication system (18) communicates the information derived by the control system (20) to a location separate and apart from the vehicle (16). The control system (20) includes a processor that provides a location of the camera (10) and a direction in which the camera (10) is imaging based on data from the IMU corrected based on data from the GPS receiver (6), for use in creating the map database (12). (FIG. 2)") processing the plurality of images using a first neural network model configured to generate, for a first image of the plurality of images, a first feature map indicating, for each pixel of the first image, a feature vector corresponding to an intent associated with the pixel; (Par. 0046; See "When an image is acquired by camera 10, it can be subjected to a coding process and coded data entered into a pattern recognition algorithm such as a neural network in the ECU 20. In one preferred implementation, the pixels of each image from camera 10 are arranged into a vector and the pixels are scanned to locate edges of objects. When an edge is found by processing hardware and/or software in the ECU 20, the value of the data element in the vector which corresponds to the pixel can be set to indicate the angular orientation of the edge in the pixel. For example, a vertical edge can be assigned a 1 and a horizontal element an 8 and those at in between angles assigned numbers between 1 and 8 depending on the angle. If no edge is found, then the pixel data can be assigned a value of O. When this vector is entered into a properly trained neural network, the network algorithm can output data indicating that a pole, tree, building, or other desired to-be-recognized object has been identified and provide the pixel locations of the object. This can be accomplished with high accuracy providing the neural network has been properly trained with sufficient examples of the objects sought to be identified. Development of the neural network is known to those skilled in the art with the understanding as found by the inventors that a large number of vectors may be needed to make up the training database for the neural network. In some cases, the number of vectors in the training database can approach or exceed one million Only those objects which are clearly recognizable are chosen as fiduciaries." The vectors are used to determine the intent of the objects within the images. The combination of vectors identified by the neural network act as a feature map which is also described within Par. 0048; See “This is only one of a large number of such techniques where observed object properties exhibited in the pixels are used to form the neural network vectors. Others include color, texture, material properties, reflective properties etc. and this invention should not be limited to a particular method of forming the neural network vectors or the pixel properties chosen.”)  using a second neural network model, identifying one or more objects within the first image based upon the first image and the first feature map of the first image; (Par. 0048; See “The above described neural network is based on using the edges of objects to form the vectors analyzed by the neural network in the ECU 20. This is only one of a large number of such techniques where observed object properties exhibited in the pixels are used to form the neural network vectors. Others include color, texture, material properties, reflective properties etc. and this invention should not be limited to a particular method of forming the neural network vectors or the pixel properties chosen.” The first and second neural network are no different from one another. The neural network within Breed et al. performs the same functions by identifying pixels within the first image and using them to identify objects in the image.) determining, for each of the identified one or more objects, an overall intent of the object, based upon an aggregation of the feature vectors corresponding to pixels encompassed by the object; (Pars. 0047-0048; See "Once pixels which represent a pole, for example, have been identified, then one or more vectors can be derived extending from the camera in the direction of the pole based on the location and angle of the camera 10. When the pole is identified in two such images (from the same or different cameras 10) then the intersection of the vectors can be calculated and the pole location in space determined." & "The above described neural network is based on using the edges of objects to form the vectors analyzed by the neural network in the ECU 20. This is only one of a large number of such techniques where observed object properties exhibited in the pixels are used to form the neural network vectors. Others include color, texture, material properties, reflective properties etc. and this invention should not be limited to a particular method of forming the neural network vectors or the pixel properties chosen." While the literal word intent is not mentioned in reference to the objects, knowing the vectors with the properties of the neural network allows for understanding the intent of the objects. While poles are stationary, Breed et al. uses multiple images to locate the vectors of the objects in relation to a moving object. The neural network uses the vectors of the objects identified in the image to identify the intent of the object. Other properties may be identified as well as described in Par. 0048.) and generating one or more commands to control the vehicle based upon the determined overall intents of the one or more objects. (Par. 0053; See "In this regard, a display may be provided to the driver of the probe vehicle 16 indicating the maximum speed which is determined based on the number of fiduciaries in the images being obtained by the camera 10 on the probe vehicle 16. If the probe vehicle 16 is autonomous, then its speed may be limited by known control systems the number of fiduciaries in the images being obtained by camera 10. In the same manner, the highest speed of the probe vehicle 16 may be notified to the driver or limited by control systems based on the accuracy desired for the images obtained by the camera 10, i.e., on the illumination present and the properties of the imager, as a sort of feedback technique. Data about the time and accuracy of the processing of images from the camera 10 by the ECU 20 is thus used to control a driver display (not shown) to show the highest speed or to control the autonomous vehicle speed control system.")
Regarding claim 11, Breed et al. teaches The non-transitory computer readable medium of claim 10, wherein a feature vector associated with a pixel corresponds to a plurality of intents associated with the pixel, each intent associated with a predicted value representative of a statistical distribution of the intent and an uncertainty value associated with the predicted value. (Par. 0046; See "When an image is acquired by camera 10, it can be subjected to a coding process and coded data entered into a pattern recognition algorithm such as a neural network in the ECU 20. In one preferred implementation, the pixels of each image from camera 10 are arranged into a vector and the pixels are scanned to locate edges of objects. When an edge is found by processing hardware and/or software in the ECU 20, the value of the data element in the vector which corresponds to the pixel can be set to indicate the angular orientation of the edge in the pixel. For example, a vertical edge can be assigned a 1 and a horizontal element an 8 and those at in between angles assigned numbers between 1 and 8 depending on the angle. If no edge is found, then the pixel data can be assigned a value ofO. When this vector is entered into a properly trained neural network, the network algorithm can output data indicating that a pole, tree, building, or other desired to-be-recognized object has been identified and provide the pixel locations of the object. This can be accomplished with high accuracy providing the neural network has been properly trained with sufficient examples of the objects sought to be identified. Development of the neural network is known to those skilled in the art with the understanding as found by the inventors that a large number of vectors may be needed to make up the training database for the neural network. In some cases, the number of vectors in the training database can approach or exceed one million Only those objects which are clearly recognizable are chosen as fiduciaries." & Par. 0061; See "Two images of a particular fiduciary (taken from different locations) are necessary to establish an estimate of the location of the fiduciary. Such an estimate contains errors in, for example, the GPS determination of the location of the device each second for calibration, errors in the IMU determination of its location over and above the GPS errors, errors in the determination of the angle of the fiduciary as determined by the IMU and the camera pixels and errors due to the resolutions of all of these devices. When a third image  is available, two additional estimates are available when image 1 is compared with image 3 and image 2 is also compared with image 3. The number of estimates E available can be determined by the formula E=n*(n-1)/2, wherein n is the number of images. Thus the number of estimates grows rapidly with the number of images. For example, if 10 images are available, 45 estimates of the position of the fiduciary can be used. Since the number of estimates increases rapidly with the number of images, convergence to any desired accuracy level is rapid. 100 images, for example, can provide almost 5000 such estimates.")
Regarding claim 12, Breed et al. teaches The non-transitory computer readable medium of claim 10, wherein the determined overall intents are representative of predicted actions to be performed by the one or more objects. (Par. 0046; See "When an image is acquired by camera 10, it can be subjected to a coding process and coded data entered into a pattern recognition algorithm such as a neural network in the ECU 20. In one preferred implementation, the pixels of each image from camera 10 are arranged into a vector and the pixels are scanned to locate edges of objects. When an edge is found by processing hardware and/or software in the ECU 20, the value of the data element in the vector which corresponds to the pixel can be set to indicate the angular orientation of the edge in the pixel. For example, a vertical edge can be assigned a 1 and a horizontal element an 8 and those at in between angles assigned numbers between 1 and 8 depending on the angle. If no edge is found, then the pixel data can be assigned a value ofO. When this vector is entered into a properly trained neural network, the network algorithm can output data indicating that a pole, tree, building, or other desired to-be-recognized object has been identified and provide the pixel locations of the object. This can be accomplished with high accuracy providing the neural network has been properly trained with sufficient examples of the objects sought to be identified. Development of the neural network is known to those skilled in the art with the understanding as found by the inventors that a large number of vectors may be needed to make up the training database for the neural network. In some cases, the number of vectors in the training database can approach or exceed one million Only those objects which are clearly recognizable are chosen as fiduciaries." & Par. 0061; See "Two images of a particular fiduciary (taken from different locations) are necessary to establish an estimate of the location of the fiduciary. Such an estimate contains errors in, for example, the GPS determination of the location of the device each second for calibration, errors in the IMU determination of its location over and above the GPS errors, errors in the determination of the angle of the fiduciary as determined by the IMU and the camera pixels and errors due to the resolutions of all of these devices. When a third image  is available, two additional estimates are available when image 1 is compared with image 3 and image 2 is also compared with image 3. The number of estimates E available can be determined by the formula E=n*(n-1)/2, wherein n is the number of images. Thus the number of estimates grows rapidly with the number of images. For example, if 10 images are available, 45 estimates of the position of the fiduciary can be used. Since the number of estimates increases rapidly with the number of images, convergence to any desired accuracy level is rapid. 100 images, for example, can provide almost 5000 such estimates.")
Regarding claim 13, Breed et al. teaches The non-transitory computer readable medium of claim 10, wherein identifying the one or more objects using the second network model, further comprises: performing object recognition on the first image to generate a bounding box around each of the one or more objects in the first image, the bounding box encompassing a plurality of pixels representative of the object. (Par. 0046; See "When an image is acquired by camera 10, it can be subjected to a coding process and coded data entered into a pattern recognition algorithm such as a neural network in the ECU 20. In one preferred implementation, the pixels of each image from camera 10 are arranged into a vector and the pixels are scanned to locate edges of objects. When an edge is found by processing hardware and/or software in the ECU 20, the value of the data element in the vector which corresponds to the pixel can be set to indicate the angular orientation of the edge in the pixel. For example, a vertical edge can be assigned a 1 and a horizontal element an 8 and those at in between angles assigned numbers between 1 and 8 depending on the angle. If no edge is found, then the pixel data can be assigned a value ofO. When this vector is entered into a properly trained neural network, the network algorithm can output data indicating that a pole, tree, building, or other desired to-be-recognized object has been identified and provide the pixel locations of the object. This can be accomplished with high accuracy providing the neural network has been properly trained with sufficient examples of the objects sought to be identified. Development of the neural network is known to those skilled in the art with the understanding as found by the inventors that a large number of vectors may be needed to make up the training database for the neural network. In some cases, the number of vectors in the training database can approach or exceed one million Only those objects which are clearly recognizable are chosen as fiduciaries.") Edges of the object are determined thus creating an estimated boundary.
Regarding claim 14, Breed et al. teaches The non-transitory computer readable medium of claim 10, wherein the overall intent of the object is based on a relationship with one or more other objects in the first image. (Par. 0006; See "In order to achieve a new and improved method and arrangement for creating maps of terrain surrounding and/or including roads, a method and system for mapping terrain including one or more roads in accordance with the invention includes a vehicle equipped with at least one camera, a position determining system that determines its position and an inertial measurement unit (IMU) that provides, when corrected by readings from the position determining system, the position and angular orientation of the camera(s) and IMU, all of which are in a set configuration relative to one another. A processor at a remote location apart from the vehicle receives data from the vehicle and converts information related to fiduciaries from the images from the camera(s) to a map including objects from the images by identifying common objects in multiple images, which may be obtained from the same or different vehicles, and using the position information and the inertial measurement information from when the multiple images were obtained and knowledge of the set configuration of the camera(s), the position determining system and the IMU. The information derived from the images, position information and inertial measurement information are transmitted to the processor by a communications unit on the vehicle. The position determining unit, the IMU, the camera and the communications unit may be combined into a single device for easy retrofit application to one or more vehicles.")
Regarding claim 15, Breed et al. teaches The non-transitory computer readable medium of claim 10, wherein the overall intent of the object is based on a presence or absence of other objects in the first image. (Par. 0047; See "Once pixels which represent a pole, for example, have been identified, then one or more vectors can be derived extending from the camera in the direction of the pole based on the location and angle of the camera 10. When the pole is identified in two such images (from the same or different cameras 10) then the intersection of the vectors can be calculated and the pole location in space determined.” The intent of the pole is determined from the relationship between the vectors of the pole within the multiple images.)
Regarding claim 16, Breed et al. teaches The non-transitory computer readable medium of claim 15, wherein the relationship is based on relative positions of the object with the one or more other objects in the first image. (Par. 0047; See "Once pixels which represent a pole, for example, have been identified, then one or more vectors can be derived extending from the camera in the direction of the pole based on the location and angle of the camera 10. When the pole is identified in two such images (from the same or different cameras 10) then the intersection of the vectors can be calculated and the pole location in space determined.” The intent of the pole is determined from the relationship between the vectors of the pole within the multiple images.)
Regarding claim 17, Breed et al. teaches The non-transitory computer readable medium of claim 10, further storing instructions that cause the processor to perform the step of: generating, using the first neural network model, a second feature map for a second image that includes an object from the one or more objects within the first image, the second feature map indicating a feature vector for each pixel encompassed by the object within the second image. (Pars. 0046-0047; See "When an image is acquired by camera 10, it can be subjected to a coding process and coded data entered into a pattern recognition algorithm such as a neural network in the ECU 20. In one preferred implementation, the pixels of each image from camera 10 are arranged into a vector and the pixels are scanned to locate edges of objects. When an edge is found by processing hardware and/or software in the ECU 20, the value of the data element in the vector which corresponds to the pixel can be set to indicate the angular orientation of the edge in the pixel. For example, a vertical edge can be assigned a 1 and a horizontal element an 8 and those at in between angles assigned numbers between 1 and 8 depending on the angle. If no edge is found, then the pixel data can be assigned a value ofO. When this vector is entered into a properly trained neural network, the network algorithm can output data indicating that a pole, tree, building, or other desired to-be-recognized object has been identified and provide the pixel locations of the object. This can be accomplished with high accuracy providing the neural network has been properly trained with sufficient examples of the objects sought to be identified. Development of the neural network is known to those skilled in the art with the understanding as found by the inventors that a large number of vectors may be needed to make up the training database for the neural network. In some cases, the number of vectors in the training database can approach or exceed one million Only those objects which are clearly recognizable are chosen as fiduciaries." & "Once pixels which represent a pole, for example, have been identified, then one or more vectors can be derived extending from the camera in the direction of the pole based on the location and angle of the camera 10. When the pole is identified in two such images (from the same or different cameras 10) then the intersection of the vectors can be calculated and the pole location in space determined.") Multiple images are received from one or more vehicles which are then analyzed using the neural networks.
Regarding claim 18, Breed et al. teaches The non-transitory computer readable medium of claim 16, further storing instructions that cause the processor to perform the steps of: identifying, using the second neural network model, the object within the second image based upon the second image and the second feature map of the second image; and determining an updated overall intent of the object based upon the aggregation of the feature vectors corresponding to the pixels encompassed by the object within the second image. (Pars. 0046-0047; See "When an image is acquired by camera 10, it can be subjected to a coding process and coded data entered into a pattern recognition algorithm such as a neural network in the ECU 20. In one preferred implementation, the pixels of each image from camera 10 are arranged into a vector and the pixels are scanned to locate edges of objects. When an edge is found by processing hardware and/or software in the ECU 20, the value of the data element in the vector which corresponds to the pixel can be set to indicate the angular orientation of the edge in the pixel. For example, a vertical edge can be assigned a 1 and a horizontal element an 8 and those at in between angles assigned numbers between 1 and 8 depending on the angle. If no edge is found, then the pixel data can be assigned a value ofO. When this vector is entered into a properly trained neural network, the network algorithm can output data indicating that a pole, tree, building, or other desired to-be-recognized object has been identified and provide the pixel locations of the object. This can be accomplished with high accuracy providing the neural network has been properly trained with sufficient examples of the objects sought to be identified. Development of the neural network is known to those skilled in the art with the understanding as found by the inventors that a large number of vectors may be needed to make up the training database for the neural network. In some cases, the number of vectors in the training database can approach or exceed one million Only those objects which are clearly recognizable are chosen as fiduciaries." & "Once pixels which represent a pole, for example, have been identified, then one or more vectors can be derived extending from the camera in the direction of the pole based on the location and angle of the camera 10. When the pole is identified in two such images (from the same or different cameras 10) then the intersection of the vectors can be calculated and the pole location in space determined." & Par. 0058; See "Image acquisition by a probe vehicle 16 can be controlled by a remote site (e.g., personnel at the remote station 14) on an as-needed basis. If the remote station 14 determines that more images would be useful, for example if it indicates a change or error in the map, it can send a command to the probe vehicle 16 to upload one or more images. In this manner, the roads can be continuously monitored for changes and the maps kept continuously accurate. Similarly, once the system is largely operational, a probe vehicle 16 can be constantly comparing what it sees using the camera 10 with its copy of the map in map database 12 and when it finds a discrepancy in the presence or location of a fiduciary found in the image from camera 10 relative to the contents of the map database 12, for example, it can notify the control site and together they can determine whether the probe vehicle's map needs updating or whether more images are needed indicating a change in the roadway or its surrounding terrain.") Multiple images are received from one or more vehicles which are then analyzed using the neural networks. The vehicle continuously updates surrounding sensor information using the multiple images after being analyzed by the neural network.
Regarding claim 19, Breed et al. teaches A system comprising: A hardware processor; and A non-transitory computer readable medium storing instructions that when executed by a processor cause the processor to perform steps comprising: (Par. 0008; See "These individual readings are directed to vehicle angular and displacement determination algorithms, incorporated or executed by a processor such as, for example, a computer, to be used to determine location and orientation of the vehicle.") receiving a plurality of images, each image corresponding to a video frame captured by one or more sensors of a vehicle; (Abstract; See "Vehicle-mounted device includes an inertial measurement unit (IMU) (8) having at least one accelerometer or gyroscope, a GPS receiver (6), a camera (10) positioned to obtain unobstructed images of an area exterior of the vehicle (16) and a control system (20) coupled to these components. The control system (20) re-calibrates each accelerometer or gyroscope using signals obtained by the GPS receiver (6), and derives information about objects in the images obtained by the camera (10) and location of the objects based on data from the IMU (8) and GPS receiver (6). A communication system (18) communicates the information derived by the control system (20) to a location separate and apart from the vehicle (16). The control system (20) includes a processor that provides a location of the camera (10) and a direction in which the camera (10) is imaging based on data from the IMU corrected based on data from the GPS receiver (6), for use in creating the map database (12). (FIG. 2)") processing the plurality of images using a first neural network model configured to generate, for a first image of the plurality of images, a first feature map indicating, for each pixel of the first image, a feature vector corresponding to an intent associated with the pixel; (Par. 0046; See "When an image is acquired by camera 10, it can be subjected to a coding process and coded data entered into a pattern recognition algorithm such as a neural network in the ECU 20. In one preferred implementation, the pixels of each image from camera 10 are arranged into a vector and the pixels are scanned to locate edges of objects. When an edge is found by processing hardware and/or software in the ECU 20, the value of the data element in the vector which corresponds to the pixel can be set to indicate the angular orientation of the edge in the pixel. For example, a vertical edge can be assigned a 1 and a horizontal element an 8 and those at in between angles assigned numbers between 1 and 8 depending on the angle. If no edge is found, then the pixel data can be assigned a value of O. When this vector is entered into a properly trained neural network, the network algorithm can output data indicating that a pole, tree, building, or other desired to-be-recognized object has been identified and provide the pixel locations of the object. This can be accomplished with high accuracy providing the neural network has been properly trained with sufficient examples of the objects sought to be identified. Development of the neural network is known to those skilled in the art with the understanding as found by the inventors that a large number of vectors may be needed to make up the training database for the neural network. In some cases, the number of vectors in the training database can approach or exceed one million Only those objects which are clearly recognizable are chosen as fiduciaries." The vectors are used to determine the intent of the objects within the images. The combination of vectors identified by the neural network act as a feature map which is also described within Par. 0048; See “This is only one of a large number of such techniques where observed object properties exhibited in the pixels are used to form the neural network vectors. Others include color, texture, material properties, reflective properties etc. and this invention should not be limited to a particular method of forming the neural network vectors or the pixel properties chosen.”) using a second neural network model, identifying one or more objects within the first image based upon the first image and the first feature map of the first image; (Par. 0048; See “The above described neural network is based on using the edges of objects to form the vectors analyzed by the neural network in the ECU 20. This is only one of a large number of such techniques where observed object properties exhibited in the pixels are used to form the neural network vectors. Others include color, texture, material properties, reflective properties etc. and this invention should not be limited to a particular method of forming the neural network vectors or the pixel properties chosen.” The first and second neural network are no different from one another. The neural network within Breed et al. performs the same functions by identifying pixels within the first image and using them to identify objects in the image.) determining, for each of the identified one or more objects, an overall intent of the object, based upon an aggregation of the feature vectors corresponding to pixels encompassed by the object; (Pars. 0047-0048; See "Once pixels which represent a pole, for example, have been identified, then one or more vectors can be derived extending from the camera in the direction of the pole based on the location and angle of the camera 10. When the pole is identified in two such images (from the same or different cameras 10) then the intersection of the vectors can be calculated and the pole location in space determined." & "The above described neural network is based on using the edges of objects to form the vectors analyzed by the neural network in the ECU 20. This is only one of a large number of such techniques where observed object properties exhibited in the pixels are used to form the neural network vectors. Others include color, texture, material properties, reflective properties etc. and this invention should not be limited to a particular method of forming the neural network vectors or the pixel properties chosen." While the literal word intent is not mentioned in reference to the objects, knowing the vectors with the properties of the neural network allows for understanding the intent of the objects. While poles are stationary, Breed et al. uses multiple images to locate the vectors of the objects in relation to a moving object. The neural network uses the vectors of the objects identified in the image to identify the intent of the object. Other properties may be identified as well as described in Par. 0048.) and generating one or more commands to control the vehicle based upon the determined overall intents of the one or more objects. (Par. 0053; See "In this regard, a display may be provided to the driver of the probe vehicle 16 indicating the maximum speed which is determined based on the number of fiduciaries in the images being obtained by the camera 10 on the probe vehicle 16. If the probe vehicle 16 is autonomous, then its speed may be limited by known control systems the number of fiduciaries in the images being obtained by camera 10. In the same manner, the highest speed of the probe vehicle 16 may be notified to the driver or limited by control systems based on the accuracy desired for the images obtained by the camera 10, i.e., on the illumination present and the properties of the imager, as a sort of feedback technique. Data about the time and accuracy of the processing of images from the camera 10 by the ECU 20 is thus used to control a driver display (not shown) to show the highest speed or to control the autonomous vehicle speed control system.")
Regarding claim 20, Breed et al. teaches The system of claim 19, wherein a feature vector associated with a pixel corresponds to a plurality of intents associated with the pixel, each intent associated with a predicted value representative of a statistical distribution of the intent and an uncertainty value associated with the predicted value. (Par. 0046; See "When an image is acquired by camera 10, it can be subjected to a coding process and coded data entered into a pattern recognition algorithm such as a neural network in the ECU 20. In one preferred implementation, the pixels of each image from camera 10 are arranged into a vector and the pixels are scanned to locate edges of objects. When an edge is found by processing hardware and/or software in the ECU 20, the value of the data element in the vector which corresponds to the pixel can be set to indicate the angular orientation of the edge in the pixel. For example, a vertical edge can be assigned a 1 and a horizontal element an 8 and those at in between angles assigned numbers between 1 and 8 depending on the angle. If no edge is found, then the pixel data can be assigned a value ofO. When this vector is entered into a properly trained neural network, the network algorithm can output data indicating that a pole, tree, building, or other desired to-be-recognized object has been identified and provide the pixel locations of the object. This can be accomplished with high accuracy providing the neural network has been properly trained with sufficient examples of the objects sought to be identified. Development of the neural network is known to those skilled in the art with the understanding as found by the inventors that a large number of vectors may be needed to make up the training database for the neural network. In some cases, the number of vectors in the training database can approach or exceed one million Only those objects which are clearly recognizable are chosen as fiduciaries." & Par. 0061; See "Two images of a particular fiduciary (taken from different locations) are necessary to establish an estimate of the location of the fiduciary. Such an estimate contains errors in, for example, the GPS determination of the location of the device each second for calibration, errors in the IMU determination of its location over and above the GPS errors, errors in the determination of the angle of the fiduciary as determined by the IMU and the camera pixels and errors due to the resolutions of all of these devices. When a third image  is available, two additional estimates are available when image 1 is compared with image 3 and image 2 is also compared with image 3. The number of estimates E available can be determined by the formula E=n*(n-1)/2, wherein n is the number of images. Thus the number of estimates grows rapidly with the number of images. For example, if 10 images are available, 45 estimates of the position of the fiduciary can be used. Since the number of estimates increases rapidly with the number of images, convergence to any desired accuracy level is rapid. 100 images, for example, can provide almost 5000 such estimates.")
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Yang et al. (U.S Publication No. 2016/0140438) teaches a hyper-class augmented and regularized deep learning for fine-grained image classification. 
Phogat et al. (U.S Patent No. 11,348,246) teaches segmenting objects in vector graphics images.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANTHONY M GARTRELLE whose telephone number is 313-446-6539.  The examiner can normally be reached on Telework 7:30am-3:30pm (EST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Elaine Gort can be reached on (571) 272-6781.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/ANTHONY M GARTRELLE/Examiner, Art Unit 3661  
10/18/2022
/Elaine Gort/Supervisory Patent Examiner, Art Unit 3661