Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED OFFICE ACTION

Status of Claims

Claims 1, 3-9, 11-26 are Allowed.
Claims 2 and 10 canceled. 

Reasons for Allowance


1.	The following is an examiner’s statement of reasons for allowance: 
Prior art made of record fails to teach the limitations underlined within the independent claims mentioned below.

Regarding Claim 1,
A method for estimating an indication of motion using input from an event-based camera, the method comprising: receiving events captured by an event-based camera, wherein each of the events represents a location of a change in pixel intensity, a polarity of the change, and a time; discretizing the events into time discretized event volumes, each of which contain events that occur within a specified time range; providing the time discretized event volumes as input to an encoder- decoder neural network trained to predict an indication of motion using a loss function that measures quality of image deblurring, wherein the loss function minimizes a sum of squares of an average timestamp at each pixel, where the average timestamp for each pixel is generated using bilinear interpolation; generating, using the neural network, an estimate of 

Regarding Claim 9,
A system for estimating an indication of motion using input from an event-based camera, the system comprising: a time discretized event volume generator for receiving events captured by an event-based camera, wherein each of the events represents a location of a change in pixel intensity, a polarity of the change, and a time and discretizing the events into time discretized event volumes, each of which contain events that occur within a specified time range; and an encoder-decoder neural network trained to estimate an indication of motion using a loss function that measures quality of image deblurring, wherein the encoder-decoder neural network receives, as input, the time discretized event volumes and generates, as output, an estimate of the indication of motion, wherein the loss function minimizes a sum of squares of an average timestamp at each pixel, where the average timestamp for each pixel is generated using bilinear interpolation.  


Regarding Claim 16,
A non-transitory computer readable medium having stored thereon executable instructions that when executed by a processor of a computer control the computer to perform steps comprising: receiving events captured by an event-based camera, wherein each of the events represents a location of a change in pixel intensity, a polarity of the change, and a time; discretizing the events into time discretized event volumes, each of which contain events that occur within a specified time range; providing the time discretized event volumes as input to an encoder- decoder neural network trained to estimate an indication of motion using a loss function that measures quality of image deblurring, wherein the loss function minimizes a sum of squares of an average timestamp at each pixel, where the average timestamp for each pixel is generated using bilinear interpolation; generating, using the neural network, an estimate of the indication of motion; and using the estimate of the indication of motion in a machine vision application.  

Regarding Claim 17,
A method for estimating an indication of motion using input from an event-based camera, the method comprising:  receiving events captured by an event-based camera, wherein each of the events represents a location of a change in pixel intensity, a polarity of the change, and a time; generating, from the events, event timestamp images, where each event image includes a first channel that encodes a number of positive events that occurred at each pixel during a time period, a second channel that encodes a number of negative events that occurred at each pixel during the time period; a third channel that encodes the most recent positive event at each pixel, and a fourth channel that encodes the most recent negative event at each pixel;  providing the event timestamp images as input to a neural network trained using event timestamp images as input and a loss function generated from frame-based camera images synchronized with the event timestamp images as a supervisory signal; generating, using the neural network, an estimate of the indication of motion; and using the estimate of the indication of motion in a machine vision application.  

Regarding Claim 22,
  A system for estimating an indication of motion using input from an event-based camera, the system comprising: an event timestamp image generator for receiving events captured by an event-based camera, wherein each of the events represents a location of a change in pixel intensity, a polarity of the change, and a time and generating, from the events, event timestamp images, where each event timestamp image includes a first channel that encodes a number of positive events that occurred at each pixel during a time period, a second channel that encodes a number of negative events that occurred at each pixel during the time period; a third channel that encodes the most recent positive event at each pixel, and
 a fourth channel that encodes the most recent negative event at each pixel; and 
a neural network trained using event timestamp images as input and a loss function generated from frame-based camera images synchronized with the event timestamp images as a supervisory signal, wherein the neural network receives the event timestamp images as input and generates an estimate of the indication of motion.  

Regarding Claim 26,
A non-transitory computer readable medium having stored thereon executable instructions that when executed by a processor of a computer controls the computer to perform steps comprising: receiving events captured by an event-based camera, wherein each of the events represents a location of a change in pixel intensity, a polarity of the change, and a time;  generating, from the events, event timestamp images, where each event timestamp image includes a first channel that encodes a number of positive events that occurred at each pixel during a time period, a second channel that encodes a number of negative events that occurred at each pixel during the time period; a third channel that encodes the most recent positive event at each pixel, and a fourth channel that encodes the most recent negative event at each pixel; providing the event timestamp images as input to a neural network trained using event timestamp images as input and a loss function generated from frame-based camera images synchronized with the event timestamp images as a supervisory signal; generating, using the neural network, an estimate of an indication of motion; and using the estimate of the indication of motion in a machine vision application.  


Regarding Claim 1: Claim 1 is   rejected over WANG et al.( USPUB 20200211206) in view of Bardow ( NPL Doc. : “Simultaneous Optical Flow and Intensity Estimation from an Event Camera,”June, 2016,  Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Pages 886-889) in further view of Dharur et al. ( USPUB 20200193609) teaches  A method for estimating an indication of motion using input from an event-based camera, the method comprising: receiving events captured by an event-based camera, wherein each of the events represents a location of a change in pixel intensity, a polarity of the change, and a time; discretizing the events into time discretized event volumes, each of which contain events that occur within a specified time range;… generating, using the neural network, an estimate of the indication of motion; and using the estimate of the indication of motion in a machine vision application.  respectively (detailed rejection of the claim mentioned within Office Action dated 08/17/2021) within claim 1,  but does not teach the limitations  ( previously  objected allowable limitation of claim 2 within office action dated  08/17/2021) as mentioned within the claim  “ providing the time discretized event volumes as input to an encoder- decoder neural network trained to predict an indication of motion using a loss function that measures quality of image deblurring, wherein the loss function minimizes a sum of squares of an average timestamp at each pixel, where the average timestamp for each pixel is generated using bilinear interpolation;”

Regarding Claim 9: Claim 9 is   rejected over WANG et al.( USPUB 20200211206) in view of Bardow ( NPL Doc. : “Simultaneous Optical Flow and Intensity Estimation from an Event Camera,”June, 2016,  Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Pages 886-889) in further view of Dharur et al. ( USPUB 20200193609) teaches  A system for estimating an indication of motion using input from an event-based camera, the system comprising: a time discretized event volume generator for receiving events captured by an event-based camera, wherein each of the events represents a location of a change in pixel intensity, a polarity of the change, and a time and discretizing the events into time discretized event volumes, each of which contain events that occur within a specified time range; and an encoder-decoder neural network trained to estimate an indication of motion using a loss function that measures quality of image deblurring, respectively (detailed rejection of the claim mentioned within Office Action dated 08/17/2021) within claim 9,  but does not teach the limitations  ( previously  objected allowable limitation of claim 10 within office action dated  08/17/2021) as mentioned within the claim  “ wherein the encoder-decoder neural network receives, as input, the time discretized event volumes and generates, as output, an estimate of the indication of motion, wherein the loss function minimizes a sum of squares of an average timestamp at each pixel, where the average timestamp for each pixel is generated using bilinear interpolation.  ”

Regarding Claim 16: Claim 16 is   rejected over WANG et al.( USPUB 20200211206) in view of Bardow ( NPL Doc. : “Simultaneous Optical Flow and Intensity Estimation from an Event Camera,”June, 2016,  Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Pages 886-889) in further view of Dharur et al. ( USPUB 20200193609) teaches  A non-transitory computer readable medium having stored thereon executable instructions that when executed by a processor of a computer control the computer to perform steps comprising: receiving events captured by an event-based camera, wherein each of the events represents a location of a change in pixel intensity, a polarity of the change, and a time; discretizing the events into time discretized event volumes, each of which contain events that occur within a specified time range;… generating, using the neural network, an estimate of the indication of motion; and using the estimate of the indication of motion in a machine vision application.   respectively (detailed rejection of the claim mentioned within Office Action dated 08/17/2021) within claim 16,  but does not teach the limitations  ( previously  objected allowable limitation of claim 2 within office action dated  08/17/2021) as mentioned within the claim  “providing the time discretized event volumes as input to an encoder- decoder neural network trained to estimate an indication of motion using a loss function that measures quality of image deblurring, wherein the loss function minimizes a sum of squares of an average timestamp at each pixel, where the average timestamp for each pixel is generated using bilinear interpolation;   ”

Regarding Claim 17:  Within Claim 17 WANG et al. (USPUB 20200211206) teaches the following limitations A method for estimating an indication of motion using input from an event-based camera( camera motion taught within Paragraph [0021-0022]), the method comprising:  receiving events captured by an event-based camera, wherein each of the events represents a location of a change in pixel intensity( Paragraphs [0056-0057]) and Paragraph [0154]- “….every pixel is explained by either rigid motion, non-rigid/object motion, or occluded/non-visible regions.  In one or more embodiments, a holistic motion parser (HMP) is used to parse pixels in an image to different regions, and various losses were designed to encourage the depth, camera motion, and optical flow consistency….”), … generating, using the neural network( Paragraphs [0033-0035]), an estimate of the indication of motion; and using the estimate of the indication of motion in a machine vision application( Computer vision taught within Paragraph [0154]).  
Within analogous art , Bardow ( NPL Doc. : “Simultaneous Optical Flow and Intensity Estimation from an Event Camera,”June, 2016,  Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Pages 886-889)  teaches a polarity of the change, and a time ( Page 886,Col. 1 –“… the event in the image domain,…is its time-stamp to microsecond resolution and … is its polarity (sign of the brightness change)….”); generating, from the events, event timestamp images( Page 886, Col. 1- 3.1. Event Camera Definitions and Page 887, Col. 1-lines 4- 10),but does not teach the limitations   as mentioned within the claim  " where each event image includes a first channel that encodes a number of positive events that occurred at each pixel during a time period, a second channel that encodes a number of negative events that occurred at each pixel during the time period; a third channel that encodes the most recent positive event at each pixel, and a fourth channel that encodes the most recent negative event at each pixel;  providing the event timestamp images as input to a neural network trained using event timestamp images as input and a loss function generated from frame-based camera images synchronized with the event timestamp images as a supervisory signal;”

Regarding Claim 22:  Within Claim 22 WANG et al. (USPUB 20200211206) teaches the following limitations A system for estimating an indication of motion using input from an event-based camera ( camera motion taught within Paragraph [0021-0022]), an event timestamp image generator for receiving events captured by an event-based camera, wherein each of the events represents a location of a change in pixel intensity ( Paragraphs [0056-0057]) and Paragraph [0154]- “….every pixel is explained by either rigid motion, non-rigid/object motion, or occluded/non-visible regions.  In one or more embodiments, a holistic motion parser (HMP) is used to parse pixels in an image to different regions, and various losses were designed to encourage the depth, camera motion, and optical flow consistency….”), … wherein the neural network receives the event timestamp images as input and generates an estimate of the indication of motion ( Computer vision taught within Paragraph [0154]).  
Within analogous art , Bardow ( NPL Doc. : “Simultaneous Optical Flow and Intensity Estimation from an Event Camera,”June, 2016,  Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Pages 886-889)  teaches a polarity of the change and a time ( Page 886,Col. 1 –“… the event in the image domain,…is its time-stamp to microsecond resolution and … is its polarity (sign of the brightness change)….”) and generating, from the events, event timestamp images ( Page 886, Col. 1- 3.1. Event Camera Definitions and Page 887, Col. 1-lines 4- 10), but does not teach the limitations   as mentioned within the claim  " where each event timestamp image includes a first channel that encodes a number of positive events that occurred at each pixel during a time period, a second channel that encodes a number of negative events that occurred at each pixel during the time period; a third channel that encodes the most recent positive event at each pixel, and
 a fourth channel that encodes the most recent negative event at each pixel; and a neural network trained using event timestamp images as input and a loss function generated from frame-based camera images synchronized with the event timestamp images as a supervisory signal,”

Regarding Claim 26:  Within Claim 26 WANG et al. (USPUB 20200211206) teaches the following limitations A non-transitory computer readable medium having stored thereon executable instructions that when executed by a processor of a computer controls the computer to perform (Paragraphs [0161-0162]) steps comprising: receiving events captured by an event-based camera, wherein each of the events represents a location of a change in pixel intensity( Paragraphs [0056-0057]) and Paragraph [0154]- “….every pixel is explained by either rigid motion, non-rigid/object motion, or occluded/non-visible regions.  In one or more embodiments, a holistic motion parser (HMP) is used to parse pixels in an image to different regions, and various losses were designed to encourage the depth, camera motion, and optical flow consistency….”), … generating, using the neural network, an estimate of an indication of motion ( Paragraphs [0033-0035]); and using the estimate of the indication of motion in a machine vision application( Computer vision taught within Paragraph [0154]).  
Within analogous art , Bardow ( NPL Doc. : “Simultaneous Optical Flow and Intensity Estimation from an Event Camera,”June, 2016,  Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Pages 886-889)  teaches a polarity of the change and a time ( Page 886,Col. 1 –“… the event in the image domain,…is its time-stamp to microsecond resolution and … is its polarity (sign of the brightness change)….”) ; and generating, from the events, event timestamp images ( Page 886, Col. 1- 3.1. Event Camera Definitions and Page 887, Col. 1-lines 4- 10), but does not teach the limitations   as mentioned within the claim “ where each event timestamp image includes a first channel that encodes a number of positive events that occurred at each pixel during a time period, a second channel that encodes a number of negative events that occurred at each pixel during the time period; a third channel that encodes the most recent positive event at each pixel, and a fourth channel that encodes the most recent negative event at each pixel; providing the event timestamp images as input to a neural network trained using event timestamp images as input and a loss function generated from frame-based camera images synchronized with the event timestamp images as a supervisory signal;”

Conclusion


2. 	Any inquiry concerning this communication or earlier communications from the examiner should be directed to OMAR S ISMAIL whose telephone number is (571)272-9799 and Fax #  ( 571) 273- 9799.  The examiner can normally be reached on M-F 9:00am-6:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at
http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David C. Payne can be reached on ((571) 272-3024.  The fax phone number for the organization where this application or proceeding is assigned is (571)273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free)? If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/OMAR S ISMAIL/
Primary Examiner, Art Unit 2637