DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 4 April 2022 has been entered.

Response to Amendment
Applicant’s response, filed 14 February 2022, to the last office action has been entered and made of record. 
In response to the amendments to the specification and claims, they are acknowledged, supported by the original disclosure, and no new matter is added.
In response to the amendments to the specification, the amended language has overcome the objection to the abstract of the previous Office action, and the respective objection has been withdrawn.
In response to the amendments to the claims, specifically addressing the objection to the claim 8 of the previous Office action, the amended language has overcome the respective objection, and the objection has been withdrawn.
Amendments to the independent claim 1, 17, and 25 have necessitated a new ground of rejection over the applied prior art. Please see below for the updated interpretations and rejections.

Response to Arguments
Applicant’s arguments with respect to claims 1, 17, and 25 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1-3, 6-8, 15-18, 23, and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Heitz et al. (US 2018/0012460), herein Heitz, in view of Kim et al. (“Efficient region-based motion segmentation for a video monitoring system”), herein Kim, and Zhu et al. (US 2020/0082544, effectively filed 10 September 2018), herein Zhu. 
Regarding claim 1, Heitz discloses a computer-implemented method of event-driven object segmentation for image processing, comprising: 
obtaining clusters of events indicating motion of image content between frames of at least one video sequence of frames formed of pixel image data (see Heitz [0242]-[0243], where motion vectors are received and forms their own cluster in event space; see Heitz [0226], where motion vectors represents motion of a motion entity in a scene depicted in a video portion; see also Heitz [0228]-[0230], where a motion mask is used to identify motion entities which are tracked across multiple frames to obtain the motion vector); 
forming cluster groups depending, at least in part, on the position of the clusters relative to each other and without tracking all pixel locations forming the frames (see Heitz [0243], where new motion vectors that are density reachable from an existing point in a cluster are assigned to that cluster); and 
providing regions-of-interest to applications associated with object segmentation (see Heitz [0245], where the event indicators for the motion events of the new event category are assigned a color of the new event category; see also Heitz [0235], where known event categories include a person running, a person jumping, a person walking, a dog running, a bird flying, a car passing by, a door opening, a door closing, and leaves rustling; see Heitz [0328]-[0335], where a deep neural network to perform scalable object detection is used to determine whether an image corresponding to a detected motion event includes one or more potential instances of a person and denoting regions encompassing potential instances of a person and determines if the one or more regions include a person).
Although Heitz teaches a captured video stream is used to identify motion entities and obtaining corresponding motion vectors, and that motion masks, which mask all pixel locations that have no motion pixels and shows the motion pixels of the frame of the video segment, are used to identify motion entities which are tracked and to generate the motion vectors (see Heitz [0222] and [0228]-[0230]); Heitz does not explicitly disclose wherein each event is a change in image data from frame to frame at a single pixel, and that forming cluster groups depend, at least in part, on the position of the clusters relative to each other on a grid of pixel locations forming the frames.
Kim teaches in a related and pertinent region based motion segmentation method for video monitoring (see Kim Abstract), where a motion mask is generated based on the pixel difference between two successive image frames (see Kim sect. 2.1.1. Variation detection, sect. 2.1.2 Adaptive thresholding, and Fig. 4), and where adjacent regions are merged if a motion discrepancy based on a distance between the corresponding regions’ motion vectors are below a threshold, where the adjacent regions are determined with pixel coordinates of the image space (see Kim sect. 2.2.3 Motion estimation and sect. 2.2.4 Region merging).
At the time of filing, one of ordinary skill in the art would have found it obvious to apply the teachings of Kim to the teachings of Heitz, such that the motion mask for identifying motion entities is generated based on the pixel differences between successive image frames, and that the clusters of motion vectors are grouped further using a distance metric based on pixel coordinates in image space between the motion vectors of the corresponding clusters. This modification is rationalized as an application of a known technique to a known device ready for improvement to yield predictable results. In this instance, Heitz disclose a base method comprising determining motion events in captured video based on clustering detected motion vectors, based on tracked motion entities identified using a motion mask, where motion vectors that are densely reachable to a cluster are grouped. Kim teaches known techniques of obtaining a motion mask based on the pixel differences between successive image frames, and merging adjacent regions of motion vectors using a motion discrepancy metric based on a distance between the corresponding regions motion vectors being below a threshold, where the adjacent regions are determined according to pixel coordinates of the image space. One of ordinary skill in the art would have recognized that by applying Kim’s techniques would allow for the Heitz’s method to obtain the motion mask for identifying motion entities based on the pixel differences between successive image frames, and to merge motion vectors into clusters based on the distance in pixel coordinates of image space of the captured video images between corresponding adjacent regions’ motion vectors, predictably leading to obtaining the motion mask for identifying motion entities and a more robust and improved method for merging and grouping the motion vectors into corresponding clusters.
Although Heitz teaches the use of a deep neural network to perform scalable object detection to determine whether a first image includes one or more potential instances of a person and denoting regions encompassing potential instances of a person (see Heitz [0332]-[0334]); Heitz and Kim do not explicitly disclose inputting the cluster groups, group by group, into at least one convolutional layer of a neural network to generate per pixel outputs without inputting non-grouped pixels of an image into the neural network; generating regions-of-interest comprising using the outputs.
Zhu teaches in a related and pertinent artificial neural network based process for detecting motion between frames, tracking a region of interest, and classifying an object within the region of interest (see Zhu Abstract), where groups of adjacent blocks of data elements having motion vectors, denoted as regions of interest are input to a convolutional neural network classifier to classify any objects in that region of interest, where only the image data for the regions of interest is input to the CNN classifier (see Zhu [0089]-[0093]; see also Zhu [0025], where convolutional neural networks has one or more convolutional layers), and the convolutional neural network outputs classification labels for any objects found and also indications of any regions of interest and motion vectors are output (see Zhu [0094]), where the classifications for any objects in those regions of interest are associated with those regions as they are subsequently tracked (see Zhu [0099]).
At the time of filing, one of ordinary skill in the art would have found it obvious to apply the teachings of Zhu to the teachings of Heitz and Kim, such that the clustered motion regions corresponding to the detected motion events are input into convolutional neural network classifier to classify any objects, e.g. person, in a corresponding clustered motion region and output corresponding classification labels and associated region of interest and motion vectors which are subsequently tracked.  This modification is rationalized as an application of a known technique to a known device ready for improvement to yield predictable results. In this instance, Heitz and Kim disclose a base method comprising determining motion events in captured video based on clustering detected motion vectors, based on tracked motion entities identified using a motion mask, where motion vectors that are densely reachable to a cluster are grouped, and a deep neural network to perform scalable object detection to determine whether a first image includes one or more potential instances of a person and denoting regions encompassing potential instances of a person. Zhu teaches known techniques of inputting groups of adjacent blocks of data elements having motion vectors, denoted as regions of interest, to a convolutional neural network classifier to classify any objects in that region of interest, where only the image data for the regions of interest is input to the CNN classifier, and outputting object classification labels and associated regions of interest and motion vectors. One of ordinary skill in the art would have recognized that by applying Zhu’s techniques would allow for the Heitz and Kim’s method to classify any objects, e.g. person, in a corresponding clustered motion region that is input to a convolutional neural network classifier and output object classification labels and associated regions of interest and motion vectors which may be subsequently tracked, predictably leading to an improved motion event determining method where only determined clustered motion regions are classified with a convolutional neural network classifier to classify any objects, e.g. person, in a corresponding clustered motion region to be subsequently tracked. 

Regarding claim 2, please see the above rejection of claim 1. Heitz, Kim, and Zhu disclose the method of claim 1 wherein each event indicates a change in image data at a pixel location that meets a criterion deemed to indicate sufficient motion of image content (see Kim sect. 2.1.1. Variation detection, sect. 2.1.2 Adaptive thresholding, where pixel differences between successive frames that are greater than a threshold are used to obtain the mask).

Regarding claim 3, please see the above rejection of claim 1. Heitz, Kim, and Zhu disclose the method of claim 1 wherein the clusters are formed by listing an anchor pixel location, a timestamp, and a size of the cluster without listing all pixel locations on an image and without listing all pixels in the cluster (see Heitz [0241], where cluster creation time, cluster weight, which records a member count for the cluster, cluster center, and cluster radius are stored and updated for each motion vector cluster).

Regarding claim 6, please see the above rejection of claim 1. Heitz, Kim, and Zhu disclose the method of claim 1 wherein forming cluster groups comprises determining whether neighbor clusters adjacent to a current cluster meet a criterion (see Kim sect. 2.2.3 Motion estimation and sect. 2.2.4 Region merging, where regions are merged if a motion discrepancy based on the distance between the corresponding regions motion vectors are below a threshold, where the distance is determined with pixel coordinates of the image space).

Regarding claim 7, please see the above rejection of claim 1. Heitz, Kim, and Zhu disclose the method of claim 1 comprising placing a cluster group on a patch array, and generating representative pixel values that indicate the number of events near a current pixel (see Heitz [0232], where an event mask may be generated by aggregating all motion pixels from all frames of the video segment at each pixel location).

Regarding claim 8, please see the above rejection of claim 7. Heitz, Kim, and Zhu disclose the method of claim 7 wherein at least some of the representative pixel values factor two or more adjacent clusters in the cluster group (see Heitz [0232], where the event mask generated by aggregating all motion pixels from all frames of the video segment at each pixel location is also referred to as a motion energy map, and the characteristics of the motion energy maps for different types of motion events are used to differentiate them from one another, thus suggesting the motion energy map values are used to factor between adjacent clusters corresponding to different motion events).

Regarding claim 15, please see the above rejection of claim 7. Heitz, Kim, and Zhu disclose the method of claim 7 comprising comparing the representative pixel values to at least one criterion to determine whether a sufficient number of events occur near a pixel to consider the pixel to indicate sufficient cohesive motion among the pixels (see Heitz [0232], where the event mask generated by aggregating all motion pixels from all frames of the video segment at each pixel location and masking pixel locations that have less than a threshold number of motion pixels); and 
generating the regions-of interest by using only those pixels that meet the at least one criterion (see Heitz [0232], where the motion energy map of a motion event candidate is vectorized to generate the representative motion vector; see Heitz [0245], where promoted dense clusters become a new vector category and assigned a new event category).

Regarding claim 16, please see the above rejection of claim 1. Heitz, Kim, and Zhu disclose the method of claim 1 comprising using fixed function hardware of an event-driven processing unit that forms the clusters, forms the cluster groups, and generates the regions-of- interest, wherein the computational load and processing time depends on the number of clusters that are determined (see Heitz [0278]-[0279], where the event segmentation is process by a server system where segments are assigned to a queue associated with particular categorizers and the assignment is based on a load balancing scheme and is based on the relative amount of data assigned to each categorizer queue, thus suggesting that computation load and processing time depends on the number of segments assigned to each queue).

Regarding claim 17, Heitz, Kim, and Zhu disclose a system of event-driven object segmentation for image processing, comprising: 
a memory storing image data of frames of a video sequence (see Heitz [0112]-[0116], where memory includes data receiving module for receiving data from electronic devices, e.g., video data from a camera); and 
at least one event-driven processor (see Heitz [0112], where the server system includes one or more processing units) being communicatively connected to the memory and being arranged to operate by: 
obtaining clusters of events indicating motion of image content between frames of at least one video sequence of frames formed of pixel image data (see Heitz [0242], where motion vectors are received and forming their own cluster in event space; see Heitz [0226], where motion vectors represents motion of a motion entity in a scene depicted in a video portion; see also Heitz [0228]-[0230], where a motion mask is used to identify motion entities which are tracked across multiple frames to obtain the motion vector), wherein each event is a change in image data from frame to frame at a single pixel (see Kim sect. 2.1.1. Variation detection, sect. 2.1.2 Adaptive thresholding, and Fig. 4, where a motion mask is generated based on the pixel difference between two successive image frames); 
forming cluster groups depending, at least in part, on the position of the clusters relative to each other on a grid of pixel locations (see Heitz [0243], where new motion vectors that are density reachable from an existing point in a cluster are assigned to that cluster; see Kim sect. 2.2.3 Motion estimation and sect. 2.2.4 Region merging, where regions are merged if a motion discrepancy based on the distance between the corresponding regions motion vectors are below a threshold, where the distance is determined with pixel coordinates of the image space); 
inputting the cluster groups, group by group, into at least one convolutional layer of a neural network to generate per pixel outputs without inputting non-grouped pixels of an image into the neural network (see Zhu [0092]-[0094], where groups of adjacent blocks of data elements having motion vectors, denoted as regions of interest are input to a convolutional neural network classifier to classify any objects in that region of interest, where only the image data for the regions of interest is input to the CNN classifier, and object classification labels and associated regions of interest and motion vectors are output); 
generating regions-of-interest comprising using the outputs (see Zhu [0094] and [0099], where object classification labels and associated regions of interest and motion vectors are output, and that classifications for any objects in those regions of interest are associated with those regions as they are subsequently tracked) and without tracking all pixel locations forming the frames (see Heitz [0245], where promoted dense clusters become a new vector category and assigned a new event category, and are tracked and updated in event space; see Zhu [0093], where only the image data for the regions of interest is input to the CNN classifier); and 
providing the regions-of-interest to applications that use segmented objects (see Heitz [0245], where the event indicators for the motion events of the new event category are assigned a color of the new event category; see also Heitz [0235], where known event categories include a person running, a person jumping, a person walking, a dog running, a bird flying, a car passing by, a door opening, a door closing, and leaves rustling; see Zhu [0094] and [0099], where classifications for any objects in those regions of interest are associated with those regions as they are subsequently tracked).
Please see the above rejection of claim 1, as the rationale to combine the teachings of Heitz, Kim, and Zhu are similar, mutatis mutandis. 

Regarding claim 18, please see the above rejection of claim 17. Heitz, Kim, and Zhu disclose the system of clam 17, wherein using the cluster groups comprises comparing representative pixel values to at least one criterion to determine whether a sufficient amount of events occur near a pixel to consider the pixel to indicate sufficient cohesive motion among the pixels (see Heitz [0232], where the event mask generated by aggregating all motion pixels from all frames of the video segment at each pixel location and masking pixel locations that have less than a threshold number of motion pixels); and 
generating the regions-of interest by using only those pixels that meet the at least one criterion (see Heitz [0232], where the motion energy map of a motion event candidate is vectorized to generate the representative motion vector; see Heitz [0245], where promoted dense clusters become a new vector category and assigned a new event category).

Regarding claim 23, please see the above rejection of claim 17. Heitz, Kim, and Zhu disclose the system of claim 17 comprising fixed function hardware of the event-driven processing unit arranged to form the clusters, form the cluster groups, and generate the regions-of- interest, wherein the computational load and processing time depend, at least in part, on the number of clusters that are determined (see Heitz [0278]-[0279], where the event segmentation is process by a server system where segments are assigned to a queue associated with particular categorizers and the assignment is based on a load balancing scheme and is based on the relative amount of data assigned to each categorizer queue, thus suggesting that computation load and processing time depends on the number of segments assigned to each queue).

Regarding claim 25, it recites a non-transitory computer-readable article performing the method of claim 1. Heitz, Kim, and Zhu teach a non-transitory computer readable article performing the method of claim 1 (see Heitz [0112], where anon-transitory computer readable storage medium stores programs and modules for implementing the disclosed invention). Please see above for detailed claim analysis, with the exception to the following further limitations:
Please see the above rejection of claim 1, as the rationale to combine the teachings of Heitz, Kim, and Zhu are similar, mutatis mutandis. 

Claims 4-5, and 20-22 are rejected under 35 U.S.C. 103 as being unpatentable over Heitz, Kim, and Zhu as applied to claim 1 and 17 above, and further in view of Numaguchi et al. (US 2017/0296913), herein Numaguchi.
Regarding claim 4, please see the above rejection of claim 1. Heitz, Kim, and Zhu do not explicitly disclose the method of claim 1 wherein forming cluster groups comprises listing clusters in an order of anchor coordinates of the clusters on a reverse mapping table.
Numaguchi teaches in a related and pertinent system for detecting and tracking targets in captured images (see Numaguchi Abstract), where a detected block is listed in a table with a positional information that expresses the center of gravity of the block image region, in addition with feature information, such as a size (see Numaguchi [0085]-[0086]).
At the time of filing, one of ordinary skill in the art would have found it obvious to apply the teachings of Numaguchi to the combined teachings of Heitz, Kim, and Zhu, such that the clusters of motion vectors may be tracked and listed in a table and stored with positional information and size information of the cluster. This modification is rationalized as an application of a known technique to a known device ready for improvement to yield predictable results. In this instance, Heitz, Kim, and Zhu disclose a base method comprising for determining motion events in captured video based on clustering detected motion vectors, where the clusters are tracked. Numaguchi teaches a known technique using a list to track detected block image regions which stores positional information and size information of the detected block image regions. One of ordinary skill in the art would have recognized that by applying Numaguchi’s technique would allow for the Heitz, Kim, and Zhu’s method to track the corresponding detected motion event clusters based on the cluster center of gravity position and cluster size, predictably leading to an improved method of managing and tracking the detected cluster motion events. 

Regarding claim 5, please see the above rejection of claim 4. Heitz, Kim, Zhu, and Numaguchi disclose the method of claim 4 wherein the reverse mapping table lists an anchor location and a size of the cluster without listing any more parameters of the cluster (see Numaguchi [0085]-[0086], where a detected block is listed in a table with a positional information that expresses the center of gravity of the block image region, in addition with feature information, such as a size).

Regarding claim 20, please see the above rejection of claim 17. Heitz, Kim, Zhu, and Numaguchi disclose the system of claim 17 wherein generating regions-of-interest comprises setting pixel locations on a label array with an initial label and only with pixel locations that are both in a cluster and have a representative value that passes at least one criterion (see Heitz [0232], where the motion energy map is generated by aggregating all motion pixels from all frames of the video segment at each pixel location and masking pixel locations that have less than a threshold number of motion pixels; see Heitz [0245] where dens clusters established as event indicators for the motion events within the dens cluster to take on a color assigned to the new event category are also assigned; see Numaguchi [0085]-[0090], where detected block regions are assigned a label to a table with the corresponding block region position information); and 
updating the initial label of a pixel location depending on labels of neighbor pixels (see Kim sect. 2.2.4 Region merging, where regions are merged if a motion discrepancy based on the distance between the corresponding regions motion vectors are below a threshold, where the distance is determined with pixel coordinates of the image space).
Please see the above rejection of claim 4, as the rationale to combine the teachings of Heitz, Kim, Zhu, and Numaguchi are similar, mutatis mutandis. 

Regarding claim 21, please see the above rejection of claim 20. Heitz, Kim, Zhu, and Numaguchi disclose the system of claim 20 wherein labels on the label array are formed for one cluster at a time (see Numaguchi [0085]-[0090], where labels are assigned to detected block regions in the table one at a time); and 
wherein generating regions-of-interest comprises storing a bottom-most label of an upper cluster to provide neighbor labels to top-most pixel locations on a lower cluster (see Numaguchi [0087]-[0090], where the labels of the detected block regions are stored and the positions of the blocks are updated whenever necessary; see Kim sect. 2.2.4 Region merging, where regions are merged based on the distance between the corresponding regions motion vectors are below a threshold, where the distance is determined with pixel coordinates of the image space; where the combined teachings suggests that the corresponding labeled regions are updated and merging may be performed when two labeled regions are within the distance threshold).

Regarding claim 22, please see the above rejection of claim 17. Heitz, Kim, Zhu, and Numaguchi disclose the system of claim 17 comprising an association table on the at least one event- driven processor, and wherein generating regions-of-interest comprises updating region-of-interest boundaries on the association table as pixel locations receive updated labels (see Numaguchi [0087]-[0090], where the labels of the detected block regions are stored and the positions of the blocks are updated whenever necessary; and see Numaguchi [0086], where the positional information may be a position and a size of a quandrangle circumscribing the image).
Please see the above rejection of claim 4, as the rationale to combine the teachings of Heitz, Kim, Zhu, and Numaguchi are similar, mutatis mutandis. 

Claims 9-13 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Heitz, Kim, and Zhu as applied to claim 7 and 18 above, and further in view of Courtney et al. (US 6,424,370), herein Courtney, and Szeliski (“Computer Vision: Algorithms and Applications”).
Regarding claim 9, please see the above rejection of claim 7. Heitz, Kim, and Zhu do not explicitly disclose the method of claim 7 wherein forming cluster groups comprises using a single layer convolution to generate the representative pixel values.
Courtney teaches in a related and pertinent (see Courtney Abstract), where a difference image between two image frames, representing motion regions, is smoothed via a low-pass filter to remove noisy data (see Courtney col. 7, ln. 25-50).
At the time of filing, one of ordinary skill in the art would have found it obvious to apply the teachings of Courtney to the combined teachings of Heitz, Kim, and Zhu, such that the motion energy map representing the accumulated motion of the image frames are smoothed via low-pass filtering to remove noise. This modification is rationalized as an application of a known technique to a known device ready for improvement to yield predictable results. In this instance, Heitz, Kim, and Zhu disclose a base method comprising for determining motion events in captured video based on clustering detected motion vectors, where motion energy maps are generated by aggregating all motion pixels from all frames of the video segment at each pixel location. Courtney teaches a known technique of smoothing a difference image, representing motion regions between two image frames via a low-pass filter to remove noisy data. One of ordinary skill in the art would have recognized that by applying Courtney’s technique would allow for the Heitz, Kim, and Zhu’s method to remove noisy data from the generated motion energy maps, predictably leading to an improved method where noise is reduced from the motion energy maps. 
Heitz, Kim, Zhu, and Courtney do not explicitly disclose that a single layer convolution is used to generate the representative pixel values. 
Szeliski teaches that performing low-pass filtering or smoothing of an image is commonly known to one of ordinary skill in the art of convolving an image with a low-pass filter kernel (see Szeliski sect. 3.2.1 Separable filtering and sect. 3.2.2 Examples of linear filtering, cited below in pertinent art section). 
At the time of filing, one of ordinary skill in the art would have recognized that performing smoothing upon the motion energy maps via low-pass filtering suggests performing a convolution operation upon the motion energy maps with a low-pass filtering. This modification is rationalized as some teaching, suggestion or motivation in the prior art that would have led one of ordinary skill to combine prior art reference teachings to arrive at the claimed invention. In this instance, the combination of Heitz, Kim, Zhu, and Courtney suggests performing smoothing of motion energy maps vial low-pass filtering. Szeliski teaches that in knowledge generally available to one of ordinary skill in the art, that low-pass filtering is commonly performed by convolving an image with a low-pass filter kernel. Thus, the combined teachings of Heitz, Kim, Zhu, Courtney, and Szeliski would have led one of ordinary skill in the art to understand that by performing smoothing upon the motion energy maps via low-pass filtering as suggested by the combination of Heitz, Kim, Zhu, and Courtney, would further suggest that a convolution operation is performed upon the motion energy maps with a low-pass filtering kernel, reading upon the broadest reasonable interpretation of a single layer convolution is used to generate the representative pixel values. 

Regarding claim 10, please see the above rejection of claim 9. Heitz, Kim, Zhu, Courtney, and Szeliski disclose the method of claim 9 wherein no other neural network layers are used (see Courtney col. 7, ln. 25-50, where smoothing via low-pass filtering is suggested, where no other neural network layers are suggested to be used).

Regarding claim 11, please see the above rejection of claim 7. Heitz, Kim, Zhu, Courtney, and Szeliski the method of claim 7 comprising traversing a filter over the patch array to generate the representative pixel values (see Courtney col. 7, ln. 25-50, where smoothing via low-pass filtering is suggested; see Szeliski sect. 3.2.2 Examples of linear filtering, where a low-pass filter kernel is suggested to be used in the convolution).

Regarding claim 12, please see the above rejection of claim 11. Heitz, Kim, Zhu, Courtney, and Szeliski disclose the method of claim 11 comprising determining the representative pixel value as a convolutional sum determined by using the filter (see Courtney col. 7, ln. 25-50, where smoothing via low-pass filtering is suggested; see Szeliski sect. 3.2.2 Examples of linear filtering, where a low-pass filter kernel is suggested to be used in the convolution); and 
providing a convolutional sum for individual pixel locations on the cluster group (see Courtney col. 7, ln. 25-50, where smoothing via low-pass filtering is suggested; see Szeliski sect. 3.2.1 Separable filtering, where convolution operation includes performing a convolutional sum for individual pixel locations of the convolved image and filter kernel).

Regarding claim 13, please see the above rejection of claim 11. Heitz, Kim, Zhu, Courtney, and Szeliski discloses the method of claim 11 wherein the filter is a unity filter (see Courtney col. 7, ln. 25-50, where smoothing via low-pass filtering is suggested; see Szeliski sect. 3.2.2 Examples of linear filtering, where the simplest filter to implement is a moving average or box filter which is equivalent to convolving the image with a kernel of all ones, where a moving average or box filter is equivalent to a unity filter).

Regarding claim 19, please see the above rejection of claim 18. Heitz, Kim, Zhu, Courtney, and Szeliski disclose the system of claim 18 wherein the representative pixel value is a convolutional sum (see Courtney col. 7, ln. 25-50, where smoothing via low-pass filtering is suggested; see Szeliski sect. 3.2.2 Examples of linear filtering, where a low-pass filter kernel is suggested to be used in the convolution), 
wherein at least some of the convolutional sums factor multiple clusters in a group of clusters (see Heitz [0232], where the event mask generated by aggregating all motion pixels from all frames of the video segment at each pixel location is also referred to as a motion energy map, and the characteristics of the motion energy maps for different types of motion events are used to differentiate them from one another, thus suggesting the motion energy map values are used to factor between adjacent clusters corresponding to different motion events).
Please see the above rejection of claim 9, as the rationale to combine the teachings of Heitz, Kim, Zhu, Courtney, and Szeliski are similar, mutatis mutandis. 

Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Heitz, Kim, Zhu, Courtney, and Szeliski as applied to claim 11 above, and further in view of El-Khamy et al. (US 2018/0300624), herein El-Khamy.
Regarding claim 14, please see the above rejection of claim 11. Heitz, Kim, Zhu, Courtney, and Szeliski do not explicitly disclose the method of claim 11 comprising inputting values from the filter into a multiply- accumulate (MAC) array to generate the representative pixel value.
El-Khamy teaches in a related and pertinent method for reducing the complexity of convolutional neural networks (see El-Khamy Abstract), where convolution operations can be performed by repetitive use of an array of multiply accumulate (MAC) units (see El-Khamy [0019]).
At the time of filing, one of ordinary skill in the art would have found it obvious to apply the teachings of El-Khamy to the combined teachings of Heitz, Kim, Zhu, Courtney, and Szeliski, such that the convolution of motion energy maps with low-pass filter kernels is implemented using an array of multiply accumulate (MAC) units. This modification is rationalized as an application of a known technique to a known device ready for improvement to yield predictable results. In this instance, Heitz, Kim, Zhu, Courtney, and Szeliski disclose a base method comprising for determining motion events in captured video based on clustering detected motion vectors, where motion energy maps low-pass filtered by convolving with low-pass filter kernels. El-Khamy teaches a known technique of implementing convolution operations by repetitive use of an array of multiply accumulate (MAC) units. One of ordinary skill in the art would have recognized that by applying El-Khamy’s technique would allow for the implementation of the low-pass filtering of the motion energy maps as suggested by Heitz, Kim, Zhu, Courtney, and Szeliski to remove noisy data from the generated motion energy maps, predictably leading to an improved and efficient implementation of the low-pass filtering of noisy motion energy maps. 

Claim 24 is rejected under 35 U.S.C. 103 as being unpatentable over Heitz, Kim, and Zhu as applied to claim 17 above, and further in view of Liu et al. (US 2019/0354159), herein Liu.
Regarding claim 24, please see the above rejection of claim 17. Heitz, Kim, and Zhu, do not explicitly disclose the system of claim 17 comprising a power circuit and a power control unit arranged to provide at least one of: 
dynamic voltage and frequency scaling depending on the number of events, 
power gating the at least one event-driven processor when the event-driven processor is idle, 
Cdyn scaling depending on the number of clusters, and 
retention of states that provide events, clusters, cluster groups, regions-of-interest, or any combination of these at the at least one event-driven processor when no event is generated during a predetermined amount of time.
Liu teaches in a related and pertinent convolution operation device (see Liu Abstract), where the use of a dynamic voltage frequency scaling (DVFS) device which acquires the working state information of the convolution operation device and sends voltage frequency scaling information to the convolutional operation device according to the working state of the device, in which the voltage frequency scaling information may be configured to instruct the convolutional operation device to scale its working voltage or frequency (see Liu [0118]). 
At the time of filing, one of ordinary skill in the art would have found it obvious to apply the teachings of Liu to the combined teachings of Heitz, Kim, and Zhu, such that a DVFS device is arranged with server system of Heitz, Kim, and Zhu to acquire the working state information and provide voltage frequency scaling information to scale the server system’s working voltage or frequency. This modification is rationalized as an application of a known technique to a known device ready for improvement to yield predictable results. In this instance, Heitz, Kim, and Zhu disclose a base method a server system which performs event segmentation where segments are assigned to categorizer queues based on the relative amount of data assigned to the respective queues when implementing a load balancing scheme. Liu teaches a known technique of implementing a DVFS system to acquire the working state information of a convolution operation device and sends voltage frequency scaling information to the convolutional operation device according to the working state of the device to instruct the convolutional operation device to scale its working voltage or frequency. One of ordinary skill in the art would have recognized that by applying Liu’s technique would allow for the implementation of a DVFS device to acquire the work load of corresponding categorizer queues of the server system and provide voltage frequency scaling information to scale the server system’s working voltage or frequency, predictably leading to an improved and efficient implementation of the of the server system.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TIMOTHY WING HO CHOI whose telephone number is (571)270-3814. The examiner can normally be reached 9:00 AM to 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, VINCENT RUDOLPH can be reached on (571) 272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/TIMOTHY CHOI/Examiner, Art Unit 2661                                                                                                                                                                                                        

/VINCENT RUDOLPH/Supervisory Patent Examiner, Art Unit 2661