homDETAILED ACTION
This action is in response to amendments filed 21 January 2022 for application 15/090874 filed on 5 April 2016. Claims 1-20 and 22-23 are currently pending. Claim 21 has been canceled.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed 21 January 2022 have been fully considered but they are not persuasive. 

Specifically, the Applicants Argue:
The Office Action fails to establish a prima facie case of obviousness over Cobb, Dupont, and Cobb_3. In rejecting claim 1, the Office Action alleges "Cobb teaches ... determining a set of matching statistical descriptions based on [a] first neuro-linguistic model and [a] second linguistic model," referencing the following in Cobb (at paragraph [0065] thereof): "The choice test generally provides a ranking of the existing clusters, relative to the vector input data. Once ranked, the vigilance test evaluates the existing clusters to determine whether to map the input to a given cluster." Applicant disagrees for at least the reasons that Cobb does not describe the basis on which the choice test ranking is performed, much less such ranking being based on a matching of statistical descriptions (thus, Cobb is different from and fails to suggest Applicant's 

Examiner’s Response
The Examiner respectively disagrees noting that, during examination, a claim must be given its broadest reasonable interpretation consistent with the specification (M.P.E.P. 2173.01(I), M.P.E.P. 2111.01(II).). Cobb teaches “determining a set of matching statistical descriptions based on the first neuro-linguistic model and the second neuro-linguistic model”  because he teaches that a cluster model representing (probabilistically) sequential (semantic) patterns is formed at each layer in the cortex module (Figure 3) such that a first and second neurolinguistics model is formed at that given layer in which each model is characterized by matching statistical descriptions determined from the previous layers (i.e., the first and second model at least share a common semantic representation based on preceding clusters each with statistical descriptions that determine an ART label and a label sequence probability) and such that a prototype cluster in such a layer is also characterized by matching statistical descriptions (even if it may not statistically match any of the clusters residing in that layer) formed from a common statistical representations over preceding layers (-viz., [0065, 0066, 0087, Figure 3, Figure 7], The prototype is generated first, as a copy of the input vector used to create a new cluster. Subsequently, the prototype may be updated as new inputs are mapped to that cluster. As stated, inputs are mapped to clusters in an ART network using a choice test and a vigilance test. The choice and vigilance tests are used to evaluate the vector passed to the ART network and select what cluster to map the inputs to (or create a new cluster). The choice test generally provides a ranking of the existing clusters, relative to the vector input data. Once ranked, the vigilance test evaluates the existing clusters to determine whether to map the input to a given cluster., At step 910, the cluster layer determines a formal language distance for each sequence received at step 905 to each cluster in the ART network of that cluster layer, e.g., based on the formal language vector corresponding to a given segment and the equations set forth above.). In other words, at each layer, an association between a (neuro-linguistic) model at given layer with another (neuro-linguistic) model at a preceding layer is determined by semantically matching those models (i.e., mapping a semantic representation of a cluster in a preceding layer to a cluster in a successive layer) in which there are various processes for determining the matching including using a distance metric and a statistical representation of each cluster.  The current office action and the 21 October 2021 NOFA clearly indicate that, while there are matching statistical descriptions, Cobb does not fully disclose the generation of similarity scores based on the set of matching statistical descriptions; Dupont is relied upon (for the current OA and previous NOFA) to teach this limitation where it is noted that Dupont also teaches the “determining a set of matching statistical descriptions” for the two models.

The Applicants Further Argue:
Without acceding to the Office Action's assertions, and in the interest of compact prosecution, Applicant has herein amended independent claim 1 (and, similarly, independent claims 8 and 15) to recite, among other things, "generat[ing] a first neuro-linguistic model that includes a first set of statistical descriptions for a distribution of values in the first neuro-linguistic model," "generat[ing] a second neuro-linguistic model that includes a second set of statistical descriptions for a distribution of values in the second neuro-linguistic model," and "in "[a] newly computed clump can be merged with existing clusters,"4 none of Cobb, Dupont, or Cobb_3 discloses or suggests the "statistically merging a trend model associated with a time interval with [an] update when the update is associated with the time interval" and the "updating the trend model when the update is not associated with the time interval" as now recited by Applicant's amended claim 1.  At least in view of the foregoing, Applicant respectfully submits that independent claim 1 (and, accordingly, all claims depending therefrom) are patentable over Cobb in view of Dupont and Cobb_3. 

Examiner’s Response:
The Examiner respectively disagrees. Cobb teaches “and 2 258620352Application No.: 15/090,874Docket No.: INAI-018/OOUS 339756-2092in response to detecting an update to at least one of the first neuro-linguistic model or the second neuro-linguistic model, one of: statistically merging a trend model associated with a time interval with the update when the update is associated with the time interval,” because he teaches that each (neuro-linguistic) statistical model of the input data (trajectories of objects in a scene) is statistically updated Illustratively, the kinematic data vectors (k) derived for foreground objects in a sequence of video frames are batched into a set of trajectories T. 510—one trajectory (T) for each foreground object. In one embodiment, the cluster layer batches trajectories (T) for n objects (e.g., 100 fore ground objects) and then using the list of kinematic vectors for those objects trains the SOM 515. Further, the cluster layer may use a fixed number of kinematic vectors from each trajectory (T) to train the SOM515. For example, the cluster layer may select 10 equally spaced kinematic vectors from each trajectory T. Thus, for a trajectory of 1500 kinematic vectors (representing a foreground object present in the scene for five minutes, sampled at a rate of 5 hz), the kinematic vectors corresponding to the 1, 150', 300' ... 1500' ones in the trajectory for this foreground object may be used to train the SOM 515., As is known, an ART network provides a specialized neural network configured to create clusters from vector inputs of N elements. For example, an ART network may receive a vector as input and either update an existing cluster or create a new cluster, as determined using a choice test and a vigilance test for the ART network. Each cluster itself may be characterized by a mean and a variance from a prototype input representing that cluster. The mean specifies a center location for the cluster (in an N-dimensional space for N elements) and the variance specifies a radius of the cluster. … Subsequently, the prototype may be updated as new inputs are mapped to that cluster., At the same time, the cluster layer 600 continues to batch trajectories for foreground objects until another n trajectories are available, e.g., batch 2630. At 635, then trajectories in batch 2630 may be normalized, passed to the nodes of a SOM 2 635, and denormalized in the same manner described for the kinematic data vectors in batch 1605. Further, the denormalized node weight vectors in SOM2 may be used to update the clusters in the ART network 625. Doing so allows the ART network 625 to further refine the clusters in that ART network as well as respond to changes in behavior occurring in the scene. That is, as new behaviors emerge in the scene, new clusters will emerge in the ART network 625., The process of batching trajectories and refining the ART network 625 may be repeated indefinitely. In each iteration, a batch of kinematic data vectors (or micro-feature vectors) is mapped into a self organizing map (SOM) and the resulting nodes of the SOM are used to train (or update) clusters in the ART network 625.). In addition Cobb teaches “or updating the trend model when the update is not associated with the time interval” because he teaches that the update of the trend model (trajectory prediction model) also occurs in a form that is not directly responsive to a particular time interval (i.e., not directly in response to events or objects associated with the data of that time interval) through the feedback mechanism from an upper layer to a lower layer which may update particular clusters in a lower layer (thereby modifying any “linguistic” model such as may be associated with any cluster but also any other model derived from that cluster representation at higher layers in the framework)  (-viz., [0059], In one embodiment, layers in the context model component 240 may be configured to provide feedback to the layer below it. For example, Suppose cluster layer w, has an element that is between c, and c, but a bit closer to c. However sequence layer w, is building sequence c, c. . . . and the sequence c, c, c, is much more probable than c, c, c. In such a case, layer w, may provide enough feedback to layer w, so that the element in question would be assigned to cluster c, rather than ca. Thus, the 'expectations' of a sequencing layer may influence the clustering of inputs per formed by the layer below it. Additionally, as the sequences and clusters mature in the sequence layers and cluster layers, anomalies may be generated when input data does not match well with a mature model of behavior at a given layer in the cortex model component 240) Moreover, it is noted that the claim only requires one of the two updating scenarios (i.e., the claims recite a disconjunctive list rather than a conjunctive list as suggested in the arguments). 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:


The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim 1-5, 8-12, 15-19, and 22-23 are rejected under 35 U.S.C. 103 as being unpatentable over Cobb et al. (2011/0064268), hereinafter referred to as Cobb, in view of Dupont et al. (US 2012/0137367), hereinafter referred to as Dupont, and in further view of Cobb et al. (2011/0044499), hereinafter referred to as Cobb_3.

Regarding claim 1, Cobb teaches A computer implemented method comprising: receiving a first set of input data, from at least one sensor and during a first time period; performing, via a neuro-linguistic module, neural network-based linguistic analysis of the first set of input data, ([0060, 0081, Figures 2-4], FIG. 4 illustrates a method 400 for the cortex model component 240 of FIG. 2 to evaluate a sequence of video frames using alternating clustering and sequencing layers, according to one embodiment of the invention. As shown, the method 400 begins at step 405 where the lowest layer of the cortex model component 240 receives sensory input data from the computer vision engine., However, as noted above, unlike the numerical data input to the first ventral and dorsal cluster layers of the cortex model component 240, the output of the sequencing layer is a symbolic symbol stream (i.e., the sequence of ART network labels segmented by the sequencing layer 710). In order to cluster the symbolic data, a formal language measure and a distance measure is used to train a self organizing map at higher layers of the cortex model component 240., wherein input sensor data in the form of a sequence of video frames (a period of time associated with, say, N frames) is received by a module that performs neural-based analysis on that data through a machine learning engine (neuro-linguistic module) which includes a cortex model component but also a succession of layers of analysis applied to symbolic symbol streams formed from predecessor layers such that the analysis of these symbolic streams in an ART neural network framework is a linguistic analysis.) to generate a first neuro-linguistic model that includes a first set of statistical descriptions for a distribution of values in the first neuro-linguistic model; ([0051, 0065],  Probabilities of observing a given sequence or input data mapping to a given cluster may be stored by the cluster statistics 230. , As is known, an ART network provides a specialized neural network configured to create clusters from vector inputs of N elements. For example, an ART network may receive a vector as input and either update an existing cluster or create a new cluster, as determined using a choice test and a vigilance test for the ART network. Each cluster itself may be characterized by a mean and a variance from a prototype input representing that cluster. The mean specifies a center location for the cluster (in an N-dimensional space for N elements) and the variance specifies a radius of the cluster., wherein the clusters derived from the ART network that performs the linguistic analysis includes a statistical representation of those clusters (a distribution of values) in the form of mean and variance, including as well as the probability of observing a mapping of input data to that cluster and wherein it is noted that each cluster at each layer of abstraction is not only itself representative of a distribution of values (through its respective statistical representation as already noted) but also receiving a second set of input data, from the at least one sensor and during a second time period; performing, via the neuro-linguistic module, neural network-based linguistic analysis of the second set of input data, to generate a second neuro-linguistic model that includes a second set of statistical descriptions for a distribution of values in the second neuro-linguistic model ([0038, 0065, 0094, Figures 10], The computer vision engine 135 processes the received video data frame-by-frame, while the machine-learning engine 140 processes data every N-frames., The mean specifies a center location for the cluster (in an N-dimensional space for N elements) and the variance specifies a radius of the cluster. The prototype is generated first, as a copy of the input vector used to create a new cluster. Subsequently, the prototype may be updated as new inputs are mapped to that cluster., For example, at step 1030 the cortex model component may determine that an input data vector is available for the ART network at a given cluster layer. … At step 1035, once such an input data is mapped to a cluster in a trained ART network, the cortex model component may evaluate which cluster the input data is mapped to, and in some cases identify an anomaly, e.g., if the input to a trained ART network does not match any cluster or when the input maps to a “rare” or “immature' cluster.,  wherein the machine learning engine (neuro-linguistic module) performs analysis of successive sequences of N-frames such that each set of N-frames undergoes the same linguistic analysis (ART/SOM) with an additional analysis to compare/reconcile a new sequence of frames with models derived up to that point and to incorporate/evolve those models (over successive layers) with the cluster information derived from the new set of data, including the mean/variance/probability cluster information noted previously as corresponding to a distribution of values.) … detect a pattern in at least one of the first set of input data or the second set of input data…; ([0038, 0048, Figure 3, Figure 7], In one embodiment, the machine-learning engine 140 receives the video frames and the data generated by the computer vision engine 135. The machine-learning engine 140 may be configured to analyze the received data, build semantic representations of events depicted in the video frames, detect patterns, and, ultimately, to learn from these observed patterns to identify normal and/or abnormal events., For example, the cortex model component 240 may subscribe to receive the kinematic data vectors and micro feature vectors output from the computer vision engine 135 and use this information to construct progressively complex abstractions representing behavioral patterns., wherein patterns are detected in the form of trajectories (of objects) discerned from the analysis of frames of data.) determining a set of matching statistical descriptions based on the first neuro-linguistic model and the second neuro-linguistic model; ([0065, 0066, 0087, Figure 3, Figure 7], The prototype is generated first, as a copy of the input vector used to create a new cluster. Subsequently, the prototype may be updated as new inputs are mapped to that cluster. As stated, inputs are mapped to clusters in an ART network using a choice test and a vigilance test. The choice and vigilance tests are used to evaluate the vector passed to the ART network and select what cluster to map the inputs to (or create a new cluster). The choice test generally provides a ranking of the existing clusters, relative to the vector input data. Once ranked, the vigilance test evaluates the existing clusters to determine whether to map the input to a given cluster., At step 910, the cluster layer determines a formal language distance for each sequence received at step 905 to each cluster in the ART network of that cluster layer, e.g., based on the formal language vector corresponding to a given segment and the equations set forth above., wherein a cluster model representing (probabilistically) sequential (semantic) patterns is formed at each layer in the cortex module generating a plurality of similarity scores…; identifying a maximum similarity score from the plurality of similarity scores ([0066, 0069], The choice test generally provides a ranking of the existing clusters, relative to the vector input data. Once ranked, the vigilance test evaluates the existing clusters to determine whether to map the input to a given cluster., In response, the ART network may specify a mapping to a "closest cluster within ART network 625 for that input data vector (determined in the first cluster layers, e.g., using a Euclidian distance measure). If the distance between the input data and the closest cluster in the ART network 625 exceeds a specified amount, or if the closest cluster has not been reinforced a specified minimum number of times (i.e., the cluster is “immature'), an alert specifying the occurrence of an anomalous observation may be generated. wherein a match between input data and existing clusters is determined through a choice test, vigilance test, and a Euclidean distance measure that provide a ranking over existing clusters at a given layer based on the input data (second data set) such that the ranking of the clusters and the Euclidean distance measure are similarity scores given the cluster statistical representations and such that this ranking identifies a closest cluster (with the greatest or maximum similarity).) detecting, via the cognitive module, a trend change based on a comparison between the maximum similarity score and a … threshold, the trend change being associated with multiple changes in learning behavior over a period of time;  ([0051, 0066, 0069, 0071, 0094], That is, as the computer vision engine 135 builds a trajectory of kinematic data, micro feature data or primitive event data while observing a foreground object in the scene, the behavioral anomaly detector 225 may evaluate the emergent trajectories to identify anomalous events, based on the prior observations of the scene as represented by the then existing State of the sequencing/clustering layers 245. For example, if a current input (e.g., a kinematic data vector) does not map to an ART network cluster with a probability of mapped to that is above a specified threshold, relative to the input data being mapped to other clusters in that ART network, a cluster anomaly may be issued. Similarly, for sequence layers, if the current input (e.g., a cluster label assigned to a cluster in an ART network) is not an element of a sequence having a probability of occurring above a specified threshold (relative to prior observation), a sequence anomaly may be issued., Otherwise, if the input vector does not match any available cluster (using the vigilance test), the ART network may create a new cluster by storing a new pattern similar to the input vector., If the distance between the input data and the closest cluster in the ART network 625 exceeds a specified amount, or if the closest cluster has not been reinforced a specified minimum number of times (i.e., the cluster is “immature'), an alert specifying the occurrence of an anomalous observation may be generated., That is, as new behaviors emerge in the scene, new clusters will emerge in the ART network 625. Further, over time, as the new clusters mature, the cluster layer 600 may treat input data mapping to such clusters as being representative of an observation of normal behavior. Thus, when input data (e.g., a kinematic data vectors) maps to a cluster in ART network 625 that has not matured it may represent a new emergent behavior or the observation of an anomalous event (at that layer of the cortex model component 240)., For example, using the vigilance parameter, it is possible that an input will not match any existing ART cluster., wherein new behavior/trend change is detected when a new cluster is generated or when a mapping to an immature cluster occurs (as well as when various other probabilistic criteria are met) such that the detection of this new trend behavior occurs as the result of a vigilance test matching requirement (a vigilance parameter-threshold that determines whether or not a match is found)  or as the result of a Euclidean distance measure between the input data and a cluster (the Euclidean/similarity distance to each ranked mature cluster is greater than a threshold corresponding to that measure associated with the lowest ranked/least dissimilar mature cluster), and wherein this determination trend change/anomaly detection is made according to a set of changes over time such as may occur in an emergent trajectory with each of those changes individually corresponding to a modification of particular models (e.g., changes in statistics) without resulting with an alert issued only once the emergent anomalous trajectory is detected (such as at a higher layer in the cortex module).)  and, in response to detecting the trend change, outputting an alert including a representation of the trend change; ([0038, 0071], Additionally, data describing whether a normal/abnormal behavior/event has been determined and/or what such behavior/event is may be provided to output devices 118 to issue alerts, for example, an alert message presented on a GUI interface screen., Accordingly, in one embodiment, the ART network 625 may issue an alert to users of the video Surveillance system when such an event occurs., wherein an indication of the new behavior/trend change is issued/outputted to users in the form of an alert which also includes additional descriptive information regarding/representative of that change.) and 2 258620352Application No.: 15/090,874Docket No.: INAI-018/OOUS 339756-2092in response to detecting an update to at least one of the first neuro-linguistic model or the second neuro-linguistic model, one of: statistically merging a trend model associated with a time interval with the update when the update is associated with the time interval, ([0062, Illustratively, the kinematic data vectors (k) derived for foreground objects in a sequence of video frames are batched into a set of trajectories T. 510—one trajectory (T) for each foreground object. In one embodiment, the cluster layer batches trajectories (T) for n objects (e.g., 100 fore ground objects) and then using the list of kinematic vectors for those objects trains the SOM 515. Further, the cluster layer may use a fixed number of kinematic vectors from each trajectory (T) to train the SOM515. For example, the cluster layer may select 10 equally spaced kinematic vectors from each trajectory T. Thus, for a trajectory of 1500 kinematic vectors (representing a foreground object present in the scene for five minutes, sampled at a rate of 5 hz), the kinematic vectors corresponding to the 1, 150', 300' ... 1500' ones in the trajectory for this foreground object may be used to train the SOM 515., As is known, an ART network provides a specialized neural network configured to create clusters from vector inputs of N elements. For example, an ART network may receive a vector as input and either update an existing cluster or create a new cluster, as determined using a choice test and a vigilance test for the ART network. Each cluster itself may be characterized by a mean and a variance from a prototype input representing that cluster. The mean specifies a center location for the cluster (in an N-dimensional space for N elements) and the variance specifies a radius of the cluster. … Subsequently, the prototype may be updated as new inputs are mapped to that cluster., At the same time, the cluster layer 600 continues to batch trajectories for foreground objects until another n trajectories are available, e.g., batch 2630. At 635, then trajectories in batch 2630 may be normalized, passed to the nodes of a SOM 2 635, and denormalized in the same manner described for the kinematic data vectors in batch 1605. Further, the denormalized node weight vectors in SOM2 may be used to update the clusters in the ART network 625. Doing so allows the ART network 625 to further refine the clusters in that ART network as well as respond to changes in behavior occurring in the scene. That is, as new behaviors emerge in the scene, new clusters will emerge in the ART network 625., The process of batching trajectories and refining the ART network 625 may be repeated indefinitely. In each iteration, a batch of kinematic data vectors (or micro-feature vectors) is mapped into a self organizing map (SOM) and the resulting nodes of the SOM are used to train (or update) clusters in the ART network 625., wherein, each (neuro-linguistic) statistical model of the input data (trajectories of objects in a scene) is statistically updated (including updating a trend represented in the model up to that update time) in each batch of input data such that the system “detects”/implements the update process when the batch data is received but also “detects”/implements the update process when a mapping between the received input batch data is made to a particular cluster, wherein the time interval associated with this update is being interpreted as corresponding to the duration of the data collection used to amass the batch data which depends on the sample rate (e.g., 5 Hz) and the number of kinematic vectors per trajectory of each object in the batch (e.g., N or 1500).) or updating the trend model when the update is not associated with the time interval. ([0059], In one embodiment, layers in the context model component 240 may be configured to provide feedback to the layer below it. For example, Suppose cluster layer w, has an element that is between c, and c, but a bit closer to c. However sequence layer w, is building sequence c, c. . . . and the sequence c, c, c, is much more probable than c, c, c. In such a case, layer w, may provide enough feedback to layer w, so that the element in question would be assigned to cluster c, rather than ca. Thus, the 'expectations' of a sequencing layer may influence the clustering of inputs per formed by the layer below it. Additionally, as the sequences and clusters mature in the sequence layers and cluster layers, anomalies may be generated when input data does not match well with a mature model of behavior at a given layer in the cortex model component 240., 
However, Cobb does not explicitly teach executing a plurality of executable codelets, via a cognitive module, to … the plurality of executable codelets including at least one of deterministic codelets or stochastic codelets; … based on the set of matching statistical descriptions; … tunable. Although Cobb teaches pattern detection, he does not explicitly disclose the use of codelets to perform this function. Although Cobb uses the choice and vigilance test, probabilistic metrics, and Euclidean metrics to compare information derived from a new set of data with that derived from previous data (clusters with statistical description) and although these tests, at each ART/SOM level in the learning framework, involve a quantification of the matching between the two sets of data for that level, he does not disclose that the input data that is being compared to existing clusters is represented statistically in making that comparison. Although Cobb does teach that the input data becomes a new cluster prototype with mean and standard deviation if it is not mapped to any existing cluster, he does not disclose when the statistics of that prototype are computed relative to the creation of that cluster (i.e., if the input data itself is a direct expression of these statistics). Also, although additional mapping of other input data to an immature cluster (indicative of an emerging trend) modifies the statistics of that immature cluster, Cobb does not disclose a comparison with that immature cluster with other 
However, Dupont, in the analogous environment of trend behavior modeling for anomaly detection teaches determining a set of matching statistical descriptions based on the first … model and the second … model;  ([0164, 1038, 1044, 0025], the system establishes baseline behaviors (260) … then assesses deviations (265) by comparing assessed behaviors (205) to such a baseline.  This allows the detection of anomalies in recent or past behavior. , The first step in detecting anomalies by deviation is to define the reference features (3070) against which analyzed features (3075) will be compared., This analyzed feature (3075) is usually defined as a sliding window (380) of recent data for the target subject (272), or as a window around an event (100) (unique or recurring) in the case where behaviors around two different events [100] have to be compared., wherein (as also taught by Cobb) the behaviors between two time periods for features that are present in both time periods are assessed by comparing the analyzed features (matched statistical descriptions) associated with those time periods in which the (common) statistically analyzed features that are used in each model are matched statistical descriptions) generating a plurality of similarity scores based on the set of matching statistical descriptions … detecting … a trend change based on a comparison between the … similarity score and a tunable threshold, …;   ([1032, 1060, 1062, 1070, Figure 8], A deviation is detected when the absolute value of the difference between the analyzed feature descriptor (3080) and the reference feature descriptor (3080) is larger than A times the variance of the reference feature descriptor (3080) across the reference observations. The amplitude of each deviation is mapped into the interval [0,1] where 1 represents the absence of deviation, and this value is used as a multiplier of the confidence level (870) of the considered anomaly (270)., In one embodiment, the threshold multiplier A has a default value of 10, and is tuned to larger or smaller values based on feedback given by the user regarding detected anomalies by deviation (seethe section on Anomaly detection tuning), The system generates alerts (305) for detected anomalies (270)… [that] is Conf*Rel≥km where Conf is the confidence level (870) of the anomaly, Rel its relevance (280), and k is a threshold for alert generation., wherein the statistical deviation between the behaviors at the two time periods, which is based upon the comparison of the statistics of the two sets of feature descriptors (corresponding to based on the matching statistical descriptions with each feature contributing a distinct deviation-based comparison/similarity score), is mapped into an interval that characterizes the deviation of each feature (corresponding to similarity)  the set of which is used to characterize the overall confidence of the set of deviations used in the computation of an anomaly by deviation occurrence alert metric (corresponding to similarity score) such that an anomaly/trend change is detected according to a tunable threshold by virtue of the inclusion of the relevance parameter indicative of a level of risk associated with the anomaly/trend change in the threshold evaluation procedure but also by virtue of the tunability of the threshold via user feedback).  
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Cobb to incorporate the teachings of Dupont to compare statistical representations of two sets of data by using a statistical similarity measure based on matching statistical descriptions of features to detect trend behavior using a tunable threshold. The modification would have been obvious because one of ordinary skill would have been motivated to improve the accuracy and efficiency of detecting anomalous behavior for reducing operational risk by using a less memory intensive and less intrusive 

Regarding claim 2, rejection of claim 1 is incorporated and Cobb further teaches wherein the first time period precedes the second time period ([0038, 0061], The computer vision engine 135 processes the received video data frame-by-frame, while the machine-learning engine 140 processes data every N-frames., Once the clusters have matured, e.g., after a specified period of time or after clustering a specified minimum number of input data values, new input data values are mapped to clusters in the ART network., wherein any set of N-frames is collected in a second time period after one or more previous sets of N-frames has been previously collected, particularly if the analysis of the previous sets of N-frames have resulted in the generation of clusters (even prototype clusters) with a statistical description.)  
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Cobb to incorporate the teachings of Dupont and Cobb_3 for the same reasons as pointed out for claim 1.

Regarding claim 3, rejection of claim 1 is incorporated and Cobb further teaches wherein the first and second neuro-linguistic models are generated either periodically based on a time interval or based on a number of operations  ([0061, 0038], Once the clusters have matured, e.g., after a specified period of time or after clustering a specified minimum number of input data values, new input data values are mapped to clusters in the ART network., The computer vision engine 135 processes the received video data frame-by-frame, while the machine-learning engine 140 processes data every N-frames. wherein cluster models are formed based upon successive N-frames of data (periodic generation) such that even an update of existing clusters based upon that data may be considered a generation/regeneration of a model and wherein the formation of mature cluster models (particularly with reference to the first model) is based upon a certain number of input vectors having been mapped to the corresponding cluster (a number of operations).)
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Cobb to incorporate the teachings of Dupont and Cobb_3  for the same reasons as pointed out for claim 1.

Regarding claim 4, rejection of claim 1 is incorporated and Dupont further teaches wherein the neuro-linguistic models are statistically described probabilistic distributions ([0068], As stated, inputs mapping to an existing ART network cluster may be used to update a mean and variance for each dimension of the ART network, changing the position, shape and size of the cluster. Alternatively, the clusters may be defined using a mean and a covariance.. wherein the cluster models at each layer in the machine learning module (semantic layers) are represented/defined by a mean and standard deviation across each dimension (statistically described probabilistic distribution). 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Cobb to incorporate the teachings of Dupont and Cobb_3 for the same reasons as pointed out for claim 1.

wherein the plurality of similarity scores is based on a comparison of mean values from the set of matching statistical descriptions and the variance of a statistical description of the first set of statistical descriptions within the set of matching statistical descriptions.264433282v1PATENTAttorney Docket No.: BRS/0077US (078989) Although Cobb uses the choice and vigilance test, probabilistic metrics, and Euclidean metrics to compare information derived from a new set of data with that derived from previous data (clusters with statistical description) and although these tests, at each ART/SOM level in the learning framework, involve a quantification of the matching between the two sets of data for that level, he does not disclose that the input data that is being compared to existing clusters is represented statistically to make that comparison. 
However, Dupont, in the analogous environment of trend behavior modeling for anomaly detection teaches  wherein the plurality of similarity scores is based on a comparison of mean values from the set of matching statistical descriptions and the variance of a statistical description of the first set of statistical descriptions within the set of matching statistical descriptions ([1060, 1039-1043,1070] A deviation is detected when the absolute value of the difference between the analyzed feature descriptor  (3080) and the reference feature descriptor (3080) is larger than A times the variance of the reference feature descriptor (3080) across the reference observations., A feature descriptor (3080) is an aggregated value computed over a time interval… One or more of the following definitions are available for a feature descriptor (3080): the average value of the feature (2920], the variance of the feature (2920), any combination of the previous values. The system generates alerts (305) for detected anomalies (270)… [that] is Conf*Rel≥km where Conf is the confidence level (870) of the anomaly, Rel its relevance (280), and k is a threshold for alert generation., wherein, based upon 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Cobb to incorporate the teachings of Dupont to have compare statistical representations of two sets of data by using a statistical similarity measure/score to detect trend behavior. The modification would have been obvious because one of ordinary skill would have been motivated to improve the accuracy and efficiency of detecting anomalous behavior by using a less memory intensive and less intrusive statistical representation of components of the dataset for representing behavior over time and by using this representation framework to achieve continuous/real-time monitoring and responsiveness to detected behavioral changes (Dupont, [0150, 0244, 0974, 1210]).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Cobb and Dupont to incorporate the teachings of Cobb_3 for the same reasons as pointed out for claim 1.

Claim 8 is also rejected because it is just a computer-readable storage medium implementation of the same subject matter of claim 1 which can be found in Cobb, Dupont, and Cobb_3. It is noted that claim 8 also recites a computer-readable storage medium for instructions which can be found in Cobb ([0033] , One embodiment of the invention is implemented as a program product for use with a computer system. The program(s) of the program product defines functions of the embodiments (including the methods described herein) and can be contained on a variety of computer-readable storage media.).

Claim 9/8 is also rejected because it is just a computer-readable storage medium implementation of the same subject matter of claim 2/1 which can be found in Cobb, Dupont, and Cobb_3.

Claim 10/8 is also rejected because it is just a computer-readable storage medium implementation of the same subject matter of claim 3/1 which can be found in Cobb, Dupont, and Cobb_3.

Claim 11/8 is also rejected because it is just a computer-readable storage medium implementation of the same subject matter of claim 4/1 which can be found in Cobb, Dupont, and Cobb_3.

Claim 12/8 is also rejected because it is just a computer-readable storage medium implementation of the same subject matter of claim 5/1 which can be found in Cobb, Dupont, and Cobb_3.

Claim 15 is also rejected because it is just a system implementation of the same subject matter of claim 1 which can be found in Cobb, Dupont, and Cobb_3. It is noted that claim 8 also recites a processor and memory for storing code which can be found in Cobb ([0009], Another embodiment of the invention includes a computer-readable storage medium containing a program which, when executed by a processor, performs an operation for analyzing a sequence of video frames depicting a scene captured by a Video camera.).

Claim 16/15 is also rejected because it is just a system implementation of the same subject matter of claim 2/1 which can be found in Cobb, Dupont, and Cobb_3.

Claim 17/15 is also rejected because it is just a system implementation of the same subject matter of claim 3/1 which can be found in Cobb, Dupont, and Cobb_3.

Claim 18/15 is also rejected because it is just a system implementation of the same subject matter of claim 4/1 which can be found in Cobb, Dupont, and Cobb_3.

Claim 19/15 is also rejected because it is just a system implementation of the same subject matter of claim 5/1 which can be found in Cobb, Dupont, and Cobb_3.

Regarding claim 22, rejection of claim 15 is incorporated and Cobb further teaches wherein the neuro-linguistic module includes a mapper configured to generate a set of clusters for each feature from a plurality of features of the first set of input data, ([0050, 0057], A second layer contains sequences of clusters of features (e.g., sequences of ART network labels to which Successive kinematic data vectors are mapped to), and a third layer contains clusters of sequences of clusters of features, etc., This approach allows distinct object types to emerge from the clustering of micro features (e.g., using an ART network to cluster the micro features). For example, the micro features of multiple passenger cars may all map to a common cluster in an ART network at the cluster layer d 315, and therefore, be presumed as being instances of a common agent type.,  wherein clusters are formed in a given layer based upon a mapping between input data to higher order features performed from a preceding sequencing layer and wherein the ART is a part of this mapping process which associates the features used in a given level with clusters formed by the ART network(s) for that level.) the system configured to determine that a cluster from the set of clusters has statistical significance when an amount of input data mapping to that cluster exceeds a threshold amount, ([0061, 0069], For example, the cluster layer may map the input data to a cluster in an adaptive resonance theory (ART) network. That is, the ART network is used to generate clusters modeling the input data. Once the clusters have matured, e.g., after a specified period of time or after clustering a specified minimum number of input data values, new input data values are mapped to clusters in the ART network. In such a case, the output of that cluster layer may be a sequence of labels assigned to the particular ART network clusters to which Successive inputs data are mapped., If the distance between the input data and the closest cluster in the ART network 625 exceeds a specified amount, or if the closest cluster has not been reinforced a specified minimum number of times (i.e., the cluster is “immature'), an alert specifying the occurrence of an anomalous observation may be generated., wherein a cluster has statistical significance for use when at least a certain minimum number of input data values have been mapped to a cluster so as to reinforce (to establish significance of) the representation of the cluster based on the data mapped to it, so that a mature cluster may be used to generate output in the form of a sequence of labels or to declare an anomaly.)  the mapper configured to limit symbols that can be sent to a lexical component to statistically significant clusters from the set of clusters. ([0061, 0075, 0078, Figure 7], For example, the cluster layer may map the input data to a cluster in an adaptive resonance theory (ART) network. That is, the ART network is used to generate clusters modeling the input data. Once the clusters have matured, e.g., after a specified period of time or after clustering a specified minimum number of input data values, new input data values are mapped to clusters in the ART network. In such a case, the output of that cluster layer may be a sequence of labels assigned to the particular ART network clusters to which Successive inputs data are mapped., Specifically, the elements of kok, k2, ks, kaks, ke, k7 in the trajectory vector 732 are mapped to clusters labeled D, D, B, B, C, C, F, F, respectively. Note, the particular sequence at which nodes of the trajectory vector 732 are mapped into the SOM 325 creates an ordered sequence 747. Removing redundant elements results in a sequence of {D, B, C, F, shown in FIG. 7 as label sequence 750.,  FIG. 7 shows sequence labels 750 (which includes the labels of {D, B, C, F}) being supplied the sequence layer 710. In one embodiment, the sequences received by the sequence layer 710 may be stored in a pool and once a thresh old number of trajectories are available (e.g., 100) the voting experts component 760 may generate (or update) the ngram trie 755, as well as determine the internal entropies and the boundary entropies for each traceable sequences in the ngram trie 755., wherein the ART output is a sequence of symbols associated with mature clusters (thereby restricted by/conditioned on the maturity state of the cluster as noted above) such that this sequence is received by a sequence layer which performs lexical analysis on those symbols.)  
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Cobb to incorporate the teachings of Dupont and Cobb_2 for the same reasons as pointed out for claim 15.

wherein the executing the plurality of executable codelets is performed as part of a cognitive cycle that further includes iteratively copying memories and percepts to a workspace. Although each of Cobb and Dupont teaches pattern detection as previously noted, neither explicitly discloses the use of codelets to perform this function. 
However, Cobb_3, in the analogous environment of trend behavior modeling for anomaly detection, teaches  wherein the executing the plurality of executable codelets is performed as part of a cognitive cycle that further includes iteratively copying memories and percepts to a workspace; ([0043, Figure 2, Figure 3],    The workspace 240 may be configured to copy information from the perceptual memory 230, retrieve relevant memories from the episodic memory 235 and the long-term memory 225, select which codelets 245 to execute. In one embodiment, each codelet 245 is a software program configured to evaluate different sequences of events and to determine how one sequence may follow (or otherwise relate to) another (e.g., a finite state machine). More generally, the codelets may provide a software module configured to detect interesting patterns from the streams of data fed to the machine-learning engine. In turn, the codelet 245 may create, retrieve, reinforce, or modify memories in the episodic memory 235 and the long-term memory 225. By repeatedly scheduling codelets 245 for execution, copying memories and percepts to/from the workspace 240, the machine-learning engine 140 performs a cognitive cycle used to observe, and learn, about pattern of behavior that occur within the scene., wherein the codelets are executed to repeatedly copy memories and percepts to and from a workspace in each cognitive cycle for the purpose of analyzing those memories and percepts (e.g., pattern detection).) 
.

Claims 6, 7, 13, 14, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Cobb, in view of Dupont, in view of Cobb_3, and in further view of Cobb et al. (US2010/0260376), hereinafter referred to as Cobb_2.

Regarding claim 6, rejection of claim 1 is incorporated and Cobb further teaches receiving a third set of input data, from the at least one sensor and during a third time period; performing, via the neuro-linguistic module, neural network-based linguistic analysis of the third set of input data, to generate a third neuro-linguistic model that includes a third set of statistical descriptions; …, the third time period preceding the first and second time periods ([0038, 0061], The computer vision engine 135 processes the received video data frame-by-frame, while the machine-learning engine 140 processes data every N-frames., Once the clusters have matured, e.g., after a specified period of time or after clustering a specified minimum number of input data values, new input data values are mapped to clusters in the ART network., wherein any set of N-frames is collected in a third period preceding a first time period which in turn precedes a second time period after one or more previous sets of N-frames has been previously collected, particularly if the analysis of the previous sets of N-frames have resulted in the generation of clusters (even prototype clusters) with a statistical description in the third time period and if the analysis of the frames in the first time period similarly may be characterized by the emergence or further development of clusters distinct from those associated with the third time period.)
However, Cobb, Dupont, and Cobb_3 do not explicitly merging the first neuro-linguistic model with the third neuro-linguistic model. Although Cobb as well as Cobb_3 teach the mapping/incorporation of input data into immature clusters such that a pre-existing immature cluster may be considered a first model, thereby incorporating the new information into the new model, they does not explicitly teach the mergence of two models each explicitly characterized by a statistical description. Furthermore, Dupont does not explicitly teach that the statistical models associated with successive peer-group referentials are merged; rather he states that they are re-computed (corresponding to updated). 
However, Cobb_2, in the analogous environment of performing trend analysis to detect anomalies, teaches merging the first neuro-linguistic model with the third neuro-linguistic model. ([0054], As stated, clusters of a given ART network315 may dynamically expand and contract by learning as the mean and variance from the prototypical cluster value changes based on inputs to that ART network 315. Further, multiple clusters may collapse to a single cluster when they overlap by a specified amount (e.g., the two clusters share greater than a specified percentage of their area). In Such a case, the mean and variance of each cluster contributes to the mean and variance of the merged cluster. Additionally, the statistical significance of each cluster participating in the merger may contribute to a significance determined for the merged cluster., wherein two clusters (first linguistic model and third linguistic model) are collapsed/merged to a single cluster based upon the overlap of their statistical description.) 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Cobb, Dupont, and Cobb_3 to incorporate the teachings of Cobb_2 to have merged two models based on their statistical representation of sets of data compare statistical representations of two sets of data by using a statistical similarity measure to detect trend behavior. The modification would have been obvious because one of ordinary skill would have been motivated to improve the efficiency of detecting anomalous behavior by restraining the growth of the number of clusters to a manageable number (Cobb_2, [0059]).

Regarding claim 7, rejection of claim 1 is incorporated and Cobb, Dupont, and Cobb_3  do not further teach updating the second neuro-linguistic model by merging the second neuro-linguistic model with the first neuro-linguistic model. Although each of Cobb and Cobb_3 teaches the mapping/incorporation of input data into immature clusters such that a pre-existing immature cluster may be considered a first model, thereby incorporating the new information into the new model, each does not explicitly teach the mergence of two models each explicitly characterized by a statistical description. Furthermore, Dupont does not explicitly teach that the statistical models associated with successive peer-group referentials are merged; rather he states that they are re-computed (corresponding to updated). 
 updating the second neuro-linguistic model by merging the second neuro-linguistic model with the first neuro-linguistic model. ([0054, 0026], As stated, clusters of a given ART network315 may dynamically expand and contract by learning as the mean and variance from the prototypical cluster value changes based on inputs to that ART network 315. Further, multiple clusters may collapse to a single cluster when they overlap by a specified amount (e.g., the two clusters share greater than a specified percentage of their area). In Such a case, the mean and variance of each cluster contributes to the mean and variance of the merged cluster. Additionally, the statistical significance of each cluster participating in the merger may contribute to a significance determined for the merged cluster., Further, the clusters may decay over time. For example, if a cluster does not receive a set of input data (reinforcing the importance of that cluster) for a specified period of time. Such a cluster may be removed from an ART network., wherein two clusters (first linguistic model and second linguistic model) are collapsed/merged to a single cluster based upon the overlap of their statistical description such that the result of the mergence is being interpreted as an update to the second model since it is an incorporation of more recent information.) 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Cobb, Dupont, and Cobb_3 to incorporate the teachings of Cobb_2 to have merged two models based on their statistical representation of sets of data. The modification would have been obvious because one of ordinary skill would have been motivated to improve the efficiency of detecting anomalous behavior by restraining the growth of the number of clusters to a manageable number (Cobb_2, [0059, 0026]).

Regarding claim 13, rejection of claim 8 is incorporated and Cobb further teaches … the first neuro-linguistic model with a third neuro-linguistic model, the third neuro-linguistic model including a third set of statistical descriptions for a third input data, wherein the third neuro-linguistic model provides a set of statistical descriptions of the third input data transmitted from the at least one sensor during a third time period,  the third time period preceding the first and second time periods ([0038, 0061], The computer vision engine 135 processes the received video data frame-by-frame, while the machine-learning engine 140 processes data every N-frames., Once the clusters have matured, e.g., after a specified period of time or after clustering a specified minimum number of input data values, new input data values are mapped to clusters in the ART network., wherein any set of N-frames is collected in a third period preceding a first time period which in turn precedes a second time period after one or more previous sets of N-frames has been previously collected, particularly if the analysis of the previous sets of N-frames have resulted in the generation of clusters (even prototype clusters) with a statistical description in the third time period and if the analysis of the frames in the first time period similarly may be characterized by the emergence or further development of clusters distinct from those associated with the third time period.)
However, Cobb does not explicitly merging the first neuro-linguistic model with a third neuro-linguistic model, the third neuro-linguistic model including a third set of statistical descriptions for a third input data. Although each of Cobb and Cobb_3 teaches the mapping/incorporation of input data into immature clusters such that a pre-existing immature cluster may be considered a first model, thereby incorporating the new information into the new model, each does not explicitly teach the mergence of two models each explicitly characterized 
However, Cobb_2, in the analogous environment of performing trend analysis to detect anomalies, teaches merging the first neuro-linguistic model with a third neuro-linguistic model, the third neuro-linguistic model including a third set of statistical descriptions for a third input data. ([0054], As stated, clusters of a given ART network315 may dynamically expand and contract by learning as the mean and variance from the prototypical cluster value changes based on inputs to that ART network 315. Further, multiple clusters may collapse to a single cluster when they overlap by a specified amount (e.g., the two clusters share greater than a specified percentage of their area). In Such a case, the mean and variance of each cluster contributes to the mean and variance of the merged cluster. Additionally, the statistical significance of each cluster participating in the merger may contribute to a significance determined for the merged cluster., wherein two clusters (first linguistic model and third linguistic model) are collapsed/merged to a single cluster based upon the overlap of their statistical description.) 
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Cobb, Dupont, and Cobb_3 to incorporate the teachings of Cobb_2 to have compare statistical representations of two sets of data by using a statistical similarity measure to detect trend behavior. The modification would have been obvious because one of ordinary skill would have been motivated to improve the efficiency of detecting anomalous behavior by restraining the growth of the number of clusters to a manageable number (Cobb_2, [0059]).

Claim 14/8 is also rejected because it is just a computer-readable storage medium implementation of the same subject matter of claim 7/1 which can be found in Cobb, Dupont, Cobb_3, and Cobb_2.

Claim 20/15 is also rejected because it is just a computer-readable storage medium implementation of the same subject matter of claim 13/8 which can be found in Cobb, Dupont, Cobb_3, and Cobb_2.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Eaton et al. (US2009/0087085, Published 2 April 2009) teach a pattern/trend recognition system using the semantics of trajectories in a video stream which are distribution of values. 
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ROBERT LEWIS KULP whose telephone number is (571)272-7983. The examiner can normally be reached M, Th, F 8-5:30; Tu 8-3.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang, can be reached on 571-270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ROBERT LEWIS KULP/Examiner, Art Unit 2124                                                                                                                                                                                                        
/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124