DETAILED ACTION
This action is in response to the claims filed 02/07/2022. Claims 1-26 are pending and have been examined.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed 02/07/2022 with respect to 35 U.S.C 101 have been fully considered but they are not persuasive.
Applicant states that using a neural network to infer content from individual images can not practically be performed in the mind. Examiner notes that to infer content about an image one may simply look at an image a determine a feature about the image, such as whether or not the image contains the color “black”. Examiner notes that even a sufficiently simply neural network could process a single or a handful of input values representing the image with a single layer to determine such a simple inference. This amounts to a single matrix multiplication operation. Furthermore as stated in the rejection, the additional element “using a neural network” only generally links the inference to a particular technology.	Further applicant compares the presented application to “Example 39, from the USPTO’s Subject Matter Eligibility Examples” stating “Like the claim from example 39….claim 1…provides specific improvements in training neural network”. Examiner notes the claims in the present application are dissimilar to the cited case. Furthermore, claims are considered according to the analysis established in MPEP 2106, not according to their similarity to specific cases. In 
Other(s) of applicant’s arguments filed 02/07/2022 pertaining to the rejections under 35 U.S.C 102 and 35 U.S.C 103 have been fully considered and are persuasive.
In view of applicant’s arguments, Sabokrou does not disclose all elements in at least claim 1. For this reason, updated rejections for claims 1-26 have been presented in view of Patraucean et al. 
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1, 2, 6-8, and 21-26 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Regarding Claim 1
Step 1 Analysis: The claim is directed to a processor, which is directed to a product, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a processor comprising. Each of the following limitations:
to infer content from individual images in a sequence of images
to infer changes in the content in the sequence of images
processor…one or more arithmetic logic units”), the above limitations in the context of this claim encompass the following: “to infer content from individual images in a sequence of images” and “to infer changes in the content in the sequence of images.” (corresponds to an evaluation performed in the human mind). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “processor…one or more arithmetic logic units” as drafted, are reciting generic computer components. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. In addition, the claim recites additional element(s) “use at least one neural network” that only generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration 

Regarding Claim 2
Step 1 Analysis: The claim is directed to a processor, which is directed to a product, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a processor comprising. Each of the following limitations:
 a probabilistic model to determine an anomalous event in the sequence of images in response
obtaining information associated with the changes in the content in the sequence of images
obtaining information associated with errors from reconstructing the sequence of images.
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, but for the generic computer components language (“processor…one or more arithmetic logic units”), the above limitations in the context of this claim encompass the following: “a probabilistic model to determine an anomalous event in the sequence of images in response” (corresponds to an evaluation performed in the human mind). Examiner notes that a probabilistic model at claimed  “obtaining information associated with the changes in the content in the sequence of images” and “obtaining information associated with errors from reconstructing the sequence of images.” (corresponds to an evaluation or judgement performed in the human mind) As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “processor…one or more arithmetic logic units” as drafted, are reciting generic computer components. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 6
Step 1 Analysis: The claim is directed to a processor, which is directed to a product, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a processor comprising. Each of the following limitations:
wherein content from individual images in the sequence of images includes spatial information.  
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, but for the generic computer components language (“processor…one or more arithmetic logic units”), the above limitations in the context of this claim encompass “wherein content from individual images in the sequence of images includes spatial information.” and further defines the abstract idea, the above limitation including: “to infer content from individual images in a sequence of images” and “to infer changes in the content in the sequence of images” (corresponds to an evaluation performed in the human mind). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “processor…one or more arithmetic logic units” as drafted, are reciting generic computer components. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 7
Step 1 Analysis: The claim is directed to a processor, which is directed to a product, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a processor comprising. Each of the following limitations:
wherein changes in the content in the sequence of images includes temporal information.
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, but for the generic processor…one or more arithmetic logic units”), the above limitations in the context of this claim encompass “wherein changes in the content in the sequence of images includes temporal information.” and further defines the abstract idea, the above limitation including: “to infer content from individual images in a sequence of images” and “to infer changes in the content in the sequence of images” (corresponds to an evaluation performed in the human mind). This is because the human mind can make evaluations as to whether image content has changed over a temporal period. As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “processor…one or more arithmetic logic units” as drafted, are reciting generic computer components. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the 

Regarding Claim 8
Step 1 Analysis: The claim is directed to a processor, which is directed to a product, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a processor comprising. Each of the following limitations:
receive the sequence of images
provide the sequence of images for anomalous event detection
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, but for the generic computer components language (“processor…one or more arithmetic logic units”), the above limitations in the context of this claim encompass the following: “receive the sequence of images” and “provide the sequence of images for anomalous event detection” (corresponds to an evaluation or judgement performed in the human mind) As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “processor…one or more arithmetic logic units” as drafted, are reciting generic computer components. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. In addition, the claim recites additional element(s) “at least one or more stationary video cameras” and “the one or more stationary video cameras provide…without reconfigurations” that only generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 21
Step 1 Analysis: The claim is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a method comprising
to infer content from individual images in a sequence of images
to infer changes in the content in the sequence of images
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, the above limitations in the context of this claim encompass the following: “to infer content from individual images in a sequence of images” and “to infer changes in the content in the sequence of images.” (corresponds to an evaluation performed in the human mind). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The claim recites additional element(s) “using a first/second portion of at least one neural network” that only generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a 

Regarding Claim 22
Step 1 Analysis: The claim is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a step for carrying out the method of claim 21. The Step 2A Prong One Analysis for claim 21 is applicable here since claim 22 carries out the method of claim 21 but for the recitation of additional elements “wherein the first portion is a convolutional autoencoder.” (particular technological environment of field of use).
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). In addition, the claim recites additional element(s) “wherein the first portion is a convolutional autoencoder.” that only generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the 

Regarding Claim 23
Step 1 Analysis: The claim is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a step for carrying out the method of claim 21. The Step 2A Prong One Analysis for claim 21 is applicable here since claim 22 carries out the method of claim 21 but for the recitation of additional elements “wherein the second portion is a Long Short-Term Memory (LSTM).” (particular technological environment of field of use).
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). In addition, the claim recites additional element(s) “wherein the second portion is a Long Short-Term Memory (LSTM)” that only generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer 

Regarding Claim 24
Step 1 Analysis: The claim is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a method comprising. Each of the following limitations:
to infer content from individual images in a sequence of images
to infer changes in the content in the sequence of images
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, the above limitations in the context of this claim encompass “wherein content from individual images in the sequence of image includes one or more latent representations of the individual images” and further defines the abstract idea, the above limitation including: “to infer content from individual images in a sequence of images” and “to infer changes in the content in the sequence of images” (corresponds to an evaluation performed in the human mind). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) that are mere instructions to 
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 25
Step 1 Analysis: The claim is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a method comprising. Each of the following limitations:
to determine one or more anomalous events in the sequence of images based at least in part on changes in the content in the sequence of images.  
as drafted, is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including an observation, evaluation, judgment, opinion)) but for the recitation of generic computer components. For example, the above limitations in the context of this claim encompass the following: “to determine one or more anomalous events in the sequence of images based at least in part on changes in the content in the sequence of images.” (corresponds to an evaluation performed in the human mind). As such the claim recites an abstract idea.
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The claim recites additional element(s) “using a third portion of at least one neural network” that only generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.

Regarding Claim 26
Step 1 Analysis: The claim is directed to a method, which is directed to a process, one of the statutory categories.
Step 2A Prong One Analysis: The claim recites a step for carrying out the method of claim 21. The Step 2A Prong One Analysis for claim 21 is applicable here since claim 26 carries out the method of claim 21 but for the recitation of additional elements “wherein the third portion of the at least one neural network is a probabilistic model” (particular technological environment of field of use).
Step 2A Prong Two Analysis: The judicial exception in not integrated into a practical application. In particular, the claim recites additional element(s) that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). In addition, the claim recites additional element(s) “wherein the third portion of the at least one neural network is a probabilistic model” that only generally link the use of the judicial exception to a particular technological environment or field of use. See MPEP 2106.05(h). Accordingly, the additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, the claim is not patent eligible.


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.



Claims 1, 3, 4-7, 9, 17, 21-24 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Patraucean et al. “spatio-temporal video autoencoder with differentiable memory” hereinafter Patraucean.


Claim 1
Patraucean teaches, A processor, comprising: one or more arithmetic logic units (ALUs) to: (pg 9 ¶02 “We trained our architecture on Camvid… However, due to memory limitations, we were not able to use all the (unlabelled) available frames” Examiner notes that a system which trains and has memory constraints was implemented an a processor or CPU, which consists of ALUs) use at least one neural network to infer content from individual images in a sequence of images; (pg 4 ¶ 01 “At each time step t, the LSTM module receives as input a new video frame after projection in the spatial feature space.”  The module performs a spatial projection on a video frame, which is an image from a sequence of images. The projection reveals or infers content from the image. Pg 3 Section 3.1 and Figure 1 “The spatial autoencoder is a classic convolutional encoder – decoder architecture. The encoder contains at least one convolutional layer” as shown in figure 1 a convolutional encoder, which is a neural network performs the spatial projection.) and use the at least one neural network to infer changes in the content in the sequence of images (pg 4 ¶01-02 “the LSTM module receives as input a new video frame after projection in the spatial feature space… we replace the fully connected transformations with spatial local convolutions…. the LSTM module outputs a memory map… temporal features learnt by the memory” the LSTM module, corresponding to the “at least one neural network” infers temporal features or changes in the spatial convolutions. The spatial convolutions are derived from each new video frame, or image.)

Claim 3
Patraucean teaches claim 1  
Patraucean teaches, wherein the one or more ALUs are to train a first component of the at least one neural network, (pg 5 Section 3.2.4 “Training the network comes down to minimising the reconstruction error between the predicted next frame and the ground truth next frame” pg 5 Section 4 “The training was done using rmsprop” the joint LSTM and convolutional autoencoder is trained according to the reconstruction error.) wherein the first component is an autoencoder with an internal layer that maps the sequence of images to generate one or more latent representations in a feature space. (Figure 1 Section 3.1 pg 3 “The encoder E contains one convolutional layer, followed by tanh non-linearity” the layers of the encoder transform the input data into a latent representation. 
    PNG
    media_image1.png
    289
    751
    media_image1.png
    Greyscale
as shown in figure 1 the output of the encoder phase of the autoencoder is the latent representation of the video sequence)

Claim 4
Patraucean teaches claim 3
Patraucean teaches, wherein the autoencoder is a convolutional autoencoder ( Pg 3 Section 3.1 and Figure 1 “The spatial autoencoder is a classic convolutional encoder – decoder architecture. The encoder contains at least one convolutional layer” as shown in figure 1 a convolutional encoder, which is a neural network performs the spatial projection.)

Claim 5
Patraucean teaches claim 3  
Patraucean teaches, wherein the one or more ALUs are to train a second component of the at least one neural network, (pg 5 Section 3.2.4 “Training the network comes down to minimising the reconstruction error between the predicted next frame and the ground truth next frame” pg 5 Section 4 “The training was done using rmsprop” the joint LSTM and convolutional autoencoder is trained according to the reconstruction error.)  wherein the second component is a Long Short-Term Memory (LSTM) that receives the one or more latent representations from the first component to infer changes in the sequence of images over a period of time. (Figure 1 Section 3.1 pg 3 “The encoder E contains one convolutional layer, followed by tanh non-linearity” the layers of the encoder transform the input data into a latent representation. 
    PNG
    media_image1.png
    289
    751
    media_image1.png
    Greyscale
as shown in figure 1 the output of the encoder phase of the autoencoder is the latent representation of the video sequence)

Claim 6
Patraucean teaches claim 1  
Patraucean teaches, content from individual images in the sequence of images includes spatial information. (pg 3 Section 3 “Our architecture consists of a temporal autoencoder nested into a spatial autoencoder (see Figure 1).” Pg 4 ¶01 “At each time step t, the LSTM module receives as input a new video frame after projection in the spatial feature space.” The spatial autoencoder output a video frame projected into “spatial feature space” to be processed by the LSTM. The spatial feature space vector contains spatial information revealed by the convolutional layer of the encoder.)

Claim 7
Patraucean teaches claim 1  
Patraucean teaches, wherein changes in the content in the sequence of images includes temporal information. (pg 4 ¶01-02 “the LSTM module receives as input a new video frame after projection in the spatial feature space… we replace the fully connected transformations with spatial local convolutions…. the LSTM module outputs a memory map… temporal features learnt by the memory” the LSTM module, corresponding to the “at least one neural network” infers temporal features or changes in the spatial convolutions. The spatial convolutions are derived from each new video frame, or image.)

Claim 9
Patraucean teaches, A system, comprising: one or more computers having one or more processors to train one or more neural networks (pg 9 ¶02 “We trained our architecture on Camvid… However, due to memory limitations, we were not able to use all the (unlabelled) available frames” Examiner notes that a system which trains and has memory constraints was implemented an a processor or CPU, which consists of ALUs) to infer content from individual images in a sequence of images (pg 4 ¶ 01 “At each time step t, the LSTM module receives as input a new video frame after projection in the spatial feature space.”  The module performs a spatial projection on a video frame, which is an image from a sequence of images. The projection reveals or infers content from the image. Pg 3 Section 3.1 and Figure 1 “The spatial autoencoder is a classic convolutional encoder – decoder architecture. The encoder contains at least one convolutional layer” as shown in figure 1 a convolutional encoder, which is a neural network performs the spatial projection.)[and to infer] changes in the content in the sequence of images.  (pg 4 ¶01-02 “the LSTM module receives as input a new video frame after projection in the spatial feature space… we replace the fully connected transformations with spatial local convolutions…. the LSTM module outputs a memory map… temporal features learnt by the memory” the LSTM module, corresponding to the “at least one neural network” infers temporal features or changes in the spatial convolutions. The spatial convolutions are derived from each new video frame, or image.)

Claim 17
Patraucean teaches, A machine-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to at least: train one or more neural networks (pg 9 ¶02 “We trained our architecture on Camvid… However, due to memory limitations, we were not able to use all the (unlabelled) available frames” Examiner notes that a system which trains and has memory constraints was implemented an a processor or CPU, which consists of ALUs) to infer content from individual images in a sequence of images (pg 4 ¶ 01 “At each time step t, the LSTM module receives as input a new video frame after projection in the spatial feature space.”  The module performs a spatial projection on a video frame, which is an image from a sequence of images. The projection reveals or infers content from the image. Pg 3 Section 3.1 and Figure 1 “The spatial autoencoder is a classic convolutional encoder – decoder architecture. The encoder contains at least one convolutional layer” as shown in figure 1 a convolutional encoder, which is a neural network performs the spatial projection.)[and to infer] changes in the content in the sequence of images.  (pg 4 ¶01-02 “the LSTM module receives as input a new video frame after projection in the spatial feature space… we replace the fully connected transformations with spatial local convolutions…. the LSTM module outputs a memory map… temporal features learnt by the memory” the LSTM module, corresponding to the “at least one neural network” infers temporal features or changes in the spatial convolutions. The spatial convolutions are derived from each new video frame, or image.)

Claim 21
Patraucean teaches, A method comprising: using a first portion of at least one neural network to infer content from individual images in a sequence of images; to infer content from individual images in a sequence of images (pg 4 ¶ 01 “At each time step t, the LSTM module receives as input a new video frame after projection in the spatial feature space.”  The module performs a spatial projection on a video frame, which is an image from a sequence of images. The projection reveals or infers content from the image. Pg 3 Section 3.1 and Figure 1 “The spatial autoencoder is a classic convolutional encoder – decoder architecture. The encoder contains at least one convolutional layer” as shown in figure 1 a convolutional encoder, which is a neural network performs the spatial projection.) and using a second portion of the at least one neural network changes in the content in the sequence of images.  (pg 4 ¶01-02 “the LSTM module receives as input a new video frame after projection in the spatial feature space… we replace the fully connected transformations with spatial local convolutions…. the LSTM module outputs a memory map… temporal features learnt by the memory” the LSTM module, corresponding to the “at least one neural network” infers temporal features or changes in the spatial convolutions. The spatial convolutions are derived from each new video frame, or image.)
Claim 22
Patraucean teaches claim 21
Patraucean teaches, wherein the first portion is a convolutional autoencoder. ( Pg 3 Section 3.1 and Figure 1 “The spatial autoencoder is a classic convolutional encoder – decoder architecture. The encoder contains at least one convolutional layer” as shown in figure 1 a convolutional encoder, which is a neural network performs the spatial projection.)
Claim 23
Patraucean teaches claim 21
Patraucean teaches, wherein the second portion is a Long Short-Term Memory (LSTM). (pg 4 ¶01-02 “the LSTM module receives as input a new video frame after projection in the spatial feature space… we replace the fully connected transformations with spatial local convolutions…. the LSTM module outputs a memory map… temporal features learnt by the memory” the LSTM module, corresponding to the “at least one neural network” infers temporal features or changes in the spatial convolutions. The spatial convolutions are derived from each new video frame, or image.)


Claim 24
Patraucean teaches claim 21  
Patraucean teaches, wherein content from individual images in the sequence of image includes one or more latent representations of the individual images. (Figure 1 Section 3.1 pg 3 “The encoder E contains one convolutional layer, followed by tanh non-linearity” the layers of the encoder transform the input data into a latent representation. 
    PNG
    media_image1.png
    289
    751
    media_image1.png
    Greyscale
as shown in figure 1 the output of the encoder phase of the autoencoder is the latent representation of the video sequence)


Claims 2, 8, 10-16, 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Patraucean, and further in view of Sabokrou et al. “Real-time anomaly detection and localization in crowded scenes” hereinafter Sabokrou.

claim 2
Patraucean teaches claim 1  
Patraucean does not explicitly teach, use a probabilistic model to determine an anomalous event in the sequence of images in response to obtaining information associated with 
Sabokrou when addressing using a probabilistic model for anomaly discovery based on content feature vectors and changes in content feature vectors teaches, use a probabilistic model to determine an anomalous event in the sequence of images (pg 5 ¶02 “To model the normal activities in each video [sequence of images] patch, we incorporate two Gaussian classifiers C1 and C2. For classifying x’ patches, as described, we use two partially independent feature sets (global and local)” the Gaussian classifiers correspond to the probabilistic model, where one of the classes indicate an anomalous event.) in response to obtaining information associated with the changes in the content in the sequence of images (pg 3 ¶01 “a descriptor-based similarity metric between adjacent patches for detecting sudden changes in spatio-temporal domains.” a vector representing “changes in content” is used by the gaussian model, as shown in figure 1 the vector d is used by a gaussian model The similarity metric is the local descriptor used in the Gaussian classifier mention above) obtaining information associated with errors from reconstructing the sequence of images. (pg 3 ¶01 “Presenting a feature learning procedure for describing videos for the task of video anomaly localization.” Pg 4 ¶02 “The auto-encoder minimizes the objective defined in Eq. (1) by re-reconstructing the original raw data:
    PNG
    media_image2.png
    102
    324
    media_image2.png
    Greyscale
” The auto-encoder is part of a system than describes a video with global features. Further the global features learned by the autoencoder are associated with the reconstruction loss because the autoencoder is trained with the loss function that uses reconstruction error. Shown in the boxed region of the equation. The feature vector f shown in Figure 1, is the information associated with errors.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to use an autoencoder system which uses content vectors and change content feature vectors in images to predict anomalies taught by Sabokrou. to the disclosed invention of Patraucean.
One of ordinary skill in the arts would have been motivated to make this modification because both autoencoder systems reveal temporal content and spatial content form images. Sabokrou uses these vectors with two classifiers in order to “achieves accurate and reliable anomaly detection and localization” which is useful for “real-time surveillance applications, in which we are dealing with live streams of videos” (Sabokrou Conclusion)


Claim 8
Patraucean teaches claim 1  
Patraucean does not explicitly teach, wherein the one or more ALUs to receive the sequence of images from at least one or more stationary video cameras, wherein the one or more stationary video cameras provide the sequence of images for anomalous event detection without reconfigurations.
Sabokrou when addressing using a probabilistic model for anomaly discovery teaches, wherein the one or more ALUs ((pg 2 ¶02 “The overall scheme of our algorithm is shown in Fig. 1. We achieve 25 fps processing power, and with enduring some bit errors we reach up to 200 fps using a PC with 3.5GHz CPU and 8G RAM in MATLAB 2012a” pg 5 Section 3 ¶01 “We empirically demonstrate that our approach is suitable to be used in surveillance systems”) to receive the sequence of images from at least one or more stationary video cameras, wherein the one or more stationary video cameras provide the sequence of images for anomalous event detection without reconfigurations. (pg 5 last paragraph pg 6 ¶01  “UCSD datasets. This dataset includes two subsets…are recorded with a static camera at 10 fps, with the resolutions 158 × 234 and 240 × 360, respectively. The dominant mobile objects in these scenes are pedestrians. Therefore, any object (e.g., a car, skateboarder, wheelchair, or bicycle) is considered as being an anomaly” the scheme implemented on a CPU, which consists of ALUs, receives video data, or sequences of images, that contain anomalous data for the scheme to detect. The static camera is able to capture images at a constant rate of 10 fps, therefore not needing to be reconfigured to capture a complete sequence of images.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to use an autoencoder system which uses content vectors and change content feature vectors in images to predict anomalies taught by Sabokrou. to the disclosed invention of Patraucean.
One of ordinary skill in the arts would have been motivated to make this modification because both autoencoder systems reveal temporal content and spatial content form images. Sabokrou uses these vectors with two classifiers in order to “achieves accurate and reliable anomaly detection and localization” which is useful for “real-time surveillance applications, in which we are dealing with live streams of videos” (Sabokrou Conclusion)

Claim 10
Patraucean teaches claim 9
Patraucean wherein the one or more processors are to train the one or more neural networks to: input the sequence of images to a first neural network of the one or more neural networks to generate a first set of information representing content from individual images of the sequence of images; (pg 4 ¶ 01 “At each time step t, the LSTM module receives as input a new video frame after projection in the spatial feature space.”  The module performs a spatial projection on a video frame, which is an image from a sequence of images. The projection reveals or infers content from the image. Pg 3 Section 3.1 and Figure 1 “The spatial autoencoder is a classic convolutional encoder – decoder architecture. The encoder contains at least one convolutional layer” as shown in figure 1 a convolutional encoder, which is a neural network performs the spatial projection.) reproduce the sequence of images using the first set of information; (pg 3 Section 3.2 “and the decoder constrains the learning of its own feature space to satisfy this decomposition and to reconstruct the input,” as shown in figure , 1 the input to the decoder is based on the first set of information, the output from the encoder. 
input the first set of information to a second neural network of the one or more neural networks to generate a second set of information associated with the changes in the content in the sequence of images; (Figure 1 Section 3.1 pg 3 “The encoder E contains one convolutional layer, followed by tanh non-linearity” the layers of the encoder transform the input data into a latent representation. 
    PNG
    media_image1.png
    289
    751
    media_image1.png
    Greyscale
as shown in figure 1 the output of the encoder phase of the autoencoder is the latent representation of the video sequence)

Patraucean does not explicitly teach, and use a probabilistic model to generate a third set of information based at least in part on… the second set of information. a probabilistic model to generate a third set of information based at least in part on receiving error measurements associated with the reproduced sequence of images
Sabokrou when addressing using a probabilistic model for anomaly discovery based on content feature vectors and changes in content feature vectors teaches, and use a probabilistic model to generate a third set of information based at least in part on… the second set of information. (pg 5 ¶02 “To model the normal activities in each video [sequence of images] patch, we incorporate two Gaussian classifiers C1 and C2. For classifying x’ patches, as described, we use two partially independent feature sets (global and local)” the Gaussian classifiers correspond to the probabilistic model, where one of the classes indicate an anomalous event. The gaussian classifiers each take a set of information as input; the vector d corresponds to the second set of information. This is shown in Figure 1) based at least (pg 3 ¶01 “Presenting a feature learning procedure for describing videos for the task of video anomaly localization.” Pg 4 ¶02 “The auto-encoder minimizes the objective defined in Eq. (1) by re-reconstructing the original raw data:
    PNG
    media_image2.png
    102
    324
    media_image2.png
    Greyscale
” The auto-encoder is part of a system than describes a video with global features. Further the global features learned by the autoencoder are associated with the reconstruction loss because the autoencoder is trained with the loss function that uses reconstruction error. Shown in the boxed region of the equation. The feature vector f shown in Figure 1 is fed to the probability model. The vector f is generated by the reconstruction process and is therefore based on in part of the error measurements associated with reconstruction.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to use an autoencoder system which uses content vectors and change content feature vectors in images to predict anomalies taught by Sabokrou. to the disclosed invention of Patraucean.
One of ordinary skill in the arts would have been motivated to make this modification because both autoencoder systems reveal temporal content and spatial content form images. Sabokrou uses these vectors with two classifiers in order to “achieves accurate and reliable anomaly detection and localization” which is useful for “real-time surveillance applications, in which we are dealing with live streams of videos” (Sabokrou Conclusion)

Claim 11
Patraucean/Sabokrou teaches claim 10 
Further Patraucean teaches, wherein the first neural network is a convolutional autoencoder that takes the sequence of images as input to generate the first set of information.  ( Pg 3 Section 3.1 and Figure 1 “The spatial autoencoder is a classic convolutional encoder – decoder architecture. The encoder contains at least one convolutional layer” as shown in figure 1 a convolutional encoder, which is a neural network performs the spatial projection.)

Claim 12
Patraucean/Sabokrou teaches claim 11
Patraucean teaches, wherein the convolutional autoencoder maps features of the sequence of images to generate the first set of information in a reduce feature space from which the sequence of images can be approximately reproduced from the first set of information in the reduced feature space. (Figure 1 Section 3.1 pg 3 “The encoder E contains one convolutional layer, followed by tanh non-linearity” the layers of the encoder transform the input data into a latent representation. 
    PNG
    media_image1.png
    289
    751
    media_image1.png
    Greyscale
as shown in figure 1 the output of the encoder phase of the autoencoder is the latent representation of the video sequence)

Claim 13
Patraucean/Sabokrou teaches claim 10
Patraucean teaches, wherein the second neural network is a Long Short-Term Memory (LSTM) that takes the first set of information as input. (Figure 1 Section 3.1 pg 3 “The encoder E contains one convolutional layer, followed by tanh non-linearity” the layers of the encoder transform the input data into a latent representation. 
    PNG
    media_image1.png
    289
    751
    media_image1.png
    Greyscale
as shown in figure 1 the output of the encoder phase of the autoencoder is the latent representation of the video sequence, which is taken as input by the LSTM.)

Claim 14
Patraucean/Sabokrou teaches claim 10
Patraucean teaches, wherein the one or more processors are to train the one or more neural networks (pg 5 Section 3.2.4 “Training the network comes down to minimising the reconstruction error between the predicted next frame and the ground truth next frame” pg 5 Section 4 “The training was done using rmsprop” the joint LSTM and convolutional autoencoder is trained according to the reconstruction error.)  
Further Sabokrou teaches,  to obtain the sequence of images from one or more static video cameras to detect anomalous events in the sequence of images. (pg 5 last paragraph pg 6 ¶01  “UCSD datasets. This dataset includes two subsets…are recorded with a static camera at 10 fps, with the resolutions 158 × 234 and 240 × 360, respectively. The dominant mobile objects in these scenes are pedestrians. Therefore, any object (e.g., a car, skateboarder, wheelchair, or bicycle) is considered as being an anomaly” the scheme implemented on a CPU, which consists of ALUs, receives video data, or sequences of images, that contain anomalous data for the scheme to detect. The static camera is able to capture images at a constant rate of 10 fps, therefore not needing to be reconfigured to capture a complete sequence of images.)

Claim 15
Patraucean teaches claim 10 
Further Sabokrou teaches, wherein the third set of information includes at least one indicator of an anomaly event in the sequence of images. (pg 5 ¶02 “To model the normal activities in each video [sequence of images] patch, we incorporate two Gaussian classifiers C1 and C2. For classifying x’ patches, as described, we use two partially independent feature sets (global and local)…
    PNG
    media_image3.png
    49
    288
    media_image3.png
    Greyscale
” the Gaussian classifiers correspond to the probabilistic model, where one of the classes signify an anomalous event. The classification score is indicative of information indicating an anomaly event. pg 3 ¶01 “a descriptor-based similarity metric between adjacent patches for detecting sudden changes in spatio-temporal domains.” Once the video patches (corresponding to sequence of images) are extracted by the neural network, the similarity metric is utilized to detect sudden changes (corresponding to infer changes). The similarity metric is the local descriptor used in the Gaussian classifier mention above)

Claim 16
Patraucean teaches claim 10 
Further Sabokrou Teaches, wherein the probabilistic model is previously trained (pg 5 ¶02 “To model the normal activities in each video [sequence of images] patch, we incorporate two Gaussian classifiers C1 and C2. For classifying x’ patches, as described, we use two partially independent feature sets (global and local)… “Selecting a “good” threshold is important for the performance; it can be selected based on training patches” the threshold selection, indicative of previous training, is selected based on the training patches.) on a collection of training images. (pg 6 ¶01 “This subset includes 12 video samples, and each sample is divided into training and test frames. To evaluate the localization, we utilize the ground truth of all test frames” the frames are a collection of training images, for the system)

Claim 18
Patraucean teaches claim 17 
Further Sabokrou teaches, wherein the set of instructions further cause the one or more processors (pg 2 ¶02 “The overall scheme of our algorithm is shown in Fig. 1. We achieve 25 fps processing power, and with enduring some bit errors we reach up to 200 fps using a PC with 3.5GHz CPU and 8G RAM in MATLAB 2012a”) to at least train the at least one neural network by using a probabilistic model to generate information associated with a likelihood of normal behavior in the sequence of images. (pg 5 ¶02 “To model the normal activities in each video [sequence of images] patch, we incorporate two Gaussian classifiers C1 and C2. For classifying x’ patches, as described, we use two partially independent feature sets (global and local)… “Selecting a “good” threshold is important for the performance; it can be selected based on training patches” the gaussian classifiers model normal activity, and indication of an anomaly by the classifier is information associated with a “likelihood of normal behavior. As shown in figure 1, the classifiers need to both agree that the behavior is abnormal.)
For the reasons to combine Patraucean/Sabokrou see at least the rejection of claim 2.


Claims 19, 20, 25, 26 is/are rejected under 35 U.S.C. 103 as being unpatentable over Patraucean/Sabokrou, and further in view of Zong et al. “Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection” hereinafter Zong

Claim 19
Patraucean/Sabokrou teaches claim 18 
Patraucean/Sabokrou does not explicitly teach, wherein the probabilistic model is a Gaussian Mixture Model (GMM).
Zong, however, when addressing issues related to using a Gaussian mixture model to determine anomalies informed by features revealed by an autoencoder network teaches, wherein the probabilistic model is a Gaussian Mixture Model (GMM). ( pg 5 section 3.3 ¶01 “the estimation network [probabilistic model] performs density estimation under the framework of GMM. [Gaussian mixture model]”
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to incorporate a Gaussian mixture model to classify samples processed by an autoencoder as normal or anomalous as taught by Zong to the disclosed invention of Patraucean/Sabokrou
One of ordinary skill in the arts would have been motivated to make this modification in order to implement “DAGMM [deep auto encoding mixture model] [which] demonstrates superior performance over state-of-the-art techniques on public benchmark datasets with up to 14% improvement on the standard F1 score, and suggests a promising direction for unsupervised anomaly detection on multi or high-dimensional data” (Zong Conclusion)

Claim 20
Patraucean/Sabokrou/Zong teaches claim 19
	Further Zong teaches, wherein the GMM determines, based at least in part on information associated with the changes in the content in the, a score indicating a likelihood of one or more anomalous events. ( pg 5 section 3.3 ¶01 “the estimation network performs density estimation under the framework of GMM. [probabilistic model]” Pg 5 section 3.4 “This objective function includes three component… the loss function that characterizes the reconstruction error [error measurements] caused by the deep autoencoder in the compression network… E(zi) models the probabilities that we could observe the input sample” pg 5 section section 3.3 ¶03 “during the testing phase with the learned GMM parameters, it is straightforward to estimate sample energy, and predict samples of high energy as anomalies by a pre-chosen threshold”  GMM uses the reconstruction error, which is representative of the changes in the content due to reconstruction by the autoencoder, a threshold, corresponding to a score, which indicates a probability that a sample is anomalous or normal.)

Claim 25
Patraucean teaches claim 24 
Patraucean does not explicitly teach, using a third portion of the at least one neural network to determine one or more anomalous events in the sequence of images based at least in part on changes in the content in the sequence of images.
Sabokrou when addressing using a probabilistic model for anomaly discovery based on content feature vectors and changes in content feature vectors teaches, determine one or more anomalous events in the sequence of images based at least in part on changes in the content in the sequence of images. (pg 5 ¶02 “To model the normal activities in each video [sequence of images] patch, we incorporate two Gaussian classifiers C1 and C2. For classifying x’ patches, as described, we use two partially independent feature sets (global and local)” the Gaussian classifiers correspond to the probabilistic model, where one of the classes indicate an anomalous event.)
It would have been obvious for one of ordinary skill in the arts before the effective filling date of the claimed invention to use an autoencoder system which uses content vectors and Sabokrou. to the disclosed invention of Patraucean.
One of ordinary skill in the arts would have been motivated to make this modification because both autoencoder systems reveal temporal content and spatial content form images. Sabokrou uses these vectors with two classifiers in order to “achieves accurate and reliable anomaly detection and localization” which is useful for “real-time surveillance applications, in which we are dealing with live streams of videos” (Sabokrou Conclusion)
Patraucean/Sabokrou does not explicitly teach, using a third portion of the at least one neural network to determine one or more anomalous events 
Zong, however, when addressing issues related to using a Gaussian mixture model to determine anomalies informed by features revealed by an autoencoder network teaches, using a third portion of the at least one neural network to determine one or more anomalous events ( pg 5 section 3.3 ¶01 “the estimation network performs density estimation under the framework of GMM. [probabilistic model]” Pg 5 section 3.4 “This objective function includes three component… the loss function that characterizes the reconstruction error [error measurements] caused by the deep autoencoder in the compression network… E(zi) models the probabilities that we could observe the input sample” pg 5 section section 3.3 ¶03 “during the testing phase with the learned GMM parameters, it is straightforward to estimate sample energy, and predict samples of high energy as anomalies [third set of information] by a pre-chosen threshold”  the GMM is the third portion of the neural network, it determines if a sample is anomalous event by comparing it to a threshold level. This is based the reconstruction error which is a measure indicating the change in a sample due to the reconstruction.)
Zong to the disclosed invention of Patraucean/Sabokrou
One of ordinary skill in the arts would have been motivated to make this modification in order to implement “DAGMM [deep auto encoding mixture model] [which] demonstrates superior performance over state-of-the-art techniques on public benchmark datasets with up to 14% improvement on the standard F1 score, and suggests a promising direction for unsupervised anomaly detection on multi or high-dimensional data” (Zong Conclusion)

Claim 26
Patraucean/Sabokrou/Zong teaches claim 25
Further Zong teaches wherein the third portion of the at least one neural network is a probabilistic model. ( pg 5 section 3.3 ¶01 “the estimation network [probabilistic model] performs density estimation under the framework of GMM. [Gaussian mixture model]” a GMM is a probabilistic model, which is part of the joint autoencoder-GMM neural network.)

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHNATHAN R GERMICK whose telephone number is (571)272-8363. The examiner can normally be reached M-F 7:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on 571-272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/J.R.G./
Examiner, Art Unit 2122                                                                                                                                                                                                        

/NICHOLAS KLICOS/Primary Examiner, Art Unit 2145