DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The amendment filed 2022-07-05 has been entered.  The claim status is as follows:
Claims 1-25 remain pending in the application.
Claims 1, 10, and 19 are amended.
Response to Arguments
Applicant's arguments with respect to rejections under 35 USC 101 have been fully considered but they are not persuasive.  Applicant’s recitation of a DNN, generically stated without details about the structure and method of training the DNN amounts to mere instructions to apply an exception.  See MPEP 2106.05(f)(1):  “Whether the claim recites only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished”, which cites Intellectual Ventures I v. Capital One Fin. Corp., 850 F.3d 1332, 121 USPQ2d 1940 (Fed. Cir. 2017):  “Although the claims purported to modify the underlying XML document in response to modifications made in the dynamic document, nothing in the claims indicated what specific steps were undertaken other than merely using the abstract idea in the context of XML documents. The court thus held the claims ineligible, because the additional limitations provided only a result-oriented solution and lacked details as to how the computer performed the modifications, which was equivalent to the words "apply it". 850 F.3d at 1341-42; 121 USPQ2d at 1947-48”.  Here, analogously to the above, Applicant is merely using the abstract idea in the context of a DNN, as there are no specific steps as to how the computer performs the modifications of the DNN.
Applicant's arguments in response to rejections under 35 USC 103 have been fully considered but they are not persuasive.  Applicant argues that “it appears the state of a specific node at time t refers to whether the specific node is awake or sleeping at time t. However, the ‘inferred response data’ recited in claim 1 represents an inference of the received data identified as response data, where the received data is sensor data. It does not represent a state of a specific node in a sensor network.”  Examiner respectfully disagrees, as the “state of a specific node” is still a piece of data, that data being whether or not the node is awake or sleeping.  This piece of data is inferred response data, as it is in inference made in response to received data, wherein the received data was received from a sensor (Xie, Page 12 Section 5.2 Para 2: “inferred by its dependent node at time t-1”).  Thus, the state of one sensor is an inferred response to received data of another sensor.  The term “data” is broad, and may comprise not only the measurements made by the sensor of the environment, but also self-referential information generated by the sensor about the status of the sensor itself.
Furthermore, secondary reference Erden also teaches inferred response data that represents an inference of received data in [0044]:  “For sensors 1201, 1202 that provide overlapping data, landmarks (or other identifiable static features) are defined for use in comparison of abnormalities (e.g., moving objects). Landmarks may be defined using machine learning, e.g., extracting features and finding common patterns from different sensors 1201, 1202.”
Examiner acknowledges that the newly amended matter “and the data processing model to comprise a deep neural network (DNN)” is not taught by the combination of Xie and Erden.  The argument is moot, as Examiner has added Khelifi et al. ("Bringing Deep Learning at the Edge of Information-Centric Internet of Things") to teach this limitation.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-25 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea, particularly a mental process, without significantly more.
Step 1 Analysis:
Claims 1-9 are directed to an apparatus, Claims 10-18 are directed to a non-transitory computer-readable storage medium, and Claims 19-25 are directed to a system.  Thus, each of the claims are directed to one of the four statutory categories of patent eligible subject matter.
Step 2A Prong 1 Analysis:
Independent Claims 1, 10 and 19 recite:
“identify a first portion of the received data as prediction data”; identifying is a mental process
“identify a second portion, different than the first portion, of the received data as response data”; identifying is a mental process
“generate inferred response data based in part on a data processing model and the prediction data, the inferred response data to represent an inference of the received data identified as response data”; generating data based on a model can be performed by a human with pen and paper, and is thus a mental process
“(store either the prediction data or the received data to a memory storage location) based in part on a comparison between the inferred response data, the response data, and an error threshold”; performing a comparison between pieces of data based on an error threshold can be performed by a human with pen and paper, and is thus a mental process
Step 2A Prong 2 Analysis:
The judicial exception is not integrated into a practical application, because additional elements “receive data from a plurality of data provider devices, the received data to represent sensor data” and “store either the prediction data or the received data to a memory storage location” amount to insignificant extra solution activity (mere data gathering, MPEP 2106.05(g)(3)), and mere instructions to apply the judicial exception on a computer (MPEP 2106.05(f)), respectively.  Additional element “the data processing model to comprise a deep neural network (DNN)”, generically stated without details about the structure and method of training the DNN amounts to mere instructions to apply an exception (see MPEP 2106.05(f)(1)). 
As per Claim 19, which additionally recites “an interface”, “a processor”, and “a memory”; the computer and related components are recited at a high-level of generality (i.e., as a generic processor performing a generic computer function) such that it amounts to no more than mere instructions to apply the exception using generic computer components (MPEP 2106.05(f)).  
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.
Step 2B Analysis:
The additional elements “receive data from a plurality of data provider devices” and “store either the prediction data or the received data to a memory storage location” are not sufficient to amount to significantly more than the judicial exception, as they amount to well-understood, routine, and conventional activity (receiving data over a network as per MPEP 2106.05(d)(II)(i) and storing data in memory as per MPEP 2106.05(d)(II)(iv), respectively.  Additional element “the data processing model to comprise a deep neural network (DNN)”, generically stated without details about the structure and method of training the DNN amounts to mere instructions to apply an exception (see MPEP 2106.05(f)(1)).
As per Claim 19, the additional elements (“an interface”, “a processor”, and “a memory”) as described amount to no more than mere instructions to apply the exception using generic computer components (MPEP 2106.05(f)). 
Accordingly, these additional elements are not sufficient to amount to significantly more than the judicial exception.
Dependent Claims:
Dependent claims 2-9, 11-18, and 20-25 are also rejected under 35 USC 101 for the following reasons:
Claims 2, 11, and 20 recite the same limitations as Claims 1, 10, and 19, further reciting: “execute the data processing model with the prediction data as input to generate the inferred response data”; executing a sufficiently simple data processing model can be performed by a human with pen and paper, and is still a mental process.
Claims 3, 12, and 21 recite the same limitations as Claims 1, 10, and 19, further reciting: “each of the plurality of data provider devices comprising at least one sensor, the received data comprising indications of signals received from the at least one sensor of the plurality of data provider devices, the memory storing instructions, which when executed by the processor cause the processor to:
identify the first portion of the received data based in part on the at least one sensor of the plurality of data provider devices associated with the first portion of the received data
and identify the second portion of the received data based in part on the at least one sensor of the plurality of data provider devices associated with the second portion of the received data, wherein the at least one sensor of the plurality of data provider devices associated with the first portion of the received data are different from the at least one sensor of the plurality of data provider devices associated with the second portion of the received data.”; identifying can be performed in the human mind, and is thus a mental process; additional elements “at least one sensor” and “data provider devices” merely provide further details on the “receiving” data (where the data has come from), and receiving data is still insignificant extra solution activity, being mere data gathering as per MPEP 2106.05(g)(3) (via receiving data over a network as per MPEP 2106.05(d)(II)(i)). The claims are still directed to a mental process.
Claims 4, 13, and 22 recite the same limitations as Claims 1, 10, and 19, further reciting: “train the data processing model based in part on the received data, to generate a further trained data processing model”; while a specific method of training a machine learning model is not considered a mental process (see MPEP 2106.04(a)(1)(vii)), a generic recitation of training a model with no further details is not sufficient to overcome this rejection, as it could mean performing a simple linear regression, which can be performed by a human with pen and paper; thus Claims 4 and 13 are still directed to a mental process; Claim 22 is continued below.
Claims 5, 14, and 22 recite the same limitations as Claims 4 and 13, further reciting: “update a version of the data processing model based on the further trained data processing model; store the updated version of the data processing model to a model database; receive additional data from the plurality of data provider devices; identify a first portion of the received additional data as additional prediction data; identify a second portion, different than the first portion, of the received additional data as additional response data; generate additional inferred response data based in part on the updated version of the data processing model and the additional prediction data; add metadata to the received additional data including an indication of the updated version of the data processing model; store either the additional prediction data or the received additional data to the memory storage location based in part on a comparison between the inferred additional response data, the additional response data, and the error threshold”; identifying, generating data, adding metadata, and a “comparison” can be performed by a human with pen and paper, and are thus a mental process; additional elements reciting “receive” from “devices” and “store” to “memory” amount to insignificant extra solution activity (mere data gathering, MPEP 2106.05(g)(3)), and mere instructions to apply the judicial exception on a computer (MPEP 2106.05(f), also storing data in memory as per MPEP 2106.05(d)(II)(iv)), respectively; thus the claims are still directed to a mental process.
Claims 6, 15, and 23 recite the same limitations as Claims 1, 10, and 19, further reciting:  “determine a difference between the response data and the inferred response data; determine whether the difference is less than, or less than or equal to the error threshold; store the prediction data to the memory storage location based on a determination that the difference is less than, or less than or equal to the error threshold”; determining a difference can be performed by a human with pen and paper, and is thus a mental process; additional element “store” to “memory” amounts to mere instructions to implement the judicial exception on a computer (MPEP 2106.05(f), also storing data in memory as per MPEP 2106.05(d)(II)(iv)); thus the claims are still directed to a mental process.
Claims 7 and 16 recite the same limitations as Claims 6 and 15, further reciting:  “store the received data to the memory storage location based on a determination that the difference is not less than, or not less than or equal to the error threshold”; determining a difference can be performed by a human with pen and paper, and is thus a mental process; additional element “store” to “memory” amounts to mere instructions to implement the judicial exception on a computer (MPEP 2106.05(f), also storing data in memory as per MPEP 2106.05(d)(II)(iv)); thus the claims are still directed to a mental process.
Claims 8, 17, and 24 recite the same limitations as Claims 1, 10, and 19, further reciting:  “send an information element comprising indications of either the prediction data or the received data to a cloud computing device or an edge computing device, wherein the cloud computing device or the edge computing device is to store the prediction data or the received data to the memory storage location”; sending an information element amounts to insignificant extra solution activity (necessary data outputting, MPEP 2106.05(g)(3)), via well-understood, routine, and conventional activity (transmitting data over a network, 2106.05(d)(II)(i)); thus the claims are still directed to a mental process.
Claims 9, 18, and 25 recite the same limitations as Claims 1, 10, and 19, further reciting:  “retrieve the prediction data from the memory storage location; generate the inferred response data based in part on the prediction data and the data processing model to retrieve the response data”; generating the inferred response data can be performed by a human with pen and paper, and is thus a mental process; additional element to “retrieve” data from “memory” amounts to mere instructions to apply the judicial exception on a computer (MPEP 2106.05(f), and is well-understood, routine, and conventional activity (storing data in memory, MPEP 2106.05(d)(II)(iv)); thus the claims are still directed to a mental process.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4, 8-13, 17-21, and 24-25 are rejected under 35 U.S.C. 103 as being unpatentable over Xie et. al. (“Anomaly Detection and Redundancy Elimination of Big Sensor Data in Internet of Things”; hereinafter “Xie”) in view of Erden (US 2020/0154110 A1) and Khelifi et al. ("Bringing Deep Learning at the Edge of Information-Centric Internet of Things"; hereinafter “Khelifi”).
As per Claim 1, Xie teaches receive data from a plurality of data provider devices, the received data to represent sensor data (Xie, Page 11 Section 5 “Redundancy Elimination of Big Sensor Data in IoT”, discloses:  “In a sensor network, there are many factors which cause data redundancy. For example, where the gap among each node is close, the type of collecting data is similar. Redundant data not only waste the storage space but also exert harmful influence on data feature extraction. In this part, we mainly focus on the methods of redundancy elimination directly from the perspective of gathered sensor data. Two methods are proposed for static and dynamic sensor data redundancy elimination separatively.”  Here, Xie discloses receiving data (“collecting data”) from a plurality of data provider devices, which are sensors (“sensor network”)).
identify a first portion of the received data as prediction data (Xie, Page 12 Section 5.2 Para 2, discloses “The varying dependencies of each node in DBN reflects the real-time characteristic of a sensor network. The main point of real-time data redundancy detection is that the state of a specific node at time t can be inferred by its dependent node at time t - 1. So, first of all we should build the real-time dependencies network for the sensor nodes.”  Here, Xie discloses a first portion of the received data as prediction data (“dependent node at time t – 1”)
identify a second portion, different than the first portion, of the received data as response data (Xie, Page 12 Section 5.2 Para 2, recited above, discloses a second portion of the received data as response data (“the state of a specific node at time t”)).
generate inferred response data based in part on a data processing model and the prediction data, the inferred response data to represent an inference of the received data identified as response data (Xie, as shown above, discloses prediction data (“dependent node at time t – 1”) and response data (the actual value of “the state of a specific node at time t”).  Xie also discloses inferred response data, since “the state of a specific node at time t”, rather than being directly measured from that node, can actually be “inferred” as shown in Page 12 Section 5.2 Para 2:  “state of a specific node at time t can be inferred by its dependent node at time t – 1”.  This is based on prediction data (“dependent node at time t – 1”) as well as on a data processing model as disclosed in Page 12 Section 5.2 Para 1:  “According to the characteristic of DBN, in this section we will post a method to build a DBN structure for a working sensor network”).  Here, Xie discloses a data processing model (“DBN”, Dynamic Bayesian Model).  The Bayesian Model is used to make the inference, as shown at the bottom pf Page 9:  “If each state of a sensor node is considered to be one category, then the problem of inferring the state of current node from the state of its parents can be seen as a classification problem given the state of parent nodes. So after the establishment of the Bayesian network, the state inference can use Naive Bayes classifier to solve the problem. We use the Naive Bayes classifier to infer the state of a node at a specific time, and the state of its parent nodes can be regarded as one feature for state inference.”)
Xie suggests, but does not explicitly teach store either the prediction data or the received data to a memory storage location based in part on a comparison between the inferred response data, the response data, and an error threshold (Xie, Pages 17-18 Section 6.3 “The Result of Sensor Data Redundancy Elimination”, discloses:  “In a gathered static dataset, if a specific node is detected as redundant node, it denotes the data collected by this node are redundant and we can get these data by its parent nodes. Based on this mechanism, we can weigh the performance of our algorithm by the accurately recovering the redundant data. The root-mean-square-error (RMSE) between real and predict values of redundant data is regarded as metrics.”  Here, Xie discloses comparison between the response data (“real values”) and inferred response data (“predict values”).  Xie, Page 20, first full paragraph, also discloses:  “In order to validate the accuracy of the predicted state, we recover the redundant data. And Fig. 15(d) shows the mean RMSE of real and estimated data of all redundant data. From Fig. 15(d) we can learn that the RSDRDA is good at real-time redundancy detection.” Here, Xie discloses by “validate the accuracy”, they determine that the algorithm is “good” at redundancy detection.  Thus, Xie suggests comparing the error to an error threshold, as some threshold is implied by determining that the results are have “good” accuracy.  Xie, Page 4 Section 1.3 Para 2, discloses:  “And two sensor data redundancy elimination approaches based on SBN and DBNs are proposed, respectively, i.e., static sensor data redundancy detection algorithm (SSDRDA) for eliminating redundant data in static data sets, and real-time sensor data redundancy detection algorithm (RSDRDA) for eliminating redundant sensor data in real-time.”  Here, Xie discloses “eliminating” redundant data, which suggests that Xie’s algorithm determines whether or not this data will be stored.)
Erden teaches an apparatus, comprising: a processor; and a memory storing instructions (Erden, Para [0052], discloses:  “Any of the embodiments described above may be implemented using this set of hardware (CPU 1302, RAM 1304, storage 1306, network 1308) or a subset thereof.”)
And explicitly teaches and store either the prediction data or the received data to a memory storage location based in part on a comparison between the inferred response data, the response data, and an error threshold (Erden, Para [0044], discloses:  “The processing in block 1206 involves a tuning phase which is performed initially and occasionally repeated. Each block 1200 identifies background in its own sensor's video streams and the blocks 1200 (and/or centralized node) also receives background from other sensors 1201, 1202 via other blocks 1200. The background data may be reduced in size by sending a subset of DWT components. For sensors 1201, 1202 that provide overlapping data, landmarks (or other identifiable static features) are defined for use in comparison of abnormalities (e.g., moving objects). Landmarks may be defined using machine learning, e.g., extracting features and finding common patterns from different sensors 1201, 1202.”  Here, Erden discloses using machine learning to determine response data and inferred response data (“overlapping data”).  Erden discloses that the “common patterns” may be defined using “machine learning”, which one of ordinary skill in the art will appreciate comprises minimizing an error threshold in making a prediction.  This is to determine redundancies of data provided by video sensors, as Erden, Para [0046], discloses:  “The spatial DWT representations of sensors 1201, 1202 are stored after eliminating the redundancies. This reduces an amount of data stored to only the non-redundant DWT representations, and may further be reduced by storing only a subset of the extracted spatial DWT components, e.g., just the “L” components for some nodes 1200. The extracted data sent outside of the storage nodes 1200 (e.g., to a centralized compute function) may also be reduced by including a subset of DWT components and only non-redundant information.”  Here, Erden discloses store either the prediction data or the received data to a memory storage location, as Erden only stores the necessary non-redundant data.)
Xie and Erden are analogous art because they are both in the field of endeavor of applying machine learning to sensor data, and eliminating redundant sensor data.
It would have been obvious before the effective filing date of the claimed invention to combine Xie’s redundancy elimination of sensor data and Erden’s storage of only non-redundant sensor data in memory.  One of ordinary skill in the art would be motivated to do so in order to save resources (Erden, [0050]:  “The ability to detect redundancies can reduce local storage at each node 1200”).
However, the combination of Xie and Erden does not explicitly teach the data processing model to comprise a deep neural network (DNN).
Khelifi teaches the data processing model to comprise a deep neural network (DNN). (Khelifi, Page 53 Section III, discloses: “We leverage DL for IoT applications on top of EC, aiming to improve the learning performance, and enhance the network traffic and user experience. Usually, DL models allow content to be stored in a real time manner, reduce the computation time, and optimize complex data processing at the EC. The DL is a machine learning method that uses Deep Neural Network (DNN), which has multi-layers (input layer, hidden layers, and output layer). Each layer is composed of several neurons.”)
Khelifi and the combination of Xie and Erden are analogous art because they are both in the field of endeavor of applying machine learning to the Internet-of-Things.
It would have been obvious before the effective filing date of the claimed invention to combine the DNN Khelifi with the IoT network of the combination of Xie and Erden.  One of ordinary skill in the art would be motivated to do so in order to gain improved machine learning performance over a large data set (Khelifi, Page 52 Bottom Right: “Furthermore, Deep Learning (DL) [6] shows an outstanding performance in computer science fields including natural language processing, bioinformatics domain, and vision recognition. It can perform a strong analysis over a huge volume of data.”)

As per Claim 2, the combination of Xie, Erden, and Khelifi teaches the apparatus of claim 1 as well as memory and a processor (See Rejection to Claim 1). Xie teaches the memory storing instructions, which when executed by the processor cause the processor to 
execute the data processing model with the prediction data as input to generate the inferred response data (Xie, Bottom of Page 19, discloses:  “Fig. 13 shows the result of predicting the data of redundant node by its parent nodes”)

As per Claim 3, the combination of Xie, Erden, and Khelifi teaches the apparatus of claim 1 as well as memory and a processor (See Rejection to Claim 1).  Xie teaches each of the plurality of data provider devices comprising at least one sensor, the received data comprising indications of signals received from the at least one sensor of the plurality of data provider devices (Xie, Page 11 Section 5 “Redundancy Elimination of Big Sensor Data in IoT”, discloses:  “In a sensor network, there are many factors which cause data redundancy. For example, where the gap among each node is close, the type of collecting data is similar. Redundant data not only waste the storage space but also exert harmful influence on data feature extraction. In this part, we mainly focus on the methods of redundancy elimination directly from the perspective of gathered sensor data. Two methods are proposed for static and dynamic sensor data redundancy elimination separatively.”  Here, Xie discloses receiving data (“collecting data”) from a plurality of data provider devices (“sensor network”)).
the memory storing instructions, which when executed by the processor cause the processor to: 
identify the first portion of the received data based in part on the at least one sensor of the plurality of data provider devices associated with the first portion of the received data (Xie, Page 12 Section 5.2 Para 2, discloses “The varying dependencies of each node in DBN reflects the real-time characteristic of a sensor network. The main point of real-time data redundancy detection is that the state of a specific node at time t can be inferred by its dependent node at time t - 1. So, first of all we should build the real-time dependencies network for the sensor nodes.”  Here, Xie discloses a first portion of the received data as prediction data (“dependent node at time t – 1”)).
and identify the second portion of the received data based in part on the at least one sensor of the plurality of data provider devices associated with the second portion of the received data, wherein the at least one sensor of the plurality of data provider devices associated with the first portion of the received data are different from the at least one sensor of the plurality of data provider devices associated with the second portion of the received data  (Xie, Page 12 Section 5.2 Para 2, recited above, discloses a second portion of the received data as response data (“the state of a specific node at time t”).  Xie discloses that the sensors between these two sets of data are different, as Xie discloses “specific node” and “dependent node”, and thus discloses two separate nodes/sensors).

As per Claim 4, the combination of Xie, Erden, and Khelifi teaches the apparatus of claim 1 as well as memory and a processor (See Rejection to Claim 1).  Xie teaches the memory storing instructions, which when executed by the processor cause the processor to 
train the data processing model based in part on the received data, to generate a further trained data processing model (Xie, Page 12 Section 5.2 End of Para 2, discloses:  “Thus, in the second part of the time slice, we use the structure which is trained in former part to predict the working state of each node. Base on this mechanism, the sensor network is in a circle of collecting data, learning transition network, and working state inference.”  Xie also discloses training on Page 14 just before Section 6:  “From Eq. (21) we can learn that the probability of current node in a specific state is the sum of the prior probability of the parent nodes in all states. We can get the prior probability of the states of parent nodes in previous time through training data sets.”)

As per Claim 8, the combination of Xie, Erden, and Khelifi teaches the apparatus of claim 1 as well as memory and a processor (See Rejection to Claim 1).  Erden teaches the memory storing instructions, which when executed by the processor cause the processor to 
send an information element comprising indications of either the prediction data or the received data to a cloud computing device or an edge computing device, wherein the cloud computing device or the edge computing device is to store the prediction data or the received data to the memory storage location (Erden, Para [0018], discloses sending some or all data to both edge and cloud devices:  “Eventually, some or all the data generated by the client devices 102 and edge devices 104 might be stored in a cloud service 108, which generally refers to one or more remotely-located data centers. The cloud service 108 may also be able to provide computation services that are offered together with the storage, as well as features such as security, data backup, etc. However, bandwidth-heavy computation and associated data flows can be done more efficiently within the edges 104 themselves. There can also be peer-to-peer data flow among edges 104. Because of the benefits offered by “edge” architectures, edge focused applications are increasing, and started to cover wide variety of applications.”)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Xie and Erden for at least the reasons recited in Claim 1.

As per Claim 9, the combination of Xie, Erden, and Khelifi teaches the apparatus of claim 1 as well as memory and a processor (See Rejection to Claim 1).  Erden teaches the memory storing instructions, which when executed by the processor cause the processor to: 
retrieve the prediction data from the memory storage location (Erden, as disclosed in Claim 1, discloses storing non-redundant data in memory:  “Erden, Para [0046], discloses:  “The spatial DWT representations of sensors 1201, 1202 are stored after eliminating the redundancies. This reduces an amount of data stored to only the non-redundant DWT representations, and may further be reduced by storing only a subset of the extracted spatial DWT components, e.g., just the “L” components for some nodes 1200. The extracted data sent outside of the storage nodes 1200 (e.g., to a centralized compute function) may also be reduced by including a subset of DWT components and only non-redundant information.”  One of ordinary skill in the art will appreciate that in order to be used in any computations, the data must be retrieved from memory.  Erden discloses this in [0048], where they describe the data going from storage to a centralized compute function:   “The spatial DWT representations of sensors 1201, 1202 are stored after eliminating the redundancies. This reduces an amount of data stored to only the non-redundant DWT representations, and may further be reduced by storing only a subset of the extracted spatial DWT components, e.g., just the “L” components for some nodes 1200. The extracted data sent outside of the storage nodes 1200 (e.g., to a centralized compute function) may also be reduced by including a subset of DWT components and only non-redundant information.”  Here, Erden discloses “extracted data sent outside of storage”).
However, Erden does not teach and generate the inferred response data based in part on the prediction data and the data processing model to retrieve the response data. 
Xie teaches and generate the inferred response data based in part on the prediction data and the data processing model to retrieve the response data. (Xie, Bottom of Page 19, discloses:  “Fig. 13 shows the result of predicting the data of redundant node by its parent nodes”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Xie and Erden for at least the reasons recited in Claim 1.

As per Claims 10-13 and 17-18, these are non-transitory computer-readable medium claims corresponding to apparatus Claims 1-4 and 8-9, respectively.  The difference is that they recite a non-transitory computer-readable medium. Erden Para [0063] discloses: “Such instructions may be stored on a non-transitory computer-readable medium and transferred to the processor for execution as is known in the art.”  Claims 10-13 and 17-18 are rejected for the same reasons as Claims 1-4 and 8-9.

As per Claims 19-21 and 24-25, these are system claims corresponding to apparatus Claims 1-3 and 8-9, respectively.  The difference is that these claims recite an interface; a processor; and a memory. Xie, Page 1 Bottom, discloses:  “Massive sensor data are gathered by distributed equipments. There are plenty of sensor data generated everyday. In order to analyze and process the data, all of these data should be stored within a certain period.”  Here, Xie discloses an interface (“sensor”) that collects data, comprising a processor (“process the data”) and memory (“data should be stored”). Xie Fig. 1 shows the architecture, as recited in Page 2 Section 1.1: “As shown in Fig. 1, the structure of Internet of things mainly consists of four layers.” Claims 19-21 and 24-25 are rejected for the same reasons as Claims 1-3 and 8-9.

Claims 5, 14, and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Xie in view of Erden and Khelifi further in view of Gupta Hyde et. al. (US 2019/0050683 A1; hereinafter “Gupta Hyde”)
As per Claim 5, the combination of Xie, Erden, and Khelifi teaches the apparatus of claim 4 as well as memory and a processor and the further trained data processing model (See Rejection to Claim 1).  Xie teaches the memory storing instructions, which when executed by the processor cause the processor to: 
receive additional data from the plurality of data provider devices (Xie, Page 11 Section 5 “Redundancy Elimination of Big Sensor Data in IoT”, discloses:  “In a sensor network, there are many factors which cause data redundancy. For example, where the gap among each node is close, the type of collecting data is similar. Redundant data not only waste the storage space but also exert harmful influence on data feature extraction. In this part, we mainly focus on the methods of redundancy elimination directly from the perspective of gathered sensor data. Two methods are proposed for static and dynamic sensor data redundancy elimination separatively.”  Here, Xie discloses receiving data (“collecting data”) from a plurality of data provider devices (“sensor network”).  Xie also discloses an iterative process, and thus discloses receiving “additional” data).
identify a first portion of the received additional data as additional prediction data (Xie, Page 12 Section 5.2 Para 2, discloses “The varying dependencies of each node in DBN reflects the real-time characteristic of a sensor network. The main point of real-time data redundancy detection is that the state of a specific node at time t can be inferred by its dependent node at time t - 1. So, first of all we should build the real-time dependencies network for the sensor nodes.”  Here, Xie discloses a first portion of the received data as prediction data (“dependent node at time t – 1”)).
identify a second portion, different than the first portion, of the received additional data as additional response data (Xie, Page 12 Section 5.2 Para 2, recited above, discloses a second portion of the received data as response data (“the state of a specific node at time t”)).
generate additional inferred response data based in part on the updated version of the data processing model and the additional prediction data (Xie, Page 12 Section 5.2 Para 2, recited above, discloses that “state of a specific node at time t can be inferred by its dependent node at time t – 1”.  Xie, Page 12 Section 5.2 Para 1, discloses:  “According to the characteristic of DBN, in this section we will post a method to build a DBN structure for a working sensor network.”  Here, Xie discloses that the inferred response data is generated based on a data processing model (“DBN”, a dynamic Bayesian network)).
comparison between the inferred additional response data, the additional response data, and an error threshold (Xie, Pages 17-18 Section 6.3 “The Result of Sensor Data Redundancy Elimination”, discloses:  “In a gathered static dataset, if a specific node is detected as redundant node, it denotes the data collected by this node are redundant and we can get these data by its parent nodes. Based on this mechanism, we can weigh the performance of our algorithm by the accurately recovering the redundant data. The root-mean-square-error (RMSE) between real and predict values of redundant data is regarded as metrics.”  Here, Xie discloses comparison between the response data (“real values”) and inferred response data (“predict values”).  Xie, Page 20, first full paragraph, also discloses:  “In order to validate the accuracy of the predicted state, we recover the redundant data. And Fig. 15(d) shows the mean RMSE of real and estimated data of all redundant data. From Fig. 15(d) we can learn that the RSDRDA is good at real-time redundancy detection.” Here, Xie discloses by “validate the accuracy”, they determine that the algorithm is “good” at redundancy detection.  Thus, Xie suggests comparing the error to an error threshold, as some threshold is implied by determining that the results are have “good” accuracy.)
However, Xie suggests, but does not explicitly teach store either the additional prediction data or the received additional data to a memory storage location.  (Xie, Page 4 Section 1.3 Para 2, discloses:  “And two sensor data redundancy elimination approaches based on SBN and DBNs are proposed, respectively, i.e., static sensor data redundancy detection algorithm (SSDRDA) for eliminating redundant data in static data sets, and real-time sensor data redundancy detection algorithm (RSDRDA) for eliminating redundant sensor data in real-time.”  Here, Xie discloses “eliminating” redundant data, which implies that Xie’s algorithm determines whether or not this data will be stored.  However, Xie does not explicitly disclose anything about storage.
Erden teaches and store either the additional prediction data or the additional received data to a memory storage location based in part on a comparison between the inferred additional response data, the additional response data (Erden, Para [0044], discloses:  “The processing in block 1206 involves a tuning phase which is performed initially and occasionally repeated. Each block 1200 identifies background in its own sensor's video streams and the blocks 1200 (and/or centralized node) also receives background from other sensors 1201, 1202 via other blocks 1200. The background data may be reduced in size by sending a subset of DWT components. For sensors 1201, 1202 that provide overlapping data, landmarks (or other identifiable static features) are defined for use in comparison of abnormalities (e.g., moving objects). Landmarks may be defined using machine learning, e.g., extracting features and finding common patterns from different sensors 1201, 1202.”  Here, Erden discloses using machine learning to determine response data and inferred response data (“overlapping data”).  This is to determine redundancies of data provided by video sensors, as Erden, Para [0046], discloses:  “The spatial DWT representations of sensors 1201, 1202 are stored after eliminating the redundancies. This reduces an amount of data stored to only the non-redundant DWT representations, and may further be reduced by storing only a subset of the extracted spatial DWT components, e.g., just the “L” components for some nodes 1200. The extracted data sent outside of the storage nodes 1200 (e.g., to a centralized compute function) may also be reduced by including a subset of DWT components and only non-redundant information.”  Here, Erden discloses store either the prediction data or the received data to a memory storage location, as Erden only stores the necessary non-redundant data.)
Thus, the combination of Xie, Erden, and Khelifi suggests store either the additional prediction data or the additional received data to a memory storage location based in part on a comparison between the inferred additional response data, the additional response data, and an error threshold, as Xie and Erden both disclose response data and inferred response data, and Erden discloses storing in memory only the necessary non-redundant data, while this is suggested by Xie.  
Also Xie, as shown above, discloses a comparison between the inferred additional response data, the additional response data, and an error threshold.  However, Xie’s comparison is performed after the pruning of redundant sensor data in order to evaluate the accuracy of the algorithm (the actual pruning is done by a probabilistic method, DBN).  Thus, this does not quite teach store either the additional prediction data or the additional received data to a memory storage location based in part on a comparison between the inferred additional response data, the additional response data, and an error threshold.  While one of ordinary skill in the art will appreciate that it would be trivial to incorporate Xie’s measure of accuracy in an additional decision making step (i.e., see Xie pg 19 Fig. 15(d), one could decide to never set any “Microphone” sensors to “sleeping” (Pg 20), for example), Examiner draws attention to Erden for a more concrete mapping.  Erden, Para [0046], discloses:  “Machine learning and/or secondary sensors may also be used to calculate rotation and scaling between sensor signals. This can be used to convert the overlapping data between the sensors 1201, 1202 into a common reference frame, from which abnormalities can be compared”.  Here it is clear that Erden is performing some comparison between the data of two sensors, and thus to determine if the data between the video sensors is redundant, this implies there must be some threshold of difference/similarity between the sensor signals, below which they are considered to comprise redundant data.  Another reference is combined for the more specific language regarding this limitation in Claim 6.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Xie and Erden for at least the reasons recited in Claim 1.
  However, the combination of Xie, Erden, and Khelifi does not teach update a version of the data processing model based on the further trained data processing model; store the updated version of the data processing model to a model database; add metadata to the received additional data including an indication of the updated version of the data processing model
Gupta Hyde teaches update a version of the data processing model based on the further trained data processing model (Gupta Hyde, Para [0058], discloses: “In examples disclosed herein, the model trainer 230 of the example edge device 130 instructs the model processor 235 to train using the local data and the response. (Block 430). During training, the example model trainer 230 updates the model stored in the model data store 210 to reduce an amount of error generated by the example model processor 235 when using the local data to attempt to correctly output the desired response. As a result of the training, a model update is created and is stored in the model data store 210. In examples disclosed herein, the model update can be computed with any sort of model learning algorithm such as, for example, Stochastic Gradient Descent.”  Here, Gupta Hyde discloses based on the further trained data processing model (“During training, the example model trainer 230 updates the model”), update a version of the data processing model (“a model update is created”)).
store the updated version of the data processing model to a model database (Gupta Hyde, Para [0058] as shown above, discloses “As a result of the training, a model update is created and is stored in the model data store 210.”)
add metadata to the received additional data including an indication of the updated version of the data processing model (Gupta Hyde, Para [0058], discloses:  “The example model trainer 230 then stores metadata in the example model data store 210 in association with the updated model to include identifications of local data (and/or information associated with the local data such as, for example, information about a user associated with the local data). The example metadata enables a later determination by the permissions enforcer 250 of whether the updated model should be shared. In some examples, reverse engineering attacks might be used to decipher input data from the resultant model. By storing metadata in association with the updated model, the example permissions enforcer can reduce the risk of reverse engineering attacks by preventing sharing of machine learning models that were trained on data that would otherwise not be shared.”  Gupta Hyde, Para [0061], also discloses:  “The example permissions enforcer 250 accesses metadata associated with the accessed item. (Block 520). In examples disclosed herein, in the context of metadata associated with local data, the metadata may represent, for example, a time at which the data was collected, a user and/or properties of a user (e.g., age, sex, etc.) identified in association with the collected data, a type of the local data (e.g., image data, audio data, text input, etc.), or any other property of the local data. In the context of metadata associated with locally created machine learning model(s), the metadata may represent, for example, a time when the machine learning model was created, information about a prior version of the machine learning model (e.g., a source of the prior version of the machine learning model), information about the local data used to train the machine learning model, etc.” In these paragraphs, Gupta Hyde discloses metadata associated with data that is indicative of the updated version of the data processing model, such as what data was used to train the current version, what user trained the current version, information about the previous version, etc.)
Gupta Hyde and the combination of Xie, Erden, and Khelifi are analogous art because they are both in the field of endeavor of machine learning.
It would have been obvious before the effective filing date of the claimed invention to combine Gupta Hyde’s model store with the model to identify sensor redundancy of Xie, Erden, and Khelifi.  One of ordinary skill in the art would be motivated to do so in order to be able to choose the best model from a plurality of models that may be best suited to a particular prediction scenario or environment (Gupta Hyde [0013]: “In examples disclosed herein, local training is utilized to train a model. Such local training does not require user data to be automatically provided to the cloud service provider. Moreover, such an approach advantageously trains the machine learning model to better understand the local user, as the model is trained based on local user data. Therefore, if the user has a dialect, the model will be trained based on that dialect.”)

As per Claim 14, this is a non-transitory computer-readable medium claims corresponding to apparatus Claim 5.  The difference is that it recites a non-transitory computer-readable medium. Erden Para [0063] discloses: “Such instructions may be stored on a non-transitory computer-readable medium and transferred to the processor for execution as is known in the art.”  Claim 14 is rejected for the same reasons as Claim 5.

As per Claim 22, this is a system claim corresponding to apparatus Claim 5.  The difference is that these claims recite an interface; a processor; and a memory. Xie, Page 1 Bottom, discloses:  “Massive sensor data are gathered by distributed equipments. There are plenty of sensor data generated everyday. In order to analyze and process the data, all of these data should be stored within a certain period.”  Here, Xie discloses an interface (“sensor”) that collects data, comprising a processor (“process the data”) and memory (“data should be stored”). Xie Fig. 1 shows the architecture, as recited in Page 2 Section 1.1: “As shown in Fig. 1, the structure of Internet of things mainly consists of four layers.” Claim 22 is rejected for the same reasons as Claim 5.

Claims 6-7, 15-16, and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Xie in view of Erden and Khelifi further in view of Xu et. al. (US 2019/0362235 A1; hereinafter “Xu”).
As per Claim 6, the combination of Xie, Erden, and Khelifi teaches the apparatus of claim 1 as well as memory and a processor (See Rejection to Claim 1).  Xie teaches the memory storing instructions, which when executed by the processor cause the processor to: 
determine a difference between the response data and the inferred response data (Xie, Pages 17-18 Section 6.3 “The Result of Sensor Data Redundancy Elimination”, discloses:  “In a gathered static dataset, if a specific node is detected as redundant node, it denotes the data collected by this node are redundant and we can get these data by its parent nodes. Based on this mechanism, we can weigh the performance of our algorithm by the accurately recovering the redundant data. The root-mean-square-error (RMSE) between real and predict values of redundant data is regarded as metrics.”  Here, Xie discloses comparison between the response data (“real values”) and inferred response data (“predict values”)).
determine whether the difference is less than, or less than or equal to the error threshold (Xie, Page 20, first full paragraph, also discloses:  “In order to validate the accuracy of the predicted state, we recover the redundant data. And Fig. 15(d) shows the mean RMSE of real and estimated data of all redundant data. From Fig. 15(d) we can learn that the RSDRDA is good at real-time redundancy detection.” Here, Xie discloses by “validate the accuracy”, they determine that the algorithm is “good” at redundancy detection.  Thus, Xie suggests comparing the error to an error threshold, as some threshold is implied by determining that the results are have “good” accuracy.)
However, Xie does not teach store the prediction data to the memory storage location.
Erden teaches store the prediction data to the memory storage location (Erden, Para [0046], discloses:  “The spatial DWT representations of sensors 1201, 1202 are stored after eliminating the redundancies. This reduces an amount of data stored to only the non-redundant DWT representations, and may further be reduced by storing only a subset of the extracted spatial DWT components, e.g., just the “L” components for some nodes 1200. The extracted data sent outside of the storage nodes 1200 (e.g., to a centralized compute function) may also be reduced by including a subset of DWT components and only non-redundant information.”  Here, Erden discloses store either the prediction data or the received data to a memory storage location, as Erden only stores the necessary non-redundant data.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Xie and Erden for at least the reasons recited in Claim 1.
However, the combination of Xie, Erden, and Khelifi does not explicitly teach store the prediction data to the memory storage location based on a determination that the difference is less than, or less than or equal to the error threshold.
Xu teaches store the prediction data to the memory storage location based on a determination that the difference is less than, or less than or equal to the error threshold.  (Xu, Para [0046], discloses:  “The pruned version of the neural network may then be caused to be implemented on a computing platform and tested 535 against a set of test input data to determine what affect this initial pruning of the particular layer has on the overall accuracy of the neural network model. If the pruned version of the neural network has an accuracy that is within an acceptable range or above an acceptable threshold set for the pruning (at 510), then the pruning steps for the particular layer are repeated 545 to attempt to further prune channels from the particular layer. If, however, the initial prune results in the accuracy falling below the threshold, the initial percentage, in some cases, may be decreased and the pruning steps repeated based on this lower percentage. In other cases, if the accuracy falls below the threshold after the initial prune, it may be determined that the layer should not be pruned. In either instance, following the sensitivity test of the neural network with the initially pruned version of the layer (e.g., by performing a forward-propagation of the modified network) the resulting accuracy or accuracy change may be recorded, along with data describing the pruned version of the particular layer used during the test.”  Here, Xu discloses determining whether or not certain nodes should be stored in memory (or, conversely, pruned), based on results being lower than an error threshold (“accuracy that is within an acceptable range or above an acceptable threshold”).  If the node is pruned, then the “prediction data” (previous nodes) are saved to memory, but “received data” (pruned node) is not.).
Xu and the combination of Xie, Erden, and Khelifi are analogous art because they are both in the field of endeavor of machine learning.
The combination of Xie, Erden, and Khelifi teaches pruning sensor data from a sensor network in an Internet of Things, comparing response data and inferred response data, and storing only non-redundant data in a memory location.  Xu discloses pruning a neural network, wherein the nodes of a neural network are analogous to the sensors in the IoT network, in that they each represent pieces of information to be used to make an inference.  Xu faces a similar issue to Xie, Erden, and Khelifi, as stated in Xu [0039]:  “As introduced above, neural network pruning may refer to the removal of some redundant weights (or channels) which are determined or predicted to not contribute meaningfully to the output of a network”.  The goal is the same, to remove redundant data that may be inferred by other nodes/sensors.  Xu, like Xie, discloses checking the accuracy of the result of the pruning, but unlike Xie, uses this as a decision point to continue with the pruning.  Thus the combination of Xie, Erden, and Xu results in the claimed limitation.  It would have been obvious before the effective filing date to combine Xu with the combination of Xie and Erden.  One of ordinary skill in the art would be motivated to do so in order to decrease resource usage, while still maintaining accuracy (Xu [0039]:  “Pruning a neural network model reduces the model size and thereby helps preventing over-fitting, and eventually generates a sparse (or thinner) version of the model. Weight pruning, for instance, shows high compression rate on some neural networks by pruning redundant weights or additionally allowing splicing of previously pruned weights. Channel-pruning prunes entire channels of the model (i.e., as opposed to the more surgical pruning of individual weights). However, naively pruning an amount of channels based on a calculation of importance of channels, may result in drastic reduction in the accuracy of systems employing the model in machine learning applications. For instance, while channel pruning may cause channels determined to be relatively less important to be removed, and thereby finetune the pruned network, individual network models have different sensitivity within and across layers to output accuracy.”)

As per Claim 7, the combination of Xie, Erden, Khelifi, and Xu teaches the apparatus of claim 6 as well as memory and a processor (See Rejection to Claim 1).  
Xu teaches store the received data to the memory storage location based on a determination that the difference is not less than, or not less than or equal to the error threshold (Xu, Para [0046], discloses:  “The pruned version of the neural network may then be caused to be implemented on a computing platform and tested 535 against a set of test input data to determine what affect this initial pruning of the particular layer has on the overall accuracy of the neural network model. If the pruned version of the neural network has an accuracy that is within an acceptable range or above an acceptable threshold set for the pruning (at 510), then the pruning steps for the particular layer are repeated 545 to attempt to further prune channels from the particular layer. If, however, the initial prune results in the accuracy falling below the threshold, the initial percentage, in some cases, may be decreased and the pruning steps repeated based on this lower percentage. In other cases, if the accuracy falls below the threshold after the initial prune, it may be determined that the layer should not be pruned. In either instance, following the sensitivity test of the neural network with the initially pruned version of the layer (e.g., by performing a forward-propagation of the modified network) the resulting accuracy or accuracy change may be recorded, along with data describing the pruned version of the particular layer used during the test.”  Here, Xu discloses not pruning a layer if the accuracy is too low, meaning the error is too high:  “If the accuracy falls below the threshold after the initial prune, it may be determined that the layer should not be pruned”.  Thus, if the layer is not pruned, the node is “received data” and is stored in memory.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Xu with the combination of Xie, Erden, and Khelifi for at least the reasons recited in Claim 6.

As per Claims 15-16, this is a non-transitory computer-readable medium claims corresponding to apparatus Claims 6-7.  The difference is that it recites a non-transitory computer-readable medium. Erden Para [0063] discloses: “Such instructions may be stored on a non-transitory computer-readable medium and transferred to the processor for execution as is known in the art.”  Claims 15-16 are rejected for the same reasons as Claim 6-7.

As per Claim 23, this is a system claim corresponding to apparatus Claim 6.  The difference is that these claims recite an interface; a processor; and a memory. Xie, Page 1 Bottom, discloses:  “Massive sensor data are gathered by distributed equipments. There are plenty of sensor data generated everyday. In order to analyze and process the data, all of these data should be stored within a certain period.”  Here, Xie discloses an interface (“sensor”) that collects data, comprising a processor (“process the data”) and memory (“data should be stored”). Xie Fig. 1 shows the architecture, as recited in Page 2 Section 1.1: “As shown in Fig. 1, the structure of Internet of things mainly consists of four layers.”  Claim 23 is rejected for the same reasons as Claim 6.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Ojha et al. (“DVSP: Dynamic Virtual Sensor Provisioning in Sensor–Cloud-Based Internet of Things”) discloses a system of identifying which nodes to activate to avoid wasting energy on redundant nodes
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEONARD A SIEGER whose telephone number is (571)272-9710. The examiner can normally be reached M-F 8:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ann Lo can be reached on (571) 272-9767. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/L.A.S./Examiner, Art Unit 2126           
/ANN J LO/Supervisory Patent Examiner, Art Unit 2126