DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
2.	This action is responsive to the following communication:  Original claims filed 06/17/20.  This action is made non-final.
3.	Claims 16-30 are pending in the case.  Claims 16, 29 and 30 are independent claims.

Claim Rejections - 35 USC § 102
4.	The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

5.	Claim 16-19 and 21-30 are rejected under 35 U.S.C. 102(a)(1) as being rejected by anticipated by Miao (US 20150242760).
Regarding claim 16, Miao discloses an apparatus comprising:
at least one processor (FIG. 1, processor); and
at least one memory including computer code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to perform (FIG. 1, memory, processor):
receiving a plurality of data sets representing sensed data from one or more devices (personalizing machine learning may be performed locally at a computing device, and may include interaction with a server on a network shared with a plurality of other computing devices. A personalized machine learning approach may use a distributed asynchronous optimization algorithm to deliver personalized machine learning models that fit at least fairly well with substantially all computing devices on a shared network, paragraph 0015);
determining, using one or more local learned models, local parameters based on the received data sets (machine learning model 202, in part as a result of offline training module 204, can be configured for a relatively large population of users. For example, machine learning model 202 can include a number of classification threshold values that are set based on average characteristics of the population of users of offline training module 204. Client devices 206A-C can modify machine learning model 202, however, subsequent to machine learning model 202 being loaded onto client devices 206A-C. In this way, customized/personalized machine learning can occur on individual client devices 206A-C. The modified machine learning model is designated as machine learning 208A-C. In some implementations, for example, machine learning 208A comprises a portion of an operating system of client device 206A. Modifying machine learning on a client device is a form of local training of a machine learning model. Such training can utilize personal information already present on the client device, as explained below. Moreover, users of client devices can be confident that their personal information remains private while the client devices remain in their possession);
generating a combined data set by combining the plurality of data sets (a machine learning model includes classifiers that make decisions based, at least in part, on comparing a value with a threshold value, paragraph 0071 and aggregating multiple feature distributions is also a technique for combining sampling data from multiple users of client devices on a server, paragraph 0076)
determine, using the one or more local learned models, global parameters based on the combined data se (a client device may modify a global classification threshold value t of a consensus machine learning model based, at least in part, on the information collected locally by the client device by performing an operation (e.g., minimization problem) defined by equation 2, introduced above, paragraph 0073);
transmitting, to a remote system, the global parameters for determining updated global parameters using one or more global learned models based at least partially on the global parameters (in some embodiments, an iterative process that continuously improves upon a machine learning model includes communication between the server and the individual computing devices. For example, data gathered locally by each of the computing devices can be used to personalize machine learning models on each of the respective computing devices. The personalized machine learning models of each of the respective computing devices can be transmitted from each of the computing devices to a server, which can subsequently aggregate this plurality of personalized machine learning models. A process of aggregation can be performed by the server using any of a number of techniques, such as normalization, some of which are described below, paragraph 0016);
receiving, from the remote system, the updated global parameters (subsequent to such aggregation, the server can update a global machine learning model based, at least in part, on the plurality of personalized machine learning models from the respective computing devices. The server can then transmit the updated global machine learning model to each of the respective computing devices, each of which can subsequently aggregate this updated global machine learning model with the personalized machine learning model already on the respective computing device. A process of aggregation can be performed by each of the computing devices using any of a number of techniques, some of which are described below, paragraph 0017); and
updating the one or more local learned models using both the local parameters and updated global parameters (subsequent to such aggregation, each of the computing devices can update their respective personalized machine learning model based, at least in part, on the global machine learning model received from the server. Moreover, data gathered locally by each of the computing devices can be used to further personalize the updated machine learning models on each of the respective computing devices. The updated personalized machine learning models of each of the respective computing devices can then be transmitted from each of the computing devices to the server. This process of updating and communicating (e.g., transmitting and receiving) between a plurality of computing devices and the server repeats and, in doing so, iteratively improves upon the global machine learning model maintained by the server and each of the personalized machine learning models of the respective computing devices, paragraph 0018).
Regarding claim 17, Miao discloses wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform to:
transmitting, to the one or more devices, the updated one or more local learned models (the server can then transmit the updated global machine learning model to each of the respective computing devices, each of which can subsequently aggregate this updated global machine learning model with the personalized machine learning model already on the respective computing device. A process of aggregation can be performed by each of the computing devices using any of a number of techniques, some of which are described below, paragraph 0017(.,
Regarding claim 18, Miao discloses wherein the one or more devices are members of a predetermined group of devices (In some implementations, a user of a client device has to "opt-in" or take other affirmative action before personalized machine learning can occur, paragraph 00014 – devices belong to a group of devices that are predetermined).
Regarding claim 19, Miao discloses wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform: 
identifying from which device each of the plurality of data sets is received from such that the generated local parameters are particular to each of the identified devices (characteristics of machine learning 208A-C change in accordance with particular users of client devices 206A-C. For example, machine learning 208A hosted by client device 206A and operated by a particular user can be different from machine learning 208B hosted by client device 206B and operated by another particular user. Behaviors and/or personal information of a user of a client device are considered for modifying various parameters of machine learning hosted by the client device. Behaviors of the user or personal information collected over a predetermined time can be considered. For example, machine learning 208A can be modified based, at least in part, on historical use patterns, behaviors, and/or personal information of a user of client device 206A over a period of time, such as hours, days, months, and so on. Accordingly, modification of machine learning 208A can continue with time, and become more personal to the particular user of client device 208A. A number of benefits result from machine learning 208A becoming more personal to the particular user. Among such benefits, precision of output of machine learning 208A increases, efficiency (e.g., speed) of operation of machine learning 208A increases, and memory footprint of machine learning 208A decreases, just to name a few example benefits. Additionally or alternatively, users may be allowed to opt out of the use of personal/private information to personalize the machine learning, paragraph 0043).
Regarding claim 21, Miao discloses wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform:
normalizing the plurality of data sets and/or using the plurality of data sets to capture high-order correlations (distributions of features of a number of users of client devices can be aggregated by a process of normalizing distributions of the individual users based on information collected locally by the individual client devices. Such a process, which can be performed by a server and/or by the individual client devices, can lead to an aggregated distribution that can be resolved. Such a resolved aggregated distribution can have a clearly definable (e.g. non-ambiguous) classification boundary, which can be incorporated into an updated (e.g., further personalized) machine learning model (paragraph 0022).
Regarding claim 22, Miao discloses wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform:
combining the local parameters and the updated global parameters; and updating the one or more local learned models using the combined parameters (subsequent to such aggregation, each of the computing devices can update their respective personalized machine learning model based, at least in part, on the global machine learning model received from the server. Moreover, data gathered locally by each of the computing devices can be used to further personalize the updated machine learning models on each of the respective computing devices. The updated personalized machine learning models of each of the respective computing devices can then be transmitted from each of the computing devices to the server. This process of updating and communicating (e.g., transmitting and receiving) between a plurality of computing devices and the server repeats and, in doing so, iteratively improves upon the global machine learning model maintained by the server and each of the personalized machine learning models of the respective computing devices, paragraph 0018).
Regarding claim 23, Miao discloses wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform:
combining the plurality of data sets with one or more historical data sets (in some implementations, individual real-time actions of a user of a client device need not influence personalized machine learning, while long-term behaviors of the user show patterns that can be used to personalize machine learning. For example, the feature output of the machine learning model can be responsive to a pattern of behavior of a user of the client device over at least a predetermined time, such as hours, days, months, and so on, paragraph 0027).
Regarding claim 24, Miao discloses wherein the plurality of data sets corresponds to different users of a single device (A process of updating a global machine learning model on the server can include normalization, which can alleviate such problems that arise from aggregating feature distributions of multiple users of client devices, as described below, paragraph 0084).
Regarding claim 25, Miao discloses wherein the plurality of data sets correspond to a single user of different respective devices (client devices 206A-C can include computing devices that receive, store, and operate on data that a user of the computing device, paragraph 0044).
Regarding claim 26, Miao discloses wherein the data sets represent sensed health or activity data (a generic machine learning model and a feature of a user, a smiling classifier can be used to determine whether a user is smiling or not. This can be useful to determine whether the user is happy or sad, for example. To build a generic (e.g., global) machine learning model, measurements of mouth sizes can be collected for a population of users (e.g., 100, 500, or 1000 or more people). Measurements can be taken from captured images of the users as the users play a video game, watch a television program, or the like. The measurements can indicate how often the users smile. Measurements can be performed for each user every 60 seconds for 3 hours, for example. These measurements can be used as an initial training set for the generic machine learning model, which will include an initial (e.g., global) classification threshold value, paragraph 0074).
Regarding claim 27, Miao discloses wherein the one or more devices are portable health monitoring devices (wearable computers, see paragraph 0014) and wherein the data sets are received wirelessly from one or more health monitoring devices (may be performed in whole or in part by a server or other computing device in a network (e.g., the Internet or the cloud), paragraph 0025).
Regarding claim 28, Miao discloses wherein the apparatus is a hub device and wherein the data sets are received from the one or more devices in local network (a server or other computing device in a network (e.g., the Internet or the cloud). For example, a server can update and improve a global machine learning model by normalizing and aligning feature distributions of multiple client devices. The server may, for example, receive, from a first client device, a first feature distribution generated by a first machine learning model hosted by the first client device, and receive, from a second client device, a second feature distribution generated by a second machine learning model hosted by the second client device, paragraph 0025).
Regarding claim 29, Miao discloses a method comprising:
receiving a plurality of data sets representing sensed data from one or more devices (personalizing machine learning may be performed locally at a computing device, and may include interaction with a server on a network shared with a plurality of other computing devices. A personalized machine learning approach may use a distributed asynchronous optimization algorithm to deliver personalized machine learning models that fit at least fairly well with substantially all computing devices on a shared network, paragraph 0015); 
determining, using one or more local learned models, local parameters based on the received data sets (machine learning model 202, in part as a result of offline training module 204, can be configured for a relatively large population of users. For example, machine learning model 202 can include a number of classification threshold values that are set based on average characteristics of the population of users of offline training module 204. Client devices 206A-C can modify machine learning model 202, however, subsequent to machine learning model 202 being loaded onto client devices 206A-C. In this way, customized/personalized machine learning can occur on individual client devices 206A-C. The modified machine learning model is designated as machine learning 208A-C. In some implementations, for example, machine learning 208A comprises a portion of an operating system of client device 206A. Modifying machine learning on a client device is a form of local training of a machine learning model. Such training can utilize personal information already present on the client device, as explained below. Moreover, users of client devices can be confident that their personal information remains private while the client devices remain in their possession);
 generating a combined data set by combining the plurality of data sets (a machine learning model includes classifiers that make decisions based, at least in part, on comparing a value with a threshold value, paragraph 0071 and aggregating multiple feature distributions is also a technique for combining sampling data from multiple users of client devices on a server, paragraph 0076);
determining, using one or more local learned models, global parameters based on the combined data set (a client device may modify a global classification threshold value t of a consensus machine learning model based, at least in part, on the information collected locally by the client device by performing an operation (e.g., minimization problem) defined by equation 2, introduced above, paragraph 0073);
transmitting, to a remote system, the global parameters for determining updated global parameters using one or more global learned models based at least partially on the global parameters (in some embodiments, an iterative process that continuously improves upon a machine learning model includes communication between the server and the individual computing devices. For example, data gathered locally by each of the computing devices can be used to personalize machine learning models on each of the respective computing devices. The personalized machine learning models of each of the respective computing devices can be transmitted from each of the computing devices to a server, which can subsequently aggregate this plurality of personalized machine learning models. A process of aggregation can be performed by the server using any of a number of techniques, such as normalization, some of which are described below, paragraph 0016);
 receiving, from the remote system, the updated global parameters (subsequent to such aggregation, the server can update a global machine learning model based, at least in part, on the plurality of personalized machine learning models from the respective computing devices. The server can then transmit the updated global machine learning model to each of the respective computing devices, each of which can subsequently aggregate this updated global machine learning model with the personalized machine learning model already on the respective computing device. A process of aggregation can be performed by each of the computing devices using any of a number of techniques, some of which are described below, paragraph 0017); and
updating the one or more local learned models using both the local parameters and updated global parameters (subsequent to such aggregation, each of the computing devices can update their respective personalized machine learning model based, at least in part, on the global machine learning model received from the server. Moreover, data gathered locally by each of the computing devices can be used to further personalize the updated machine learning models on each of the respective computing devices. The updated personalized machine learning models of each of the respective computing devices can then be transmitted from each of the computing devices to the server. This process of updating and communicating (e.g., transmitting and receiving) between a plurality of computing devices and the server repeats and, in doing so, iteratively improves upon the global machine learning model maintained by the server and each of the personalized machine learning models of the respective computing devices, paragraph 0018).
Regarding claim 30, Miao discloses a non-transitory computer readable medium comprising program instructions stored thereon for performing at least the following:
receiving a plurality of data sets representing sensed data from one or more devices (personalizing machine learning may be performed locally at a computing device, and may include interaction with a server on a network shared with a plurality of other computing devices. A personalized machine learning approach may use a distributed asynchronous optimization algorithm to deliver personalized machine learning models that fit at least fairly well with substantially all computing devices on a shared network, paragraph 0015);
determining, using one or more local learned models, local parameters based on the received data sets (machine learning model 202, in part as a result of offline training module 204, can be configured for a relatively large population of users. For example, machine learning model 202 can include a number of classification threshold values that are set based on average characteristics of the population of users of offline training module 204. Client devices 206A-C can modify machine learning model 202, however, subsequent to machine learning model 202 being loaded onto client devices 206A-C. In this way, customized/personalized machine learning can occur on individual client devices 206A-C. The modified machine learning model is designated as machine learning 208A-C. In some implementations, for example, machine learning 208A comprises a portion of an operating system of client device 206A. Modifying machine learning on a client device is a form of local training of a machine learning model. Such training can utilize personal information already present on the client device, as explained below. Moreover, users of client devices can be confident that their personal information remains private while the client devices remain in their possession);
generating a combined data set by combining the plurality of data sets (a machine learning model includes classifiers that make decisions based, at least in part, on comparing a value with a threshold value, paragraph 0071 and aggregating multiple feature distributions is also a technique for combining sampling data from multiple users of client devices on a server, paragraph 0076);
determine, using one or more local learned models, global parameters based on the combined data set (a client device may modify a global classification threshold value t of a consensus machine learning model based, at least in part, on the information collected locally by the client device by performing an operation (e.g., minimization problem) defined by equation 2, introduced above, paragraph 0073);
transmitting, to a remote system, the global parameters for determining updated global parameters using one or more global learned models based at least partially on the global parameters (in some embodiments, an iterative process that continuously improves upon a machine learning model includes communication between the server and the individual computing devices. For example, data gathered locally by each of the computing devices can be used to personalize machine learning models on each of the respective computing devices. The personalized machine learning models of each of the respective computing devices can be transmitted from each of the computing devices to a server, which can subsequently aggregate this plurality of personalized machine learning models. A process of aggregation can be performed by the server using any of a number of techniques, such as normalization, some of which are described below, paragraph 0016);
receiving, from the remote system, the updated global parameters (subsequent to such aggregation, the server can update a global machine learning model based, at least in part, on the plurality of personalized machine learning models from the respective computing devices. The server can then transmit the updated global machine learning model to each of the respective computing devices, each of which can subsequently aggregate this updated global machine learning model with the personalized machine learning model already on the respective computing device. A process of aggregation can be performed by each of the computing devices using any of a number of techniques, some of which are described below, paragraph 0017); and
updating the one or more local learned models using both the local parameters and updated global parameters (subsequent to such aggregation, each of the computing devices can update their respective personalized machine learning model based, at least in part, on the global machine learning model received from the server. Moreover, data gathered locally by each of the computing devices can be used to further personalize the updated machine learning models on each of the respective computing devices. The updated personalized machine learning models of each of the respective computing devices can then be transmitted from each of the computing devices to the server. This process of updating and communicating (e.g., transmitting and receiving) between a plurality of computing devices and the server repeats and, in doing so, iteratively improves upon the global machine learning model maintained by the server and each of the personalized machine learning models of the respective computing devices, paragraph 0018).

Claim Rejections - 35 USC § 103
6.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

7.	Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Miao in view of Ramage (US 20160063393).
Regarding claim 20, Miao does not disclose wherein at least one of the received data sets comprises at least one missing data field, and wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform: inferring and inserting the at least one missing data field using the one or more local learned models.
However, Ramage discloses wherein FIG. 1B is a block diagram of the example system 100 that can use the residual data to train a local model for the user 102. In general, a model (e.g., a global model or a local model) may refer to any representational model f(x) that is invertible, where given the representation output by f(x), a maximum likelihood pseudo-observation f.sup.−1(f(x)) may be found. For example, a local model may be trained to predict the vector valued f.sup.−1(f(x))−x. In some implementations, a model may refer to a representational model f(x) that is not invertible, but an inference function g may be applied to the representation output by f(x) to estimate a pseudo-observation g(f(x)). For example, a global model or a local model may be a neural network with multiple hidden layers such that it may be challenging to invert the output, but an inference function may be applied to the output to estimate the input observation (paragraph 0039).
The combination of Miao and Ramage would have resulted in the learning model of Miao to utilize Ramage’s teachings of using inference data to predict new values.  One would have been motivated to have combined the teachings because Miao is already involved in using learning models iteratively to improve user data and as such utilizing the well known teachings of Ramage would have improved the learning models.  Therefore, the combination of references would have resulted in a predictable invention.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVID E CHOI whose telephone number is (571)270-3780.  The examiner can normally be reached on M-F: 7-2, 7-10 (PST). If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sherief Badawi can be reached on 571-272-9782.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/DAVID E CHOI/Primary Examiner, Art Unit 2174