DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 6-8, 14-16, 19-21, 27-28 are rejected under 35 U.S.C. 103 as being unpatentable over Miao et al (US 20150242760 A1) in view of McMahan et al (US 20170109322 A1).
Regarding claims 1, 7, 8, Miao et al discloses a local device (computing devices 102 can include 102a-102e) to train deep learning models (machine learning module 114 or machine learning module 202; an initial training set for the generic machine learning model, which will include an initial (global) classification threshold value; normalizing the first feature distribution with respect to a training distribution; paragraph 0023-0024), the local device (computer devices 102) comprising: a reference generator (machine learning model can generate a personalized machine learning model; machine learning model generates a personalized machine learning model, which can be transmitted (via input/output interface 106) to a server; subsequently, computing device 102 may receive a global machine learning model from the server ; paragraph 0031) to label input data received at the local device to generate training data (training module 204: train a generic machine learning model; a feature output of a global machine learning model on a server can be updated; the global machine learning model generates a normalized output that can be aggregated with the de-identified data received from the client devices ; the global machine learning model may be based on the personalized machine learning model transmitted to the server and an aggregation, performed by the server, of a plurality of other personalized machine learning models transmitted from a plurality of other client devices to the server; paragraph 0024, 0031, 0068); a trainer to train (training module 204; machine learning model 202 can receive training data from offline training module 204; training data can include data from a population, such as a population of users operating client devices or applications executed by a processor of client devices; training distribution; paragraph 0041-0042) a local deep learning model and to transmit (machine learning can also include transmitting the personalized machine learning model to a server that updates the consensus machine learning model) the local deep learning model to a server (the server may receive, from a first client device, a first feature distribution generated by a first machine learning model hosted by the first client device, and receive, from a second client device, a second feature distribution generated by a second machine learning model hosted by the second client device ; paragraph 0025) that is to receive a plurality of local deep learning models from a plurality of local devices (a global classification threshold value can initially be set during training, which is based on a plurality of users; for instance, an initial global classification threshold value may be set to a value determined by a priori training of a generic machine learning model; furthermore, the server then provides to the first client device a normalized first feature distribution resulting from normalizing the first feature distribution with respect to the second feature distribution ; paragraph 0074-0075), the server (the server may modify a consensus (or global) machine learning model based, at least in part, on the aggregated and normalized personalized machine learning models transmitted to the server from the multiple client devices; paragraph 0058, 0062) for a global deep learning model (the measurements can indicate how often the users smile; measurements can be performed for each user every 60 seconds for 3 hours; these measurements can be used as an initial training set for the generic machine learning model, which will include an initial (global) classification threshold value; paragraph 0074-0075); and an updater (modifying global machine learning model 520 generates an updated global machine learning model ; in addition, the updated global machine learning is transmitted by server 504 to client device 502; paragraph 0058, 0062) the local deep learning model received from the server (modifying consensus machine learning model 510 generates a global machine learning model that is transmitted by server 504 and received, by client device 502; in addition, server 504 may transmit the global machine learning model to at least a portion of the plurality of client devices 506; note that the first feature distribution is based on information collected locally by the first client device; paragraph 0059-0060, 0065, 0085).  
However, Miao et al does not specifically teach that the server determines a set of weights for a global deep learning model, and updates the local deep learning model based on the set of weights received from the server.
On the other hand, McMahan et al, from the same field of endeavor, discloses a server ( server 304: server 304 can receive each local update from user device 302, and can aggregate the local updates to determine a global update to the model; paragraph 0034) determines a set of weights (the server 304 can determine a weighted average of the local updates and determine the global update based at least in part on the average; paragraph 0034-0035) for a global deep learning model (global machine learning models 306: training one or more global machine learning models 306 using training data 308 stored locally on a plurality of user devices 302; furthermore, the server 304 can be configured to access machine learning model 306, and to provide model 306 to a plurality of user devices 302 ; paragraph 0030-0031), and updates (user devices 302 can each be configured to determine one or more local updates associated with model 306 based at least in part on training data 308; note that the local updates can be a gradient vector associated with the model; for instance, user devices 302 can determine a gradient (an average gradient) associated with the model based at least in part on training data 308 respectively stored on user devices 302 ; paragraph 0032) the local deep learning model (the server can aggregate the gradients to determine a global model update) based on the set of weights received from the server (server 304 can receive local update from user device 302; note that the server can aggregate the data, by determining a weighted average ; for instance, the user devices may determine an updated version of the model (using one or more stochastic gradient descent techniques) using local data; the server can then determine a weighted average of the resulting models to determine a global update to the model; paragraph 0029, 0041, 0044). Therefore, it would have been obvious to one of ordinary skill in the art, at the time the invention was made to apply the technique of McMahan to the system of Miao in order to provide a method for training processing engines for updating a global model. 
Regarding claim 2, Miao et al as modified discloses a local device (computing devices 102 can include 102a-102e) to train deep learning models (machine learning module 114 or machine learning module 202; an initial training set for the generic machine learning model, which will include an initial (global) classification threshold value; normalizing the first feature distribution with respect to a training distribution; paragraph 0023-0024), further including a data receiver to receive the input data directly at the local device (machine learning model generates a personalized machine learning model, which can be transmitted (via input/output interface 106) to a server; subsequently, computing device 102 may receive a global machine learning model from the server ; paragraph 0031).
 Regarding claim 3, Miao et al as modified discloses a local device (computing devices 102 can include 102a-102e) to train deep learning models (machine learning module 114 or machine learning module 202; an initial training set for the generic machine learning model, which will include an initial (global) classification threshold value; normalizing the first feature distribution with respect to a training distribution; paragraph 0023-0024), wherein the input data does not pass through the server (paragraph 0032). 
Regarding claim 6, Miao et al as modified discloses a local device (computing devices 102 can include 102a-102e) to train deep learning models (machine learning module 114 or machine learning module 202; an initial training set for the generic machine learning model, which will include an initial (global) classification threshold value; normalizing the first feature distribution with respect to a training distribution; paragraph 0023-0024), wherein the local device does not transmit the input data to the server (machine learning 202 can receive  training data from offline training module; paragraph 0041-0042).   
	Regarding claims 14, 20, 21, Miao et al discloses a non-transitory computer readable medium comprising instructions that, when executed, cause a local device (computing devices 102 can include 102a-102e) to at least: label input data received at the local device (computer devices 102) to generate (machine learning model can generate a personalized machine learning model; machine learning model generates a personalized machine learning model, which can be transmitted (via input/output interface 106) to a server; subsequently, computing device 102 may receive a global machine learning model from the server ; paragraph 0031) training data (training module 204: train (training module 204; machine learning model 202 can receive training data from offline training module 204; training data can include data from a population, such as a population of users operating client devices or applications executed by a processor of client devices; training distribution; paragraph 0041-0042) a generic machine learning model (machine learning can also include transmitting the personalized machine learning model to a server that updates the consensus machine learning model); a feature output of a global machine learning model on a server can be updated; the global machine learning model generates a normalized output that can be aggregated with the de-identified data received from the client devices ; the global machine learning model may be based on the personalized machine learning model transmitted to the server and an aggregation, performed by the server, of a plurality of other personalized machine learning models transmitted from a plurality of other client devices to the server; paragraph 0024, 0031, 0068); train a local deep learning model; transmit the local deep learning model to a server (the server may receive, from a first client device, a first feature distribution generated by a first machine learning model hosted by the first client device, and receive, from a second client device, a second feature distribution generated by a second machine learning model hosted by the second client device ; paragraph 0025), the server to receive a plurality of local deep learning models from a plurality of local devices (a global classification threshold value can initially be set during training, which is based on a plurality of users; for instance, an initial global classification threshold value may be set to a value determined by a priori training of a generic machine learning model; furthermore, the server then provides to the first client device a normalized first feature distribution resulting from normalizing the first feature distribution with respect to the second feature distribution ; paragraph 0074-0075), the server (the server may modify a consensus (or global) machine learning model based, at least in part, on the aggregated and normalized personalized machine learning models transmitted to the server from the multiple client devices; paragraph 0058, 0062) for a global deep learning model (the measurements can indicate how often the users smile; measurements can be performed for each user every 60 seconds for 3 hours; these measurements can be used as an initial training set for the generic machine learning model, which will include an initial (global) classification threshold value; paragraph 0074-0075); and update (modifying global machine learning model 520 generates an updated global machine learning model ; in addition, the updated global machine learning is transmitted by server 504 to client device 502; paragraph 0058, 0062) the local deep learning model received from the server (modifying consensus machine learning model 510 generates a global machine learning model that is transmitted by server 504 and received, by client device 502; in addition, server 504 may transmit the global machine learning model to at least a portion of the plurality of client devices 506; note that the first feature distribution is based on information collected locally by the first client device; paragraph 0059-0060, 0065, 0085). 
However, Miao et al does not specifically teach that the server determines a set of weights for a global deep learning model, and updates the local deep learning model based on the set of weights received from the server.
On the other hand, McMahan et al, from the same field of endeavor, discloses a server ( server 304: server 304 can receive each local update from user device 302, and can aggregate the local updates to determine a global update to the model; paragraph 0034) determines a set of weights (the server 304 can determine a weighted average of the local updates and determine the global update based at least in part on the average; paragraph 0034-0035) for a global deep learning model (global machine learning models 306: training one or more global machine learning models 306 using training data 308 stored locally on a plurality of user devices 302; furthermore, the server 304 can be configured to access machine learning model 306, and to provide model 306 to a plurality of user devices 302 ; paragraph 0030-0031), and updates (user devices 302 can each be configured to determine one or more local updates associated with model 306 based at least in part on training data 308; note that the local updates can be a gradient vector associated with the model; for instance, user devices 302 can determine a gradient (an average gradient) associated with the model based at least in part on training data 308 respectively stored on user devices 302 ; paragraph 0032) the local deep learning model (the server can aggregate the gradients to determine a global model update) based on the set of weights received from the server (server 304 can receive local update from user device 302; note that the server can aggregate the data, by determining a weighted average ; for instance, the user devices may determine an updated version of the model (using one or more stochastic gradient descent techniques) using local data; the server can then determine a weighted average of the resulting models to determine a global update to the model; paragraph 0029, 0041, 0044). Therefore, it would have been obvious to one of ordinary skill in the art, at the time the invention was made to apply the technique of McMahan to the system of Miao in order to provide a method for training processing engines for updating a global model.
   	Regarding claim 15, Miao et al as modified discloses a non-transitory computer readable medium comprising instructions that, when executed, cause a local device (computing devices 102 can include 102a-102e), wherein the input data is received directly at the local device (machine learning model generates a personalized machine learning model, which can be transmitted (via input/output interface 106) to a server; subsequently, computing device 102 may receive a global machine learning model from the server ; paragraph 0031).  
	 Regarding claim 16, Miao et al as modified discloses a non-transitory computer readable medium comprising instructions that, when executed, cause a local device (computing devices 102 can include 102a-102e), wherein the input data does not pass through the server (paragraph 0032). 
	Regarding claim 19, Miao et al as modified discloses a non-transitory computer readable medium comprising instructions that, when executed, cause a local device (computing devices 102 can include 102a-102e), wherein the input data is not transmitted to the server ; machine learning model 202 can receive training data from offline training module 204 
 paragraph 0041-0042).
	Regarding claim 27, Miao et al discloses a method (figs. 1-2, fig. 4) to train deep learning models (machine learning module 114 or machine learning module 202; an initial training set for the generic machine learning model, which will include an initial (global) classification threshold value; normalizing the first feature distribution with respect to a training distribution; paragraph 0023-0024), the method comprising: labelling, by executing an instruction with at least one processor at a local device (computer devices 102), input data received at the local device to generate (machine learning model can generate a personalized machine learning model; machine learning model generates a personalized machine learning model, which can be transmitted (via input/output interface 106) to a server; subsequently, computing device 102 may receive a global machine learning model from the server ; paragraph 0031) training data (training module 204: train a generic machine learning model; a feature output of a global machine learning model on a server can be updated; the global machine learning model generates a normalized output that can be aggregated with the de-identified data received from the client devices ; the global machine learning model may be based on the personalized machine learning model transmitted to the server and an aggregation, performed by the server, of a plurality of other personalized machine learning models transmitted from a plurality of other client devices to the server; paragraph 0024, 0031, 0068); training (training module 204; machine learning model 202 can receive training data from offline training module 204; training data can include data from a population, such as a population of users operating client devices or applications executed by a processor of client devices; training distribution; paragraph 0041-0042), by executing an instruction with the at least one processor, a local deep learning model (machine learning can also include transmitting the personalized machine learning model to a server that updates the consensus machine learning model); transmitting the local deep learning model to a server (the server may receive, from a first client device, a first feature distribution generated by a first machine learning model hosted by the first client device, and receive, from a second client device, a second feature distribution generated by a second machine learning model hosted by the second client device ; paragraph 0025), the server to receive a plurality of local deep learning models from a plurality of local devices (a global classification threshold value can initially be set during training, which is based on a plurality of users; for instance, an initial global classification threshold value may be set to a value determined by a priori training of a generic machine learning model; furthermore, the server then provides to the first client device a normalized first feature distribution resulting from normalizing the first feature distribution with respect to the second feature distribution ; paragraph 0074-0075), the server (the server may modify a consensus (or global) machine learning model based, at least in part, on the aggregated and normalized personalized machine learning models transmitted to the server from the multiple client devices; paragraph 0058, 0062) for a global deep learning model (the measurements can indicate how often the users smile; measurements can be performed for each user every 60 seconds for 3 hours; these measurements can be used as an initial training set for the generic machine learning model, which will include an initial (global) classification threshold value; paragraph 0074-0075); and updating (modifying global machine learning model 520 generates an updated global machine learning model ; in addition, the updated global machine learning is transmitted by server 504 to client device 502; paragraph 0058, 0062), by executing an instruction with the at least one processor at the local device, the local deep learning model received from the server (modifying consensus machine learning model 510 generates a global machine learning model that is transmitted by server 504 and received, by client device 502; in addition, server 504 may transmit the global machine learning model to at least a portion of the plurality of client devices 506; note that the first feature distribution is based on information collected locally by the first client device; paragraph 0059-0060, 0065, 0085).
However, Miao et al does not specifically teach that the server determines a set of weights for a global deep learning model, and updates the local deep learning model based on the set of weights received from the server.
On the other hand, McMahan et al, from the same field of endeavor, discloses a server ( server 304: server 304 can receive each local update from user device 302, and can aggregate the local updates to determine a global update to the model; paragraph 0034) determines a set of weights (the server 304 can determine a weighted average of the local updates and determine the global update based at least in part on the average; paragraph 0034-0035) for a global deep learning model (global machine learning models 306: training one or more global machine learning models 306 using training data 308 stored locally on a plurality of user devices 302; furthermore, the server 304 can be configured to access machine learning model 306, and to provide model 306 to a plurality of user devices 302 ; paragraph 0030-0031), and updates (user devices 302 can each be configured to determine one or more local updates associated with model 306 based at least in part on training data 308; note that the local updates can be a gradient vector associated with the model; for instance, user devices 302 can determine a gradient (an average gradient) associated with the model based at least in part on training data 308 respectively stored on user devices 302 ; paragraph 0032) the local deep learning model (the server can aggregate the gradients to determine a global model update) based on the set of weights received from the server (server 304 can receive local update from user device 302; note that the server can aggregate the data, by determining a weighted average ; for instance, the user devices may determine an updated version of the model (using one or more stochastic gradient descent techniques) using local data; the server can then determine a weighted average of the resulting models to determine a global update to the model; paragraph 0029, 0041, 0044). Therefore, it would have been obvious to one of ordinary skill in the art, at the time the invention was made to apply the technique of McMahan to the system of Miao in order to provide a method for training processing engines for updating a global model.  
	Regarding claim 28, Miao et al as modified discloses a method (figs. 1-2, fig. 4) to train deep learning models (machine learning module 114 or machine learning module 202; an initial training set for the generic machine learning model, which will include an initial (global) classification threshold value; normalizing the first feature distribution with respect to a training distribution; paragraph 0023-0024), wherein the input data is received directly at the local device machine learning model generates a personalized machine learning model, which can be transmitted (via input/output interface 106) to a server; subsequently, computing device 102 may receive a global machine learning model from the server ; paragraph 0031). 

Claims 4, 5, 9-11, 13, 17-18, 22, 23, are rejected under 35 U.S.C. 103 as being unpatentable over Miao et al (US 20150242760 A1) in view of McMahan et al (US 20170109322 A1) as applied to claims above, and further in view of Williamson et al (US 20180232528 A1).
Regarding claims 4, 5, 9-11, 13, 17-18, 22, 23, Miao and McMahan disclose everything claimed as explained above except the features of sampling the input data by selecting a pseudo-random portion of the input data, wherein the sample controller is to sample the input data by down sampling the input data to reduce a data size of the input data, wherein the trainer is further to determine a difference between a label determined by the labelling and an output of the local deep learning model; wherein the local device is communicatively coupled to a sensor that collects the input data and transmits the input data to the local device.
However, Williamson et al discloses the features of sampling the input data (the confidence value is determined by adding all the determinations of sensitive data made by the components of the data classifier 108 together as weighted by the respective significance factors; furthermore, the data pre-processor 106 may randomly sample data from the input data sources 102A-N and place this randomly sampled data into the common data structure instead of placing the entire set of data from the input data sources 102A-N into the common data structure; paragraph 0029) by selecting a pseudo-random portion of the input data (the deep learning classifier 212 determines whether a data portion is sensitive using a machine learning model; the deep learning classifier 212 may employ one or more machine learning model ; in addition, for each of the subsections of the input data sources 102A-N which have separate metadata labels, the data pre-processor 106 may randomly sample data from within that subsection; paragraph 0063, 0070-0071), wherein the sample controller is to sample the input data by down sampling the input data to reduce a data size of the input data (the data classification reporting module 114 also presents an input data sources view 504 indicating the number of input data sources that have been processed or sampled; paragraph 0092), wherein the trainer is further to determine a difference between a label determined by the labelling and an output of the local deep learning model (when computing the confidence value, the confidence value calculator 218 also considers the quality of the data portion for which the confidence value is to be calculated, as well as the uniqueness of that data portion ; the determination that data is sensitive may be a binary outcome, as may be the case when analyzing metadata, comparing to reference or patterns, using logical classifiers, and analyzing context ; paragraph 0029-0030, 0080, 0085); wherein the local device is communicatively coupled to a sensor (a motion sensor) that collects the input data and transmits the input data to the local device (the data classification reporting module 114 may also present an overall security posture of the organization based on the collected information ; paragraph 0044, 0114). Therefore, it would have been obvious to one of ordinary skill in the art, at the time the invention was made to apply the technique of Williamson to the modified system of McMahan and Miao in order to provide a deep learning classifier that may employ one or more machine learning models to classify sensitive data.
Allowable Subject Matter
Claim 12 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARCEAU MILORD whose telephone number is (571)272-7853. The examiner can normally be reached 10-6.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CHARLES APPIAH can be reached on 571-2727904. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

MARCEAU  MILORD
Examiner
Art Unit 2641



/MARCEAU MILORD/Primary Examiner, Art Unit 2641