DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments with respect to claim(s) 1-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-4, 8-11, 15-17 are rejected under 35 U.S.C. 103 as being unpatentable over Shariat (US 20180115598) in view of Moloney (WO2018033890).
Regarding claim 1, Shariat teaches: a computer-implemented method for training machine-learned models, the method comprising: (Shariat: Abstract “The server trains machine-learned models using” A computer-implemented method for training machine-learned models is taught as server trains machine-learned models.); obtaining, by one or more computing devices, a plurality of regions based at least in part on temporal availability of user devices (Shariat: Paragraph [0019] “A client device 110 may interact with the travel coordination system 130 through a client application configured to interact with the travel coordination system 130. The client application of the client device 110 can present information received from the travel coordination system 130 on a user interface, such as a map of the geographic region, and the current location of the client device 110.” Obtaining, by one or more computing devices, a plurality of regions based at least in part on temporal availability of user devices is taught as a client device interacting with a travel coordinate system on an interface with a map of the current location of the client device.); selecting, by the one or more computing devices, a plurality of available user devices within a region (Shariat: Paragraph [0013] “For clarity, although only the client device 110A and the client device 110B are shown in FIG. 1, embodiments of the system environment 100 can have any number of client devices 110, as well as multiple travel coordination systems 130 and model management systems 160.” Selecting, by the one or more computing devices, a plurality of available user devices within a region is taught as the two client devices in the which communicate with local area network in the system environment. Estimating number of available providers in various geographical locations & a provider can indicate availability, via a client application on the client device); providing, by the one or more computing devices, a current version of a machine-learned model associated with the region to the plurality of selected user devices within the region (Shariat: Paragraph [0027] “For example, the travel coordination system 130 at a server farm in Asia may store the subset of the machine-learned models 170 associated with making predictions for geographic regions in Asia, and not receive machine-learned models 170 associated with making predictions for other geographic regions (e.g., North America or Western Europe)” Providing, by the one or more computing devices, a current version of a machine-learned model associated with the region to the plurality of selected user devices within the region is taught as a server farm in Asia storing the subset of machine learning models associated with making predictions for geographic regions in Asia.) … associated with the region (Shariat: Paragraph [0023] “For example, the travel coordination system 130 located at a server farm in California may process trip requests received within the West Coast of the United States.” Associated with the region is taught as the riders and providers associated with the server farm in California.)… obtaining, by the one or more computing devices from the plurality of selected user devices, updated machine-learned model data generated by the plurality of selected user devices through training of the current version of the machine-learned model associated with the region using data local to each of the plurality of selected user devices (Shariat: Paragraph [0030] “the model management system 160 receives travel information from the travel coordination system 130 and generates the machine-learned models based on conditions identified in the travel information. The model management system 160 may generate models for a variety of different aspects of services provided by the travel coordination system 130. In one embodiment, the model management system 160 generates models for predicting ETAs that providers will arrive at locations of riders. In other example, the model management system 160 generates models for predicting demands for trip requests at various times during the day and week, estimated number of available providers in various geographical locations, and the like. In addition, the model management system 160 provides different selected subsets of the generated machine-learned models to different server farms implementing the travel coordination system 130.” Obtaining, by the one or more computing devices from the plurality of selected user devices is taught as the machine-learned models predicting for trips at the rider devices. Updated machine-learned model data generated by the plurality of selected user devices through training of the current version of the machine-learned model associated with the region using data local to each of the plurality of selected user devices is taught by the system generates models for predicting demands for trip requests at various times during the day and week, estimated number of available providers in various geographical locations. The geographical location in this scenario being San Francisco. The devices are associated with the region using local data according to their locations.)… and generating, by the one or more computing devices, an updated machine-learned model associated with the region based on the updated machine-learned model data (Shariat: Paragraph [0033] “The model management system 160 generates machine-learned models associated with particular conditions represented in the hierarchy. The model management system 160 generates the models for a given node using travel information associated with the aspect of the travel information represented by that node. Thus, if a node represents a state in the United States such as California, the model management system 160 uses travel information associated with trips involving California to generate the models for that node.” Generating, by the one or more computing devices, an updated machine-learned model associated with the region based on the updated machine-learned model data is taught as the model management system generates machine-learned models associated with particular conditions represented in the hierarchy. The model management system generates the models for a given node using travel information associated with the aspect of the travel information represented by that node. Thus, if a node represents a state in the United States such as California, the model management system uses travel information associated with trips involving California to generate the models for that node (i.e. associated with the region).).
Shariat does not explicitly disclose…, the current version of the machine-learned model comprising a copy of a global machine-learned model…;…, the data local to each of the plurality of selected user devices comprising data that is locally-generated and locally-stored at each of the plurality of selected user devices, wherein the data local to each of the plurality of selected user devices is used with the current version of the machine-learned model locally at each of the plurality of selected user devices in generating the updated machine-learned model data;
	Moloney further teaches …, the current version of the machine-learned model comprising a copy of a global machine-learned model… (Moloney: Paragraph [0019] “it allows the host server to learn a deep learning model that takes into account all available information (e.g., training data) and provide the globally-learned deep learning model to the local devices” The current version of the machine-learned model comprising a copy of a global machine-learned model is taught as the globally learned deep learning model to the local devices.)
;…, the data local to each of the plurality of selected user devices comprising data that is locally-generated and locally-stored at each of the plurality of selected user devices (Moloney: Paragraph [0024] “The example host server 104 aggregates the local deep learning model data received from the plurality of local devices 102 and distributes the aggregated results back to the plurality of local devices 102.” The data local to each of the plurality of selected user devices comprising data that is locally-generated and locally-stored at each of the plurality of selected user devices is taught as the local deep learning model data received from the plurality of local devices (i.e. comprising data that is locally-generated and locally-stored at each of the plurality of selected user devices)), wherein the data local to each of the plurality of selected user devices is used with the current version of the machine-learned model locally at each of the plurality of selected user devices in generating the updated machine-learned model data (Moloney: Paragraph [0032] “The example local devices 102 transmit the weights and/or other details of the respective local deep learning models to the example host server 104. The example weight aggregator 110 of the example host server 104 aggregates the weights to develop a global set of weights. The example weight distributor 112 distributes the aggregated weights back to the local devices 102 to update the respective local deep learning models with the globally aggregated weights. For example, the local devices 102 may then utilize the globally updated respective local deep learning models to classify test data (e.g., data that has not been classified or for which classification is desired).” The data local to each of the plurality of selected user devices is used with the current version of the machine-learned model locally at each of the plurality of selected user devices in generating the updated machine-learned model data is taught as the local devices transmitting the weights and/or other details of the respective local deep learning models to the example host server in order to develop a global set of weights which are further sent back to the local deep learning models in order to utilize the globally updated respective local deep learning models.);
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the selective distribution of machine-learned models of Shariat with the global model of Moloney in order to utilize the distributive platform in which a host server learns based on all the local models thereby allowing the local devices to adopt the globally-learned deep learning model and also adjust it to take into account any local variations (Moloney: Paragraph [0019] “beneficial because (1) it allows the host server to learn a deep learning model that takes into account all available information (e.g., training data) and provide the globally-learned deep learning model to the local devices, and (2) it allows the local devices to adopt the globally -learned deep learning model and also adjust it to take into account any local variations.”).

Claim 8 and 15 are similarly rejected refer to claim 1 for further analysis.

Regarding claim 2, Shariat in view of Moloney teaches the computer-implemented method of claim 1, Shariat further teaches wherein obtaining, by one or more computing devices, a plurality of regions based at least in part on temporal availability of user devices further comprises generating the plurality of regions based at least in part on one or more of: time zones, latitude ranges, longitude ranges, semantic boundaries, user population, or, diurnal availability patterns (Shariat: Paragraph [0025] “a machine-learned model may generate predictions for estimated times of arrival (ETAs) for trip requests based on input data such as the distance from the provider to the rider, the geographic region in which the provider and the rider are located (e.g., city of San Francisco, country of U.S.), the day of week (e.g., Monday) and/or time of day (e.g. morning, afternoon, evening), and/or current traffic conditions in the geographic region, etc.” Obtaining, by one or more computing devices, a plurality of regions is taught as the provider/the rider devices in the multiple regions in California. Based at least in part on temporal availability of user devices is taught as the day of week (e.g., Monday) and/or time of day (e.g. morning, afternoon, evening)(i.e. diurnal cycles). Generating the plurality of regions based at least in part on one or more of user populations are taught as the rider and provider devices in San Francisco.).

Claim 9 and 16 are similarly rejected refer to claim 2 for further analysis.

Regarding claim 3, Shariat in view of Moloney teaches the computer-implemented method of claim 2, Shariat further teaches wherein each region is generated such that each region comprises a user population having a similar diurnal cycle (Shariat: Paragraph [0025] “a machine-learned model may generate predictions for estimated times of arrival (ETAs) for trip requests based on input data such as the distance from the provider to the rider, the geographic region in which the provider and the rider are located (e.g., city of San Francisco, country of U.S.), the day of week (e.g., Monday) and/or time of day (e.g. morning, afternoon, evening), and/or current traffic conditions in the geographic region, etc.” A machine-learned model may generate predictions for estimated times of arrival (ETAs) for trip requests based on input data such as the distance from the provider to the rider, the geographic region in which the provider and the rider are located (e.g., city of San Francisco, country of U.S.), the day of week (e.g., Monday) and/or time of day (e.g. morning, afternoon, evening), and/or current traffic conditions in the geographic region, etc (i.e. grouping and generating regions of users based on day of the week, time of the day, i.e. diurnal cycles)).

Claim 10 and 17 are similarly rejected refer to claim 3 for further analysis.

Regarding claim 4, Shariat in view of Moloney teaches the computer-implemented method of claim 1, Moloney further teaches further comprising associating, by the one or more computing devices, the copy of the global machine-learned model (Moloney: Paragraph [0019] “it allows the host server to learn a deep learning model that takes into account all available information (e.g., training data) and provide the globally-learned deep learning model to the local devices” The current version of the machine-learned model comprising a copy of a global machine-learned model is taught as the globally learned deep learning model to the local devices.) with each region wherein the machine-learned model associated with the region is trained using federated learning based on users in the region (Moloney: Paragraph [0024] “The example host server 104 aggregates the local deep learning model data received from the plurality of local devices 102 and distributes the aggregated results back to the plurality of local devices 102.” With each region wherein the machine-learned model associated with the region is taught as the local deep learning model data received from the plurality of local devices. Trained using federated learning based on users in the region is taught as the local devices may then utilize the globally updated respective local deep learning models.).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the selective distribution of machine-learned models of Shariat with the global model of Moloney in order to utilize the distributive platform in which a host server learns based on all the local models thereby allowing the local devices to adopt the globally-learned deep learning model and also adjust it to take into account any local variations (Moloney: Paragraph [0019] “beneficial because (1) it allows the host server to learn a deep learning model that takes into account all available information (e.g., training data) and provide the globally-learned deep learning model to the local devices, and (2) it allows the local devices to adopt the globally -learned deep learning model and also adjust it to take into account any local variations.”).

Claim 11 is similarly rejected refer to claim 4 for further analysis.

Claims 6, 13, 19 are rejected under 35 U.S.C. 103 as being unpatentable over Shariat (US 20180115598) in view of Moloney (WO2018033890) and Kelm (WO 2018015080).
Regarding claim 6, Shariat in view of Moloney teaches the computer-implemented method of claim 1, Shariat further teaches…associated with the region (Shariat: Paragraph [0023] “For example, the travel coordination system 130 located at a server farm in California may process trip requests received within the West Coast of the United States.” Associated with the region is taught as the riders and providers associated with the server farm in California.)…associated with the region (Shariat: Paragraph [0023] “For example, the travel coordination system 130 located at a server farm in California may process trip requests received within the West Coast of the United States.” Associated with the region is taught as the riders and providers associated with the server farm in California.)…associated with at least one other region (Shariat: Paragraph [0023] “For example, the travel coordination system 130 located at a server farm in California may process trip requests received within the West Coast of the United States.” Associated with the region is taught as the riders and providers associated with the server farm in multiple areas of California.).
Shariat in view of Moloney does not explicitly disclose wherein generating, by the one or more computing devices, the updated machine-learned model … based on the updated machine-learned model data further comprises performing, by the one or more computing devices, 
Kelm further teaches when addressing multi-task neural networks, teaches wherein generating, by the one or more computing devices, the updated machine-learned model … based on the updated machine-learned model data further comprises performing, by the one or more computing devices, multitask learning to bias the machine-learned model … toward at least one machine-learned model … (Kelm: Page 7, Paragraph 3 “A multi-task network is a network using Multi-task learning (MTL). MTL is an approach in machine learning that refers to the joint training of multiple problems, enforcing a common intermediate parameterization or representation. If the different problems are sufficiently related, MTL can lead to better generalization and benefit all of the tasks. This often leads to a better model for the main task, because it allows the learner to use the commonality among the tasks. Therefore, multi-task learning is a kind of inductive transfer. This type of machine learning is an approach to inductive transfer that improves generalization by using the domain information contained in the training datasets of related tasks as an inductive bias. It does this by learning tasks in parallel while using a shared representation; what is learned for each task can help other tasks be learned better. The goal of MTL is to improve the performance of learning algorithms by learning classifiers for multiple tasks jointly. Multi-task learning works, because encouraging a classifier (or a modification thereof) to also perform well on a slightly different task is a better regularization than uninformed regularizers (e.g. to enforce that all weights are small).” Multitask learning to bias the machine-learned model toward at least one machine-learned model is taught as MTL learning to bias machine learned model towards other models.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Shariat and Moloney with the MTL learning of Kelm in order to utilize multi-task learning which is an approach to inductive transfer, thereby improving generalization by using the domain information contained in the training datasets of related tasks as an inductive bias (Kelm: Page 3. “This type of machine learning is an approach to inductive transfer that improves generalization by using the domain information contained in the training datasets of related tasks as an inductive bias.”).

Claim 13 and 19 is similarly rejected refer to claim 6 for further analysis.

Claims 5, 7, 12, 14, 18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Shariat (US 20180115598) in view of Moloney (WO2018033890) and Masato (JP WO2017/145852).
Regarding claim 5, Shariat in view of Moloney teaches the computer-implemented method of claim 1, Shariat further teaches … the plurality of selected user devices within the region (Shariat: Paragraph [0013] “the system environment 100 can have any number of client devices” The plurality of selected user devices within the region is taught as the system environment 100 can have any number of client devices. In this case the devices are associated with San Francisco.)… associated with the region (Shariat: Paragraph [0023] “For example, the travel coordination system 130 located at a server farm in California may process trip requests received within the West Coast of the United States.” Associated with the region is taught as the riders and providers associated with the server farm in California.)… associated with the region from at least one model of at least one other region (Shariat: Paragraph [0023] “For example, the travel coordination system 130 located at a server farm in California may process trip requests received within the West Coast of the United States.” Associated with the region is taught as the riders and providers associated with the server farm in California. The multiple cities of California are taught as the other regions.).
Shariat in view of Moloney does not explicitly disclose providing, by the one or more computing devices, a regularization term to …, wherein the regularization term is added to the loss function for training of the current version of the machine-learned model…, and wherein the regularization term represents a sum of distances, measured in parameter space, of the model …
Masato further teaches providing, by the one or more computing devices, a regularization term to …, wherein the regularization term is added to the loss function for training of the current version of the machine-learned model…, and wherein the regularization term represents a sum of distances, measured in parameter space, of the model … (Masato: Paragraph [7] “For example, the effect of regularization on learning will be described with reference to FIG. 3 is a diagram illustrating an example of the function of the regularization term. For simplicity, consider the case of parameters w and w . If regularization is too weak, two 1 2 parameters are updated in the direction of reducing only the loss function (A in FIG. 3), and overlearning occurs. Conversely, if regularization is too strong, parameters are updated in a direction (B in FIG. 3) that reduces only the regularization term, many parameters converge to zero, and learning does not proceed. Therefore, it is necessary to adjust the update direction by appropriately setting the size of regularization (C in FIG. 3), and update so as not to cause any problems.” A regularization term to …, wherein the regularization term is added to the loss function for training of the current version of the machine-learned model is taught as the function of the regularization term. Wherein the regularization term represents a sum of distances, measured in parameter space, of the model is taught as the parameters that are converging to zero and adjust the update direct by setting the regularization.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Shariat and Moloney with the regularization function of Masato in order to utilize a regularization term, thereby appropriately adjusting the setting of the [distance] parameters by applying the regularization term to the loss function in order to reduce problems (Masato: Paragraph [7] “if regularization is too strong, parameters are updated in a direction (B in FIG. 3) that reduces only the regularization term, many parameters converge to zero, and learning does not proceed. Therefore, it is necessary to adjust the update direction by appropriately setting the size of regularization (C in FIG. 3), and update so as not to cause any problems.”).

Claim 12 and 18 are similarly rejected refer to claim 5 for further analysis.

Regarding claim 7, Shariat teaches the computer-implemented method of claim 1. Shariat further teaches … of at least one other region in parameter space (Shariat: Paragraph [0023] “For example, the travel coordination system 130 located at a server farm in California may process trip requests received within the West Coast of the United States.” Other regions in the parameter space may be other cities in California.);… to the plurality of (Shariat: Paragraph [0013] “the system environment 100 can have any number of client devices” The plurality of selected user devices within the region is taught as the system environment 100 can have any number of client devices. In this case the devices are associated with San Francisco.)… of the model associated with the region from… associated with the region (Shariat: Paragraph [0023] “For example, the travel coordination system 130 located at a server farm in California may process trip requests received within the West Coast of the United States.” Associated with the region is taught as the models for the riders and providers associated with the server farm in California.).
Moloney further teaches computing, by the one or more devices, a centroid of at least one model … (Moloney: Paragraph [0014] “A centralized deep learning training platform includes a host server and a plurality of local devices. Each of the local devices can gather input data and transmit the input data to the host server. The host server can aggregate the input data received from each of the local devices,” Computing, by the one or more devices, a centroid of at least one model is taught as the centralized deep learning training platform includes a host server that can aggregate the input data received from each of the local devices.); and providing, by the one or more computing devices, the centroid …,… the centroid as (Moloney: Paragraph [0014] “A centralized deep learning training platform includes a host server and a plurality of local devices. Each of the local devices can gather input data and transmit the input data to the host server. The host server can aggregate the input data received from each of the local devices,” Computing, by the one or more devices, a centroid of at least one model is taught as the centralized deep learning training platform includes a host server that can aggregate the input data received from each of the local devices.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the selective distribution of machine-learned models of Shariat with the global model of Moloney in order to utilize the distributive platform in which a host server learns based on all the local models thereby allowing the local devices to adopt the globally-learned deep learning model and also adjust it to take into account any local variations (Moloney: Paragraph [0019] “beneficial because (1) it allows the host server to learn a deep learning model that takes into account all available information (e.g., training data) and provide the globally-learned deep learning model to the local devices, and (2) it allows the local devices to adopt the globally -learned deep learning model and also adjust it to take into account any local variations.”).
Masato further teaches …, wherein each of the plurality of selected user devices computes a distance, measured in parameter space, … a regularization term that is added to the loss function for training of the current version of the machine-learned model …(Masato: Paragraph [7] “For example, the effect of regularization on learning will be described with reference to FIG. 3 is a diagram illustrating an example of the function of the regularization term. For simplicity, consider the case of parameters w and w . If regularization is too weak, two 1 2 parameters are updated in the direction of reducing only the loss function (A in FIG. 3), and overlearning occurs. Conversely, if regularization is too strong, parameters are updated in a direction (B in FIG. 3) that reduces only the regularization term, many parameters converge to zero, and learning does not proceed. Therefore, it is necessary to adjust the update direction by appropriately setting the size of regularization (C in FIG. 3), and update so as not to cause any problems.” A regularization term to …, wherein the regularization term is added to the loss function for training of the current version of the machine-learned model is taught as the function of the regularization term which is applied to the loss function of Shariat (Refer to Shariat Paragraph [0051] “generating the machine-learned models include determining a set of parameters for the model that minimize a loss as a function of the training set through any minimization algorithm.”). Wherein the regularization term represents a sum of distances, measured in parameter space, of the model is taught as the parameters that are converging to zero and adjust the update direct by setting the regularization.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the combination of Shariat and Moloney with the regularization function of Masato in order to utilize a regularization term, thereby appropriately adjusting the setting of the [distance] parameters by applying the regularization term to the loss function in order to reduce problems (Masato: Paragraph [7] “if regularization is too strong, parameters are updated in a direction (B in FIG. 3) that reduces only the regularization term, many parameters converge to zero, and learning does not proceed. Therefore, it is necessary to adjust the update direction by appropriately setting the size of regularization (C in FIG. 3), and update so as not to cause any problems.”).
Claim 14 and 20 are similarly rejected refer to claim 7 for further analysis.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  

Any inquiry concerning this communication or earlier communications from the examiner should be directed to AHSIF A. SHEIKH whose telephone number is (571)272-2607.  The examiner can normally be reached on Mon-Fri 7:30-5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on 571-270-3428.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to 






/A.A.S./Examiner, Art Unit 2123                                                                                                                                                                                                        
/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123