Traditional machine learning methods need to focus training data on a machine or a single data center. Google and other cloud services giant also built a large cloud computing infrastructure, to deal with the data. Now, for the use of human-computer interaction on mobile devices to train the model, Google invented a new term & mdash; & mdash; Federated Learning.
Google said that this will be another big future development direction of machine learning.
So what is Federated Learning?
It means "joint learning" and "mdash; & mdash; can make multiple smart phones in the form of collaboration, learning to share the forecast model. At the same time, all the training data is stored in the terminal device. This means that in the Federated Learning way, the data stored in the cloud, is no longer engaged in large-scale machine learning the necessary prerequisite.
The most important point: Federated Learning is not just running on the smart phone to do the local model to predict (such as Mobile Vision API and On-Device Smart Reply), and further,Allowing mobile devices to collaborate on model training.
Federated Learning works as follows:
Smartphone Download the current version of the model
Improve the model by learning local data
The improvement of the model, summarized as a relatively small special update
The update is encrypted and sent to the cloud
Instantly integrate with other users' updates as an improvement to the shared model
All training data is still in the end user's device, the personal update will not be saved in the cloud.
Lei Feng network that the whole process has three key links:
According to the user's use, each phone in the local model to personalize the improvement
Form a whole model of the modified program
Applied to the shared model
The process will continue to cycle.
Google said, Federated Learning's main advantages are:
More intelligent model
Low power consumption
Protect user privacy
In addition, in addition to providing updates to the shared model; local improvements to the model can be used immediately, which can provide users with personalized experience.
Google input method
Currently, Google is testing Federated Learning on the Google input method Gboard. When the Gboard displays the recommended search term, the smartphone stores the relevant information locally, regardless of whether the user has finally clicked the referral. Federated Learning will process the device history data and then make improvements to the Gboard search recommendation model.
And the recommended algorithm is like, but the model update first occurred in the local, and then to the cloud integration.
Technical challenges and solutions
Google said that implementing Federated Learning has many algorithms, technical challenges, such as:
In a typical machine learning system, very large data sets are equally divided into multiple servers in the cloud, and optimization algorithms such as random gradient descent (SGD) are run on it. This type of iterative algorithm requires a low latency, high throughput connection between the training data and the training data. In the case of Federated Learning, the data is distributed in millions of mobile devices in a very uneven way. In contrast, smart phones have a higher latency, lower network throughput, and can only be intermittently trained while ensuring the user's daily use.
To solve these bandwidth, delay, Google developed a set of algorithms called Federated Averaging. According to the native Federated Learning version of the random gradient, the algorithm requires 10 to 100 times lower communication requirements for training deep neural networks. Google's core idea is to use intelligent mobile devices to calculate a more powerful processor to update, rather than just optimization. To do a good model, a high quality update would mean a reduction in the number of iterations. Therefore, model training can reduce communication requirements.
As the uplink speed is generally much slower than the downlink speed, Google has also developed a more novel way to reduce the demand for uplink communications more than 100 times: the use of random rotation and quantization to update the update. Although these solutions focus on the training depth of the network, Google also designed a high-dimensional sparse convex model for the algorithm, especially good at click-through rate forecasting and other issues.
Deploying Federated Learning on millions of different smartphones requires very complex technology integration. Equipment local model training, using the mini version of TensorFlow. Very detailed scheduling system to ensure that only the user phone idle, plug the electricity, there is Wi-Fi training model. So in the daily use of smart phones, Federated Learning does not affect performance.
Google stressed that Federated Learning will not make any compromise on the user experience. To ensure that the premise, the user will join Federated Learning mobile phone.
The system then needs to integrate the model updates in a secure, efficient, scalable, and fault-tolerant manner.
Federated learning does not need to store user data in the cloud. But to avoid user privacy disclosure, Google went further and developed a protocol called Secure Aggregation, which uses encryption technology. Because of this draft, the system server can only decode an average update of at least 100 or 1,000 users. Before the integration, the user's individual update can not be viewed.
According to Lei Feng network to understand that this is the world's first such agreement, for the depth of the network-level problems and the practical communication bottleneck has value. Google said that the design of Federated Averaging, is to allow the server only need to integrate the update, so Secure Aggregation can come in handy. In addition, the draft has general potential and can be applied to other issues. Google is stepping up research and development of the product implementation of the agreement.
Google said Federated learning potential is very large, but now only to explore its fur. But it can not be used to deal with all machine learning problems. For many other models, the necessary training data already exists in the cloud (such as training Gmail spam filter). Therefore, Google said it would continue to explore the cloud based on the ML, but at the same time "determined to" continue to expand the function of Federated Learning. At present, Google in the Google input method of search recommendations, Google wants to improve the language based on mobile phone input language model; and according to the picture browsing data to improve picture arrangement.
The application of Federated Learning requires machine learning developers to adopt new development tools and new ideas & mdash; from model development, training to model assessment.
In the future, whether Federated Learning will become a major topic in the field of AI, or stay like a mesh network technology in the laboratory, we will wait and see.