Data silos, data privacy and data security are like “three mountains” that cannot be avoided in the large-scale industrial application process of artificial intelligence and cloud computing.
As a new generation of artificial intelligence algorithm, “Federated Learning” can realize joint modeling and improve the effect of AI model by keeping data at local so as to guarantee the data privacy and break through the limitation of small data and data silos, which has undoubtedly become one of the ways to cross the “three mountains”. As a result, as the world’s first industrial-scale open source project in Federated Learning, FATE has received a lot of attention. The developers expect to join community construction. (FATE open source community address: https://github.com/FederatedAI/FATE).
After the introduction of the contributor bounty program, FATE open-source community welcomed its first-level contributor – Yang Liu from Tencent Cloud. How Federated learning to empower industry data security? What are the assessments of FATE from the privacy practitioners? Dr. Liu Yang expressed his opinion in the interview.
Data operation improves 70%, speed up the implementation of enterprise applications
Liu Yang, a Ph.D. from The Australian National University, is also a senior researcher at Tencent Cloud, responsible for the privacy protection algorithm of Tencent Shield Sandbox. Yang Liu said he had been closely following “Federated Learning” since the beginning of this year because of his career.
As a result, FATE has been into his view and Yang Liu and Tencent Cloud team focus on it. After an in-depth understanding of FATE, Yang Liu believed that the concept of privacy security + distributed learning created by Tencent Shield Sandbox coincides with the three issues of “data security”, “data privacy” and “data compliance” that FATE is to solve. They have gradually started to meet the functional requirements of Shield Sandbox by using FATE.
Yang Liu said that after a long time of contact, he agreed with the logic regression of FATE and XGBoost algorithm flow and thus he started to join the construction of FATE open-source community and put forward optimization suggestions. Using symmetric affine cipher to replace Paillier cipher, the training time can be improved by more than 70%, thus reducing the load of homomorphic operation. In the future, when cooperative enterprises apply the optimized version of FATE, it can effectively reduce the time cost of data calculation and improve the technological competitiveness of enterprises in the AI era.
Data security is an urgent issue in the industry
In the AI application scenario, the traditional cooperation method of multi-party data center merging processing has serious privacy leakage problem, which has even become a key obstacle for enterprises to apply AI on a large-scale application.
In Mr. Liu’s opinion, the key to the solution of the problem of data security, that is the compromise between data privacy and utility. To be specific, data to be safe from the island to share out, must experience some “masked” operation: Converting valid data into scrambled code through cryptography tools is privacy but whoever own the key will greatly affect the utility of data; It is also possible to confuse the original data with noise, such as differential privacy. The higher the noise is, the more privacy is guaranteed but the lower the utility that the user can use to get the data. How to find a compromise between privacy and utility is one of the key issues of data security circulation.
Ideally, any data user can perform efficient data mining operations on freely flowing and aggregated distributed data without any sense of privacy constraints. In MPC (Multi-party Computation) field, the current industry is still stuck in confusion circuit, trusted computing and other solutions. Although the supported computing task is general, it requires extra hardware support and higher learning cost, which hinders large-scale application and is not good for the formation of a security data alliance.
In the universal federated framework, Federated Learning makes customized privacy protection for each kind of machine learning algorithm so that their use is no different from the classic central machine learning model. In contrast, Federated Learning ensures ease of use while stabilizing costs. Yang Liu said that the solution offered by Federated Learning is more attractive to companies; For the industries, the easier operation will attract more hard work from developers, thus promoting the construction of security data alliance.
FATE ecosystem × Tencent Cloud, Expecting the data security future
FATE and Tencent Cloud Shield Sandbox have been conducting business and technical exchanges since early May of this year. Currently, the core calculation module of shield sandbox is provided by FATE. In the process of building the platform, the two sides work closely together. Yang Liu expressed his thought in the interview that when the team uses FATE framework and algorithm, they will contribute effective suggestions to FATE open source project and participate in the construction of the open-source community.
This form of cooperation with the characteristic of “mutual benefit and open-source construction” not only promotes the polishing of Shield Sandbox products and the improvement of FATE project, but also provides a good example for other technical projects or teams ——Embracing new technologies in an open manner not only benefits the company, but also promotes the development of the industry.
In Mr. Liu’s imagination, in the future, the two parties can carry out deeper cooperation in improving technical influence and business implementation, such as publishing significant papers, submitting patents and jointly taking over the internal and external actual business so as to form a pretty prospect of “academic” and “industry”.
As more and more contributors join the construction of the theoretical standard and industrial application of FATE, FATE is bound to usher in a broader prospect. In this regard, Yang Liu said that the combination of Shield Sandbox and FATE will accelerate the root and growth of data security and build a future safe data alliance on the data silos.