FederatedML
Introduction
Federatedml includes implementation of many common machine learning algorithms as well as necessary utility tools. All modules are developed in a decoupling modular approach to enhance scalability. Specifically, we provide:
1. FML Algorithms: Federated machine learning algorithms serving for DataIO, Data-preprocessing, feature engineering and modeling. More details are listed below.
2. Utilities: Tools that enable federated learning such as encryption tools, statistic modules, parameter definitions, and transfer variable autogenerator etc.
3. Framework: Kits and base models for developing new algorithm modules. Framework provides reusable functions to standardize modules and make them compact.
4.Secure Protocol: Provides multiple security protocols for more secure multi-party interaction calculations.
Algorithm List
DataIO
This component is typically the first component of a modeling task. It will transform user-uploaded date into Instance object which can be used for the following components.
- Corresponding module name: DataIO
- Data Input: DTable, values are raw data.
- Data Output: Transformed DTable, values are data instance define in federatedml/feature/instance.py
Intersect
Compute intersect data set of two parties without leakage of difference set information. Mainly used in hetero scenario task.
- Corresponding module name: Intersection
- Data Input: DTable
- Data Output: DTable which keys are occurred in both parties.
Federated Sampling
Federated Sampling data so that its distribution become balance in each party.This module support both federated and standalone version
- Corresponding module name: FederatedSample
- Data Input: DTable
- Data Output: the sampled data, supports both random and stratified sampling.
Feature Scale
Module for feature scaling and standardization.
- Corresponding module name: FeatureScale
- Data Input: DTable, whose values are instances.
- Data Output: Transformed DTable.
- Model Output: Transform factors like min/max, mean/std.
Hetero Feature Binning
With binning input data, calculates each column’s iv and woe and transform data according to the binned information.
- Corresponding module name: HeteroFeatureBinning
- Data Input: DTable with y in guest and without y in host.
- Data Output: Transformed DTable.
- Model Output: iv/woe, split points, event counts, non-event counts etc. of each column.
OneHot Encoder
Transfer a column into one-hot format.
- Corresponding module name: OneHotEncoder
- Data Input: Input DTable.
- Data Output: Transformed DTable with new headers.
- Model Output: Original header and feature values to new header map.
Hetero Feature Selection
Provide 5 types of filters. Each filters can select columns according to user config.
- Corresponding module name: HeteroFeatureSelection
- Data Input: Input DTable.
- Data Output: Transformed DTable with new headers and filtered data instance. Model Output: Whether left or not for each column.
- Model Input: If iv filters used, hetero_binning model is needed.
Hetero LR
Build hetero logistic regression module through multiple parties.
- Corresponding module name: HeteroLR
- Data Input: Input DTable.
- Model Output: Logistic Regression model.
Hetero LinR
Build hetero linear regression module through multiple parties.
- Corresponding module name: HeteroLinR
- Data Input: Input DTable.
- Model Output: Linear Regression model.
Hetero Poisson
Build hetero poisson regression module through multiple parties.
- Corresponding module name: HeteroPoisson
- Data Input: Input DTable.
- Model Output: Poisson Regression model.
Homo LR
Build homo logistic regression module through multiple parties.
- Corresponding module name: HomoLR
- Data Input: Input DTable.
- Model Output: Logistic Regression model.
Homo NN
Build homo neural network module through multiple parties.
- Corresponding module name: HomoNN
- Data Input: Input DTable.
- Model Output: Neural Network model.
Hetero Secure Boosting
Build hetero secure boosting model through multiple parties.
- Corresponding module name: HeteroSecureBoost
- Data Input: DTable, values are instances.
- Model Output: SecureBoost Model, consists of model-meta and model-param
Evaluation
Output the model evaluation metrics for user.
- Corresponding module name: Evaluation
More available algorithms are coming soon.