FederatedML

Introduction

Federatedml includes implementation of many common machine learning algorithms as well as necessary utility tools. All modules are developed in a decoupling modular approach to enhance scalability. Specifically, we provide:

1. FML Algorithms: Federated machine learning algorithms serving for DataIO, Data-preprocessing, feature engineering and modeling. More details are listed below.

2. Utilities: Tools that enable federated learning such as encryption tools, statistic modules, parameter definitions, and transfer variable autogenerator etc.

3. Framework: Kits and base models for developing new algorithm modules. Framework provides reusable functions to standardize modules and make them compact.

4.Secure Protocol: Provides multiple security protocols for more secure multi-party interaction calculations.

Algorithm List

DataIO

This component is typically the first component of a modeling task. It will transform user-uploaded date into Instance object which can be used for the following components.

  • Corresponding module name: DataIO
  • Data Input: DTable, values are raw data.
  • Data Output: Transformed DTable, values are data instance define in federatedml/feature/instance.py

Intersect

Compute intersect data set of two parties without leakage of difference set information. Mainly used in hetero scenario task.

  • Corresponding module name: Intersection
  • Data Input: DTable
  • Data Output: DTable which keys are occurred in both parties.

Federated Sampling

Federated Sampling data so that its distribution become balance in each party.This module support both federated and standalone version

  • Corresponding module name: FederatedSample
  • Data Input: DTable
  • Data Output: the sampled data, supports both random and stratified sampling.

Feature Scale

Module for feature scaling and standardization.

  • Corresponding module name: FeatureScale
  • Data Input: DTable, whose values are instances.
  • Data Output: Transformed DTable.
  • Model Output: Transform factors like min/max, mean/std.

Hetero Feature Binning

With binning input data, calculates each column’s iv and woe and transform data according to the binned information.

  • Corresponding module name: HeteroFeatureBinning
  • Data Input: DTable with y in guest and without y in host.
  • Data Output: Transformed DTable.
  • Model Output: iv/woe, split points, event counts, non-event counts etc. of each column.

OneHot Encoder

Transfer a column into one-hot format.

  • Corresponding module name: OneHotEncoder
  • Data Input: Input DTable.
  • Data Output: Transformed DTable with new headers.
  • Model Output: Original header and feature values to new header map.

Hetero Feature Selection

Provide 5 types of filters. Each filters can select columns according to user config.

  • Corresponding module name: HeteroFeatureSelection
  • Data Input: Input DTable.
  • Data Output: Transformed DTable with new headers and filtered data instance. Model Output: Whether left or not for each column.
  • Model Input: If iv filters used, hetero_binning model is needed.

Hetero LR

Build hetero logistic regression module through multiple parties.

  • Corresponding module name: HeteroLR
  • Data Input: Input DTable.
  • Model Output: Logistic Regression model.

Hetero LinR

Build hetero linear regression module through multiple parties.

  • Corresponding module name: HeteroLinR
  • Data Input: Input DTable.
  • Model Output: Linear Regression model.

Hetero Poisson

Build hetero poisson regression module through multiple parties.

  • Corresponding module name: HeteroPoisson
  • Data Input: Input DTable.
  • Model Output: Poisson Regression model.

Homo LR

Build homo logistic regression module through multiple parties.

  • Corresponding module name: HomoLR
  • Data Input: Input DTable.
  • Model Output: Logistic Regression model.

Homo NN

Build homo neural network module through multiple parties.

  • Corresponding module name: HomoNN
  • Data Input: Input DTable.
  • Model Output: Neural Network model.

Hetero Secure Boosting

Build hetero secure boosting model through multiple parties.

  • Corresponding module name: HeteroSecureBoost
  • Data Input: DTable, values are instances.
  • Model Output: SecureBoost Model, consists of model-meta and model-param

Evaluation

Output the model evaluation metrics for user.

  • Corresponding module name: Evaluation

 

More available algorithms are coming soon.

Introduction