secretflow.ml.boost.homo_boost package#

Subpackages#

Submodules#

secretflow.ml.boost.homo_boost.homo_booster module#

Classes:

SFXgboost(server, clients)

class secretflow.ml.boost.homo_boost.homo_booster.SFXgboost(server, clients)[source]#

Bases: object

Methods:

__init__(server, clients)

check_params(params)

train(train_hdf, valid_hdf[, params, ...])

Federated xgboost interface for training

save_model(model_path)

Federated xgboost save model interface

dump_model(model_path)

Federated xgboost dump model interface

eval(model_path, hdata, params)

Federated xgboost eval interface

__init__(server, clients)[source]#
check_params(params)[source]#
train(train_hdf: HDataFrame, valid_hdf: HDataFrame, params: Optional[Dict] = None, num_boost_round: int = 10, obj=None, feval=None, maximize: Optional[bool] = None, early_stopping_rounds: Optional[int] = None, evals_result: Optional[Dict] = None, verbose_eval: Union[int, bool] = True, xgb_model: Optional[Dict] = None, callbacks: Optional[List[Callable]] = None) SFXgboost[source]#

Federated xgboost interface for training

Parameters
  • train_hdf – horizontal federation table used for training

  • valid_hdf – horizontal federated table for validation

  • params – dictionary of parameters

  • num_boost_round – Number of spanning trees required

  • obj – custom objective function, objective type is squared_error

  • feval – custom eval evaluation function

  • maximize – whether feval is maximized

  • early_stopping_rounds – same as xgboost early_stooping_round option

  • evals_result – container for storing evaluation results

  • verbose_eval – same as xgboost verbose_eval

  • xgb_model – xgb model file path, used for breakpoint retraining (training continuation)

  • callbacks – list of callback functions

save_model(model_path: Dict)[source]#

Federated xgboost save model interface

Parameters

model_path – Path of the model stored

dump_model(model_path: Dict)[source]#

Federated xgboost dump model interface

Parameters

model_path – Path of the model stored

eval(model_path: Union[str, Dict[PYU, str]], hdata: HDataFrame, params: Dict)[source]#

Federated xgboost eval interface

Parameters
  • model_path – Path of the model stored

  • hdata – Horizontal dataframe to be evaluated

  • params – Xgboost params

Returns

Dict evaluate result

Return type

result

secretflow.ml.boost.homo_boost.homo_booster_worker module#

Homo Booster

Classes:

HomoBooster

alias of ActorProxy(HomoBooster)

secretflow.ml.boost.homo_boost.homo_booster_worker.HomoBooster[source]#

alias of ActorProxy(HomoBooster) Methods:

__init__(*args, **kwargs)

Abstraction device object base class.

set_split_point(bin_split_points, *[, ...])

gen_mock_data([data_num, columns, ...])

mock data with the same schema for the SERVER to synchronize the training process

homo_train(train_hdf, valid_hdf[, params, ...])

Fed xgboost entrance

homo_eval(eval_hdf, params, model_path, *[, ...])

save_model(model_path, *[, _ray_trace_ctx])

dump_model(model_path, *[, _ray_trace_ctx])

initialize(comm, *[, _ray_trace_ctx])

Initialize networking

recv(name, src_device[, step_id, _ray_trace_ctx])

Receive messages from the source device.

recv_message(key, value, *[, _ray_trace_ctx])

Receive message

send(name, value, dst_device[, step_id, ...])

Send message to target device.

secretflow.ml.boost.homo_boost.homo_decision_tree module#

Homo Decision Tree

Classes:

HomoDecisionTree([tree_param, data, ...])

Class for federated version decision tree

class secretflow.ml.boost.homo_boost.homo_decision_tree.HomoDecisionTree(tree_param: Optional[TreeParam] = None, data: Optional[HDataFrame] = None, bin_split_points: Optional[ndarray] = None, group_id: Optional[int] = None, tree_id: Optional[int] = None, iter_round: Optional[int] = None, hess_key: str = 'hess', grad_key: str = 'grad', label_key: str = 'label')[source]#

Bases: DecisionTree

Class for federated version decision tree

tree_param#

params for tree build

data#

training data, HdataFrame

bin_split_points#

global binning infos

tree_id#

tree id

group_id#

group_id

iter_round#

iter_round in the total XGBoost training progress

hess_key#

unique column name for hess value

grad_key#

unique column name for grad value

label_key#

unique column name for label key

Methods:

__init__([tree_param, data, ...])

key(name)

get_valid_features_by_tree()

get_valid_features_by_level()

cal_root_node()

cal_local_hist_bags(cur_to_split, ...)

cal_split_info_list(agg_histograms)

fit()

Enter for homo decision tree

__init__(tree_param: Optional[TreeParam] = None, data: Optional[HDataFrame] = None, bin_split_points: Optional[ndarray] = None, group_id: Optional[int] = None, tree_id: Optional[int] = None, iter_round: Optional[int] = None, hess_key: str = 'hess', grad_key: str = 'grad', label_key: str = 'label')[source]#
key(name: str) str[source]#
get_valid_features_by_tree()[source]#
get_valid_features_by_level()[source]#
cal_root_node()[source]#
static cal_local_hist_bags(cur_to_split, cur_data_frame, bin_split_points, valid_features, use_missing, grad_key, hess_key, thread_pool)[source]#
cal_split_info_list(agg_histograms)[source]#
fit()[source]#

Enter for homo decision tree

secretflow.ml.boost.homo_boost.tree_param module#

Classes:

TreeParam([max_depth, eta, verbosity, ...])

Param class, externally exposed interface

class secretflow.ml.boost.homo_boost.tree_param.TreeParam(max_depth: int = 3, eta: float = 0.3, verbosity: int = 0, objective: Optional[Union[callable, str]] = None, tree_method: str = 'hist', criterion_method: str = 'xgboost', gamma: float = 0.0001, min_child_weight: float = 1, subsample: float = 1, colsample_bytree: float = 1, colsample_byleval: float = 1, reg_alpha: float = 0.0, reg_lambda: float = 0.1, base_score: float = 0.5, random_state: int = 1234, num_parallel: Optional[int] = None, importance_type: str = 'split', use_missing: bool = False, min_sample_split: int = 2, max_split_nodes: int = 20, min_leaf_node: int = 1, decimal: int = 10, num_class: int = 0)[source]#

Bases: object

Param class, externally exposed interface

max_depth#

the max depth of a decision tree.

Type

int

eta#

learning rate, same as xgb’s “eta”

Type

float

verbosity#

int level of log printing. Valid values are 0 (silent) - 3 (debug).

Type

int

objective#

Optional[callable , str] objective function, default ‘squareloss’

Type

Union[callable, str]

tree_method#

Optional[str] tree type, only support hist

Type

str

criterion_method#

str split criterion method, default xgboost

Type

str

gamma#

Optional[float] same as min_impurity_split,minimum gain

Type

float

min_child_weight#

Optional[float] sum of hessian needed in child nodes

Type

float

subsample#

Optional[float] subsample rate for rows

Type

float

colsample_bytree#

Optional[float] subsample rate for columns(by tree)

Type

float

colsample_bylevel#

Optional[float] subsample rate for columns(by level)

reg_alpha#

Optional[float] L1 regularization term on weights (xgb’s alpha).

Type

float

reg_lambda#

Optional[float] L2 regularization term on weights (xgb’s lambda).

Type

float

base_score#

Optional[float] base score, global bias.

Type

float

random_state#

Optional[Union[numpy.random.RandomState, int]] Random number seed.

Type

int

num_parallel#

Optional[int] num of parallel when built tree

Type

int

importance_type#

Optional[str] importance type, in [‘gain’,’split’]

Type

str

use_missing#

bool whether missing value participate in train

Type

bool

min_sample_split#

minimum sample split of splitting, default to 2

Type

int

max_split_nodes#

max_split_nodes to parallel finding their splits in a batch

Type

int

min_leaf_node#

minimum samples on node to split

Type

int

decimal#

decimal reserved of gain

Type

int

num_class#

num of class

Type

int

Attributes:

max_depth

eta

verbosity

objective

tree_method

criterion_method

gamma

min_child_weight

subsample

colsample_bytree

colsample_byleval

reg_alpha

reg_lambda

base_score

random_state

num_parallel

importance_type

use_missing

min_sample_split

max_split_nodes

min_leaf_node

decimal

num_class

Methods:

__init__([max_depth, eta, verbosity, ...])

max_depth: int = 3#
eta: float = 0.3#
verbosity: int = 0#
objective: Union[callable, str] = None#
tree_method: str = 'hist'#
criterion_method: str = 'xgboost'#
gamma: float = 0.0001#
min_child_weight: float = 1#
subsample: float = 1#
colsample_bytree: float = 1#
colsample_byleval: float = 1#
reg_alpha: float = 0.0#
reg_lambda: float = 0.1#
base_score: float = 0.5#
random_state: int = 1234#
num_parallel: int = None#
importance_type: str = 'split'#
use_missing: bool = False#
min_sample_split: int = 2#
max_split_nodes: int = 20#
min_leaf_node: int = 1#
decimal: int = 10#
num_class: int = 0#
__init__(max_depth: int = 3, eta: float = 0.3, verbosity: int = 0, objective: Optional[Union[callable, str]] = None, tree_method: str = 'hist', criterion_method: str = 'xgboost', gamma: float = 0.0001, min_child_weight: float = 1, subsample: float = 1, colsample_bytree: float = 1, colsample_byleval: float = 1, reg_alpha: float = 0.0, reg_lambda: float = 0.1, base_score: float = 0.5, random_state: int = 1234, num_parallel: Optional[int] = None, importance_type: str = 'split', use_missing: bool = False, min_sample_split: int = 2, max_split_nodes: int = 20, min_leaf_node: int = 1, decimal: int = 10, num_class: int = 0) None#

Module contents#

Classes:

SFXgboost(server, clients)

class secretflow.ml.boost.homo_boost.SFXgboost(server, clients)[source]#

Bases: object

Methods:

__init__(server, clients)

check_params(params)

train(train_hdf, valid_hdf[, params, ...])

Federated xgboost interface for training

save_model(model_path)

Federated xgboost save model interface

dump_model(model_path)

Federated xgboost dump model interface

eval(model_path, hdata, params)

Federated xgboost eval interface

__init__(server, clients)[source]#
check_params(params)[source]#
train(train_hdf: HDataFrame, valid_hdf: HDataFrame, params: Optional[Dict] = None, num_boost_round: int = 10, obj=None, feval=None, maximize: Optional[bool] = None, early_stopping_rounds: Optional[int] = None, evals_result: Optional[Dict] = None, verbose_eval: Union[int, bool] = True, xgb_model: Optional[Dict] = None, callbacks: Optional[List[Callable]] = None) SFXgboost[source]#

Federated xgboost interface for training

Parameters
  • train_hdf – horizontal federation table used for training

  • valid_hdf – horizontal federated table for validation

  • params – dictionary of parameters

  • num_boost_round – Number of spanning trees required

  • obj – custom objective function, objective type is squared_error

  • feval – custom eval evaluation function

  • maximize – whether feval is maximized

  • early_stopping_rounds – same as xgboost early_stooping_round option

  • evals_result – container for storing evaluation results

  • verbose_eval – same as xgboost verbose_eval

  • xgb_model – xgb model file path, used for breakpoint retraining (training continuation)

  • callbacks – list of callback functions

save_model(model_path: Dict)[source]#

Federated xgboost save model interface

Parameters

model_path – Path of the model stored

dump_model(model_path: Dict)[source]#

Federated xgboost dump model interface

Parameters

model_path – Path of the model stored

eval(model_path: Union[str, Dict[PYU, str]], hdata: HDataFrame, params: Dict)[source]#

Federated xgboost eval interface

Parameters
  • model_path – Path of the model stored

  • hdata – Horizontal dataframe to be evaluated

  • params – Xgboost params

Returns

Dict evaluate result

Return type

result