secretflow.ml.boost.homo_boost package#

Subpackages#

Submodules#

secretflow.ml.boost.homo_boost.homo_booster module#

Classes:

SFXgboost(server, clients)

class secretflow.ml.boost.homo_boost.homo_booster.SFXgboost(server, clients)[source]#

Bases: object

Methods:

`__init__`(server, clients)
`check_params`(params)
`train`(train_hdf, valid_hdf[, params, ...])	Federated xgboost interface for training
`save_model`(model_path)	Federated xgboost save model interface
`dump_model`(model_path)	Federated xgboost dump model interface
`eval`(model_path, hdata, params)	Federated xgboost eval interface

__init__(server, clients)[source]#

check_params(params)[source]#

train(train_hdf: HDataFrame, valid_hdf: HDataFrame, params: Optional[Dict] = None, num_boost_round: int = 10, obj=None, feval=None, maximize: Optional[bool] = None, early_stopping_rounds: Optional[int] = None, evals_result: Optional[Dict] = None, verbose_eval: Union[int, bool] = True, xgb_model: Optional[Dict] = None, callbacks: Optional[List[Callable]] = None) → SFXgboost[source]#

Federated xgboost interface for training

Parameters

train_hdf – horizontal federation table used for training
valid_hdf – horizontal federated table for validation
params – dictionary of parameters
num_boost_round – Number of spanning trees required
obj – custom objective function, objective type is squared_error
feval – custom eval evaluation function
maximize – whether feval is maximized
early_stopping_rounds – same as xgboost early_stooping_round option
evals_result – container for storing evaluation results
verbose_eval – same as xgboost verbose_eval
xgb_model – xgb model file path, used for breakpoint retraining (training continuation)
callbacks – list of callback functions

save_model(model_path: Dict)[source]#

Federated xgboost save model interface

Parameters: model_path – Path of the model stored

dump_model(model_path: Dict)[source]#

Federated xgboost dump model interface

Parameters: model_path – Path of the model stored

eval(model_path: Union[str, Dict[PYU, str]], hdata: HDataFrame, params: Dict)[source]#

Federated xgboost eval interface

Parameters

model_path – Path of the model stored
hdata – Horizontal dataframe to be evaluated
params – Xgboost params

Returns

Dict evaluate result

Return type

result

secretflow.ml.boost.homo_boost.homo_booster_worker module#

Homo Booster

Classes:

HomoBooster

alias of ActorProxy(HomoBooster)

secretflow.ml.boost.homo_boost.homo_booster_worker.HomoBooster[source]#

alias of ActorProxy(HomoBooster) Methods:

`__init__`(args, *kwargs)	Abstraction device object base class.
`set_split_point`(bin_split_points, *[, ...])
`gen_mock_data`([data_num, columns, ...])	mock data with the same schema for the SERVER to synchronize the training process
`homo_train`(train_hdf, valid_hdf[, params, ...])	Fed xgboost entrance
`homo_eval`(eval_hdf, params, model_path, *[, ...])
`save_model`(model_path, *[, _ray_trace_ctx])
`dump_model`(model_path, *[, _ray_trace_ctx])
`initialize`(comm, *[, _ray_trace_ctx])	Initialize networking
`recv`(name, src_device[, step_id, _ray_trace_ctx])	Receive messages from the source device.
`recv_message`(key, value, *[, _ray_trace_ctx])	Receive message
`send`(name, value, dst_device[, step_id, ...])	Send message to target device.

secretflow.ml.boost.homo_boost.homo_decision_tree module#

Homo Decision Tree

Classes:

HomoDecisionTree([tree_param, data, ...])

Class for federated version decision tree

class secretflow.ml.boost.homo_boost.homo_decision_tree.HomoDecisionTree(tree_param: Optional[TreeParam] = None, data: Optional[HDataFrame] = None, bin_split_points: Optional[ndarray] = None, group_id: Optional[int] = None, tree_id: Optional[int] = None, iter_round: Optional[int] = None, hess_key: str = 'hess', grad_key: str = 'grad', label_key: str = 'label')[source]#

Bases: DecisionTree

Class for federated version decision tree

tree_param#: params for tree build

data#: training data, HdataFrame

bin_split_points#: global binning infos

tree_id#: tree id

group_id#: group_id

iter_round#: iter_round in the total XGBoost training progress

hess_key#: unique column name for hess value

grad_key#: unique column name for grad value

label_key#: unique column name for label key

Methods:

`__init__`([tree_param, data, ...])
`key`(name)
`get_valid_features_by_tree`()
`get_valid_features_by_level`()
`cal_root_node`()
`cal_local_hist_bags`(cur_to_split, ...)
`cal_split_info_list`(agg_histograms)
`fit`()	Enter for homo decision tree

__init__(tree_param: Optional[TreeParam] = None, data: Optional[HDataFrame] = None, bin_split_points: Optional[ndarray] = None, group_id: Optional[int] = None, tree_id: Optional[int] = None, iter_round: Optional[int] = None, hess_key: str = 'hess', grad_key: str = 'grad', label_key: str = 'label')[source]#

key(name: str) → str[source]#

get_valid_features_by_tree()[source]#

get_valid_features_by_level()[source]#

cal_root_node()[source]#

static cal_local_hist_bags(cur_to_split, cur_data_frame, bin_split_points, valid_features, use_missing, grad_key, hess_key, thread_pool)[source]#

cal_split_info_list(agg_histograms)[source]#

fit()[source]#: Enter for homo decision tree

secretflow.ml.boost.homo_boost.tree_param module#

Classes:

TreeParam([max_depth, eta, verbosity, ...])

Param class, externally exposed interface

class secretflow.ml.boost.homo_boost.tree_param.TreeParam(max_depth: int = 3, eta: float = 0.3, verbosity: int = 0, objective: Optional[Union[callable, str]] = None, tree_method: str = 'hist', criterion_method: str = 'xgboost', gamma: float = 0.0001, min_child_weight: float = 1, subsample: float = 1, colsample_bytree: float = 1, colsample_byleval: float = 1, reg_alpha: float = 0.0, reg_lambda: float = 0.1, base_score: float = 0.5, random_state: int = 1234, num_parallel: Optional[int] = None, importance_type: str = 'split', use_missing: bool = False, min_sample_split: int = 2, max_split_nodes: int = 20, min_leaf_node: int = 1, decimal: int = 10, num_class: int = 0)[source]#

Bases: object

Param class, externally exposed interface

max_depth#

the max depth of a decision tree.

Type: int

eta#

learning rate, same as xgb’s “eta”

Type: float

verbosity#

int level of log printing. Valid values are 0 (silent) - 3 (debug).

Type: int

objective#

Optional[callable , str] objective function, default ‘squareloss’

Type: Union[callable, str]

tree_method#

Optional[str] tree type, only support hist

Type: str

criterion_method#

str split criterion method, default xgboost

Type: str

gamma#

Optional[float] same as min_impurity_split,minimum gain

Type: float

min_child_weight#

Optional[float] sum of hessian needed in child nodes

Type: float

subsample#

Optional[float] subsample rate for rows

Type: float

colsample_bytree#

Optional[float] subsample rate for columns(by tree)

Type: float

colsample_bylevel#: Optional[float] subsample rate for columns(by level)

reg_alpha#

Optional[float] L1 regularization term on weights (xgb’s alpha).

Type: float

reg_lambda#

Optional[float] L2 regularization term on weights (xgb’s lambda).

Type: float

base_score#

Optional[float] base score, global bias.

Type: float

random_state#

Optional[Union[numpy.random.RandomState, int]] Random number seed.

Type: int

num_parallel#

Optional[int] num of parallel when built tree

Type: int

importance_type#

Optional[str] importance type, in [‘gain’,’split’]

Type: str

use_missing#

bool whether missing value participate in train

Type: bool

min_sample_split#

minimum sample split of splitting, default to 2

Type: int

max_split_nodes#

max_split_nodes to parallel finding their splits in a batch

Type: int

min_leaf_node#

minimum samples on node to split

Type: int

decimal#

decimal reserved of gain

Type: int

num_class#

num of class

Type: int

Attributes:

`max_depth`
`eta`
`verbosity`
`objective`
`tree_method`
`criterion_method`
`gamma`
`min_child_weight`
`subsample`
`colsample_bytree`
`colsample_byleval`
`reg_alpha`
`reg_lambda`
`base_score`
`random_state`
`num_parallel`
`importance_type`
`use_missing`
`min_sample_split`
`max_split_nodes`
`min_leaf_node`
`decimal`
`num_class`

Methods:

__init__([max_depth, eta, verbosity, ...])

max_depth: int = 3#

eta: float = 0.3#

verbosity: int = 0#

objective: Union[callable, str] = None#

tree_method: str = 'hist'#

criterion_method: str = 'xgboost'#

gamma: float = 0.0001#

min_child_weight: float = 1#

subsample: float = 1#

colsample_bytree: float = 1#

colsample_byleval: float = 1#

reg_alpha: float = 0.0#

reg_lambda: float = 0.1#

base_score: float = 0.5#

random_state: int = 1234#

num_parallel: int = None#

importance_type: str = 'split'#

use_missing: bool = False#

min_sample_split: int = 2#

max_split_nodes: int = 20#

min_leaf_node: int = 1#

decimal: int = 10#

num_class: int = 0#

__init__(max_depth: int = 3, eta: float = 0.3, verbosity: int = 0, objective: Optional[Union[callable, str]] = None, tree_method: str = 'hist', criterion_method: str = 'xgboost', gamma: float = 0.0001, min_child_weight: float = 1, subsample: float = 1, colsample_bytree: float = 1, colsample_byleval: float = 1, reg_alpha: float = 0.0, reg_lambda: float = 0.1, base_score: float = 0.5, random_state: int = 1234, num_parallel: Optional[int] = None, importance_type: str = 'split', use_missing: bool = False, min_sample_split: int = 2, max_split_nodes: int = 20, min_leaf_node: int = 1, decimal: int = 10, num_class: int = 0) → None#

Module contents#

Classes:

SFXgboost(server, clients)

class secretflow.ml.boost.homo_boost.SFXgboost(server, clients)[source]#

Bases: object

Methods:

`__init__`(server, clients)
`check_params`(params)
`train`(train_hdf, valid_hdf[, params, ...])	Federated xgboost interface for training
`save_model`(model_path)	Federated xgboost save model interface
`dump_model`(model_path)	Federated xgboost dump model interface
`eval`(model_path, hdata, params)	Federated xgboost eval interface

__init__(server, clients)[source]#

check_params(params)[source]#

Federated xgboost interface for training

Parameters

train_hdf – horizontal federation table used for training
valid_hdf – horizontal federated table for validation
params – dictionary of parameters
num_boost_round – Number of spanning trees required
obj – custom objective function, objective type is squared_error
feval – custom eval evaluation function
maximize – whether feval is maximized
early_stopping_rounds – same as xgboost early_stooping_round option
evals_result – container for storing evaluation results
verbose_eval – same as xgboost verbose_eval
xgb_model – xgb model file path, used for breakpoint retraining (training continuation)
callbacks – list of callback functions

save_model(model_path: Dict)[source]#

Federated xgboost save model interface

Parameters: model_path – Path of the model stored

dump_model(model_path: Dict)[source]#

Federated xgboost dump model interface

Parameters: model_path – Path of the model stored

eval(model_path: Union[str, Dict[PYU, str]], hdata: HDataFrame, params: Dict)[source]#

Federated xgboost eval interface

Parameters

model_path – Path of the model stored
hdata – Horizontal dataframe to be evaluated
params – Xgboost params

Returns

Dict evaluate result

Return type

result