secretflow.ml.boost.ss_xgb_v.core package#

Submodules#

secretflow.ml.boost.ss_xgb_v.core.node_split module#

Classes:

RegType(value)

An enumeration.

Functions:

sigmoid(pred)

compute_obj(G, H, reg_lambda)

compute objective values of input buckets.

compute_weight(G, H, reg_lambda, learning_rate)

compute weight values of tree leaf nodes.

get_weight(context, s)

compute weight values of tree leaf nodes.

compute_gh(y, pred, objective)

compute first and second order gradient of each sample.

global_setup(buckets_map, y, seed, ...)

Set up global context.

tree_setup(context, pred, col_choices, ...)

Set up pre-tree context.

find_best_split_bucket(context, nodes_s, ...)

compute the gradient sums of the containing instances in each split bucket and find best split bucket for each node which has the max split gain.

init_pred(base, samples)

root_select(samples)

get_child_select(nodes_s, lchilds_ss)

compute the next level's select indexes.

predict_tree_weight(selects, weights)

get final pred for this tree.

do_leaf(context, ss)

class secretflow.ml.boost.ss_xgb_v.core.node_split.RegType(value)[source]#

Bases: Enum

An enumeration.

Attributes:

Linear

Logistic

Linear = 'linear'#
Logistic = 'logistic'#
secretflow.ml.boost.ss_xgb_v.core.node_split.sigmoid(pred: ndarray) ndarray[source]#
secretflow.ml.boost.ss_xgb_v.core.node_split.compute_obj(G: ndarray, H: ndarray, reg_lambda: float) ndarray[source]#

compute objective values of input buckets.

Parameters
  • G/H – sum of first and second order gradient in each bucket.

  • reg_lambda – L2 regularization term

Returns

objective values.

secretflow.ml.boost.ss_xgb_v.core.node_split.compute_weight(G: float, H: float, reg_lambda: float, learning_rate: float) ndarray[source]#

compute weight values of tree leaf nodes.

Parameters
  • G/H – sum of first and second order gradient in each node.

  • reg_lambda – L2 regularization term

  • learning_rate – Step size shrinkage used in update to prevents overfitting.

Returns

weight values.

secretflow.ml.boost.ss_xgb_v.core.node_split.get_weight(context: Dict[str, Any], s: ndarray) ndarray[source]#

compute weight values of tree leaf nodes.

Parameters
  • context – comparison context.

  • s – sample selects in each leaf node.

Returns

weight values.

secretflow.ml.boost.ss_xgb_v.core.node_split.compute_gh(y: ndarray, pred: ndarray, objective: RegType) Tuple[ndarray, ndarray][source]#

compute first and second order gradient of each sample.

Parameters
  • y – sample true label of each sample.

  • pred – prediction of each sample.

  • objective – regression learning objective,

Returns

weight values.

secretflow.ml.boost.ss_xgb_v.core.node_split.global_setup(buckets_map: List[ndarray], y: ndarray, seed: int, reg_lambda: float, learning_rate: float) Dict[str, Any][source]#

Set up global context.

secretflow.ml.boost.ss_xgb_v.core.node_split.tree_setup(context: Dict[str, Any], pred: ndarray, col_choices: List[ndarray], objective: RegType, samples: int, subsample: float) Dict[str, Any][source]#

Set up pre-tree context.

secretflow.ml.boost.ss_xgb_v.core.node_split.find_best_split_bucket(context: Dict[str, Any], nodes_s: List[ndarray], last_level: bool) Tuple[ndarray, Dict[str, Any]][source]#

compute the gradient sums of the containing instances in each split bucket and find best split bucket for each node which has the max split gain.

Parameters
  • context – comparison context.

  • nodes_s – sample select indexes of each node from same tree level.

  • last_level – if this split is last level, next level is leaf nodes.

Returns

idx of split bucket for each node.

secretflow.ml.boost.ss_xgb_v.core.node_split.init_pred(base: float, samples: int)[source]#
secretflow.ml.boost.ss_xgb_v.core.node_split.root_select(samples: int) List[ndarray][source]#
secretflow.ml.boost.ss_xgb_v.core.node_split.get_child_select(nodes_s: List[ndarray], lchilds_ss: List[ndarray]) List[ndarray][source]#

compute the next level’s select indexes.

Parameters
  • nodes_s – sample select indexes of each node from current level’s nodes.

  • lchilds_ss – left children’s sample select idx for current level’s nodes.

Returns

sample select indexes for nodes in next tree level.

secretflow.ml.boost.ss_xgb_v.core.node_split.predict_tree_weight(selects: List[ndarray], weights: ndarray) ndarray[source]#

get final pred for this tree.

Parameters
  • selects – leaf nodes’ sample selects from each model handler.

  • weights – leaf weights in secure share.

Returns

pred

secretflow.ml.boost.ss_xgb_v.core.node_split.do_leaf(context: Dict[str, Any], ss: List[ndarray]) ndarray[source]#

secretflow.ml.boost.ss_xgb_v.core.tree_worker module#

Classes:

XgbTreeWorker

alias of ActorProxy(XgbTreeWorker)

secretflow.ml.boost.ss_xgb_v.core.tree_worker.XgbTreeWorker[source]#

alias of ActorProxy(XgbTreeWorker) Methods:

__init__(*args, **kwargs)

Abstraction device object base class.

predict_weight_select(x, tree, *[, ...])

computer leaf nodes' sample selects known by this partition.

build_maps(x, *[, _ray_trace_ctx])

split features into buckets and build maps use in train.

global_setup(x, buckets, seed, *[, ...])

Set up global context.

update_buckets_count(buckets_count, *[, ...])

save how many buckets in each partition's all features.

tree_setup(colsample, *[, _ray_trace_ctx])

Set up tree context and do col sample if colsample < 1

tree_finish(*[, _ray_trace_ctx])

do_split(split_buckets, *[, _ray_trace_ctx])

record split info and generate next level's left children select.

secretflow.ml.boost.ss_xgb_v.core.utils module#

Functions:

prepare_dataset(ds)

check data setting and get total shape.

secretflow.ml.boost.ss_xgb_v.core.utils.prepare_dataset(ds: Union[FedNdarray, VDataFrame]) Tuple[FedNdarray, Tuple[int, int]][source]#

check data setting and get total shape.

Parameters

ds – input dataset

Returns

dataset in unified type Second: shape concat all partition.

Return type

First

secretflow.ml.boost.ss_xgb_v.core.xgb_tree module#

Classes:

XgbTree()

class secretflow.ml.boost.ss_xgb_v.core.xgb_tree.XgbTree[source]#

Bases: object

Methods:

__init__()

insert_split_node(feature, value)

__init__() None[source]#
insert_split_node(feature: int, value: float) None[source]#

Module contents#