secretflow.ml.boost.ss_xgb_v.core package#

Submodules#

secretflow.ml.boost.ss_xgb_v.core.node_split module#

Classes:

RegType(value)

An enumeration.

Functions:

`sigmoid`(pred)
`compute_obj`(G, H, reg_lambda)	compute objective values of input buckets.
`compute_weight`(G, H, reg_lambda, learning_rate)	compute weight values of tree leaf nodes.
`get_weight`(context, s)	compute weight values of tree leaf nodes.
`compute_gh`(y, pred, objective)	compute first and second order gradient of each sample.
`global_setup`(buckets_map, y, seed, ...)	Set up global context.
`tree_setup`(context, pred, col_choices, ...)	Set up pre-tree context.
`find_best_split_bucket`(context, nodes_s, ...)	compute the gradient sums of the containing instances in each split bucket and find best split bucket for each node which has the max split gain.
`init_pred`(base, samples)
`root_select`(samples)
`get_child_select`(nodes_s, lchilds_ss)	compute the next level's select indexes.
`predict_tree_weight`(selects, weights)	get final pred for this tree.
`do_leaf`(context, ss)

class secretflow.ml.boost.ss_xgb_v.core.node_split.RegType(value)[source]#

Bases: Enum

An enumeration.

Attributes:

`Linear`
`Logistic`

Linear = 'linear'#

Logistic = 'logistic'#

secretflow.ml.boost.ss_xgb_v.core.node_split.sigmoid(pred: ndarray) → ndarray[source]#

secretflow.ml.boost.ss_xgb_v.core.node_split.compute_obj(G: ndarray, H: ndarray, reg_lambda: float) → ndarray[source]#

compute objective values of input buckets.

Parameters

G/H – sum of first and second order gradient in each bucket.
reg_lambda – L2 regularization term

Returns

objective values.

secretflow.ml.boost.ss_xgb_v.core.node_split.compute_weight(G: float, H: float, reg_lambda: float, learning_rate: float) → ndarray[source]#

compute weight values of tree leaf nodes.

Parameters

G/H – sum of first and second order gradient in each node.
reg_lambda – L2 regularization term
learning_rate – Step size shrinkage used in update to prevents overfitting.

Returns

weight values.

secretflow.ml.boost.ss_xgb_v.core.node_split.get_weight(context: Dict[str, Any], s: ndarray) → ndarray[source]#

compute weight values of tree leaf nodes.

Parameters

context – comparison context.
s – sample selects in each leaf node.

Returns

weight values.

secretflow.ml.boost.ss_xgb_v.core.node_split.compute_gh(y: ndarray, pred: ndarray, objective: RegType) → Tuple[ndarray, ndarray][source]#

compute first and second order gradient of each sample.

Parameters

y – sample true label of each sample.
pred – prediction of each sample.
objective – regression learning objective,

Returns

weight values.

secretflow.ml.boost.ss_xgb_v.core.node_split.global_setup(buckets_map: List[ndarray], y: ndarray, seed: int, reg_lambda: float, learning_rate: float) → Dict[str, Any][source]#: Set up global context.

secretflow.ml.boost.ss_xgb_v.core.node_split.tree_setup(context: Dict[str, Any], pred: ndarray, col_choices: List[ndarray], objective: RegType, samples: int, subsample: float) → Dict[str, Any][source]#: Set up pre-tree context.

secretflow.ml.boost.ss_xgb_v.core.node_split.find_best_split_bucket(context: Dict[str, Any], nodes_s: List[ndarray], last_level: bool) → Tuple[ndarray, Dict[str, Any]][source]#

compute the gradient sums of the containing instances in each split bucket and find best split bucket for each node which has the max split gain.

Parameters

context – comparison context.
nodes_s – sample select indexes of each node from same tree level.
last_level – if this split is last level, next level is leaf nodes.

Returns

idx of split bucket for each node.

secretflow.ml.boost.ss_xgb_v.core.node_split.init_pred(base: float, samples: int)[source]#

secretflow.ml.boost.ss_xgb_v.core.node_split.root_select(samples: int) → List[ndarray][source]#

secretflow.ml.boost.ss_xgb_v.core.node_split.get_child_select(nodes_s: List[ndarray], lchilds_ss: List[ndarray]) → List[ndarray][source]#

compute the next level’s select indexes.

Parameters

nodes_s – sample select indexes of each node from current level’s nodes.
lchilds_ss – left children’s sample select idx for current level’s nodes.

Returns

sample select indexes for nodes in next tree level.

secretflow.ml.boost.ss_xgb_v.core.node_split.predict_tree_weight(selects: List[ndarray], weights: ndarray) → ndarray[source]#

get final pred for this tree.

Parameters

selects – leaf nodes’ sample selects from each model handler.
weights – leaf weights in secure share.

Returns

pred

secretflow.ml.boost.ss_xgb_v.core.node_split.do_leaf(context: Dict[str, Any], ss: List[ndarray]) → ndarray[source]#

secretflow.ml.boost.ss_xgb_v.core.tree_worker module#

Classes:

XgbTreeWorker

alias of ActorProxy(XgbTreeWorker)

secretflow.ml.boost.ss_xgb_v.core.tree_worker.XgbTreeWorker[source]#

alias of ActorProxy(XgbTreeWorker) Methods:

`__init__`(args, *kwargs)	Abstraction device object base class.
`predict_weight_select`(x, tree, *[, ...])	computer leaf nodes' sample selects known by this partition.
`build_maps`(x, *[, _ray_trace_ctx])	split features into buckets and build maps use in train.
`global_setup`(x, buckets, seed, *[, ...])	Set up global context.
`update_buckets_count`(buckets_count, *[, ...])	save how many buckets in each partition's all features.
`tree_setup`(colsample, *[, _ray_trace_ctx])	Set up tree context and do col sample if colsample < 1
`tree_finish`(*[, _ray_trace_ctx])
`do_split`(split_buckets, *[, _ray_trace_ctx])	record split info and generate next level's left children select.

secretflow.ml.boost.ss_xgb_v.core.utils module#

Functions:

prepare_dataset(ds)

check data setting and get total shape.

secretflow.ml.boost.ss_xgb_v.core.utils.prepare_dataset(ds: Union[FedNdarray, VDataFrame]) → Tuple[FedNdarray, Tuple[int, int]][source]#

check data setting and get total shape.

Parameters: ds – input dataset
Returns: dataset in unified type Second: shape concat all partition.
Return type: First

secretflow.ml.boost.ss_xgb_v.core.xgb_tree module#

Classes:

XgbTree()

class secretflow.ml.boost.ss_xgb_v.core.xgb_tree.XgbTree[source]#

Bases: object

Methods:

`__init__`()
`insert_split_node`(feature, value)

__init__() → None[source]#

insert_split_node(feature: int, value: float) → None[source]#

secretflow.ml.boost.ss_xgb_v.core package#

Submodules#

secretflow.ml.boost.ss_xgb_v.core.node_split module#

secretflow.ml.boost.ss_xgb_v.core.tree_worker module#

secretflow.ml.boost.ss_xgb_v.core.utils module#

secretflow.ml.boost.ss_xgb_v.core.xgb_tree module#

Module contents#