secretflow.ml.nn package#

Subpackages#

Module contents#

Classes:

`FLModel`([server, device_list, model, ...])
`SLModel`([base_model_dict, device_y, ...])

class secretflow.ml.nn.FLModel(server=None, device_list: List[PYU] = [], model: Union[TorchModel, Callable[[], tensorflow.keras.Model]] = None, aggregator=None, strategy='fed_avg_w', consensus_num=1, backend='tensorflow', **kwargs)[source]#

Bases: object

Methods:

`__init__`([server, device_list, model, ...])
`init_workers`(model, device_list, strategy, ...)
`initialize_weights`()
`handle_file`(train_dict, label[, batch_size, ...])
`handle_data`(train_x[, train_y, batch_size, ...])
`fit`(x, y[, batch_size, batch_sampling_rate, ...])	Horizontal federated training interface
`predict`(x[, batch_size, label_decoder, ...])	Horizontal federated offline prediction interface
`evaluate`(x[, y, batch_size, sample_weight, ...])	Horizontal federated offline evaluation interface
`save_model`(model_path[, is_test])	Horizontal federated save model interface
`load_model`(model_path[, is_test])	Horizontal federated load model interface

__init__(server=None, device_list: List[PYU] = [], model: Union[TorchModel, Callable[[], tensorflow.keras.Model]] = None, aggregator=None, strategy='fed_avg_w', consensus_num=1, backend='tensorflow', **kwargs)[source]#

init_workers(model, device_list, strategy, backend)[source]#

initialize_weights()[source]#

handle_file(train_dict: Dict[PYU, str], label: str, batch_size: Union[int, Dict[PYU, int]] = 32, sampling_rate=None, shuffle=False, random_seed=1234, epochs=1, stage='train', label_decoder=None, max_batch_size=20000, prefetch_buffer_size=None)[source]#

handle_data(train_x: Union[HDataFrame, FedNdarray], train_y: Optional[Union[HDataFrame, FedNdarray]] = None, batch_size: Union[int, Dict[PYU, int]] = 32, sampling_rate=None, shuffle=False, random_seed=1234, epochs=1, sample_weight: Optional[Union[HDataFrame, FedNdarray]] = None, sampler_method='batch', stage='train')[source]#

fit(x: Union[HDataFrame, FedNdarray, Dict[PYU, str]], y: Union[HDataFrame, FedNdarray, str], batch_size: Union[int, Dict[PYU, int]] = 32, batch_sampling_rate: Optional[float] = None, epochs: int = 1, verbose: int = 1, callbacks=None, validation_data=None, shuffle=False, class_weight=None, sample_weight=None, validation_freq=1, aggregate_freq=1, label_decoder=None, max_batch_size=20000, prefetch_buffer_size=None, sampler_method='batch', random_seed=None, dp_spent_step_freq=None, audit_log_dir=None) → History[source]#

Horizontal federated training interface

Parameters

x – feature, FedNdArray, HDataFrame or Dict {PYU: model_path}
y – label, FedNdArray, HDataFrame or str(column name of label)
batch_size – Number of samples per gradient update, int or Dict, recommend 64 or more for safety
batch_sampling_rate – Ratio of sample per batch, float
epochs – Number of epochs to train the model
verbose – 0, 1. Verbosity mode
callbacks – List of keras.callbacks.Callback instances.
validation_data – Data on which to evaluate
shuffle – whether to shuffle the training data
class_weight – Dict mapping class indices (integers) to a weight (float)
sample_weight – weights for the training samples
validation_freq – specifies how many training epochs to run before a new validation run is performed
aggregate_freq – Number of steps of aggregation
label_decoder – Only used for CSV reading, for label preprocess
max_batch_size – Max limit of batch size
prefetch_buffer_size – An int specifying the number of feature batches to prefetch for performance improvement. Only for csv reader
sampler_method – The name of sampler method
random_seed – Prg seed for shuffling
dp_spent_step_freq – specifies how many training steps to check the budget of dp
audit_log_dir – path of audit log dir, checkpoint will be save if audit_log_dir is not None

Returns

A history object. It’s history.global_history attribute is a aggregated record of training loss values and metrics, while history.local_history attribute is a record of training loss values and metrics of each party.

predict(x: Union[HDataFrame, FedNdarray, Dict], batch_size=None, label_decoder=None, sampler_method='batch', random_seed=1234) → Dict[PYU, PYUObject][source]#

Horizontal federated offline prediction interface

Parameters

x – feature, FedNdArray or HDataFrame
batch_size – Number of samples per gradient update, int or Dict
label_decoder – Only used for CSV reading, for label preprocess
sampler_method – The name of sampler method
random_seed – Prg seed for shuffling

Returns

predict results, numpy.array

evaluate(x: Union[HDataFrame, FedNdarray, Dict], y: Optional[Union[HDataFrame, FedNdarray, str]] = None, batch_size: Union[int, Dict[PYU, int]] = 32, sample_weight: Optional[Union[HDataFrame, FedNdarray]] = None, label_decoder=None, return_dict=False, sampler_method='batch', random_seed=None) → Tuple[Union[List[Metric], Dict[str, Metric]], Union[Dict[str, List[Metric]], Dict[str, Dict[str, Metric]]]][source]#

Horizontal federated offline evaluation interface

Parameters

x – Input data. It could be: - FedNdArray - HDataFrame - Dict {PYU: model_path}
y – Label. It could be: - FedNdArray - HDataFrame - str column name of csv
batch_size – Integer or Dict. Number of samples per batch of computation. If unspecified, batch_size will default to 32.
sample_weight – Optional Numpy array of weights for the test samples, used for weighting the loss function.
label_decoder – User define how to handle label column when use csv reader
sampler_method – The name of sampler method
return_dict – If True, loss and metric results are returned as a dict, with each key being the name of the metric. If False, they are returned as a list.

Returns

A tuple of two objects. The first object is a aggregated record of metrics, and the second object is a record of training loss values and metrics of each party.

save_model(model_path: Union[str, Dict[PYU, str]], is_test=False)[source]#

Horizontal federated save model interface

Parameters

model_path – model path, only support format like ‘a/b/c’, where c is the model name
is_test – whether is test mode

load_model(model_path: Union[str, Dict[PYU, str]], is_test=False)[source]#

Horizontal federated load model interface

Parameters

model_path – model path
is_test – whether is test mode

class secretflow.ml.nn.SLModel(base_model_dict: Dict[Device, Callable[[], tensorflow.keras.Model]] = {}, device_y: PYU = None, model_fuse: Callable[[], tensorflow.keras.Model] = None, compressor: Compressor = None, dp_strategy_dict: Dict[Device, DPStrategy] = None, **kwargs)[source]#

Bases: object

Methods:

`__init__`([base_model_dict, device_y, ...])
`handle_data`(x[, y, sample_weight, ...])
`fit`(x, y[, batch_size, epochs, verbose, ...])	Vertical split learning training interface
`predict`(x[, batch_size, verbose, ...])	Vertical split learning offline prediction interface
`evaluate`(x, y[, batch_size, sample_weight, ...])	Vertical split learning evaluate interface
`save_model`([base_model_path, ...])	Vertical split learning save model interface
`load_model`([base_model_path, ...])	Vertical split learning load model interface

__init__(base_model_dict: Dict[Device, Callable[[], tensorflow.keras.Model]] = {}, device_y: PYU = None, model_fuse: Callable[[], tensorflow.keras.Model] = None, compressor: Compressor = None, dp_strategy_dict: Dict[Device, DPStrategy] = None, **kwargs)[source]#

handle_data(x: Union[VDataFrame, FedNdarray, List[Union[HDataFrame, VDataFrame, FedNdarray]]], y: Optional[Union[FedNdarray, VDataFrame, PYUObject]] = None, sample_weight: Optional[Union[FedNdarray, VDataFrame]] = None, batch_size=32, shuffle=False, epochs=1, stage='train', random_seed=1234, dataset_builder: Optional[Callable] = None)[source]#

fit(x: Union[VDataFrame, FedNdarray, List[Union[HDataFrame, VDataFrame, FedNdarray]]], y: Union[VDataFrame, FedNdarray, PYUObject], batch_size=32, epochs=1, verbose=1, callbacks=None, validation_data=None, shuffle=False, sample_weight=None, validation_freq=1, dp_spent_step_freq=None, dataset_builder: Optional[Callable[[List], Tuple[int, Iterable]]] = None, audit_log_dir: Optional[str] = None, random_seed: Optional[int] = None)[source]#

Vertical split learning training interface

Parameters

x – Input data. It could be:
VDataFrame (-) – a vertically aligned dataframe.
FedNdArray (-) – a vertically aligned ndarray.
List[Union[HDataFrame (-) – list of dataframe or ndarray.
VDataFrame – list of dataframe or ndarray.
FedNdarray]] – list of dataframe or ndarray.
y – Target data. It could be a VDataFrame or FedNdarray which has only one partition, or a PYUObject.
batch_size – Number of samples per gradient update.
epochs – Number of epochs to train the model
verbose – 0, 1. Verbosity mode
callbacks – List of keras.callbacks.Callback instances.
validation_data – Data on which to validate
shuffle – Whether shuffle dataset or not
validation_freq – specifies how many training epochs to run before a new validation run is performed
sample_weight – weights for the training samples
dp_spent_step_freq – specifies how many training steps to check the budget of dp
dataset_builder – Callable function, its input is x or [x, y] if y is set, it should return a iterable dataset which should has steps_per_epoch property. Dataset builder is mainly for building graph dataset.

predict(x: Union[VDataFrame, FedNdarray, List[Union[HDataFrame, VDataFrame, FedNdarray]]], batch_size=32, verbose=0, dataset_builder: Optional[Callable[[List], Tuple[int, Iterable]]] = None, compress: bool = False)[source]#

Vertical split learning offline prediction interface

Parameters

x – Input data. It could be:
VDataFrame (-) – a vertically aligned dataframe.
FedNdArray (-) – a vertically aligned ndarray.
List[Union[HDataFrame (-) – list of dataframe or ndarray.
VDataFrame – list of dataframe or ndarray.
FedNdarray]] – list of dataframe or ndarray.
batch_size – Number of samples per gradient update, Int
verbose – 0, 1. Verbosity mode
dataset_builder – Callable function, its input is x or [x, y] if y is set, it should return steps_per_epoch and iterable dataset. Dataset builder is mainly for building graph dataset.
compress – Whether to use compressor to compress cross device data.

evaluate(x: Union[VDataFrame, FedNdarray, List[Union[HDataFrame, VDataFrame, FedNdarray]]], y: Union[VDataFrame, FedNdarray, PYUObject], batch_size: int = 32, sample_weight=None, verbose=1, dataset_builder: Callable[[List], Tuple[int, Iterable]] = None, random_seed: int = None, compress: bool = False)[source]#

Vertical split learning evaluate interface

Parameters

x – Input data. It could be:
VDataFrame (-) – a vertically aligned dataframe.
FedNdArray (-) – a vertically aligned ndarray.
List[Union[HDataFrame (-) – list of dataframe or ndarray.
VDataFrame – list of dataframe or ndarray.
FedNdarray]] – list of dataframe or ndarray.
y – Target data. It could be a VDataFrame or FedNdarray which has only one partition, or a PYUObject.
batch_size – Integer or Dict. Number of samples per batch of computation. If unspecified, batch_size will default to 32.
sample_weight – Optional Numpy array of weights for the test samples, used for weighting the loss function.
verbose – Verbosity mode. 0 = silent, 1 = progress bar.
dataset_builder – Callable function, its input is x or [x, y] if y is set, it should return steps_per_epoch and iterable dataset. Dataset builder is mainly for building graph dataset.
compress – Whether to use compressor to compress cross device data.

Returns

federate evaluate result

Return type

metrics

save_model(base_model_path: Optional[Union[str, Dict[PYU, str]]] = None, fuse_model_path: Optional[str] = None, is_test=False, save_traces=True)[source]#

Vertical split learning save model interface

Parameters

base_model_path – base model path,only support format like ‘a/b/c’, where c is the model name
fuse_model_path – fuse model path
is_test – whether is test mode
save_traces – (only applies to SavedModel format) When enabled, the SavedModel will store the function traces for each layer.

load_model(base_model_path: Optional[Union[str, Dict[PYU, str]]] = None, fuse_model_path: Optional[str] = None, is_test=False, base_custom_objects=None, fuse_custom_objects=None)[source]#

Vertical split learning load model interface

Parameters

base_model_path – base model path
fuse_model_path – fuse model path
is_test – whether is test mode
base_custom_objects – Optional dictionary mapping names (strings) to custom classes or functions of the base model to be considered during deserialization
fuse_custom_objects – Optional dictionary mapping names (strings) to custom classes or functions of the base model to be considered during deserialization.