secretflow package#
Subpackages#
- secretflow.data package
- secretflow.device package
- secretflow.ml package
- secretflow.preprocessing package
- secretflow.security package
- secretflow.stats package
- Subpackages
- Submodules
- secretflow.stats.biclassification_eval module
- secretflow.stats.psi_eval module
- secretflow.stats.pva_eval module
- secretflow.stats.regression_eval module
- secretflow.stats.score_card module
- secretflow.stats.ss_pearsonr_v module
- secretflow.stats.ss_pvalue_v module
- secretflow.stats.ss_vif_v module
- secretflow.stats.table_statistics module
- Module contents
- secretflow.utils package
- Subpackages
- Submodules
- secretflow.utils.compressor module
- secretflow.utils.errors module
- secretflow.utils.hash module
- secretflow.utils.io module
- secretflow.utils.ndarray_bigint module
- secretflow.utils.ndarray_encoding module
- secretflow.utils.sigmoid module
- secretflow.utils.testing module
- Module contents
Module contents#
Classes:
|
Homomorphic encryption device |
|
PYU is the device doing computation in single domain. |
|
|
|
|
|
|
|
HEU Object |
|
PYU device object. |
|
Functions:
|
Connect to an existing Ray cluster or start one and connect to it. |
|
Define a device class which should accept DeviceObject as method parameters and return DeviceObject. |
|
Get plaintext data from device. |
|
Disconnect the worker, and terminate processes started by secretflow.init(). |
|
Device object conversion. |
|
Wait for device objects until all are ready or error occurrency. |
- class secretflow.HEU(config: dict, spu_field_type)[源代码]#
基类:
Device
Homomorphic encryption device
Methods:
__init__
(config, spu_field_type)Initialize HEU
init
()get_participant
(party)Get ray actor by name
has_party
(party)- __init__(config: dict, spu_field_type)[源代码]#
Initialize HEU
- 参数
config –
HEU init config, for example
{ 'sk_keeper': { 'party': 'alice' }, 'evaluators': [{ 'party': 'bob' }], # The HEU working mode, choose from PHEU / LHEU / FHEU_ROUGH / FHEU 'mode': 'PHEU', # TODO: cleartext_type should be migrated to HeObject. 'encoding': { # DT_I1, DT_I8, DT_I16, DT_I32, DT_I64 or DT_FXP (default) 'cleartext_type': "DT_FXP" # see https://heu.readthedocs.io/en/latest/getting_started/quick_start.html#id3 for detail # available encoders: # - IntegerEncoder: Plaintext = Cleartext * scale # - FloatEncoder (default): Plaintext = Cleartext * scale # - BigintEncoder: Plaintext = Cleartext # - BatchEncoder: Plaintext = Pack[Cleartext, Cleartext] 'encoder': 'FloatEncoder' } 'he_parameters': { 'schema': 'paillier', 'key_pair': { 'generate': { 'bit_size': 2048, }, } } }
spu_field_type – Field type in spu, Device.to operation requires the data scale of HEU to be aligned with SPU
- class secretflow.PYU(party: str, node: str = '')[源代码]#
基类:
Device
PYU is the device doing computation in single domain.
Essentially PYU is a python worker who can execute any python code.
Methods:
__init__
(party[, node])PYU contructor.
- class secretflow.SPU(cluster_def: Dict, link_desc: Optional[Dict] = None, name: str = 'SPU')[源代码]#
基类:
Device
Methods:
__init__
(cluster_def[, link_desc, name])SPU device constructor.
init
()Init SPU runtime in each party
reset
()Reset spu to clear corrupted internal state, for test only
psi_df
(key, dfs, receiver[, protocol, ...])Private set intersection with DataFrame.
psi_csv
(key, input_path, output_path, receiver)Private set intersection with csv file.
psi_join_df
(key, dfs, receiver, join_party)Private set intersection with csv file.
psi_join_csv
(key, input_path, output_path, ...)Private set intersection with csv file.
- __init__(cluster_def: Dict, link_desc: Optional[Dict] = None, name: str = 'SPU')[源代码]#
SPU device constructor.
- 参数
cluster_def –
SPU cluster definition. More details refer to SPU runtime config.
For example
{ 'nodes': [ { 'party': 'alice', 'id': 'local:0', # The address for other peers. 'address': '127.0.0.1:9001', # The listen address of this node. # Optional. Address will be used if listen_address is empty. 'listen_address': '' }, { 'party': 'bob', 'id': 'local:1', 'address': '127.0.0.1:9002', 'listen_address': '' }, ], 'runtime_config': { 'protocol': spu.spu_pb2.SEMI2K, 'field': spu.spu_pb2.FM128, 'sigmoid_mode': spu.spu_pb2.RuntimeConfig.SIGMOID_REAL, } }
link_desc –
optional. A dict specifies the link parameters. Available parameters are:
connect_retry_times
connect_retry_interval_ms
recv_timeout_ms
http_max_payload_size
http_timeout_ms
throttle_window_size
brpc_channel_protocol refer to https://github.com/apache/incubator-brpc/blob/master/docs/en/client.md#protocols
brpc_channel_connection_type refer to https://github.com/apache/incubator-brpc/blob/master/docs/en/client.md#connection-type
- psi_df(key: Union[str, List[str], Dict[Device, List[str]]], dfs: List[PYUObject], receiver: str, protocol='KKRT_PSI_2PC', precheck_input=True, sort=True, broadcast_result=True, bucket_size=1048576, curve_type='CURVE_25519')[源代码]#
Private set intersection with DataFrame.
- 参数
key (str, List[str], Dict[Device, List[str]]) – Column(s) used to join.
dfs (List[PYUObject]) – DataFrames to be joined, which
runtimes. (should be colocated with SPU) –
receiver (str) – Which party can get joined data, others will get None.
protocol (str) – PSI protocol.
precheck_input (bool) – Whether to check input data before join.
sort (bool) – Whether sort data by key after join.
broadcast_result (bool) – Whether to broadcast joined data to all parties.
bucket_size (int) – Specified the hash bucket size used in psi.
memory. (Larger values consume more) –
curve_type (str) – curve for ecdh psi
- 返回
Joined DataFrames with order reserved.
- 返回类型
List[PYUObject]
- psi_csv(key: Union[str, List[str], Dict[Device, List[str]]], input_path: Union[str, Dict[Device, str]], output_path: Union[str, Dict[Device, str]], receiver: str, protocol='KKRT_PSI_2PC', precheck_input=True, sort=True, broadcast_result=True, bucket_size=1048576, curve_type='CURVE_25519')[源代码]#
Private set intersection with csv file.
- 参数
key (str, List[str], Dict[Device, List[str]]) – Column(s) used to join.
input_path – CSV files to be joined, comma seperated and contains header.
output_path – Joined csv files, comma seperated and contains header.
receiver (str) – Which party can get joined data.
-1. (Others won't generate output file and intersection_count get) –
protocol (str) – PSI protocol.
precheck_input (bool) – Whether check input data before joining,
now (for) –
duplicate. (it will check if key) –
sort (bool) – Whether sort data by key after joining.
broadcast_result (bool) – Whether broadcast joined data to all parties.
bucket_size (int) – Specified the hash bucket size used in psi.
memory. (Larger values consume more) –
- 返回
PSI reports output by SPU with order reserved.
- 返回类型
List[Dict]
- psi_join_df(key: Union[str, List[str], Dict[Device, List[str]]], dfs: List[PYUObject], receiver: str, join_party: str, protocol='KKRT_PSI_2PC', precheck_input=True, bucket_size=1048576, curve_type='CURVE_25519')[源代码]#
Private set intersection with csv file.
- 参数
key (str, List[str], Dict[Device, List[str]]) – Column(s) used to join.
dfs (List[PYUObject]) – DataFrames to be joined, which should be colocated with SPU runtimes.
receiver (str) – Which party can get joined data. Others won’t generate output file and intersection_count get -1
join_party (str) – party can get joined data
protocol (str) – PSI protocol.
precheck_input (bool) – Whether check input data before joining, for now, it will check if key duplicate.
bucket_size (int) – Specified the hash bucket size used in psi. Larger values consume more memory.
curve_type (str) – curve for ecdh psi
- 返回
Joined DataFrames with order reserved.
- 返回类型
List[PYUObject]
- psi_join_csv(key: Union[str, List[str], Dict[Device, List[str]]], input_path: Union[str, Dict[Device, str]], output_path: Union[str, Dict[Device, str]], receiver: str, join_party: str, protocol='KKRT_PSI_2PC', precheck_input=True, bucket_size=1048576, curve_type='CURVE_25519')[源代码]#
Private set intersection with csv file.
- 参数
key (str, List[str], Dict[Device, List[str]]) – Column(s) used to join.
input_path – CSV files to be joined, comma seperated and contains header.
output_path – Joined csv files, comma seperated and contains header.
receiver (str) – Which party can get joined data. Others won’t generate output file and intersection_count get -1
join_party (str) – party can get joined data
protocol (str) – PSI protocol.
precheck_input (bool) – Whether check input data before joining, for now, it will check if key duplicate.
bucket_size (int) – Specified the hash bucket size used in psi. Larger values consume more memory.
curve_type (str) – curve for ecdh psi
- 返回
PSI reports output by SPU with order reserved.
- 返回类型
List[Dict]
- class secretflow.Device(device_type: DeviceType)[源代码]#
基类:
ABC
Methods:
__init__
(device_type)Abstraction device base class.
Attributes:
Get underlying device type
- __init__(device_type: DeviceType)[源代码]#
Abstraction device base class.
- 参数
device_type (DeviceType) – underlying device type
- property device_type#
Get underlying device type
- class secretflow.DeviceObject(device: Device)[源代码]#
基类:
ABC
Methods:
__init__
(device)Abstraction device object base class.
to
(device[, config])Device object conversion.
Attributes:
Get underlying device type
- __init__(device: Device)[源代码]#
Abstraction device object base class.
- 参数
device (Device) – Device where this object is located.
- property device_type#
Get underlying device type
- to(device: Device, config: Optional[MoveConfig] = None)[源代码]#
Device object conversion.
- 参数
device (Device) – Target device
config – configuration of this data movement
- 返回
Target device object.
- 返回类型
- class secretflow.HEUObject(device, data: ObjectRef, location_party: str, is_plain: bool = False)[源代码]#
基类:
DeviceObject
HEU Object
- data#
The data hold by this Heu object
- location#
The party where the data actually resides
- is_plain#
Is the data encrypted or not
Methods:
__init__
(device, data, location_party[, ...])Abstraction device object base class.
encrypt
([heu_audit_log])Force encrypt if data is plaintext
sum
()Sum of HeObject elements over a given axis.
dump
(path)Dump ciphertext into files.
- class secretflow.PYUObject(device: PYU, data: ObjectRef)[源代码]#
基类:
DeviceObject
PYU device object.
- data#
Reference to underlying data.
- Type
ray.ObjectRef
Methods:
__init__
(device, data)Abstraction device object base class.
- class secretflow.SPUObject(device: Device, meta: ObjectRef, shares: Sequence[ObjectRef])[源代码]#
基类:
DeviceObject
Methods:
__init__
(device, meta, shares)SPUObject refers to a Python Object which could be flattened to a list of SPU Values.
- __init__(device: Device, meta: ObjectRef, shares: Sequence[ObjectRef])[源代码]#
SPUObject refers to a Python Object which could be flattened to a list of SPU Values. A SPU value is a Numpy array or equivalent. e.g.
1. If referred Python object is [1,2,3] Then meta would be referred to a single SPUValueMeta, and shares is a list of referrence to pieces of share of [1,2,3].
2. If referred Python object is {‘a’: 1, ‘b’: [3, np.array(…)]} The meta would be referred to something like {‘a’: SPUValueMeta1, ‘b’: [SPUValueMeta2, SPUValueMeta3]} Each element of shares would be referred to something like {‘a’: share1, ‘b’: [share2, share3]}
3. shares is a list of ObjectRef to share slices while these share slices are not necessarily located at SPU device. The data transfer would only happen when SPU device consumes SPU objects.
- 参数
meta – Ref to the metadata.
refs (Sequence[ray.ObjectRef]) – Refs to shares of data.
- secretflow.init(parties: Optional[Union[List[str], str]] = None, address: Optional[str] = None, num_cpus: Optional[int] = None, log_to_driver=False, omp_num_threads: Optional[int] = None, **kwargs)[源代码]#
Connect to an existing Ray cluster or start one and connect to it.
- 参数
parties – parties this node represents, e.g: ‘alice’, [‘alice’, ‘bob’, ‘carol’].
address – The address of the Ray cluster to connect to. If this address is not provided, then a raylet, a plasma store, a plasma manager, and some workers will be started.
num_cpus – Number of CPUs the user wishes to assign to each raylet.
log_to_driver – Whether direct output of worker processes on all nodes to driver.
omp_num_threads – set environment variable OMP_NUM_THREADS. It works only when address is None.
**kwargs – see
ray.init()
parameters.
- secretflow.proxy(device_object_type: Type[DeviceObject], max_concurrency=None)[源代码]#
Define a device class which should accept DeviceObject as method parameters and return DeviceObject.
This proxy function mainly does the following work: 1. Add an additional parameter device: Device to init method __init__. 2. Wrap class methods, allow passing DeviceObject as parameters, which must be on the same device as the class instance. 3. According to the return annotation of class methods, return the corresponding number of DeviceObject.
@proxy(PYUObject) class Model: def __init__(self, builder): self.weights = builder() def build_dataset(self, x, y): self.dataset_x = x self.dataset_y = y def get_weights(self) -> np.ndarray: return self.weights def train_step(self, step) -> Tuple[np.ndarray, int]: return self.weights, 100 alice = PYU('alice') model = Model(builder, device=alice) x, y = alice(load_data)() model.build_dataset(x, y) w = model.get_weights() w, n = model.train_step(10)
- 参数
device_object_type (Type[DeviceObject]) – DeviceObject type, eg. PYUObject.
max_concurrency (int) – Actor threadpool size.
- 返回
Wrapper function.
- 返回类型
Callable
- secretflow.reveal(func_or_object)[源代码]#
Get plaintext data from device.
NOTE: Use this function with extreme caution, as it may cause privacy leaks. In SecretFlow, we recommend that data should flow between different devices and rarely revealed to driver. Only use this function when data dependency control flow occurs.
- 参数
func_or_object – May be callable or any Python objects which contains Device objects.
- secretflow.shutdown()[源代码]#
Disconnect the worker, and terminate processes started by secretflow.init().
This will automatically run at the end when a Python process that uses Ray exits. It is ok to run this twice in a row. The primary use case for this function is to cleanup state between tests.
- secretflow.to(device: Device, data: Any, spu_vis: str = 'secret')[源代码]#
Device object conversion.
- 参数
device (Device) – Target device.
data (Any) – DeviceObject or plaintext data.
spu_vis (str) – Deivce object visibility, SPU device only. secret: Secret sharing with protocol spdz-2k, aby3, etc. public: Public sharing, which means data will be replicated to each node.
- 返回
Target device object.
- 返回类型