secretflow package#

Subpackages#

Module contents#

Classes:

HEU(config, spu_field_type)

Homomorphic encryption device

PYU(party[, node])

PYU is the device doing computation in single domain.

SPU(cluster_def[, link_desc, name])

Device(device_type)

DeviceObject(device)

HEUObject(device, data, location_party[, ...])

HEU Object

PYUObject(device, data)

PYU device object.

SPUObject(device, meta, shares)

Functions:

init([parties, address, num_cpus, ...])

Connect to an existing Ray cluster or start one and connect to it.

proxy(device_object_type[, max_concurrency])

Define a device class which should accept DeviceObject as method parameters and return DeviceObject.

reveal(func_or_object)

Get plaintext data from device.

shutdown()

Disconnect the worker, and terminate processes started by secretflow.init().

to(device, data[, spu_vis])

Device object conversion.

wait(objects)

Wait for device objects until all are ready or error occurrency.

class secretflow.HEU(config: dict, spu_field_type)[source]#

Bases: Device

Homomorphic encryption device

Methods:

__init__(config, spu_field_type)

Initialize HEU

init()

sk_keeper_name()

evaluator_names()

get_participant(party)

Get ray actor by name

has_party(party)

__init__(config: dict, spu_field_type)[source]#

Initialize HEU

Parameters
  • config

    HEU init config, for example

    {
        'sk_keeper': {
            'party': 'alice'
        },
        'evaluators': [{
            'party': 'bob'
        }],
        # The HEU working mode, choose from PHEU / LHEU / FHEU_ROUGH / FHEU
        'mode': 'PHEU',
        # TODO: cleartext_type should be migrated to HeObject.
        'encoding': {
            # DT_I1, DT_I8, DT_I16, DT_I32, DT_I64 or DT_FXP (default)
            'cleartext_type': "DT_FXP"
            # see https://heu.readthedocs.io/en/latest/getting_started/quick_start.html#id3 for detail
            # available encoders:
            #     - IntegerEncoder: Plaintext = Cleartext * scale
            #     - FloatEncoder (default): Plaintext = Cleartext * scale
            #     - BigintEncoder: Plaintext = Cleartext
            #     - BatchEncoder: Plaintext = Pack[Cleartext, Cleartext]
            'encoder': 'FloatEncoder'
        }
        'he_parameters': {
            'schema': 'paillier',
            'key_pair': {
                'generate': {
                    'bit_size': 2048,
                },
            }
        }
    }
    

  • spu_field_type – Field type in spu, Device.to operation requires the data scale of HEU to be aligned with SPU

init()[source]#
sk_keeper_name()[source]#
evaluator_names()[source]#
get_participant(party: str)[source]#

Get ray actor by name

has_party(party: str)[source]#
class secretflow.PYU(party: str, node: str = '')[source]#

Bases: Device

PYU is the device doing computation in single domain.

Essentially PYU is a python worker who can execute any python code.

Methods:

__init__(party[, node])

PYU contructor.

__init__(party: str, node: str = '')[source]#

PYU contructor.

Parameters
  • party (str) – Party name where this device is located.

  • node (str, optional) – Node name where the device is located. Defaults to “”.

class secretflow.SPU(cluster_def: Dict, link_desc: Optional[Dict] = None, name: str = 'SPU')[source]#

Bases: Device

Methods:

__init__(cluster_def[, link_desc, name])

SPU device constructor.

init()

Init SPU runtime in each party

reset()

Reset spu to clear corrupted internal state, for test only

psi_df(key, dfs, receiver[, protocol, ...])

Private set intersection with DataFrame.

psi_csv(key, input_path, output_path, receiver)

Private set intersection with csv file.

psi_join_df(key, dfs, receiver, join_party)

Private set intersection with csv file.

psi_join_csv(key, input_path, output_path, ...)

Private set intersection with csv file.

__init__(cluster_def: Dict, link_desc: Optional[Dict] = None, name: str = 'SPU')[source]#

SPU device constructor.

Parameters
  • cluster_def

    SPU cluster definition. More details refer to SPU runtime config.

    For example

    {
        'nodes': [
            {
                'party': 'alice',
                'id': 'local:0',
                # The address for other peers.
                'address': '127.0.0.1:9001',
                # The listen address of this node.
                # Optional. Address will be used if listen_address is empty.
                'listen_address': ''
            },
            {
                'party': 'bob',
                'id': 'local:1',
                'address': '127.0.0.1:9002',
                'listen_address': ''
            },
        ],
        'runtime_config': {
            'protocol': spu.spu_pb2.SEMI2K,
            'field': spu.spu_pb2.FM128,
            'sigmoid_mode': spu.spu_pb2.RuntimeConfig.SIGMOID_REAL,
        }
    }
    

  • link_desc

    optional. A dict specifies the link parameters. Available parameters are:

    1. connect_retry_times

    2. connect_retry_interval_ms

    3. recv_timeout_ms

    4. http_max_payload_size

    5. http_timeout_ms

    6. throttle_window_size

    7. brpc_channel_protocol refer to https://github.com/apache/incubator-brpc/blob/master/docs/en/client.md#protocols

    8. brpc_channel_connection_type refer to https://github.com/apache/incubator-brpc/blob/master/docs/en/client.md#connection-type

init()[source]#

Init SPU runtime in each party

reset()[source]#

Reset spu to clear corrupted internal state, for test only

psi_df(key: Union[str, List[str], Dict[Device, List[str]]], dfs: List[PYUObject], receiver: str, protocol='KKRT_PSI_2PC', precheck_input=True, sort=True, broadcast_result=True, bucket_size=1048576, curve_type='CURVE_25519')[source]#

Private set intersection with DataFrame.

Parameters
  • key (str, List[str], Dict[Device, List[str]]) – Column(s) used to join.

  • dfs (List[PYUObject]) – DataFrames to be joined, which

  • runtimes. (should be colocated with SPU) –

  • receiver (str) – Which party can get joined data, others will get None.

  • protocol (str) – PSI protocol.

  • precheck_input (bool) – Whether to check input data before join.

  • sort (bool) – Whether sort data by key after join.

  • broadcast_result (bool) – Whether to broadcast joined data to all parties.

  • bucket_size (int) – Specified the hash bucket size used in psi.

  • memory. (Larger values consume more) –

  • curve_type (str) – curve for ecdh psi

Returns

Joined DataFrames with order reserved.

Return type

List[PYUObject]

psi_csv(key: Union[str, List[str], Dict[Device, List[str]]], input_path: Union[str, Dict[Device, str]], output_path: Union[str, Dict[Device, str]], receiver: str, protocol='KKRT_PSI_2PC', precheck_input=True, sort=True, broadcast_result=True, bucket_size=1048576, curve_type='CURVE_25519')[source]#

Private set intersection with csv file.

Parameters
  • key (str, List[str], Dict[Device, List[str]]) – Column(s) used to join.

  • input_path – CSV files to be joined, comma seperated and contains header.

  • output_path – Joined csv files, comma seperated and contains header.

  • receiver (str) – Which party can get joined data.

  • -1. (Others won't generate output file and intersection_count get) –

  • protocol (str) – PSI protocol.

  • precheck_input (bool) – Whether check input data before joining,

  • now (for) –

  • duplicate. (it will check if key) –

  • sort (bool) – Whether sort data by key after joining.

  • broadcast_result (bool) – Whether broadcast joined data to all parties.

  • bucket_size (int) – Specified the hash bucket size used in psi.

  • memory. (Larger values consume more) –

Returns

PSI reports output by SPU with order reserved.

Return type

List[Dict]

psi_join_df(key: Union[str, List[str], Dict[Device, List[str]]], dfs: List[PYUObject], receiver: str, join_party: str, protocol='KKRT_PSI_2PC', precheck_input=True, bucket_size=1048576, curve_type='CURVE_25519')[source]#

Private set intersection with csv file.

Parameters
  • key (str, List[str], Dict[Device, List[str]]) – Column(s) used to join.

  • dfs (List[PYUObject]) – DataFrames to be joined, which should be colocated with SPU runtimes.

  • receiver (str) – Which party can get joined data. Others won’t generate output file and intersection_count get -1

  • join_party (str) – party can get joined data

  • protocol (str) – PSI protocol.

  • precheck_input (bool) – Whether check input data before joining, for now, it will check if key duplicate.

  • bucket_size (int) – Specified the hash bucket size used in psi. Larger values consume more memory.

  • curve_type (str) – curve for ecdh psi

Returns

Joined DataFrames with order reserved.

Return type

List[PYUObject]

psi_join_csv(key: Union[str, List[str], Dict[Device, List[str]]], input_path: Union[str, Dict[Device, str]], output_path: Union[str, Dict[Device, str]], receiver: str, join_party: str, protocol='KKRT_PSI_2PC', precheck_input=True, bucket_size=1048576, curve_type='CURVE_25519')[source]#

Private set intersection with csv file.

Parameters
  • key (str, List[str], Dict[Device, List[str]]) – Column(s) used to join.

  • input_path – CSV files to be joined, comma seperated and contains header.

  • output_path – Joined csv files, comma seperated and contains header.

  • receiver (str) – Which party can get joined data. Others won’t generate output file and intersection_count get -1

  • join_party (str) – party can get joined data

  • protocol (str) – PSI protocol.

  • precheck_input (bool) – Whether check input data before joining, for now, it will check if key duplicate.

  • bucket_size (int) – Specified the hash bucket size used in psi. Larger values consume more memory.

  • curve_type (str) – curve for ecdh psi

Returns

PSI reports output by SPU with order reserved.

Return type

List[Dict]

class secretflow.Device(device_type: DeviceType)[source]#

Bases: ABC

Methods:

__init__(device_type)

Abstraction device base class.

Attributes:

device_type

Get underlying device type

__init__(device_type: DeviceType)[source]#

Abstraction device base class.

Parameters

device_type (DeviceType) – underlying device type

property device_type#

Get underlying device type

class secretflow.DeviceObject(device: Device)[source]#

Bases: ABC

Methods:

__init__(device)

Abstraction device object base class.

to(device[, config])

Device object conversion.

Attributes:

device_type

Get underlying device type

__init__(device: Device)[source]#

Abstraction device object base class.

Parameters

device (Device) – Device where this object is located.

property device_type#

Get underlying device type

to(device: Device, config: Optional[MoveConfig] = None)[source]#

Device object conversion.

Parameters
  • device (Device) – Target device

  • config – configuration of this data movement

Returns

Target device object.

Return type

DeviceObject

class secretflow.HEUObject(device, data: ObjectRef, location_party: str, is_plain: bool = False)[source]#

Bases: DeviceObject

HEU Object

data#

The data hold by this Heu object

location#

The party where the data actually resides

is_plain#

Is the data encrypted or not

Methods:

__init__(device, data, location_party[, ...])

Abstraction device object base class.

encrypt([heu_audit_log])

Force encrypt if data is plaintext

sum()

Sum of HeObject elements over a given axis.

dump(path)

Dump ciphertext into files.

__init__(device, data: ObjectRef, location_party: str, is_plain: bool = False)[source]#

Abstraction device object base class.

Parameters

device (Device) – Device where this object is located.

encrypt(heu_audit_log: Optional[str] = None)[source]#

Force encrypt if data is plaintext

sum()[source]#

Sum of HeObject elements over a given axis.

Returns

sum_along_axis

dump(path)[source]#

Dump ciphertext into files.

class secretflow.PYUObject(device: PYU, data: ObjectRef)[source]#

Bases: DeviceObject

PYU device object.

data#

Reference to underlying data.

Type

ray.ObjectRef

Methods:

__init__(device, data)

Abstraction device object base class.

__init__(device: PYU, data: ObjectRef)[source]#

Abstraction device object base class.

Parameters

device (Device) – Device where this object is located.

class secretflow.SPUObject(device: Device, meta: ObjectRef, shares: Sequence[ObjectRef])[source]#

Bases: DeviceObject

Methods:

__init__(device, meta, shares)

SPUObject refers to a Python Object which could be flattened to a list of SPU Values.

__init__(device: Device, meta: ObjectRef, shares: Sequence[ObjectRef])[source]#

SPUObject refers to a Python Object which could be flattened to a list of SPU Values. A SPU value is a Numpy array or equivalent. e.g.

1. If referred Python object is [1,2,3] Then meta would be referred to a single SPUValueMeta, and shares is a list of referrence to pieces of share of [1,2,3].

2. If referred Python object is {‘a’: 1, ‘b’: [3, np.array(…)]} The meta would be referred to something like {‘a’: SPUValueMeta1, ‘b’: [SPUValueMeta2, SPUValueMeta3]} Each element of shares would be referred to something like {‘a’: share1, ‘b’: [share2, share3]}

3. shares is a list of ObjectRef to share slices while these share slices are not necessarily located at SPU device. The data transfer would only happen when SPU device consumes SPU objects.

Parameters
  • meta – Ref to the metadata.

  • refs (Sequence[ray.ObjectRef]) – Refs to shares of data.

secretflow.init(parties: Optional[Union[str, List[str]]] = None, address: Optional[str] = None, num_cpus: Optional[int] = None, log_to_driver=False, omp_num_threads: Optional[int] = None, **kwargs)[source]#

Connect to an existing Ray cluster or start one and connect to it.

Parameters
  • parties – parties this node represents, e.g: ‘alice’, [‘alice’, ‘bob’, ‘carol’].

  • address – The address of the Ray cluster to connect to. If this address is not provided, then a raylet, a plasma store, a plasma manager, and some workers will be started.

  • num_cpus – Number of CPUs the user wishes to assign to each raylet.

  • log_to_driver – Whether direct output of worker processes on all nodes to driver.

  • omp_num_threads – set environment variable OMP_NUM_THREADS. It works only when address is None.

  • **kwargs – see ray.init() parameters.

secretflow.proxy(device_object_type: Type[DeviceObject], max_concurrency=None)[source]#

Define a device class which should accept DeviceObject as method parameters and return DeviceObject.

This proxy function mainly does the following work: 1. Add an additional parameter device: Device to init method __init__. 2. Wrap class methods, allow passing DeviceObject as parameters, which must be on the same device as the class instance. 3. According to the return annotation of class methods, return the corresponding number of DeviceObject.

@proxy(PYUObject)
class Model:
    def __init__(self, builder):
        self.weights = builder()

    def build_dataset(self, x, y):
        self.dataset_x = x
        self.dataset_y = y

    def get_weights(self) -> np.ndarray:
        return self.weights

    def train_step(self, step) -> Tuple[np.ndarray, int]:
        return self.weights, 100

alice = PYU('alice')
model = Model(builder, device=alice)
x, y = alice(load_data)()
model.build_dataset(x, y)
w = model.get_weights()
w, n = model.train_step(10)
Parameters
  • device_object_type (Type[DeviceObject]) – DeviceObject type, eg. PYUObject.

  • max_concurrency (int) – Actor threadpool size.

Returns

Wrapper function.

Return type

Callable

secretflow.reveal(func_or_object)[source]#

Get plaintext data from device.

NOTE: Use this function with extreme caution, as it may cause privacy leaks. In SecretFlow, we recommend that data should flow between different devices and rarely revealed to driver. Only use this function when data dependency control flow occurs.

Parameters

func_or_object – May be callable or any Python objects which contains Device objects.

secretflow.shutdown()[source]#

Disconnect the worker, and terminate processes started by secretflow.init().

This will automatically run at the end when a Python process that uses Ray exits. It is ok to run this twice in a row. The primary use case for this function is to cleanup state between tests.

secretflow.to(device: Device, data: Any, spu_vis: str = 'secret')[source]#

Device object conversion.

Parameters
  • device (Device) – Target device.

  • data (Any) – DeviceObject or plaintext data.

  • spu_vis (str) – Deivce object visibility, SPU device only. secret: Secret sharing with protocol spdz-2k, aby3, etc. public: Public sharing, which means data will be replicated to each node.

Returns

Target device object.

Return type

DeviceObject

secretflow.wait(objects: Any)[source]#

Wait for device objects until all are ready or error occurrency.

Parameters

objects – struct of device objects.