在 SecretFlow 中使用自定义 DataBuilder(TensorFlow)#
以下代码仅作为示例,请勿在生产环境直接使用。
本教程将展示下,怎样在 SecretFlow 的多方安全环境中,如何使用自定义 DataBuilder 模式加载数据,并训练模型。本教程将使用 Flower 数据集的图像分类任务来进行介绍,如何使用自定义 DataBuilder 完成联邦学习。
环境设置#
[1]:
%load_ext autoreload
%autoreload 2
[2]:
import secretflow as sf
# Check the version of your SecretFlow
print('The version of SecretFlow: {}'.format(sf.__version__))
# In case you have a running secretflow runtime already.
sf.shutdown()
sf.init(['alice', 'bob', 'charlie'], address="local", log_to_driver=False)
alice, bob, charlie = sf.PYU('alice'), sf.PYU('bob'), sf.PYU('charlie')
2023-04-17 15:18:33,602 INFO worker.py:1538 -- Started a local Ray instance.
接口介绍#
我们在 SecretFlow 的 FLModel 中支持了自定义 DataBuilder 的读取方式,可以方便用户根据需求更灵活的处理数据输入。下面我们以一个例子来展示下,如何使用自定义 DataBuilder 来进行联邦模型训练。
使用 DataBuilder 的步骤:
使用单机版本引擎(tensorflow,pytorch)进行开发,得到 Dataset 的 Builder 函数。
将各方的 Builder 函数进行 wrap,得到
create_dataset_builder。注:dataset_builder 函数需要传入 stage 参数构造 data_builder_dict [PYU, dataset_builder]。
将得到的 data_builder_dict 传入
fit函数的dataset_builder。同时 x 参数位置传入 dataset_builder 中需要的输入。(比如:本例中传入的输入是实际使用的图像路径)。
在 FLModel 中使用 DataBuilder 需要预先定义 data_builder_dict。需要能够返回 tf.dataset 和 steps_per_epoch。而且各方返回的 steps_per_epoch 必须保持一致。
data_builder_dict =
{
alice: create_alice_dataset_builder(
batch_size=32,
), # create_alice_dataset_builder must return (Dataset, steps_per_epoch)
bob: create_bob_dataset_builder(
batch_size=32,
), # create_bob_dataset_builder must return (Dataset, steps_per_epochstep_per_epochs)
}
下载数据#
Flower 数据集介绍:flower 数据集是一个包含了 5 种花卉(雏菊、蒲公英、玫瑰、向日葵、郁金香)共计 4323 张彩色图片的数据集。每种花卉都有多个角度和不同光照下的图片,每张图片的分辨率为 320x240。这个数据集常用于图像分类和机器学习算法的训练与测试。数据集中每个类别的数量分别是:daisy(633),dandelion(898),rose(641),sunflower(699),tulip(852)
下载地址: http://download.tensorflow.org/example_images/flower_photos.tgz

下载数据并解压#
[3]:
import tempfile
import tensorflow as tf
_temp_dir = tempfile.mkdtemp()
path_to_flower_dataset = tf.keras.utils.get_file(
"flower_photos",
"https://secretflow-data.oss-accelerate.aliyuncs.com/datasets/tf_flowers/flower_photos.tgz",
untar=True,
cache_dir=_temp_dir,
)
Downloading data from https://secretflow-data.oss-accelerate.aliyuncs.com/datasets/tf_flowers/flower_photos.tgz
67588319/67588319 [==============================] - 1s 0us/step
接下来我们开始构造自定义 DataBuilder
1. 使用单机引擎开发 DataBuilder#
我们在开发 DataBuilder 的时候可以自由的按照单机开发的逻辑即可。目的是构建一个 tf.dataset 对象即可。
[4]:
import math
import tensorflow as tf
img_height = 180
img_width = 180
batch_size = 32
# In this example, we use the TensorFlow interface for development.
data_set = tf.keras.utils.image_dataset_from_directory(
path_to_flower_dataset,
validation_split=0.2,
subset="both",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size,
)
Found 436 files belonging to 5 classes.
Using 349 files for training.
Using 87 files for validation.
2023-04-10 13:16:34.492390: E tensorflow/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
[5]:
train_set = data_set[0]
test_set = data_set[1]
[6]:
print(type(train_set), type(test_set))
<class 'tensorflow.python.data.ops.dataset_ops.BatchDataset'> <class 'tensorflow.python.data.ops.dataset_ops.BatchDataset'>
[7]:
x, y = next(iter(train_set))
print(f"x.shape = {x.shape}")
print(f"y.shape = {y.shape}")
x.shape = (32, 180, 180, 3)
y.shape = (32,)
2. 将开发完成的 DataBuilder 进行包装(wrap)#
我们开发好的 DataBuilder 在运行是需要分发到各个执行机器上去执行,为了序列化,我们需要把他们进行 wrap。需要注意的是: FLModel 要求传入的DataBuilder需要返回两个结果(data_set,steps_per_epoch)。
[8]:
def create_dataset_builder(
batch_size=32,
):
def dataset_builder(folder_path, stage="train"):
import math
import tensorflow as tf
img_height = 180
img_width = 180
data_set = tf.keras.utils.image_dataset_from_directory(
folder_path,
validation_split=0.2,
subset="both",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size,
)
if stage == "train":
train_dataset = data_set[0]
train_step_per_epoch = math.ceil(len(data_set[0].file_paths) / batch_size)
return train_dataset, train_step_per_epoch
elif stage == "eval":
eval_dataset = data_set[1]
eval_step_per_epoch = math.ceil(len(data_set[1].file_paths) / batch_size)
return eval_dataset, eval_step_per_epoch
return dataset_builder
3. 构建 dataset_builder_dict#
在水平场景,我们各方处理数据的逻辑是一样的,所以只需要一个 wrap 后的DataBuilder构造方法即可。接下来我们构建 dataset_builder_dict
[9]:
data_builder_dict = {
alice: create_dataset_builder(
batch_size=32,
),
bob: create_dataset_builder(
batch_size=32,
),
}
4. 得到 dataset_builder_dict 后我们就可以传入模型进行使用了#
接下来我们定义模型,并使用上面构造好的自定义数据进行训练
[10]:
def create_conv_flower_model(input_shape, num_classes, name='model'):
def create_model():
from tensorflow import keras
# Create model
model = keras.Sequential(
[
keras.Input(shape=input_shape),
tf.keras.layers.Rescaling(1.0 / 255),
tf.keras.layers.Conv2D(32, 3, activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Conv2D(32, 3, activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Conv2D(32, 3, activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(num_classes),
]
)
# Compile model
model.compile(
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer='adam',
metrics=["accuracy"],
)
return model
return create_model
[11]:
from secretflow.ml.nn import FLModel
from secretflow.security.aggregation import SecureAggregator
[12]:
device_list = [alice, bob]
aggregator = SecureAggregator(charlie, [alice, bob])
# prepare model
num_classes = 5
input_shape = (180, 180, 3)
# keras model
model = create_conv_flower_model(input_shape, num_classes)
fed_model = FLModel(
device_list=device_list,
model=model,
aggregator=aggregator,
backend="tensorflow",
strategy="fed_avg_w",
random_seed=1234,
)
INFO:root:Create proxy actor <class 'secretflow.security.aggregation.secure_aggregator._Masker'> with party alice.
INFO:root:Create proxy actor <class 'secretflow.security.aggregation.secure_aggregator._Masker'> with party bob.
INFO:root:Create proxy actor <class 'secretflow.ml.nn.fl.backend.tensorflow.strategy.fed_avg_w.PYUFedAvgW'> with party alice.
INFO:root:Create proxy actor <class 'secretflow.ml.nn.fl.backend.tensorflow.strategy.fed_avg_w.PYUFedAvgW'> with party bob.
我们构造好的 dataset builder 的输入是图像数据集的路径,所以这里需要将输入的数据设置为一个 Dict。
data = {
alice: folder_path_of_alice,
bob: folder_path_of_bob
}
[13]:
data = {
alice: path_to_flower_dataset,
bob: path_to_flower_dataset,
}
history = fed_model.fit(
data,
None,
validation_data=data,
epochs=5,
batch_size=32,
aggregate_freq=2,
sampler_method="batch",
random_seed=1234,
dp_spent_step_freq=1,
dataset_builder=data_builder_dict,
)
INFO:root:FL Train Params: {'self': <secretflow.ml.nn.fl.fl_model.FLModel object at 0x7f7b7a28b8e0>, 'x': {alice: '../../public_dataset/datasets/flower_photos', bob: '../../public_dataset/datasets/flower_photos'}, 'y': None, 'batch_size': 32, 'batch_sampling_rate': None, 'epochs': 5, 'verbose': 1, 'callbacks': None, 'validation_data': {alice: '../../public_dataset/datasets/flower_photos', bob: '../../public_dataset/datasets/flower_photos'}, 'shuffle': False, 'class_weight': None, 'sample_weight': None, 'validation_freq': 1, 'aggregate_freq': 2, 'label_decoder': None, 'max_batch_size': 20000, 'prefetch_buffer_size': None, 'sampler_method': 'batch', 'random_seed': 1234, 'dp_spent_step_freq': 1, 'audit_log_dir': None, 'dataset_builder': {alice: <function create_dataset_builder.<locals>.dataset_builder at 0x7f7b7a2bb1f0>, bob: <function create_dataset_builder.<locals>.dataset_builder at 0x7f7b7a2bb0d0>}}
32it [00:18, 1.71it/s, epoch: 1/5 - loss:1.5339548587799072 accuracy:0.3142559826374054 val_loss:1.582740068435669 val_accuracy:0.2874999940395355 ]
100%|██████████| 8/8 [00:05<00:00, 1.51it/s, epoch: 2/5 - loss:1.4520319700241089 accuracy:0.36693549156188965 val_loss:1.3319271802902222 val_accuracy:0.40416666865348816 ]
100%|██████████| 8/8 [00:05<00:00, 1.54it/s, epoch: 3/5 - loss:1.2720597982406616 accuracy:0.45766130089759827 val_loss:1.3382091522216797 val_accuracy:0.47083333134651184 ]
100%|██████████| 8/8 [00:05<00:00, 1.50it/s, epoch: 4/5 - loss:1.229131817817688 accuracy:0.5040322542190552 val_loss:1.3033963441848755 val_accuracy:0.4375 ]
100%|██████████| 8/8 [00:05<00:00, 1.59it/s, epoch: 5/5 - loss:1.3306885957717896 accuracy:0.4301075339317322 val_loss:2.1492652893066406 val_accuracy:0.25833332538604736 ]
接下来,您可以使用自己的数据集来进行尝试