在TensorFlow联合中应用差异隐私

问题描述

我尝试按照here中提供的两个示例对自己的数据集使用Tensorflow Privacy和TFF。在添加带有剪切和噪声的DP程序之前，我确保样本和目标的格式正确，并且一切正常。不幸的是，在任何使用dp的执行中，模型都是发散而不是收敛的，训练和验证损失在每一轮都增加。

Round  0,68.89s per round in average.
    Train: loss=5.172,accuracy=0.222
    Validation: loss=6.181,accuracy=0.002

Round  1,61.52s per round in average.
    Train: loss=4.087,accuracy=0.328
    Validation: loss=6.747,accuracy=0.002

Round  2,57.98s per round in average.
    Train: loss=4.659,accuracy=0.227
    Validation: loss=7.475,accuracy=0.002

Round  3,56.62s per round in average.
    Train: loss=5.354,accuracy=0.198
    Validation: loss=8.409,accuracy=0.002
     Updating the best state...

Round  4,55.25s per round in average.
    Train: loss=6.181,accuracy=0.172
    Validation: loss=9.330,accuracy=0.004

Round  5,54.36s per round in average.
    Train: loss=7.739,accuracy=0.095
    Validation: loss=10.311,accuracy=0.006

Round  6,53.83s per round in average.
    Train: loss=9.188,accuracy=0.037
    Validation: loss=11.243,accuracy=0.006

Round  7,53.63s per round in average.
    Train: loss=9.581,accuracy=0.080
    Validation: loss=12.214,accuracy=0.009

我尝试使用clip和noise_multiplier的不同组合，但未取得任何结果。这是一个示例：

  'clients_per_round' : 20,'client_epochs_per_round' : 2,'uniform_weighting' : True,'server_optimizer': 'adam','client_optimizer': 'adam','clip':0.05,#l2 norm
  'noise_multiplier' : 1.0,'adaptive_clip_learning_rate' : 0,'target_unclipped_quantile' : 0.5,'clipped_count_budget_allocation' : 0.1,'per_vector_clipping' : False,

关于可能是什么问题的任何想法？使用'noise_multiplier'：False，一切正常。 DP_query的定义和平均过程与示例中使用的基本相同：

dp_query = tff.utils.build_dp_query(
      clip=FLAGS.clip,noise_multiplier=FLAGS.noise_multiplier,expected_total_weight=FLAGS.clients_per_round,adaptive_clip_learning_rate=FLAGS.adaptive_clip_learning_rate,target_unclipped_quantile=FLAGS.target_unclipped_quantile,clipped_count_budget_allocation=FLAGS.clipped_count_budget_allocation,expected_clients_per_round=FLAGS.clients_per_round,per_vector_clipping=FLAGS.per_vector_clipping,model=model_fn())

  weights_type = tff.learning.framework.weights_type_from_model(model_fn)
  aggregation_process = tff.utils.build_dp_aggregate_process(
      weights_type.trainable,dp_query)

谢谢！

解决方法

您的noise_multiplier对于您的client_per_round数量而言过高。按照"Learning Differentially Private Language Models"中的方法，您应该首先找到最大的n_m，以便进行良好的训练，然后扩大n_m 并按比例扩大c_p_r 以训练具有良好隐私性的最终模型。 / p>

tensorflow tensorflow tensorflow tensorflow-federated

在TensorFlow联合中应用差异隐私

问题描述

解决方法

相关问答