当前位置: 首页 > article >正文

[论文精读]Pisces: Efficient Federated Learning via GuidedAsynchronous Training

论文网址:Pisces | Proceedings of the 13th Symposium on Cloud Computing

论文代码:GitHub - SamuelGong/Pisces: [ACM SoCC'22] Pisces: Efficient Federated Learning via Guided Asynchronous Training

英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用

目录

1. 心得

2. 论文逐段精读

2.1. Abstract

2.2. Introduction

2.3. Background and motivation

2.3.1. Synchronous federated learning

2.3.2. Participant selection

2.4. Pisces overview

2.5. Participant selection

2.5.1. Controlled concurrency

2.5.2. Utility profiling

2.6. Model aggregation

2.6.1. Bounded staleness

2.6.2. Adaptive pace control

2.7. Theoretical analysis

2.7.1. Convergence guarantee

2.7.2. Performance analysis

2.8. Implementation

2.9. Evaluation

2.9.1. Methodology

2.9.2. End-to-end performance

2.9.3. Sensitivity analysis

2.10. Discussion and future work

2.11. Related work

2.12. Conclusion

3. 知识补充

3.1. Oort

3.2. FedBuff

4. Reference


1. 心得

(1)至少从眼缘来看...这篇文章...是比其他某些基础文章要友好很多的...

(2)这是ccf b???吗???我感觉写的挺好的呢还是我搜索错了?

2. 论文逐段精读

2.1. Abstract

        ①Slow clients can slow down the overall server updating speed

stale  adj.不新鲜的;陈腐的;(空气)污浊的;老掉牙的;厌倦的,腻烦的;(烟味)难闻的;没有新意的    n.(牛马、骆驼的)尿    v.使陈旧;使没有味道;用坏;变陈旧

2.2. Introduction

        ①An extreme scenario is that clients with small amounts of data have extremely fast transmission speeds, but clients with rich data transmit very slowly

        ②Pisces firstly consider the clients with high data quality based on the loss

        ③Platform: Plato

orchestrate  vt.编排;策划;编配(或创作管弦乐曲);精心安排;密谋

2.3. Background and motivation

2.3.1. Synchronous federated learning

        ①Sychronous and asynchronous FL, the subset of clients might be randomly chosen:

        ②Solutions of synchronization performance (time costs) bottleneck: a) periodic aggregation: limits the deadline of transmission, b) over-selection: choose more clients based on time. However, they two ignore the rich information in slow clients

straggler  n.掉队者;落伍者;落在最后的人(或动物)

2.3.2. Participant selection

        ①Goal: balance time and data quality

        ②Relevant researches on trade off: FedCS, TiFL, Oort, and AutoFL

        ③The score calculated in Oort:

U_{i}^{Oort}=\underbrace{|B_{i}|\sqrt{\frac{1}{|B_{i}|}\sum_{k\in B_{i}}Loss(k)^{2}}}_{Data \, quality}\times\underbrace{(\frac{T}{t_{i}})^{\mathbf{1}(T<t_{i})\times\alpha}}_{System \, speed}

the second term penalize the delay with exponent \alpha > 0

        ④The authors executed experiment and identified that the penalty is over heavy, and the opportunity of being chosen of slow client is much less than fast one

        ⑤The time records that the accuracy achieves 95% under different penlaty coefficient \alpha:

2.4. Pisces overview

        ①The workflow of Pisces:

        ②Coordinator’s control loop algorithm:

2.5. Participant selection

2.5.1. Controlled concurrency

        ①The time to achieve specific accuracy and the accumulated updating clients number curve under different maximum concurrency:

就是作者觉得最大并发得限制一下,少了会导致没几个模型更新而且达到好的精度耗时很长,但是多了又导致模型频繁更新然后又更难训练好了

2.5.2. Utility profiling

        ①The authors reckon that the harm is more than good for strictly penalizing

        ②Stale updating of high quality data might not be so useful

        ③The updating method of Pisces:

U_{i}^{Pisces}=\underbrace{|B_{i}|\sqrt{\frac{1}{|B_{i}|}\sum_{k\in B_{i}}Loss(k)^{2}}}_{Data \, quality}\times\underbrace{\frac{1}{(\tilde{\tau_{i}}+1)^{\beta}}}_{Staleness}

where \tilde{\tau}_i\geq 0 denotes the estimated staleness of the client’s updates, \beta > 0 is the staleness penalty factor

        ④The authors will blacklist clients who have lost outliers exceeding a certain time threshold: each client obtain r reliability credits. DBSCAN clusters the updating pool in \left \{ w_{t-k},w_{t-k+1},...,w_t \right \},k> 0 based on associated losses. If a client experiences an outlier once, deduct 1 credit score. When the credit score is used up, this client is removed from the training pool(不过我不太知道怎么去界定异常值?

        ⑤The staleness calculation:

\tilde{\tau_i}=\frac{1}{k}\sum_{j=t-k+1}^t\tau_{i,j}

        ⑥The staleness figure:

2.6. Model aggregation

2.6.1. Bounded staleness

        ①Buffered aggregation (BA) and Pisces's methods:

where FedBuff updates when the aggregation goal K but it ignores the slowest client

2.6.2. Adaptive pace control

        ①Different from FedBuff, Pisces considers the most delayed client (with grey shadow) and adjusts aggregation interval

        ②Within 2 updating operators, all the runned clients have the same staleness bound

        ③Examination of aggregations in the control loop:

at beginning time T_l, in the l loop step, Pisces obtains the profiled latency L_{max} of the slowest client. And define the aggregation interval I=\frac{L_{max}}{b} and calculate elapsed time e=T_{l}-t_{\tau_{j-1}} after the last aggregation. Only when e> I, Pisces executes aggregation(感觉这个设计还是挺妙的啊,如果是首次提出的话)

2.7. Theoretical analysis

2.7.1. Convergence guarantee

(1)Notation

T_lthe start time of a loop step l
I_laggregation interval at step l
nnumber of clients
bstaleness bound
\triangledown F_i\left ( w \right )gradient of model w with respect to the loss of client i's data
f(w)=\frac{1}{n}\sum_{i=1}^{n}p_{i}F_{i}(w)global learning objective with p_i> 0 weighting each client’s contributio
f^*the optimum of f\left ( w \right )
g_i(w;\xi_i)stochastic gradient on client i with randomness \xi_i
Qnumber of steps in local SGD

(2)Why does Pisces achieve bounded staleness? 

(3)What is the convergence guarantee? 

        ①....参考原文吧,我不在这证明了

2.7.2. Performance analysis

        ①Computation cost: O(n^2), mostly taken by DBSCAN

        ②Communication cost: O(n), collected from n clients

        ③Storage cost: O(n^2), DBSCAN(搁这全是DBSCAN来的?干嘛不换一个聚类)

2.8. Implementation

        ①Platform: Plato

2.9. Evaluation

2.9.1. Methodology

(1)Datasets and models

        ①Image classification datasets: CIFAR-10 with 60k colored images, MNIST with 60k grescale image in 10 classes, and FEMNIST with 805k grescale images in 63 classes

        ②Model on images: LeNet5

        ③Language modeling dataset: StackOverflow on 0.3 million clients

        ④Model on text: Albert

(2)Experimental Setup

        ①Simulated latency: based on Zipf distribution with a=1.2

(3)Hyperparameters

        ①Concurrency limit C=20

        ②Staleness penalty factor \beta = 0.5

        ③Optimizer: SGD with 0.9 momentum except for StackOverflow on Adam

        ④Other hyperparameters:

(4)Baseline Algorithms

        ①Compared baselines: Oort and FedBuff

(5)Metrics

        ①Evaluation metrics: time-to-acc, aggregation times, and network footprint

2.9.2. End-to-end performance

        ①Time-to-accuracy comparison table:

        ②Model aggregation times:

        ③Dataset scale perference:

        ④Convergence behaviours curve:

        ⑤Optimizations on participant selection are effective:

        ⑥Optimizations on aggregation adapt to various settings:

        ⑦Competitive in network overhead (total network footprint (GB)):

2.9.3. Sensitivity analysis

        ①Impact of participation scale:

        ②Impact of training loss outliers:

they manually insert inverse all the labels in subset of clients, which caused high loss. The corruption rate in (a) is set to 5% 

        ③Impact of staleness penalty factor:

        ④Impact of optimizers:

        ⑤Impact of model architectures:

2.10. Discussion and future work

        ①Privacy compliance: the data might be disclosed in average training loss and local update

        ②They aimed to further introduce differential privacy (DP) to protect their data

2.11. Related work

        ①Lists regularized to local optimization methods such as FedAsynv and ASO-Fed; aggregation optimization methods such as HySync, Port and FedBuff; or principled participant selection() methods FeadAR and works by Chen

2.12. Conclusion

        Pisces is speedy, robust and flexible

3. 知识补充

3.1. Oort

(1)参考学习1:【文献阅读】Oort:通过引导参与者选择实现高效联合学习_oort联邦学习-CSDN博客

(2)参考学习2:一种新型的联邦分布式架构(Oort)及未来研究方向 - 知乎

3.2. FedBuff

(1)参考学习1:【论文笔记 | 异步联邦】 FedBuff-CSDN博客

4. Reference

Jiang, Z. et al. (2022)  'Pisces: Efficient Federated Learning via Guided Asynchronous Training', ACM SoCC. doi: Pisces | Proceedings of the 13th Symposium on Cloud Computing


http://www.kler.cn/a/370114.html

相关文章:

  • Jenkins-pipeline语法说明
  • 人工智能之深度学习_[3] -PyTorch自动微分模块和构建线性回归模型
  • [Easy] leetcode-500 键盘行
  • Ubuntu22部署MySQL5.7详细教程
  • 开源AI崛起:新模型逼近商业巨头
  • PostgreSQL的学习心得和知识总结(一百六十六)|深入理解PostgreSQL数据库之\watch元命令的实现原理
  • IDEA->EasyCode(mapper.xml) 字段无逗号分隔和修改全局变量问题
  • Linux驱动开发 中断上下文机制详解 中断上下文API使用详解
  • ubuntu-开机黑屏问题快速解决方法
  • taro微信小程序assets静态图片不被编译成base64
  • 2024年10款好用的图纸加密软件推荐|企业CAD图纸加密指南!
  • 《近似线性可分支持向量机的原理推导》 拉格朗日函数 公式解析
  • linux命令之mv
  • 力扣每日一题 685. 冗余连接 II
  • 新能源汽车爆炸频发?FLIR TG275助你提前检测,规避风险!
  • 练习LabVIEW第二十一题
  • 【鸡翅Club】Nas搭建中间键方案
  • MoCap 动作捕捉开源库教程
  • RNN与Self-Attention
  • 设计模式(五)原型模式详解
  • python之爬虫遇到返回内容乱码
  • 2024年10月28日练习(双指针算法)
  • 了解VGG网络并利用PyTorch实现VGG网络
  • 计算服务器:开启科学计算新变革的强大引擎
  • 论文 | PROMPTING GPT-3 TO BE RELIABLE
  • Vue.js 入门指南:从基础知识到核心功能