Skip to content
Snippets Groups Projects
Commit 23c48e8d authored by aurelien.bellet's avatar aurelien.bellet
Browse files

cosmit

parent cc91426a
No related branches found
No related tags found
No related merge requests found
...@@ -5,6 +5,12 @@ ...@@ -5,6 +5,12 @@
year={2020} year={2020}
} }
@inproceedings{Lian2018,
Author = {Xiangru Lian and Wei Zhang and Ce Zhang and Ji Liu},
Booktitle = {ICML},
Title = {{Asynchronous Decentralized Parallel Stochastic Gradient Descent}},
Year = {2018}}
@inproceedings{fedprox, @inproceedings{fedprox,
author = {Tian Li and Anit Kumar Sahu and Manzil Zaheer and Maziar Sanjabi and Ameet Talwalkar and Virginia Smith}, author = {Tian Li and Anit Kumar Sahu and Manzil Zaheer and Maziar Sanjabi and Ameet Talwalkar and Virginia Smith},
title = {{Federated Optimization in Heterogeneous Networks}}, title = {{Federated Optimization in Heterogeneous Networks}},
......
...@@ -137,8 +137,11 @@ unbiased with respect to the class distribution. ...@@ -137,8 +137,11 @@ unbiased with respect to the class distribution.
We empirically evaluate our approach on MNIST and CIFAR10 datasets using We empirically evaluate our approach on MNIST and CIFAR10 datasets using
logistic logistic
regression and deep convolutional models with up to 1000 participants. This is regression and deep convolutional models with up to 1000 participants. This is
in contrast to most previous work on fully decentralized algorithms which only in contrast to most previous work on fully decentralized algorithms
consider a few tens of participants \cite{refs}. considering only a few tens of participants \cite{tang18a,more_refs}, which
fall short of
giving a realistic view of the performance of these algorithms in actual
applications.
\aurelien{TODO: complete above paragraph with more details and highlighting \aurelien{TODO: complete above paragraph with more details and highlighting
other contributions as needed} other contributions as needed}
...@@ -614,7 +617,8 @@ network, see for instance \cite{Duchi2012a,lian2017d-psgd,Nedic18}. ...@@ -614,7 +617,8 @@ network, see for instance \cite{Duchi2012a,lian2017d-psgd,Nedic18}.
% papers using multiple averaging steps % papers using multiple averaging steps
% also our personalized papers % also our personalized papers
D2: numerically unstable when $W_{ij}$ rows and columns do not exactly sum to $1$, as the small differences are amplified in a positive feedback loop. More work is therefore required on the algorithm to make it usable with a wider variety of topologies. In comparison, D-cliques do not modify the SGD algorithm and instead simply removes some neighbor contributions that would otherwise bias the direction of the gradient. D-Cliques with D-PSGD are therefore as tolerant to ill-conditioned $W_{ij}$ matrices as regular D-PSGD in an IID setting. D2 \cite{tang18a}: numerically unstable when $W_{ij}$ rows and columns do not exactly
sum to $1$, as the small differences are amplified in a positive feedback loop. More work is therefore required on the algorithm to make it usable with a wider variety of topologies. In comparison, D-cliques do not modify the SGD algorithm and instead simply removes some neighbor contributions that would otherwise bias the direction of the gradient. D-Cliques with D-PSGD are therefore as tolerant to ill-conditioned $W_{ij}$ matrices as regular D-PSGD in an IID setting.
An originality of our approach is to focus on the effect of topology An originality of our approach is to focus on the effect of topology
level without significantly changing the original simple and efficient D-SGD level without significantly changing the original simple and efficient D-SGD
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment