abstract

61ee6167 · aurelien.bellet · 6b933270 · 61ee6167
Commit 61ee6167 authored 3 years ago by aurelien.bellet
--- a/main.tex
+++ b/main.tex
@@ -27,7 +27,8 @@
 \begin{document}
 %
 %\title{D-Cliques: Topology can compensate NonIIDness in Decentralized Federated Learning}
-\title{D-Cliques: Compensating NonIIDness in Decentralized Federated Learning with topology}
+\title{D-Cliques: Compensating NonIIDness in Decentralized Federated Learning
+with Topology}
 %
 \titlerunning{D-Cliques}
 % If the paper title is too long for the running head, you can set
@@ -50,7 +51,21 @@
 \maketitle              % typeset the header of the contribution
 %
 \begin{abstract}
-The convergence speed of machine learning models trained with Federated Learning is significantly affected by non-independent and identically distributed (non-IID) data partitions, even more so in a fully decentralized setting without a central server. In this paper, we show that the impact \textit{local class bias} can be significantly reduced by carefully designing the underlying communication topology. We present D-Cliques, a novel topology that reduces gradient bias by grouping nodes in cliques such that their local joint distribution is representative of the global class distribution. We refine D-Cliques with Clique Averaging and unbiased momentum, tested on MNIST and CIFAR10, and demonstrate that D-Cliques provide similar convergence speed as a fully-connected topology with a significant reduction in the number of required edges and messages. In a 1000-node topology, D-Cliques requires 98\% less edges and 96\% less total messages to achieve a similar accuracy, with further possible gains using a small-world topology.
+The convergence speed of machine learning models trained with Federated
+Learning is significantly affected by non-independent and identically
+distributed (non-IID) data partitions, even more so in a fully decentralized
+setting without a central server. In this paper, we show that the impact 
+\textit{local class bias} can be significantly reduced by carefully designing
+the underlying communication topology. We present D-Cliques, a novel topology
+that reduces gradient bias by grouping nodes in interconnected cliques such
+that the local joint distribution in a clique is representative of the global
+class distribution. We also show how to adapt the updates of decentralized SGD
+to obtain unbiased gradients and effective momentum with D-Cliques. Our
+empirical evaluation on MNIST and CIFAR10 demonstrates that our approach
+provides similar convergence speed as a fully-connected topology with a
+significant reduction in the number of edges and messages. In a 1000-node
+topology, D-Cliques requires 98\% less edges and 96\% less total messages,
+with further possible gains using a small-world topology across cliques.

 \keywords{Decentralized Learning \and Federated Learning \and Topology \and
 Non-IID Data \and Stochastic Gradient Descent}