From 61ee6167a411f88c3d3db99295493c9e12083c65 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Aur=C3=A9lien?= <aurelien.bellet@inria.fr> Date: Fri, 2 Apr 2021 10:42:02 +0200 Subject: [PATCH] abstract --- main.tex | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/main.tex b/main.tex index e619c7e..83330db 100644 --- a/main.tex +++ b/main.tex @@ -27,7 +27,8 @@ \begin{document} % %\title{D-Cliques: Topology can compensate NonIIDness in Decentralized Federated Learning} -\title{D-Cliques: Compensating NonIIDness in Decentralized Federated Learning with topology} +\title{D-Cliques: Compensating NonIIDness in Decentralized Federated Learning +with Topology} % \titlerunning{D-Cliques} % If the paper title is too long for the running head, you can set @@ -50,7 +51,21 @@ \maketitle % typeset the header of the contribution % \begin{abstract} -The convergence speed of machine learning models trained with Federated Learning is significantly affected by non-independent and identically distributed (non-IID) data partitions, even more so in a fully decentralized setting without a central server. In this paper, we show that the impact \textit{local class bias} can be significantly reduced by carefully designing the underlying communication topology. We present D-Cliques, a novel topology that reduces gradient bias by grouping nodes in cliques such that their local joint distribution is representative of the global class distribution. We refine D-Cliques with Clique Averaging and unbiased momentum, tested on MNIST and CIFAR10, and demonstrate that D-Cliques provide similar convergence speed as a fully-connected topology with a significant reduction in the number of required edges and messages. In a 1000-node topology, D-Cliques requires 98\% less edges and 96\% less total messages to achieve a similar accuracy, with further possible gains using a small-world topology. +The convergence speed of machine learning models trained with Federated +Learning is significantly affected by non-independent and identically +distributed (non-IID) data partitions, even more so in a fully decentralized +setting without a central server. In this paper, we show that the impact +\textit{local class bias} can be significantly reduced by carefully designing +the underlying communication topology. We present D-Cliques, a novel topology +that reduces gradient bias by grouping nodes in interconnected cliques such +that the local joint distribution in a clique is representative of the global +class distribution. We also show how to adapt the updates of decentralized SGD +to obtain unbiased gradients and effective momentum with D-Cliques. Our +empirical evaluation on MNIST and CIFAR10 demonstrates that our approach +provides similar convergence speed as a fully-connected topology with a +significant reduction in the number of edges and messages. In a 1000-node +topology, D-Cliques requires 98\% less edges and 96\% less total messages, +with further possible gains using a small-world topology across cliques. \keywords{Decentralized Learning \and Federated Learning \and Topology \and Non-IID Data \and Stochastic Gradient Descent} -- GitLab