From e1059e5de1423562793889c852e3335b4d44159e Mon Sep 17 00:00:00 2001 From: Erick Lavoie <erick.lavoie@epfl.ch> Date: Fri, 2 Apr 2021 21:37:01 +0200 Subject: [PATCH] Fixed typos --- main.tex | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/main.tex b/main.tex index 76bf550..987f5b7 100644 --- a/main.tex +++ b/main.tex @@ -62,8 +62,7 @@ that reduces gradient bias by grouping nodes in interconnected cliques such that the local joint distribution in a clique is representative of the global class distribution. We also show how to adapt the updates of decentralized SGD to obtain unbiased gradients and implement an effective momentum with -D-Cliques. Our -empirical evaluation on MNIST and CIFAR10 demonstrates that our approach +D-Cliques. Our empirical evaluation on MNIST and CIFAR10 demonstrates that our approach provides similar convergence speed as a fully-connected topology with a significant reduction in the number of edges and messages. In a 1000-node topology, D-Cliques requires 98\% less edges and 96\% less total messages, @@ -357,7 +356,7 @@ prediction accuracy. We use a logistic regression classifier for MNIST, which -provides up to 92.5\% percent accuracy in the centralized setting. +provides up to 92.5\% accuracy in the centralized setting. % compared to % $99\%$ for the state-of-the-art~\cite{mnistWebsite}. For CIFAR10, we use a Group-Normalized variant of LeNet~\cite{quagmire}, a @@ -454,7 +453,7 @@ In D-Cliques, we address the issues of non-iidness by carefully designing a network topology composed of \textit{cliques} and \textit{inter-clique connections}: \begin{itemize} - \item D-Cliques recovers a balanced representation of classes, similar to + \item D-Cliques recover a balanced representation of classes, similar to that of the IID case, by constructing a topology such that each node is part of a \textit{clique} with neighbors representing all classes. \item To ensure a global consensus and convergence, @@ -462,7 +461,7 @@ connections}: are introduced by connecting a small number of node pairs that are part of different cliques. \end{itemize} -In the following, we introduce one inter-clique connection per node such that each clique has exactly one +In the following, we introduce up to one inter-clique connection per node such that each clique has exactly one edge with all other cliques, see Figure~\ref{fig:d-cliques-figure} for the corresponding D-Cliques network in the case of $n=100$ nodes and $c=10$ classes. We will explore sparser inter-clique topologies in Section~\ref{section:interclique-topologies}. @@ -484,7 +483,7 @@ topology, namely: We refer to Algorithm~\ref{Algorithm:D-Clique-Construction} in the appendix for a formal account of D-Cliques construction. We note that it only requires the knowledge of the local class distribution at each node. For the sake of -simplicity, we assume that D-Cliques is constructed from the global +simplicity, we assume that D-Cliques are constructed from the global knowledge of these distributions, which can easily be obtained by decentralized averaging in a pre-processing step. @@ -524,7 +523,7 @@ speed on MNIST.} \end{figure} Figure~\ref{fig:d-cliques-example-convergence-speed} illustrates the -performance D-Cliques on MNIST with $n=100$ nodes. Observe that the +performance of D-Cliques on MNIST with $n=100$ nodes. Observe that the convergence speed is very close to that of a fully-connected topology, and significantly better than with @@ -967,7 +966,7 @@ has gone into designing efficient topologies to optimize the use of network resources (see e.g., \cite{marfoq}), but the topology is chosen independently of how data is distributed across nodes. In summary, the role of topology in the non-IID data scenario is not well understood and we are not -aware of prior work focusing on this question. Our work shows is the first +aware of prior work focusing on this question. Our work is the first to show that an appropriate choice of data-dependent topology can effectively compensate for non-IID data. -- GitLab