Skip to content
Snippets Groups Projects
Commit 710a0505 authored by Erick Lavoie's avatar Erick Lavoie
Browse files

Updated experiment parameters

parent 803c11fc
No related branches found
No related tags found
No related merge requests found
...@@ -131,8 +131,10 @@ Because the data distribution within each clique is representative of the global ...@@ -131,8 +131,10 @@ Because the data distribution within each clique is representative of the global
As a summary, we make the following contributions: As a summary, we make the following contributions:
\begin{itemize} \begin{itemize}
\item significant impact of topology on non-iid data
\item we propose the D-Cliques topology to remove the impact of non-IID data on convergence speed, similar to a fully-connected topology, with a reduced number of edges and required messages \item we propose the D-Cliques topology to remove the impact of non-IID data on convergence speed, similar to a fully-connected topology, with a reduced number of edges and required messages
\item we show how to leverage D-Cliques to implement momentum in a distributed non-IID setting, which would otherwise be detrimental to the convergence speed of convolutional networks \item we show how to leverage D-Cliques to implement momentum in a distributed non-IID setting, which would otherwise be detrimental to the convergence speed of convolutional networks
\item scale (>16 noeuds)
\end{itemize} \end{itemize}
The rest of the paper is organized as such. \dots The rest of the paper is organized as such. \dots
...@@ -276,6 +278,8 @@ We solve this problem by decoupling the gradient averaging from the weight avera ...@@ -276,6 +278,8 @@ We solve this problem by decoupling the gradient averaging from the weight avera
\caption{\label{fig:d-cliques-mnist-linear} D-Cliques with Linear Model on MNIST.} \caption{\label{fig:d-cliques-mnist-linear} D-Cliques with Linear Model on MNIST.}
\end{figure} \end{figure}
With and without clique averaging.
TODO: Update figure with actual Clique-Ring results TODO: Update figure with actual Clique-Ring results
\subsection{CIFAR10 and Convolutional Model} \subsection{CIFAR10 and Convolutional Model}
...@@ -330,13 +334,23 @@ Similar number of maximum hops but no or less clustering than D-Cliques (and no ...@@ -330,13 +334,23 @@ Similar number of maximum hops but no or less clustering than D-Cliques (and no
% To regenerate the figure, from directory results/scaling % To regenerate the figure, from directory results/scaling
% python ../../../learn-topology/tools/plot_convergence.py 10/mnist/fully-connected-cliques/all/2021-03-10-14:40:35-CET ../mnist/fully-connected-cliques/all/2021-03-10-10:19:44-CET 1000/mnist/fully-connected-cliques/all/2021-03-10-16:44:35-CET --labels '10 nodes bsz=128' '100 nodes bsz=128' '1000 nodes bsz=128 (45)' --legend 'lower right' --yaxis test-accuracy --save-figure ../../figures/d-cliques-mnist-scaling-fully-connected.png --ymin 80 --add-min-max % python ../../../learn-topology/tools/plot_convergence.py 10/mnist/fully-connected-cliques/all/2021-03-10-14:40:35-CET ../mnist/fully-connected-cliques/all/2021-03-10-10:19:44-CET 1000/mnist/fully-connected-cliques/all/2021-03-10-16:44:35-CET --labels '10 nodes bsz=128' '100 nodes bsz=128' '1000 nodes bsz=128 (45)' --legend 'lower right' --yaxis test-accuracy --save-figure ../../figures/d-cliques-mnist-scaling-fully-connected.png --ymin 80 --add-min-max
\begin{figure}[htbp] \begin{figure}[htbp]
\centering
\begin{subfigure}[b]{\textwidth}
\centering \centering
\includegraphics[width=0.7\textwidth]{figures/d-cliques-mnist-scaling-fully-connected} \includegraphics[width=0.7\textwidth]{figures/d-cliques-mnist-scaling-fully-connected}
\caption{\label{fig:d-cliques-mnist-scaling-fully-connected} Scaling Behaviour of Fully-Connected D-Clique} \caption{Constant Batch-Size}
\end{subfigure}
\begin{subfigure}[b]{\textwidth}
\centering
\caption{Constant Nb Updates per Epoch}
\end{subfigure}
\caption{\label{fig:d-cliques-mnist-scaling-fully-connected} Scaling Behaviour of Fully-Connected D-Clique}
\end{figure} \end{figure}
Show scaling effect for 10, 100, 1000 nodes (with decreasing sample sizes) for Clique Ring, Hierarchical, Fully-Connected. Show scaling effect for 10, 100, 1000 nodes (with decreasing sample sizes) for Clique Ring, Hierarchical, Fully-Connected.
(Smallworld?)
Robustness to not having fully-connected cliques (static and dynamic subsets). Robustness to not having fully-connected cliques (static and dynamic subsets).
\section{Related Work} \section{Related Work}
......
#!/usr/bin/env bash #!/usr/bin/env bash
TOOLS=../../../../../../learn-topology/tools; CWD="$(pwd)"; cd $TOOLS TOOLS=../../../../../../learn-topology/tools; CWD="$(pwd)"; cd $TOOLS
BSZS=' BSZS='
128 1280
' '
LRS=' LRS='
0.1 0.1
......
#!/usr/bin/env bash #!/usr/bin/env bash
TOOLS=../../../../../../learn-topology/tools; CWD="$(pwd)"; cd $TOOLS TOOLS=../../../../../../learn-topology/tools; CWD="$(pwd)"; cd $TOOLS
BSZS=' BSZS='
128 13
' '
LRS=' LRS='
0.1 0.1
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment