Newer
Older
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% EPFL report package, main thesis file
% Goal: provide formatting for theses and project reports
% Author: Mathias Payer <mathias.payer@epfl.ch>
%
% This work may be distributed and/or modified under the
% conditions of the LaTeX Project Public License, either version 1.3
% of this license or (at your option) any later version.
% The latest version of this license is in
% http://www.latex-project.org/lppl.txt
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\documentclass[a4paper,11pt,oneside]{article}
% Options: MScThesis, BScThesis, MScProject, BScProject
\usepackage[BScProject,lablogo]{EPFLreport}
\usepackage{xspace}
\usepackage{listings}
\usepackage{caption}
\usepackage{subcaption}
\renewcommand{\lstlistingname}{Configuration}
\lstset{
basicstyle=\footnotesize\ttfamily,
columns=flexible,
literate={-}{-}1,
breaklines=true,
}
\lstset{captionpos=b}
\AtBeginDocument{\def\chapterautorefname{Chapter}}
\AtBeginDocument{\def\sectionautorefname{Section}}%
\AtBeginDocument{\def\subsectionautorefname{Subsection}}%
%
\author{Paulette Vazquez and Guilhem Niot}
\supervisor{Dr. Erick Lavoie}
\adviser{Professor Anne-Marie Kermarrec}
\newcommand{\sysname}{FooSystem\xspace}
\begin{document}
\maketitle
\section{Another application of Smallworld, the Dat Protocol}
The Dat Protocol\footnote{\url{https://www.datprotocol.com/}} is a peer-to-peer protocol that allows sharing files in a distributed network. It is based on append-only logs like Secure-Scuttlebutt (SSB) [ref] but it provides more mature core libraries.
As a distributed file sharing protocol, it differs from BitTorrent[ref] in that contents can be updated by their author.
This report aims at showing that Dat can be used with a Smallworld setup, to easily share files that could be often updated among peers, similarly to a Dropbox instance. We will explain how to configure Dat to replicate files, and we will cover the case where we replicate files between two clients A and B with the Raspberry Pi acting as an intermediary in case only one of the clients is online at a given time.
\subsection{Real time synchronization with dat-store}
Dat-store is a command line tool that provides commands for downloading and syncing Dat archives and a service that can be run in background, and can be remotely controlled. \footnote{People usually recommend using a command line tool called \emph{dat-cli} for downloading and syncing Dat archives, but it is not able to run in background, and cannot be remotely controlled and is thus not practical for this application.} \\
Dat-store introduces the concept of provider. Providers are instances of dat-store running either locally or remotely. The default provider is your local instance, but you can also configure other providers, like the Raspberry Pi.
This notion is particularly useful for remote control of other providers. \\
Dat-store should be installed on both the client and the Raspberry Pi using
\begin{lstlisting}
npm install -g dat-store
\end{lstlisting}
\textbf{Note:} In case you have permission errors, you may need to fix the permissions of your NPM installations. See the manual guide\footnote{\url{https://docs.npmjs.com/resolving-eacces-permissions-errors-when-installing-packages-globally}}. Using sudo instead won't fix the problem. \\
You should then configure a systemd service to run the store in background. You have to configure both your device and the Raspberry Pi. For the user device you can follow Configuration \ref{cmd:dat_store_systemd_user} below. For the Raspberry Pi you can follow Configuration \ref{cmd:dat_store_systemd_rasp}.
\begin{lstlisting}[label=cmd:dat_store_systemd_user, caption=Configure dat-store systemd service on the user device]
# This will create the service file.
sudo cat << EOF | sudo tee /etc/systemd/system/dat-store.service > /dev/null
[Unit]
Description=Dat storage provider, keeps dats alive in the background.
[Service]
Type=simple
# Check that dat-store is present at this location
# If it's not, replace the path with its location
ExecStart=$(which dat-store) run-service
Restart=always
[Install]
WantedBy=multi-user.target
EOF
sudo chmod 644 /etc/systemd/system/dat-store.service
sudo systemctl daemon-reload
sudo systemctl enable dat-store
sudo systemctl start dat-store
sudo systemctl status dat-store
\begin{lstlisting}[label=cmd:dat_store_systemd_rasp, caption=Configure dat-store systemd service on the raspberry pi]
# This will create the service file.
sudo cat << EOF | sudo tee /etc/systemd/system/dat-store.service > /dev/null
[Unit]
Description=Dat storage provider exposed to the internet (for the raspberry pi).
[Service]
Type=simple
# Check that dat-store is present at this location
# If it's not, replace the path with its location
ExecStart=$(which dat-store) run-service --expose-to-internet
Restart=always
[Install]
WantedBy=multi-user.target
EOF
sudo chmod 644 /etc/systemd/system/dat-store.service
sudo systemctl daemon-reload
sudo systemctl enable dat-store
sudo systemctl start dat-store
sudo systemctl status dat-store
Dat-store provides interesting commands that we will use in our demo\footnote{See the documentation at \url{https://github.com/datproject/dat-store}.}:
\begin{itemize}
\item \emph{dat-store add <url|path> [provider]}: Adds a folder or a dat url to the dat-store of the specified provider.
\item \emph{dat-store set-provider <url> [provider]}: Sets the url of the specified provider.
\item \emph{dat-store list [provider]}: Retrieves the list of available Dats in the specified provider.
\item \emph{dat-store clone <path> <url> [provider]}: Clones \emph{<url>} into a local folder.
\end{itemize}
\subsection{Demonstration with two clients, and a Raspberry Pi}
We can then imagine the case where we have two clients A and B and one Raspberry Pi used as a permanent store to synchronize a folder from A to B.
First, A executes the commands from \autoref{cmd:dat_store_example_A}.
Then, client B executes the commands from \autoref{cmd:dat_store_example_B}.
\begin{lstlisting}[label=cmd:dat_store_example_A, caption=Commands executed by client A]
dat-store set-provider http://raspberrypi.local:3472 raspberry
dat-store list
dat-store add <url from previous command corresponding to ./mydat> raspberry
# You can check that the content is actually replicated by accessing
# http://raspberrypi.local/gateway/<token from hyper url corresponding to ./mydat>/
\begin{lstlisting}[label=cmd:dat_store_example_B, caption=Commands executed by client B]
dat-store set-provider http://raspberrypi.local:3472 raspberry
dat-store list raspberry
dat-store clone ./mydat <url obtained from the previous command>
\end{lstlisting}
Changes from A to the folder \emph{mydat} will be replicated by B, even if A is not connected at the same time.
We also experimented with Pushpin\footnote{\url{https://github.com/automerge/pushpin}}, which is a collaborative coarkboard app. We wanted to showcase a peer-to-peer application with a real-time interface. It is based on hypermerge\footnote{\url{https://github.com/automerge/hypermerge}}, a library providing a JSON-like structure that can be edited concurrently, without worrying about conflicts. And hypermerge is itself based on the Dat protocol.
However, Pushpin was quite unstable, and Electron was crashing from time to time, and the synchronization also stopped until the window was reloaded. \\
On top of that, we were unable to install Pushpin on the Raspberry Pi to have a replicating node as Pushpin requires compiling Node 14 and the Raspberry Pi ran out of memory while doing the compilation.
We've shown how to use dat store to replicate files between devices.
In implementing this application, we've noticed a major need shared between SSB and Dat, that is to provide a way to determine when two devices ended their synchronization.
SSB provides a Node.js client\footnote{\url{https://github.com/ssbc/ssb-client}} that can be used to determine when new posts are received. A proof of concept is showcased at \url{https://github.com/GuilhemN/ssb-copy-follows/tree/POSTS}. This script could be adapted to make a LED blink for instance to notify the user that the devices have synchronized.
Another possible improvement for SSB is the refactoring of the AutoFollow with \emph{ssb-client} to copy the follows of its owner. Similarly a proof of concept is available at \url{https://github.com/GuilhemN/ssb-copy-follows/blob/master/index.js}. It could be integrated to a web page exposed at \verb|http://raspberrypi.local/|.
\newpage
\appendix
\section{Dat vs Hypercore protocol}
The Hypercore protocol\footnote{\url{https://hypercore-protocol.org/}} was created in 2016 as a way to provide Dat more abstract foundations that could be reused among several applications. From then, the Hyper project grew and it was decided in 2020 to separate Dat from Hypercore and to provide both their own governance\footnote{An explanation video is available at \url{https://www.youtube.com/watch?v=mx52uO5SP7A}}.
The Hypercore protocol is now incompatible with the Dat protocol and provides its own CLI tools and libraries.
The main entry of the Hyperspace is the Hyperspace CLI\footnote{\url{https://github.com/hypercore-protocol/cli}}. It allows to create Hyperdrives, folders distributed on the Hyper network, and Bees, which are distributed key-value tables.
The Hypercore provides an actually decentralized implementation for remote peer-to-peer connexion based on DHT (Distributed Hash Tables), while the Dat protocol only supports a rendez-vous approach with a centralized server. \\
While we did not do anything new with Hyper compared to what we did with Dat, we wanted to explain its installation procedure on Raspberry Pis as it is quite complex, and as Hyper will likely be preferred over Dat in the future. \\
Hyper libraries require Node 14 but Node 14 is not officially supporting the Raspberry Pi Zero as it chips an ARMv6 CPU.
The NodeJs team is actually still providing unofficial (and "experimental") builds under \url{https://github.com/nodejs/unofficial-builds/} and those are usable on the Raspberry Pi. You may install them using the scripts provided by \url{https://github.com/sdesalas/node-pi-zero}. You may install the latest version of Node 14 to have a version of node compatible with Hyper. \\
You will then be able to install the Hyperspace CLI using \verb|npm install -g @hyperspace/cli|. \\
And now you should be able to interact with Hyperdrives using the \emph{hyp} command. You can check its documentation at \url{https://hypercore-protocol.org/guides/hyp/}.
\textbf{Note:} In case the \emph{hyp} command is not resolved, you may need to add to your \verb|~/.bashrc| the following line: \verb|export PATH=$PATH:$(npm bin -g)| which adds NPM libraries to your PATH.
\cleardoublepage
\phantomsection
\addcontentsline{toc}{chapter}{Bibliography}
\printbibliography