introduce label distribution skew in sec 2

a257e5b6 · aurelien.bellet · 8c9e245a · a257e5b6
Commit a257e5b6 authored 3 years ago by aurelien.bellet
--- a/mlsys2022style/setting.tex
+++ b/mlsys2022style/setting.tex
@@ -11,8 +11,17 @@ labeled data point by a tuple $(x,y)$ where $x$ represents the data point
 Each
 node has
 access to a local dataset that
- follows its own local distribution $D_i$. The goal is to find the parameters
+ follows its own local distribution $D_i$ which may differ from that of other
- $\theta$ of a global model that performs well on the union of the local
+ nodes.
+In this work, we focus on label distribution skew: denoting by $p_i(x,y)=p_i
+(x|y)p_i(y)$ the
+probability of $(x,y)$ under the local distribution $D_i$ of node $i$, we
+assume that $p_i(y)$ varies across nodes. We refer to 
+\cite{kairouz2019advances,quagmire} for concrete examples of problems
+with label distribution skew.
+The objective is to find the parameters
+$\theta$ of a global model that performs well on the union of the local
 distributions by
 minimizing
 the average training loss:
@@ -26,8 +35,10 @@ function
 on node $i$. Therefore, $\mathds{E}_{(x_i,y_i) \sim D_i} F_i(\theta;x_i,y_i)$
 denotes 
 the
-expected loss of model $\theta$ over the local data distribution
+expected loss of model $\theta$ over $D_i$.
-$D_i$.
 To collaboratively solve Problem \eqref{eq:dist-optimization-problem}, each
 node can exchange messages with its neighbors in an undirected network graph