From 83a38b5deb843e476f9b8ab0315a820583c6d3e7 Mon Sep 17 00:00:00 2001
From: Pauline Isabela Conti <pauline.conti@epfl.ch>
Date: Mon, 28 Feb 2022 17:08:59 +0000
Subject: [PATCH] Update README.md

---
 README.md | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index e418b00..a9f1daa 100644
--- a/README.md
+++ b/README.md
@@ -174,7 +174,12 @@ before running on cluster.
 spark-submit --class distributed.DistributedBaseline --master yarn --num-executors 1 m1_yourid-assembly-1.0.jar  --train TRAIN --test TEST --separator , --json distributed-25m-1.json --num_measurements 1
 ````
 
-See [config.sh](./config.sh) for HDFS paths to pre-uploaded TRAIN and TEST datasets. You can vary the number of executors with ````--num-executors X````, and number of measurements with ````--num_measurements Y````.
+See [config.sh](./config.sh) for HDFS paths to pre-uploaded train and test datasets to replace TRAIN and TEST with in the command. For instance, if you want to run on ML-25m, you should first run [config.sh](./config.sh) and then use the above command adapted as such:
+````
+spark-submit --class distributed.DistributedBaseline --master yarn --num-executors 1 m1_yourid-assembly-1.0.jar  --train $ML25Mr2train --test $ML25Mr2test --separator , --json distributed-25m-1.json --num_measurements 1
+````
+
+You can vary the number of executors with ````--num-executors X````, and number of measurements with ````--num_measurements Y````.
 
 ## Grading scripts
 
-- 
GitLab