From 1efd878db8eef4e3929cf9cf8fbb4f6874ba83bc Mon Sep 17 00:00:00 2001 From: Pauline Isabela Conti <pauline.conti@epfl.ch> Date: Thu, 14 Apr 2022 12:26:11 +0000 Subject: [PATCH] Update README.md --- README.md | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 4752d19..28ce7f4 100644 --- a/README.md +++ b/README.md @@ -136,14 +136,23 @@ before running on cluster. ### Run on Cluster -See [config.sh](./config.sh) for HDFS paths to pre-uploaded train and test datasets to replace TRAIN and TEST with in the command in the example command below: +See [config.sh](./config.sh) for HDFS paths to pre-uploaded train and test datasets to replace TRAIN and TEST, like in the example commands below: +#### When using ML-100k ```` -spark-submit --class distributed.Exact --master yarn --conf "spark.dynamicAllocation.enabled=false" --num-executors 1 m2_yourid-assembly-1.0.jar --train TRAIN --test TEST +spark-submit --class distributed.Exact --master yarn --conf "spark.dynamicAllocation.enabled=false" --num-executors 1 m2_yourid-assembly-1.0.jar --json exact-100k-1.json --train $ML100Ku2base --test $ML100Ku2test +```` +#### When using ML-1m +```` +spark-submit --class distributed.Exact --master yarn --conf "spark.dynamicAllocation.enabled=false" --num-executors 1 m2_yourid-assembly-1.0.jar --json exact-1m-1.json --train $ML1Mrbtrain --test $ML1Mrbtest --separator :: --k 300 --users 6040 --movies 4000 ```` +In order to keep results obtained with different parameters in different .json files, simply modify the corresponding parameter ("--json") passed and the values. For instance, with ```--num-executors 4``` : ```--json exact-1m-4.json```. +Note that when changing from ML-100k to ML-1M, the parameter ```--separator ::``` should be added, and the number of users and movies should be modified. + ## Grading scripts + We will use the following scripts to grade your submission: 1. ````./test.sh````: Run all unit tests. -- GitLab