Dependencies
sbt >= 1.4.7
Should be available by default on the IC Cluster. Otherwise, refer to each project installation instructions.
Dataset
Download the ml-100k.zip
dataset in the data/
folder:
> mkdir data
> cd data
> wget http://files.grouplens.org/datasets/movielens/ml-100k.zip
Check the integrity of the file with (it should give the same number as below):
> md5 -q ml-100k.zip
0e33842e24a9c977be4e0107933c0723
Unzip:
> unzip ml-100k.zip
Personal Ratings
Add your ratings in the 'data/personal.csv' file, by providing a numerical rating between [1,5] for at least 20 movies. For example, to rate the 'Toy Story' movie with '5', modify this line:
1,Toy Story (1995),
to this:
1,Toy Story (1995),5
Do include your own ratings in your final submission so we can check your answers against those provided in your report.
Usage
Compute predictions
> sbt "runMain predict.Predictor --train data/ml-100k/u1.base --test data/ml-100k/u1.test --json answers.json"
Compute recommendations
> sbt 'runMain recommend.Recommender'
Package for submission
Steps:
1. Update the name
, maintainer
fields of build.sbt
, with the correct Milestone number, your ID, and your email.
2. Ensure you only used the dependencies listed in build.sbt
in this template, and did not add any other.
3. Remove project/project
, project/target
, and target/
.
4. Test that all previous commands for generating statistics, predictions, and recommendations correctly produce a JSON file (after downloading/reinstalling dependencies).
5. Remove the ml-100k dataset (data/ml-100k.zip
, and data/ml-100k
), as well as theproject/project
, project/target
, and target/
.
6. Add your report and any other necessary files listed in the Milestone description (see Deliverables
).
7. Zip the archive.
8. Submit to the TA for grading.
References
Essential sbt: https://www.scalawilliam.com/essential-sbt/