using uid instead of rank for the femnist data partition
For the femnist dataset the data gets partitioned on every machine based on the rank of the process [0, procs_per_machine
]. If the data is not already pre-partitioned then all machines will use the entire dataset, i.e., a sample is used machines
times.
This PR changes the Femnist dataset to use the uid
for partitioning as it is done in for Celeba.
TODO:
[x] Either change all femnist config files such that n_procs now equals procs_per_machine * machines
, or set n_procs
in Dataset.py to mapping.n_machines * mapping.procs_per_machine
Update:
Removed the n_procs config option. Both for the Femnist and Celeba datasets the data is now split between the number of global processes.
Edited by Jeffrey Wigger