using uid instead of rank for the femnist data partition (!1) · Merge requests · SaCS / decentralizepy

Jeffrey Wigger requested to merge wigger/decentralizepy:femnist_uid_change into main Feb 17, 2022

For the femnist dataset the data gets partitioned on every machine based on the rank of the process [0, procs_per_machine]. If the data is not already pre-partitioned then all machines will use the entire dataset, i.e., a sample is used machines times.

This PR changes the Femnist dataset to use the uid for partitioning as it is done in for Celeba.

TODO:
[x] Either change all femnist config files such that n_procs now equals procs_per_machine * machines, or set n_procs in Dataset.py to mapping.n_machines * mapping.procs_per_machine

Update:
Removed the n_procs config option. Both for the Femnist and Celeba datasets the data is now split between the number of global processes.

Edited Feb 17, 2022 by Jeffrey Wigger

using uid instead of rank for the femnist data partition

Merge request reports