Skip to content

using uid instead of rank for the femnist data partition

Jeffrey Wigger requested to merge wigger/decentralizepy:femnist_uid_change into main

For the femnist dataset the data gets partitioned on every machine based on the rank of the process [0, procs_per_machine]. If the data is not already pre-partitioned then all machines will use the entire dataset, i.e., a sample is used machines times.

This PR changes the Femnist dataset to use the uid for partitioning as it is done in for Celeba.

TODO:
[x] Either change all femnist config files such that n_procs now equals procs_per_machine * machines, or set n_procs in Dataset.py to mapping.n_machines * mapping.procs_per_machine

Update:
Removed the n_procs config option. Both for the Femnist and Celeba datasets the data is now split between the number of global processes.

Edited by Jeffrey Wigger

Merge request reports