A Basic Data Manager#
- class BasicDataManager(root='data', dataset='mnist', num_partitions=500, rule='iid', sample_balance=0.0, label_balance=1.0, local_test_portion=0.0, global_valid_portion=0.0, seed=10, save_dir='partitions')[source]#
A basic data manager for partitioning the data. Currecntly three rules of partitioning are supported:
- iid:
same label distribution among clients. sample balance determines quota of each client samples from a lognorm distribution.
- dir:
Dirichlete distribution with concentration parameter given by label_balance determines label balance of each client. sample balance determines quota of each client samples from a lognorm distribution.
- exclusive:
samples corresponding to each label are randomly splitted to k clients where k = total_sample_size * label_balance. sample_balance determines the way this split happens (quota). This rule also is know as "shards splitting".
- Parameters
root (str) -- root dir of the dataset to partition
dataset (str) -- name of the dataset
num_clients (int) -- number of partitions or clients
rule (str) -- rule of partitioning
sample_balance (float) -- balance of number of samples among clients
label_balance (float) -- balance of the labels on each clietns
local_test_portion (float) -- portion of local test set from trian
global_valid_portion (float) -- portion of global valid split. What remains from global samples goes to the test split.
seed (int) -- random seed of partitioning
save_dir (str, optional) -- dir to save partitioned indices.
- get_identifiers()[source]#
Returns identifiers to be used for saving the partition info.
- Returns
Sequence[str] -- a sequence of str identifing class instance
- make_datasets(root)[source]#
makes and returns local and global dataset objects. The created datasets do not need a transform as recompiled datasets with separately provided transforms on the fly.
- Parameters
dataset_name (str) -- name of the dataset.
root (str) -- directory to download and manipulate data.
- Returns
Tuple[object, object] -- local and global dataset
- make_transforms()[source]#
make and return the dataset trasformations for local and global split.
- Returns
Tuple[Dict[str, Callable], Dict[str, Callable]] --
- tuple of two dictionaries,
first, the local transform mapping and second the global transform mapping.