FedAvg#

class FedAvg(data_manager, metric_logger, num_clients, sample_scheme, sample_rate, model_def, epochs, criterion_def, optimizer_def=functools.partial(<class 'torch.optim.sgd.SGD'>, lr=1.0), local_optimizer_def=functools.partial(<class 'torch.optim.sgd.SGD'>, lr=0.1), lr_scheduler_def=None, local_lr_scheduler_def=None, r2r_local_lr_scheduler_def=None, batch_size=32, test_batch_size=64, device='cpu', *args, **kwargs)[source]#

Implements FedAvg algorithm for centralized FL. For further details regarding the algorithm we refer to Communication-Efficient Learning of Deep Networks from Decentralized Data.

Parameters

data_manager (distributed.data_management.DataManager) -- data manager
metric_logger (logall.Logger) -- metric logger for tracking.
num_clients (int) -- number of clients
sample_scheme (str) -- mode of sampling clients. Options are 'uniform' and 'sequential'
sample_rate (float) -- rate of sampling clients
model_def (torch.Module) -- definition of for constructing the model
epochs (int) -- number of local epochs
criterion_def (Callable) -- loss function defining local objective
optimizer_def (Callable) -- derfintion of server optimizer
local_optimizer_def (Callable) -- defintoin of local optimizer
lr_scheduler_def (Callable) -- definition of lr scheduler of server optimizer.
local_lr_scheduler_def (Callable) -- definition of lr scheduler of local optimizer
r2r_local_lr_scheduler_def (Callable) -- definition to schedule lr that is delivered to the clients at each round (deterimined init lr of the client optimizer)
batch_size (int) -- batch size of the local trianing
test_batch_size (int) -- inference time batch size
device (str) -- cpu, cuda, or gpu number

Note

definition of

learning rate schedulers, could be any of the ones defined at
torch.optim.lr_scheduler or any other that implements step and get_last_lr methods._schedulers``.
optimizers, could be any torch.optim.Optimizer.
model, could be any torch.Module.
criterion, could be any fedsim.scores.Score.

deploy()[source]#

return Mapping of name -> parameters_set to test the model

Parameters: server_storage (Storage) -- server storage object.

init()[source]#

this method is executed only once at the time of instantiating the algorithm object. Here you define your model and whatever needed during the training. Remember to write the outcome of your processing to server_storage for access in other methods.

Note

*args and **kwargs are directly passed through from algorithm constructor.

Parameters: server_storage (Storage) -- server storage object

optimize(serial_aggregator, appendix_aggregator)[source]#

optimize server mdoel(s) and return scores to be reported

Parameters

server_storage (Storage) -- server storage object.
serial_aggregator (SerialAggregator) -- serial aggregator instance of current round.
appendix_aggregator (AppendixAggregator) -- appendix aggregator instance of current round.

Raises

NotImplementedError -- abstract class to be implemented by child

Returns

Mapping[Hashable, Any] -- context to be reported

receive_from_client(client_id, client_msg, train_split_name, serial_aggregator, appendix_aggregator)[source]#

receive and aggregate info from selected clients

Parameters

server_storage (Storage) -- server storage object.
client_id (int) -- id of the sender (client)
client_msg (Mapping[Hashable, Any]) -- client context that is sent.
train_split_name (str) -- name of the training split on clients.
aggregator (SerialAggregator) -- aggregator instance to collect info.

Returns

bool -- success of the aggregation.

Raises

NotImplementedError -- abstract class to be implemented by child

report(dataloaders, rounds, scores, metric_logger, device, optimize_reports, deployment_points=None)[source]#

test on global data and report info. If a flatten dict of str:Union[int,float] is returned from this function the content is automatically logged using the metric logger (e.g., logall.TensorboardLogger). metric_logger is also passed as an input argument for extra logging operations (non scalar).

Parameters

server_storage (Storage) -- server storage object.
dataloaders (Any) -- dict of data loaders to test the global model(s)
round_scores (Dict[str, Dict[str, fedsim.scores.Score]]) -- dictionary of form {'split_name':{'score_name': score_def}} for global scores to evaluate at the current round.
metric_logger (Any, optional) -- the logging object (e.g., logall.TensorboardLogger)
device (str) -- 'cuda', 'cpu' or gpu number
optimize_reports (Mapping[Hashable, Any]) -- dict returned by optimzier
deployment_points (Mapping[Hashable, torch.Tensor], optional) -- output of deploy method

Raises

NotImplementedError -- abstract class to be implemented by child

send_to_client(client_id)[source]#

returns context to send to the client corresponding to client_id.

Warning

Do not send shared objects like server model if you made any before you deepcopy it.

Parameters

server_storage (Storage) -- server storage object.
client_id (int) -- id of the receiving client

Raises

NotImplementedError -- abstract class to be implemented by child

Returns

Mapping[Hashable, Any] -- the context to be sent in form of a Mapping

send_to_server(rounds, storage, datasets, train_split_name, scores, epochs, criterion, train_batch_size, inference_batch_size, optimizer_def, lr_scheduler_def=None, device='cuda', ctx=None, step_closure=None)[source]#

client operation on the recieved information.

Parameters

id (int) -- id of the client
rounds (int) -- global round number
storage (Storage) -- storage object of the client
datasets (Dict[str, Iterable]) -- this comes from Data Manager
train_split_name (str) -- string containing name of the training split
scores -- Dict[str, Dict[str, Score]]: dictionary of form {'split_name':{'score_name': Score}} for global scores to evaluate at the current round.
epochs (int) -- number of epochs to train
criterion (Score) -- citerion, should be a differentiable fedsim.scores.score
train_batch_size (int) -- training batch_size
inference_batch_size (int) -- inference batch_size
optimizer_def (float) -- class for constructing the local optimizer
lr_scheduler_def (float) -- class for constructing the local lr scheduler
device (Union[int, str], optional) -- Defaults to 'cuda'.
ctx (Optional[Dict[Hashable, Any]], optional) -- context reveived.

Returns

Mapping[str, Any] -- client context to be sent to the server