Federated Learning

Federated learning requires to define an aggregation strategy, i.e. a method to combine the local models coming from the clients into a global one. Here is the list of some variant of federated learning:

Federated averaging (FedAvg)

The standard and simplest aggregation strategy is federated averaging (FedAvg).

The learning is performed in rounds. At each round, the server samples a set of 𝑚 clients (out of the total 𝐾 clients) which will be considered for the current iteration and sends them the current global model. These clients update the parameters of their local copy of the model by optimizing the loss 𝐹𝑘 on their local training data using SGD for 𝐸 epochs. At the end of the round, the local parameters are sent to the server, which aggregates them by performing a weighted average. The aggregated parameters define the global model for the next round.

Federated learning (FL) is a privacy-preserving distributed machine learning (ML) paradigm. In FedAvg, a central server connects with enormous clients (e.g., mobile phones, pad, etc.); the clients keep their data without sharing it with the server. In each communication round, clients receive the current global model from the server, and a small portion of clients are selected to update the global model by running stochastic gradient descent (SGD) for multiple iterations using local data. The central server then aggregates these updated parameters to obtain the updated global model. In particular, if the clients are homogeneous, FedAvg is equivalent to the local SGD. FedAvg involves multiple local SGD updates and one aggregation by the server in each communication round, which significantly reduces the communication cost between sever and clients compared to the conventional distributed training with one local SGD update and one communication.

FedProx

Another strategy is FedProx, which is a generalization of FedAvg with some modifications to address heterogeneity of data and systems.