Personalized Federated Learning with Robust and Sparse Regressions

Posted on Dec 10, 2024 (Update: Dec 12, 2024)

This note is for Liu, W., Mao, X., Zhang, X., & Zhang, X. (2024). Robust Personalized Federated Learning with Sparse Penalization. Journal of the American Statistical Association, 0(0), 1–12. https://doi.org/10.1080/01621459.2024.2321652

Federated learning is an emerging topic due to its advantage in collaborative learning with distributed data.

the paper proposed a personalized federated learning method to address the robust regression problem

learn the regression weight by solving a Huber loss with the sparse fused penalty.
design the personalized federated learning for robust and sparse regression (PerFL-RSR) algorithm to solve the estimation problem in the federated system efficiently

In a FL system, a large number of clients will collaboratively train a machine learning model coordinated by a central server.

Rather than transferring all the data to the central server as in the traditional distributed learning system, FL allows the clients to keep their local data and thus gain a basic level of data privacy

However, due to the heterogeneous data-generating mechanism among the clients, there emerge two new challenges that might cause performance deterioration on FL:

statistical heterogeneity
system heterogeneity

many recent studies for the data heterogeneity:

feature normalization: Huang and Belongie (2017), Choi et al. (2021)
model weight regularization: Karimireddy et al. (2020)
improving aggregation: Reddi et al. (2020), Wang et al. (2020)

however, the above methods focus on training a global shared model, which means that they assume a common global model for all clients. the global model-sharing assumption might not be appropriate, especially when data distributions differ significantly among clients

the traditional FL with the global model sharing assumption would sacrifice the generalizability of the model

to balance generalization and personalization, personalized federated learning (PFL) is proposed

Rather than aggregation and generation of one global model in FL, in PFL, the sever needs to learn the relationship or similarity between local models and then generate personalized models for each client.

many recent studies for PFL:

fine-tuning: Fallah, Mokhtari, and Ozdaglar (2020)
multi-task learning (MTL): Smith et al. (2017), Huang et al. (2021)
clustered federated learning: Ghosh et al. (2020), Sattler, Muller, and Samek (2020)
parameter decoupling: Arivazhagan et al. (2019)
knowledge distillation: Li and Wang (2019)

this work combines the idea of MTL and clustered FL

MTL achieves personalization by studying the similarity between local client models
clustered FL achieves personalization through inherent partitions of all local client models

system heterogeneity becomes a considerable bottleneck in FL, especially when there are a large number of local clients in the network, such as learning over mobile phones, wearable devices, and autonomous vehicles.

because of the different conditions of the clients, e.g., the network connection and power status of the devices, it may be impractical to involve all the clients in each communication round
thus, the paper considers designing a federated learning algorithm that allows a low participation rate per model iteration round

in the paper, they focus on personalized federated learning for robust regression problems

the study is motivated by the fact that many real-world datasets are contaminated with heavy-tailed noises and abnormal values, especially for the federated learning system with massive data sources
and they consider that the regression problem is high-dimensional, the high dimensionality issue necessitates the model sparsity recovery.

in this work, the objective is to develop a personalized federated learning method for robust and sparse regression.

main contributions:

balance the tradeoff of personalization and generalization for the robust sparse regression problem under the federated learning paradigm
- propose a novel learning loss, which consists of the Huber loss for robustness, the client-wise fusion regularizer for personalization, and the sparse regularizer for sparsity recovery
develop an alternating direction method of multipliers (ADMM) based algorithm in the federated server-client system to solve the proposed loss, called PerFL-RSR
- it addresses system heterogeneity and communication efficiency through random sampling clients for each update on the server.
establish the convergence theory for the proposed Perfl-RSR algorithm.
establish the consistency properties for the proposed estimator

Federated MTL:

MOCHA algorithm (Smith et al., 2017)
FedAMP (Huang et al., 2021): propose a message-passing mechanism to solve loss with pairwise regularization terms among all clients
both are designed for convex objectives and are not applicable to the proposed non-convex loss

clustering FL assumes that there are homogeneous groups of clients in terms of local data distributions

IFCA (Ghosh et al., 2020) has been proposed to learn $K$ global models on the server, and then each client can choose a model with the smallest local loss
- the server needs to communicate $K$ times more information
- prior knowledge of the number of groups $K$ is required
Sattler, Muller, and Samek (2020):
- authors proposed an algorithm with a post-processing step of clustering
- use recursive bi-partitioning clustering, which increases computation and communication consists

robust regressions:

median-on-means
quantile regression
Huber loss

consider a federation of $M$ clients

each client owns a local datasets $\cD_m = \{x_{mi}, y_{mi}\}_{i=1}^{n_m}, m=1,\ldots,M$
assume client-specific linear models: $y_{mi} = x_{mi}^\top \beta_m^\star + \epsilon_{mi}$

adopt the regularization term on the scalar-wise and pair-wise differences on $\beta_{mj}$ and $\beta_{m’j}$ for $j\in [p]$.

then the PEL solution is obtained by solving

\[\argmin_{\beta_1,\ldots,\beta_M\in \IR^p} \frac 1M\sum_{m=1}^M l_m(\beta_m) + \sum_{j=1}^p\sum_{m\le m'} p_\lambda(\vert \beta_{mj} - \beta_{m'j}\vert)\]

Published in categories

← previous next →

See all posts →

WeiYa's Work Yard

A traveler with endless curiosity, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

Personalized Federated Learning with Robust and Sparse Regressions

Posted on Dec 10, 2024 (Update: Dec 12, 2024)