Decentralized Mutual Learning for Multi-Class Federated 3D Organ-at-Risk Segmentation
Abstract
Purpose
To generalize decentralized mutual learning to multi-class 3D organ-at-risk (OAR) segmentation and evaluate whether gossip-based peer-to-peer training improves federated radiotherapy auto-segmentation across heterogeneous clinical sites.
Methods
Multi-institutional radiotherapy segmentation is challenged by domain shift, class imbalance, and restrictions on data sharing. We developed a decentralized mutual learning framework in which each site trains a local 3D segmentation network and intermittently performs gossip-based peer-to-peer model exchange with another site. The key methodological development is extending mutual-learning–based distillation (commonly demonstrated in binary settings) to multi-OAR segmentation, using class-wise peer distillation with region-focused agreement while preserving standard supervised training for all classes. The method was evaluated on a private Head&Neck radiotherapy dataset (400 CT scans in total) with 15 OARs, under six different sites defined by treating physicians. We compared (i) site-specific individual training, (ii) standard federated learning (FedAvg), and (iii) decentralized mutual learning, using Dice similarity coefficient (DSC) on held-out test sets.
Results
On the private Head&Neck dataset, mean DSC was 0.7625 for individualized (site-personalized) models, 0.7786 for FedAvg, and 0.7804 for decentralized mutual learning. Compared with FedAvg, decentralized mutual learning achieved better accuracy while requiring substantially less communicationcost. The decentralized mutual learning consistently improved performance for several OARs and mitigated performance degradation under non-IID inter-site heterogeneity, with particular robustnessgains observed for select small structures at certain sites.
Conclusion
Decentralized mutual learning can be extended to practical multi-class radiotherapy OAR segmentation and can improve federated performance, particularly in heterogeneous abdominal organsegmentation. These preliminary results support gossip-based peer-to-peer mutual learning as a viable alternative to standard federated averaging for multi-institution radiotherapy auto-segmentation.