Fednlp: A research platform for federated learning in natural language processing

Bill Yuchen Lin, Chaoyang He, Zihang Zeng, Hulin Wang, Yufen Huang, Mahdi Soltanolkotabi, Xiang Ren, Salman Avestimehr

July, 2021

Abstract

Increasing concerns and regulations about data privacy, necessitate the study of privacypreserving methods for natural language processing (NLP) applications. Federated learning (FL) provides promising methods for a large number of clients (ie, personal devices or organizations) to collaboratively learn a shared global model to benefit all clients, while allowing users to keep their data locally. To facilitate FL research in NLP, we present the FedNLP 1, a research platform for federated learning in NLP. FedNLP supports various popular task formulations in NLP such as text classification, sequence tagging, question answering, seq2seq generation, and language modeling. We also implement an interface between Transformer language models (eg, BERT) and FL methods (eg, FedAvg, FedOpt, etc.) for distributed training. The evaluation protocol of this interface supports a comprehensive collection of non-IID partitioning strategies. Our preliminary experiments with FedNLP reveal that there exists a large performance gap between learning on decentralized and centralized datasets—opening intriguing and exciting future research directions aimed at developing FL methods suited to NLP tasks.

Type

Conference paper

Publication

In NAACL Conference

Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.

Create your slides in Markdown - click the Slides button to check out the example.

Supplementary notes can be added here, including code, math, and images.