A Transformer-Based Approach to Multilingual Fake News Detection in Low-Resource Languages
De, Arkadipta and Bandyopadhyay, Dibyanayan and Gain, Baban and Ekbal, Asif (2022) A Transformer-Based Approach to Multilingual Fake News Detection in Low-Resource Languages. ACM Transactions on Asian and Low-Resource Language Information Processing, 21 (1). pp. 1-20. ISSN 2375-4699
Text
ACM_Transactions.pdf - Published Version Restricted to Registered users only Download (1MB) | Request a copy |
Abstract
Fake news classification is one of the most interesting problems that has attracted huge attention to the researchers of artificial intelligence, natural language processing, and machine learning (ML). Most of the current works on fake news detection are in the English language, and hence this has limited its widespread usability, especially outside the English literate population. Although there has been a growth in multilingual web content, fake news classification in low-resource languages is still a challenge due to the non-Availability of an annotated corpus and tools. This article proposes an effective neural model based on the multilingual Bidirectional Encoder Representations from Transformer (BERT) for domain-Agnostic multilingual fake news classification. Large varieties of experiments, including language-specific and domain-specific settings, are conducted. The proposed model achieves high accuracy in domain-specific and domain-Agnostic experiments, and it also outperforms the current state-of-The-Art models. We perform experiments on zero-shot settings to assess the effectiveness of language-Agnostic feature transfer across different languages, showing encouraging results. Cross-domain transfer experiments are also performed to assess language-independent feature transfer of the model. We also offer a multilingual multidomain fake news detection dataset of five languages and seven different domains that could be useful for the research and development in resource-scarce scenarios. © 2021 Association for Computing Machinery.
IITH Creators: |
|
||
---|---|---|---|
Item Type: | Article | ||
Uncontrolled Keywords: | Fake news detection; Hindi; Indonesian; low-resource languages; multilingual; Swahili; Vietnamese | ||
Subjects: | Computer science | ||
Divisions: | Department of Computer Science & Engineering | ||
Depositing User: | . LibTrainee 2021 | ||
Date Deposited: | 26 Jul 2022 04:05 | ||
Last Modified: | 26 Jul 2022 04:05 | ||
URI: | http://raiithold.iith.ac.in/id/eprint/9921 | ||
Publisher URL: | http://doi.org/10.1145/3472619 | ||
OA policy: | https://v2.sherpa.ac.uk/id/publication/10669 | ||
Related URLs: |
Actions (login required)
View Item |
Statistics for this ePrint Item |