Data augmentation approach for language identification in imbalanced bilingual code-mixed social media datasets
Addressing the problem of language identification in code-mixed datasets poses notable challenges due to data scarcity and high confusability in bilingual contexts. These challenges are further amplified by the associated imbalance and noise characteristic of social media data, complicating efforts...
| Main Authors: | , , , |
|---|---|
| Format: | Conference or Workshop Item |
| Language: | English English |
| Published: |
Institute of Electrical and Electronics Engineers Inc.
2023
|
| Subjects: | |
| Online Access: | http://umpir.ump.edu.my/id/eprint/40378/ http://umpir.ump.edu.my/id/eprint/40378/1/Data%20augmentation%20approach%20for%20language%20identification.pdf http://umpir.ump.edu.my/id/eprint/40378/2/Data%20augmentation%20approach%20for%20language%20identification%20in%20imbalanced%20bilingual%20code-mixed%20social%20media%20datasets_ABS.pdf |
Internet
http://umpir.ump.edu.my/id/eprint/40378/http://umpir.ump.edu.my/id/eprint/40378/1/Data%20augmentation%20approach%20for%20language%20identification.pdf
http://umpir.ump.edu.my/id/eprint/40378/2/Data%20augmentation%20approach%20for%20language%20identification%20in%20imbalanced%20bilingual%20code-mixed%20social%20media%20datasets_ABS.pdf