News Story
Sentiment Analysis for Arabizi: A Multilingual Jargon on Social Media
Monday 29 May 2017
After spending almost 2 years at the lab, I am finally pleased to share my research with my friends. Last Thursday, 25th of May, I presented my work in NLP titled "Sentiment Analysis for Arabizi: A Multilingual Jargon". Arabizi is a transcription of the naturally-dialectal Arabic language in Latinscript, quiet common in mobile texting and social media. Similar to the Modern Standard Arabic (MSA) or Al-Fus·ha, it is rich in morphology. Unlike MSA, there is no standard orthography to transcribe a spoken language, it is often code-switched with English or French, found within multi-lingual streams of social data, and it is under-resourced lacking the classical NLP tools such as lexicons, stemmers, parsers, and labelled datasets. I talked about a pilot case study that we conducted last year to analyse the usage of Arabizi in Twitter data, and then moved on to address the challenges to process and extract sentiment from Arabizi. I presented some preliminary results and my current and future line of research. I thank Rania Islambouli, Omar Farhat, and Omar Osman for their contributions in annotating a Lebanese dialect Arabizi dataset that is going to be prepared and released publicly soon on project-rbz.com. You can watch the talk (20 min) via the link below.
Related Links:
Latest News
ClimateSense team wins top places in two international misinformation challenges
Assessing the Impact of Artificial Intelligence on the Gender Pay Gap
PhD Awarded for Groundbreaking Research on Game-Based Cyber Security Training
KMi at the Palace of Westminster: Exploring Blockchain for Society and Economy
OUAnalyse at the Digital Ethics Summit 2025: Advancing Responsible AI in Education

