A Supervised Machine Learning Approach for Events Extraction out of Arabic Tweets

Mohammad ALSmadi
Omar Qawasmeh

Tweets hold rich amount of information about daily events, however they are noisy text, personalized and challenging to be understood by machines. Therefore, this research proposes a state-of-the-art supervised machine learning approach for extracting events out of Arabic tweets. The proposed approach focuses on four research tasks: Task 1: Event Trigger Extraction, Task 2: Event Time Expression Extraction, Task 3: Event Type Identification, and Task 4: Temporal Resolution for ontology population. Extracted event arguments using these tasks are used to populate an event ontology designed for this purpose. This ontology is used to feed a visualization tool (e.g. Calendar) representing live extracted events. The proposed approach was evaluated on a dataset of 2k Arabic tweets and evaluation results were promising. The approach performance was compared to an unsupervised rule-based approach from previous work using the same dataset. Results show that the proposed approach outperforms the unsupervised rule-based one in tasks T1: event trigger extraction (F-1= 92.6 vs. F-1= 78.7) and T2: event time expression extraction (F-1= 92.8 vs. F-1= 88.35), whereas is acting relatively worse in T3: event type identification (Accuracy= 80.1 vs. Accuracy= 95.9).
