Hybrid Model For Amharic Sentiment Classification

Information Technology(phd) Project Topics

Get the Complete Project Materials Now! ยป

Amharic is a less-resourced language, which lacks a standard dictionary, stemmer, languagerndetector, subjectivity detection, negation handling, Amharic sentiment lexicon and annotatedrnAmharic corpora to carry out sentiment classification in social media texts. This research focusesrnon sentiment analysis of Amharic texts. The first part of the research is to generate mostrnof these required resources and the second part of the research deals with enhancing performancernof sentiment classification using proposed approaches.rnIn this research, four categories of corpora are prepared: general corpora (category I), annotatedrncorpora (category II), lexical resources (category III) and pre-trained models (categoryrnIV). The annotated corpora, such as Facebook comments of GCAO (2,871), PMO (6,637),rnEBC (2,444) and Zemen YouTube Comments(1,440) are used for building and evaluatingrnAmharic sentiment classification approaches. To remedy the problems of sentiment analysisrnof an under-resourced language (Amharic in this case), Amharic sentiment lexicons are generatedrnusing dictionary based and corpus based approaches. Using a dictionary based approach,rnSO-CAL (5,681) and SWN (13,677) are generated from English sentiment lexicons (using categoryrnIII). Using corpus based approach, Amharic sentiment seeds are expanded to generaternAmharic sentiment lexicon from Amharic corpora (using category I). At the threshold of 500,rnthe generated lexicon has a size of 8,132. The generated lexicons are evaluated in terms of subjectivityrndetection, coverage, agreement and accuracy by comparison with the manual lexiconrn(baseline). The generated lexicons are used for subjectivity detection and negation handling.rnFor sentiment classification (SC) of text on a topic, supervised, ensemble methods and BERTrntransfer learning are proposed, built, tuned and evaluated under small labeled observations (usingrncategory II). Finally, for enhancing the performance of Amharic sentiment classification, arnhybrid model (i.e. voting, averaging and blender) is developed that combines the top performingrnclassifiers of the earlier approaches. Experiments on the proposed hybrid models were donernusing category II annotated sentiment data sets. The results show that the proposed hybridrnmodel (blender) has achieved performance gain as compared to SVM model (baseline) using therndata sets. The complete sentiment classification system is showed in real-time and offline applicationsrnfor detecting language, topics, subjectivity and prediction of sentiments of comments.

Get Full Work

Report copyright infringement or plagiarism

Be the First to Share On Social



1GB data
1GB data

RELATED TOPICS

1GB data
1GB data
Hybrid Model For Amharic Sentiment Classification

116