Named Entity Recognition For Amharic Language

Computer Science Project Topics

Get the Complete Project Materials Now! ยป

Named Entity Recognition (NER) is a process of identifying and categorizing all named entitiesrnin a document into predefined classes like person, organization, location, time, and numeralrnexpressions. This identification and classification of proper names in text has recently consideredrnas a major importance in natural language processing as it plays a significant role in variousrntypes of NLP applications, especially in information extraction, information retrieval, machinerntranslation, and question-answering. This paper reports about the development of a NER systemrnfor Amharic using Conditional Random Fields (CRFs). Though this state of the art machinernlearning method has been widely applied to NER in several well-studied languages, this is thernfirst attempt to use this method to Amharic language.rnThe system makes use of different features such as word and tag context features, part of speechrntags of tokens, prefix and suffix. Since feature selection plays a crucial role in CRF framework,rnexperiments were carried out to find out most suitable features for Amharic NE tagging task.rnDuring the experiment, four different scenarios were considered based on the differentrncombination of features. In the first scenario all the features were considered, in the secondrnscenario all the features except POS tags of tokens were considered. In the third and fourthrnscenarios all the features except prefix and suffix respectively were considered.rnThe experimental results show that for different combinations of features, we have got differentrnresults. In scenario one experiment, we have got Precision, Recall and F-measure of 72%, 75%rnand 73.47% respectively. Taking this as a base line we made the remaining experiments. Thernremaining experiments on scenario two, three and fourth, its F-measure of 69.70%, 74.61%, andrn70.65% respectively were obtained.rnFrom the above results, it is possible to make a conclusion that word context features, POS tagsrnof tokens and suffix are important features in NE recognition and classification for Amharicrntext.rnKeywords: Named Entity Recognition, Conditional Random fields, Named Entities, AmharicrnNamed Entity Recognition.

Get Full Work

Report copyright infringement or plagiarism

Be the First to Share On Social



1GB data
1GB data

RELATED TOPICS

1GB data
1GB data
Named Entity Recognition For Amharic Language

372