An Integra Ted Approach To Automatic Complex Sentence Parsing For Amharic Text

Information Sciences Project Topics

Get the Complete Project Materials Now! ยป

Natural language processing is a research area which is becoming increasingly popularrneach day for both academic and commercial reasons. Higher NLP systems (e.g., machinerntranslation) are materialized only when the lower ones (e.g. , partition-speech tagger,rnsyntactic parser) are successfully built. This nonfictional dependency exists even among thernlower NLP systems. A morphological analyzer can be an important component for a partoj-rnspeech (paS) tagger particularly in dealing with unknown words. A pas tagger, which is a system that uses various sources of information to assign possibly uniquernpass to words, in turn, can be used as an input to a syntactic parser. Writers in the arearnof NLP argue that if the pas tagger is accurate, th is method is an excellent one. Th isrnthesis can be taken as an attempt to integrate ideas and outputs of previously attemptedrnAmharic NLP prototypes towards solving a birther problem in the NLP of thernlanguage, i.e. automatic Amharic complex sentence parsing.rnSyntactic parsing underlies most of the applications in natural language processing.rnParsers are already being used extensively in a number of disciplines such as inrncomputer science (for compiler construction, database interfaces, artificial intelligence,rnetc), and in linguistics (for text analysis, co/pora analysis, machine translation, etc.).rnAlthough there have been some comprehensive studies of Amharic syntax from arnlinguistic perspective, attempts for investigating it from a computational point of view isrnave/y recent story. In this thesis, Amharic word and phrase classes, sentence formalism,rnmo/pho logical properties peculiar to complex sentence formation in the language, and attempts to extract such features that enable implementation of automatic Amharicrncomplex sentence parser is presented.rnThe sample data used in this study has been taken from references that are widely usedrnin the teaching-learning process of the language. This data has also been manuallyrnanalyzed, tagged, parsed, and then used as a corpus to extract the grammar rules and tornassign probabilities. Algorithms that can use the morphological, lexical and syntacticrnproperties of the language have been customized and modified.rnExperiments have been conducted in this study using the training set and test set. Thernfirst experiment was conducted on the patrol-speech tagger to see the state of itsrnperformance when a morphological analysis is embedded in it. The result of thisrnexperiment showed that the tagger attained 98. 7% and 94% of ac curacies on the trainingrnset and the test set, respectively.rnThe experiments on complex sentence parsing showed 89.6% accuracy result on therntraining set and 81.6% accuracy result on the test set prepared for this purpose.

Get Full Work

Report copyright infringement or plagiarism

Be the First to Share On Social



1GB data
1GB data

RELATED TOPICS

1GB data
1GB data
An Integra Ted Approach To Automatic Complex Sentence Parsing For Amharic Text

271