Automatic Sentence Parsing For Amharic Text An Experiment Using Probabilistic Context Free Grammar

Information Sciences Project Topics

Get the Complete Project Materials Now! ยป

Natural Language processing, as a field of scientific inquiry, plays an important rolernin increasing computers capability to understand natural languages, the language byrnwhich most human knowledge is recorded. Works in the area of Natural LanguagernProcessing try to design and implement computer programs that can understandrnnatural language and act appropriately on the information contained in the text orrnutterance. Enabling computers to understand natural language involves extraction ofrnmeaning from natural language sentences. And one of the steps in this process isrnsentence parsing.rnSentence parsing, which is also called syntactic parsing, is the process of identifyingrnhow words can be put together to form correct sentences and determining whatrnstructural role each word plays in the sentence and what phrases are subparts ofrnwhat other phrases. A sentence parser outputs a parse structure that could be usedrnas a component in many applications including semantic analysis, machinerntranslation, information storage and retrieval of textual data etc.rnToday, parsers of different kinds (e.g. probabilistic, rule based) have been developedrnfor languages, which have relatively wider use nationally and/or internationally (e .g.rnEnglish, German, Chinese, etc). The same story is not true for Amharic, the workingrnlanguage of the Federal Government of Ethiopia, and one of the major languages ofrnEthiopia (Bender et ai, 1976) since to the best of my knowledge, there are nornsentence parsers of any sort that process this language.Sentence parsing, which is also called syntactic parsing, is the process of identifyingrnhow words can be put together to form correct sentences and determining whatrnstructural role each word plays in the sentence and what phrases are subparts ofrnwhat other phrases. A sentence parser outputs a parse structure that could be usedrnas a component in many applications including semantic analysis, machinerntranslation, information storage and retrieval of textual data etc.rnToday, parsers of different kinds (e.g. probabilistic, rule based) have been developedrnfor languages, which have relatively wider use nationally and/or internationally (e .g.rnEnglish, German, Chinese, etc). The same story is not true for Amharic, the workingrnlanguage of the Federal Government of Ethiopia, and one of the major languages ofrnEthiopia (Bender et ai, 1976) since to the best of my knowledge, there are nornsentence parsers of any sort that process this language. This study, thus, attempted to develop a simple automatic parser for Amharic texts/sentences to address the need for developing systems that automatically process the Amharic language. In the study, the Inside Outside algorithm with a bottom up chart parsing strategy hasrnbeen used. The probabilistic context free grammar has been used as a grammaticalrnformalism to represent the phrase structure rules of the language. A small samplerncorpus was selected from sentences in the language, and has been used to serve asrna training and test set. The sample was then hand parsed, automatically tagged, andrnwas used as a corpus to extract the grammar rules and assign probabilities.rnThe thesis, in short, describes processes of automatic sentence parsing using arncombination of probabilistic and rule-based reasoning. It describes the whole processrnfrom manually parsing simple sentences to developing a prototype and conductingrnan experiment with it. The results obtained using the small manually parsed corpusrnseems to encourage further research to be launched, especially with the aim ofrndeveloping a full-fledged Amharic sentence parser.

Get Full Work

Report copyright infringement or plagiarism

Be the First to Share On Social



1GB data
1GB data

RELATED TOPICS

1GB data
1GB data
Automatic Sentence Parsing For Amharic Text An Experiment Using Probabilistic Context Free Grammar

299