Concept -based Automatic Amharic Document Categorization

Information Sciences Project Topics

Get the Complete Project Materials Now! ยป

Along with the continuously growing volume of information availability, there is a growing interestrntowards better solutions for finding, filtering and organizing these resources. Automatic textrncategorization can play an important role in a wide variety of more flexible, dynamic , andrnpersonalized information management tasks.rnThe process of automatic text categorization involves calculating similarities between documentsrnand categories using the information extracted from the document. In recent years, ontology-basedrndocument categorization method is introduced to solve the problem of document classifier.rnPrevious works on keyword-based document categorization miss some important issues ofrnconsidering semantic relationships between words. In order to resolve the existing problems, thisrnstudy proposes a framework that automatically categorizes Amharic documents into predefinedrncategories using knowledge represented in the News ontology. At the heart of the classificationrnsystem is the knowledge base that enables the representation of different domain concepts.rnDuring the classification process, all the documents pass through pre-processing stages. Then indexrnterms are extracted from a given document which is mapped onto their corresponding concepts inrnthe ontology. Finally, the selected document is classified into a predefined category, based on thernweighted concept.rnWith the help of News domain entomologist, this study categorizes a given Amharic document into arnspecific predefined category . The study shows that the use of concepts for Amharic documentrncategorize results in 92.9% accuracy which is a promising outcome.rnKeywords: Ontology, Keyword-based, Concept-based text categorization, Knowledge representation .

Get Full Work

Report copyright infringement or plagiarism

Be the First to Share On Social



1GB data
1GB data

RELATED TOPICS

1GB data
1GB data
Concept -based Automatic Amharic Document Categorization

264