Amharic Document Image Retrieval Using Lingustic Features

Computer Science Project Topics

Get the Complete Project Materials Now! »

The advent of modern computers play important roles in processing and managing electronicrninformation that are found in the form of texts, images, audios and videos, etc. With the rapidrndevelopment of computer technology, digital documents have become popular options forrnstorage, accessing and transmission. With the need of current fast evolving digital libraries, anrnincreasing amount of historical documents, newspaper, books, etc. are being digitized into anrnelectronic format for easy archival and dissemination purposes. Optical Character Recognitionrn(OCR) and Document Image Retrieval (DIR), as part of information retrieval paradigm, are therntwo means of accessing document images that received attention among the IR community.rnAmharic is the official language of Ethiopia since 19th century and as a result so many religiousrnand government documents are written in Amharic. Huge collections of Amharic machinernprinted documents are found in almost every institution of the country. It is observed thatrnaccessing those documents has become more and more difficult. To address this problem, veryrnfew number of research works have been attempted recently by using OCR and DIR methods.rnThe aim of this research is to develop a system model that enables users to find relevant Amharicrndocument images from a corpus of digitized documents in an easy, accurate, fast and efficientrnmanner. So this work presents the architecture of Amharic DIR which allows users to searchrnscanned Amharic documents without the need of OCR. The proposed model is designed afterrnmaking detailed analysis of the specific nature of Amharic language. Amharic belongs to thernSemitic languages and is morphologically rich language. Surface words formation involvesrnprefixation, suffixation, infixation, circumfixation and reduplication.rnIn this work a model for searching Amharic document images is proposed and word imagernfeatures are systematically extracted for automatically indexing, retrieving and ranking ofrndocument images stored in a database. A new approach that applies one of the NLP tools whichrnis Amharic word generator is incorporated in the proposed system model. By providing a givenrnAmharic root word to this Amharic specific surface word synthesizer, a number of possiblernsurface words are produced. Then, the descriptions of these surface word images are used forrnindexing and searching purposes. On the other hand the system passes through various phasesrnsuch as noise removal, binirization, text line and word boundary identification, wordrnsegmentation and resizing to normalize different font types, sizes and styles, feature extractionrnand finally matching query word image against document word images. The proposed methodrnwas tested on different real world Amharic documents from different sources like magazines,rntextbooks and newspapers with various font styles, types and sizes. Precision-recall measures ofrnevaluation had been conducted for sample queries on sample document images and promisingrnresults have been achieved.

Subsurface Intelligence & Critical Mineral Exploration

Modern Geology projects now focus on Machine Learning in Mineral Targeting, Carbon Capture & Storage (CCS) Geologic Modeling, and Critical Mineral Systems (Lithium, REEs). If your research involves Hydrogeological Connectivity, Seismic Inversion, or Geotechnical Site Characterization, ensure your analysis follows the JORC or NI 43-101 reporting standards and utilizes robust 3D Subsurface Visualization and Geochemical Fingerprinting frameworks.

Get Full Work

Report copyright infringement or plagiarism

Be the First to Share On Social

Amharic Document Image Retrieval Using Lingustic Features

Computer Science Project Topics

Get the Complete Project Materials Now! »

Subsurface Intelligence & Critical Mineral Exploration

Get Full Work

Be the First to Share On Social

RELATED TOPICS

383

Amharic Document Image Retrieval Using Lingustic Features

Computer Science Project Topics

Get the Complete Project Materials Now! »

Subsurface Intelligence & Critical Mineral Exploration

Get Full Work

Be the First to Share On Social

RELATED TOPICS

383

Enjoying our content?