Amharic-english Bilingual Search Engine

Computer Science Project Topics

Get the Complete Project Materials Now! »

As non-English languages have been growing exponentially on the Web with the expansion ofrnmultilingual World Wide Web, the number of online non-English speakers who realizes thernimportance of finding information in different languages is enormously growing. However, thernmajor general purpose search engines such as Google, Yahoo, etc have been lagging behind inrnproviding indexes and search features to handle non-English languages. Hence, documents thatrnare published in non-English languages are more likely to be missed or improperly indexed byrnmajor search engines. Amharic, which is the family of Semitic languages and the officialrnworking language of the federal government of Ethiopia, is one of these languages with a rapidlyrngrowing content on the Web. As a result, the need to develop bilingual search engine thatrnhandles the specific characteristics of the users’ native language query (Amharic) and retrievesrndocuments in both Amharic and English languages becomes more apparent.rnIn this research work, we designed a model for an Amharic-English Search Engine andrndeveloped a bilingual Web search engine based on the model that enables Web users for findingrnthe information they need in Amharic and English languages. In doing so, we have identifiedrndifferent language dependent query preprocessing components for query translation. We havernalso developed a bidirectional dictionary-based translation system which incorporates arntransliteration component to handle proper names which are often missing in bilingual lexicons.rnWe have used an Amharic search engine and an open source English search engine (Nutch) asrnour underlying search engines for Web document crawling, indexing, searching, ranking andrnretrieving.rnTo evaluate the effectiveness of our Amharic-English bilingual search engine, precisionrnmeasures were conducted on the top 10 retrieved Web documents. The experimental resultsrnshowed that the Amharic-English cross-lingual retrieval engine performed 74.12% of itsrncorresponding English monolingual retrieval engine and the English-Amharic cross-lingualrnretrieval engine performed 78.82% of its corresponding Amharic monolingual retrieval engine.rnThe bilingualism advantage of the system is also evaluated by comparing its results with generalrnpurpose search engines. The overall evaluation results of the system are found to be promising.rnKey Words: Bilingual search engines, cross-lingual information retrieval, query preprocessing,rnquery translation, transliteration.

Get Full Work

Report copyright infringement or plagiarism

Be the First to Share On Social



1GB data
1GB data

RELATED TOPICS

1GB data
1GB data
Amharic-english Bilingual Search Engine

193