Spoken language is the primary method of human to human communication.rnThis communication by spoken language is now extended by use ofrntechnologies such as telephony, radio, etc. These technological advancementsrnreflect that spoken communication is the preferred method in humanrnpsychology.rnSpoken language is also a preferred method of human-machine interaction. Arnspoken language system needs to have both speech recognition and speechrnsynthesis capabilities. But this thesis is about building only the speechrnrecognition (Speech to Text) system, specifically for Amharic language.rnAmharic language has more than 200 characters but the standard keyboard isrnmade for English alphabet. This limited number of keys has imposed the need ofrn2 – 4 key strokes to write a single Amharic letter.rnThe practical project of this thesis is to develop functional software with speechrnto text capabilities for Amharic language. But this software by no means coversrnall Ethiopic characters. The algorithms and models developed will bernexperimented on small part of the Ethiopic characters with minimal error rate asrnpossible.rnThere are different approaches to speech recognition. But the statisticalrnapproach to speech recognition seems to be industries current favorite, as itrndelivers better performance. It is also easier to implement. So the statisticalrnapproach is used in the development of the software. This approach requiresrnacoustic models and language models to be built. Acoustic model refer tornrepresentation of knowledge about acoustics, phonetics, etc whereas Languagernviiirnmodel refers to system knowledge of what constitutes a possible word, whatrnwords likely to co-occur and in what sequence.rnThis thesis is an attempt to build STT conversion for Amharic language usingrnthe statistical approach. So the inventory of speech files is made by recordingrnand from these data appropriate models are built. The purpose is to test thernperformance based on the models built and prove that statistical models arernsuited to modeling speech signals.