Triple Point Geometric Hashing Based Audio Fingerprinting

Computer Engineering Project Topics

Get the Complete Project Materials Now! »

Audio fingerprinting is a technique used for exact identification of an audio by extractingrnperceptually relevant audio features and transforming them into condensed reproduciblernformats. Different approaches are proposed to develop audio fingerprintingrnsystem. Based on their baseline assumption, these approaches can be grouped intornthree categories: Philips, Image Processing and Shazam approach. These audio fingerprintingrnsystems, however, are not usually effective when the audio is distorted.rnDistortion in an audio might come from different modifications such as additive noise,rnspeed change, pitch shifting, time stretching and others. Of these modifications, thisrnthesis focuses on handling the problem of linear speed change in Shazam based audiornfingerprinting system. Linear speed change is a common audio modification whichrnoccurs when the audio is played faster or slower with a constant rate. In this thesis,rna Shazam based audio fingerprinting system which is robust to linear speed change isrnproposed. The proposed approach employs triple point geometric hashing to handlernthe effect of linear speed change on audio fingerprints.rnThe proposed approach is evaluated using 29,600 query audios, and compared withrnthe baseline work, Shazam and recent Shazam based work, Panako. Evaluation resultsrnshow that the proposed approach is robust to linear speed change in a range from 30%rnto 22%. This is a significant improvement compared to Panako, which is robustrnto linear speed change between -12% to 6%, and Shazam which failed to handle 2%rnlinear speed change. In addition to speed change, the proposed approach is evaluatedrnin terms of robustness to additive noise, time stretching and pitch shifting. The resultsrnshow that the proposed approach is robust to: i) additive noise in a range from -5dB torn20dB, comparable robustness is also exhibited by Shazam and Panako; ii) time stretchingrnin a range from -10% to 8%. This is also an improvement compared to Shazam andrnPankao, which are robust to time stretching between -4% to 4%; and, iii) pitch shiftingrnin a range from -4% to 4%, which is comparable robustness with Panako, wherernShazam failed to handle 2% pitch shifting.

Get Full Work

Report copyright infringement or plagiarism

Be the First to Share On Social



1GB data
1GB data

RELATED TOPICS

1GB data
1GB data
Triple Point Geometric Hashing Based Audio Fingerprinting

155