Emerging technologies such as smart phones and wireless Internet have transformed thernWeb from a static data publishing platform into a collaborative information sharingrnenvironment. Yet, attaining the next stage in Web engineering, i.e., Intelligent Web allowsrnmeaningful human-machine and machine-machine collaboration. The Intelligent Webrnrequires another breakthrough: allowing the sharing and organizing of CollectivernKnowledge (CK), where CK underlines the combination of all known data, information, andrnmeta-data concerning a given concept or event. In this context, various methods have beenrnput forward to perform automatic event detection using the shared multimedia collections’rnmeta-data. But, most of these methods do not capture the semantic meaning embedded inrnWeb-based multimedia data, which are usually highly heterogeneous and unstructured.rnTaking this into account, the main goal of this research is the identification of meaningfulrnevents from Web-based multimedia data resources, considering the heterogeneous nature ofrntheir meta-data. To achieve this goal, an Event Extraction t framework for Web-basedrnmultimedia data is introduced. The framework consists of a Multimedia RepresentationrnSpace Model (MRSM), designed for multimedia data and multimedia-based eventrnrepresentation, in order to allow event detection based on multimedia CK. The MRSM, itsrndimensions, their coordinates, and the associated distance (similarity) metrics and propertiesrnare formally defined. Then a dedicated algorithm for event detection, built upon MRSM, andrngeared toward effective CK management is provided. The proposed theoretical modelrnprovides a means of event detection from heterogeneous multimedia data without any priorrnknowledge about event-related clues. The MediaEval 2013 benchmark dataset is used, asrnreal datasets and synthetic datasets are also extracted from it in order to validate and test thernquality and practicality of the proposed approach. Results highlight our method’srneffectiveness, achieving an average NMI of 0.9887 and F-score of 0.9465. The experimentalrnrnresults show that the temporal, spatial, and semantic dimensions are all important inrndetecting meaningful events, such that the best results were observed with close weightrnvalues assigned to every dimension. Moreover, comparative tests highlight the performancernof the MRSM-based event detection approach compared with alternative solutions.