Machine Learning Based Traffic Classification Algorithm For Fixed Network Traffic

Telecommunication Engineering Project Topics

Get the Complete Project Materials Now! ยป

Traffic classification is associating network flows with the applications that generate them. Traffic classification helps ISPs (internet service providers) as the fundamental building block for any traffic management activity, for traffic pricing and treatment (e.g., policing, shaping, etc.) and for security activities. There are various types of methods used for network traffic classification. Port-based and payload-based methods is widely used for application identification in the traffic. In recent years, these methods have not worked well in practice. This is because the number of applications that employ random or non-standard ports have increased dramatically, and payload content encryption is required for security purposes. Therefore, machine-learning techniques have been proposed as solutions in the literature, recently.rnIn this study, a machine learning method is used for the identification of applications using fixed network traffic data collected from ethio telecom access layer devices. To build the model, two supervised machine-learning algorithms, namely Random Forest and C4.5are selected from the state of the art. The flow level network features extracted from the collected data to train the machine-learning model. This study is unique from existing network traffic classification studies in that it uses two additional new features to train the model. These are the flow index and flow state. The performance of the models analyzed before and after the addition of new features. Finally, application dominance in terms of flow, packet, and byte composition in fixed network traffic is studied.rnThe experiment results show that Random Forest provided 90.8% and C4.5 provides 88% of the overall accuracy-based on the flow features available in the state of the art. However, after the addition of the flow state and flow index features to build the models, the overall classification accuracy of the Random Forest 95.2% and 94.8 for C4.5. The overall classification accuracy increased by 5.1% for Random Forest and 6% for C4.5.rnFinally, this study shows the application's dominance in terms of flow, packet, and byte composition. On the fixed data network, web applications consume approximately 35.5 percent of bytes, 21.5 percent of packets, and 56.6 percent of flow, making them the dominant application.

Get Full Work

Report copyright infringement or plagiarism

Be the First to Share On Social



1GB data
1GB data

RELATED TOPICS

1GB data
1GB data
Machine Learning Based Traffic Classification Algorithm For Fixed Network Traffic

173