Towards Data-Driven Network Intrusion Detection Systems: Features Dimensionality Reduction and Machine Learning
Keywords:Cyberattacks, machine learning, deep learning, ML models
Cyberattacks have increased in tandem with the exponential expansion
of computer networks and network applications throughout the world. In this study,
we evaluate and compare four features selection methods, seven classical machine
learning algorithms, and the deep learning algorithm on one million random instances
of CSE-CIC-IDS2018 big data set for network intrusions. The dataset was
preprocessed and cleaned and all learning algorithms were trained on the original
values of features. The feature selection methods highlighted the importance of
features related to forwarding direction (FWD) and two flow measures (FLOW) in
predicting the binary traffic type; benign or attack. Furthermore, the results revealed
that whether models are trained on all features or the top 30 features selected by any
of the four features selection techniques used in this experiment, there is no significant
difference in model performance. Moreover, we may be able to train ML models on
only four features and have them perform similarly to models trained on all data,
which may result in preferable models in terms of complexity, explainability, and
scale for deployment. Furthermore, by choosing four unanimity features instead of all
traffic features, training time may be reduced from 10% to 50% of the training time
on all features.
How to Cite
Copyright (c) 2022 Ibrahim obeidat, Majdi Maabreh, esraa abu elsoud , asma alnajjar, rahaf alzyoud, omar darwish
This work is licensed under a Creative Commons Attribution 4.0 International License.
The submitting author warrants that the submission is original and that she/he is the author of the submission together with the named co-authors; to the extend the submission incorporates text passages, figures, data or other material from the work of others, the submitting author has obtained any necessary permission.
Articles in this journal are published under the Creative Commons Attribution Licence (CC-BY What does this mean?). This is to get more legal certainty about what readers can do with published articles, and thus a wider dissemination and archiving, which in turn makes publishing with this journal more valuable for you, the authors.
By submitting an article the author grants to this journal the non-exclusive right to publish it. The author retains the copyright and the publishing rights for his article without any restrictions.
This journal has been awarded the SPARC Europe Seal for Open Access Journals (What's this?)