Date of Award

8-2024

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Degree Discipline

Electrical Engineering

Abstract

Traditional signature-based network intrusion detection systems, which capture network attributes, are inadequate against zero-day attack. The smaller number of attacks creates an imbalanced dataset, the major problem during anomaly detection. Machine Learning (ML) and Deep Learning (DL) approaches are promising for network anomaly detection because they can efficiently analyze big network traffic data for malicious activities and detect zero-day attacks. The appropriate selection of the ML/DL algorithm, hyperparameter tuning, and techniques, such as sampling methods, ensemble methods, and reduction of number of classes, can enhance the anomaly detection performance of the anomaly detection methods on an imbalanced network intrusion-based dataset. The efficacy of various traditional ML models such as Random Forest (RF), J48, Naïve Bayes, Bayesian Network, Bagging, AdaBoost, and Support Vector Machine (SVM) is examined. Different combinations of deep learning models, including convolutional neural networks, bidirectional long-short term memory (LSTM) models, ensemble techniques, sampling techniques, and class reduction approaches, are applied to different sets of network-based intrusion datasets (KDD99, UNSW-NB15, CIC-IDS2017). These

experiments are conducted using different tools (WEKA, Jupyter Notebook) on the Anaconda platform. Investigation results reveal that traditional ML models are suitable for smaller data and low computational power. Deep learning models outperform huge datasets with large numbers of features but require significantly more computational power. The proposed heterogeneous ensemble method, which combines a number of different models along with a wise selection of hyperparameters and class size reduction techniques, has been demonstrated to significantly enhance anomaly detection performance on communication network-based intrusion datasets. Implementing different sampling techniques on different training and testing dataset combinations provided insight into application sampling techniques to deal with imbalance network intrusion datasets. The sampling is only efficient for the single set of working data, but the class reduction method to deal with class imbalance problems results in more efficient performance in regard to the single or different set of training and testing data given for network anomaly detection. The overall combination of results and conclusions will provide a comprehensive study of artificial intelligence techniques to enhance network anomaly detection in communication networks.

Index Terms— ADASYN, Bi-LSTM, CIC-IDS2017, class reductions, CNN-BLSTM, deep learning, heterogeneous ensemble learning, imbalance dataset, KDD99, LSTM, machine learning, network intrusion detection system, NSL-KDD, Random Over Sampling (ROS), Random Under Sampling (RUS), SMOTE, SMOTEENN, UNSW-NB15.

Committee Chair/Advisor

Annamalai Annamalai

Committee Co-Chair:

Mohamed F. Chouikha

Committee Member

Xiangfang Li

Committee Member

Xishuang Dong

Committee Member

Ahmed A. Ahmed

Publisher

Prairie View A&M University

Rights

© 2021 Prairie View A & M University

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Date of Digitization

9/13/2024

Contributing Institution

John B Coleman Library

City of Publication

Prairie View

MIME Type

Application/PDF

Share

COinS