Published January 1, 2023 | Version v1
Publication

Botnet attacks detection in IoT environment using machine learning techniques

  • 1. University of Jordan
  • 2. Saudi Electronic University
  • 3. Applied Science Private University
  • 4. Middle East University

Description

IoT devices with weak security designs are a serious threat to organizations. They are the building blocks of Botnets, the platforms that launch organized attacks that are capable of shutting down an entire infrastructure. Researchers have been developing IDS solutions that can counter such threats, often by employing innovation from other disciplines like artificial intelligence and machine learning. One of the issues that may be encountered when machine learning is used is dataset purity. Since they are not captured from perfect environments, datasets may contain data that could affect the machine learning process, negatively. Algorithms already exist for such problems. Repeated Edited Nearest Neighbor (RENN), Encoding Length (Explore), and Decremental Reduction Optimization Procedure 5 (DROP5) algorithm can filter noises out of datasets. They also provide other benefits such as instance reduction which could help reduce larger Botnet datasets, without sacrificing their quality. Three datasets were chosen in this study to construct an IDS: IoTID20, N-BaIoT and MedBIoT. The filtering algorithms, RENN, Explore, and DROP5 were used on them to filter noise and reduce instances. Noise was also injected and filtered again to assess the resilience of these filters. Then feature optimizations were used to shrink the dataset features. Finally, machine learning was applied on the processed dataset and the resulting IDS was evaluated with the standard supervised learning metrics: Accuracy, Precision, Recall, Specificity, F-Score and G-Mean. Results showed that RENN and DROP5 filtering delivered excellent results. DROP5, in particular, managed to reduce the dataset substantially without sacrificing accuracy. However, when noise got injected, the DROP5 accuracy went down and could not keep up. Of the three dataset, N-BaIoT delivers the best accuracy overall across the learning techniques.

⚠️ This is an automatic machine translation with an accuracy of 90-95%

Translated Description (Arabic)

تشكل أجهزة إنترنت الأشياء ذات التصميمات الأمنية الضعيفة تهديدًا خطيرًا للمؤسسات. إنها اللبنات الأساسية لـ Botnets، وهي المنصات التي تشن هجمات منظمة قادرة على إغلاق بنية تحتية بأكملها. يعمل الباحثون على تطوير حلول IDS التي يمكنها مواجهة مثل هذه التهديدات، غالبًا من خلال توظيف الابتكار من تخصصات أخرى مثل الذكاء الاصطناعي والتعلم الآلي. إحدى المشكلات التي يمكن مواجهتها عند استخدام التعلم الآلي هي نقاء مجموعة البيانات. نظرًا لعدم التقاطها من بيئات مثالية، فقد تحتوي مجموعات البيانات على بيانات يمكن أن تؤثر سلبًا على عملية التعلم الآلي. الخوارزميات موجودة بالفعل لمثل هذه المشاكل. يمكن لخوارزمية التعديل المتكرر لأقرب جار (REN)، وطول الترميز (الاستكشاف)، وإجراء تحسين التخفيض التنازلي 5 (DROP5) تصفية الضوضاء من مجموعات البيانات. كما أنها توفر فوائد أخرى مثل تقليل الحالات التي يمكن أن تساعد في تقليل مجموعات بيانات الروبوتات الأكبر، دون التضحية بجودتها. تم اختيار ثلاث مجموعات بيانات في هذه الدراسة لإنشاء معرفات: IoTID20 و N - BaIoT و MedBIoT. تم استخدام خوارزميات التصفية و RENN و EXPLORE و DROP5 عليها لتصفية الضوضاء وتقليل الحالات. كما تم حقن الضوضاء وتصفيتها مرة أخرى لتقييم مرونة هذه المرشحات. ثم تم استخدام تحسينات الميزات لتقليص ميزات مجموعة البيانات. أخيرًا، تم تطبيق التعلم الآلي على مجموعة البيانات المعالجة وتم تقييم المعرفات الناتجة باستخدام مقاييس التعلم القياسية الخاضعة للإشراف: الدقة والدقة والاستدعاء والخصوصية و F - Score و G - Mean. أظهرت النتائج أن تصفية RENN و DROP5 حققت نتائج ممتازة. تمكنت DROP5، على وجه الخصوص، من تقليل مجموعة البيانات بشكل كبير دون التضحية بالدقة. ومع ذلك، عندما تم حقن الضوضاء، انخفض DROP5 بدقة ولم يتمكن من مواكبة ذلك. من بين مجموعة البيانات الثلاث، توفر N - BaIoT أفضل دقة بشكل عام عبر تقنيات التعلم.

Translated Description (English)

IoT devices with weak security designs are a serious threat to organizations. They are the building blocks of Botnets, the platforms that launch organized attacks that are capable of shutting down an entire infrastructure. Researchers have been developing IDs solutions that can counter such threats, often by employing innovation from other disciplines like artificial intelligence and machine learning. One of the issues that may be encountered when machine learning is used is dataset purity. Since they are not captured from perfect environments, datasets may contain data that could affect the machine learning process, negatively. Algorithms already exist for such problems. Repeated Edited Nearest Neighbor (RENN), Encoding Length (Explore), and Decremental Reduction Optimization Procedure 5 (DROP5) algorithm can filter noises out of datasets. They also provide other benefits such as instance reduction which could help reduce larger Botnet datasets, without sacrificing their quality. Three datasets were chosen in this study to construct an IDs: IoTID20, N-BaIoT and MedBIoT. The filtering algorithms, RENN, Explore, and DROP5 were used on them to filter noise and reduce instances. Noise was also injected and filtered again to assess the resilience of these filters. Then feature optimizations were used to shrink the dataset features. Finally, machine learning was applied on the processed dataset and the resulting IDs was evaluated with the standard supervised learning metrics: Accuracy, Precision, Recall, Specificity, F-Score and G-Mean. Results showed that RENN and DROP5 filtering delivered excellent results. DROP5, in particular, managed to reduce the dataset substantially without sacrificing accuracy. However, when noise got injected, the DROP5 accurately went down and could not keep up. Of the three dataset, N-BaIoT delivers the best accuracy overall across the learning techniques.

Translated Description (Spanish)

IoT devices with weak security designs are a serious threat to organizations. They are the building blocks of Botnets, the platforms that launch organized attacks that are capable of shutting down an entire infrastructure. Researchers have been developing IDS solutions that can counter such threats, often by employing innovation from other disciplines like artificial intelligence and machine learning. One of the issues that may be encountered when machine learning is used is dataset purity. Since they are not captured from perfect environments, datasets may contain data that could affect the machine learning process, negatively. Algoritmos Aready exist for such problems. Repeated Edited Nearest Neighbor (RENN), Encoding Length (Explore), and Decremental Reduction Optimization Procedure 5 (DROP5) algorithm can filter noises out of datasets. They also provide other benefits such as instance reduction which could help reduce larger Botnet datasets, without sacrificing their quality. Three datasets were chosen in this study to construct an IDS: IoTID20, N-BaIoT and MedBIoT. The filtering algorithms, RENN, Explore, and DROP5 were used on them to filter noise and reduce instances. Noise was also injected and filtered again to assess the resilience of these filters. Then feature optimizations were used to shrink the dataset features. Finally, machine learning was applied on the processed dataset and the resulting IDS was evaluated with the standard supervised learning metrics: Accuracy, Precision, Recall, Specificity, F-Score and G-Mean. Results showed that RENN and DROP5 filtering delivered excelente results. DROP5, in particular, managed to reduce the dataset substantially without sacrificing accuracy. However, when noise got injected, the DROP5 accuracy went down and could not keep up. Of the three dataset, N-BaIoT delivers the best accuracy overall across the learning techniques.

Additional details

Additional titles

Translated title (Arabic)
الكشف عن هجمات الروبوتات في بيئة إنترنت الأشياء باستخدام تقنيات التعلم الآلي
Translated title (English)
Botnet attacks detection in IoT environment using machine learning techniques
Translated title (Spanish)
Botnet attacks detection in IoT environment using machine learning techniques

Identifiers

Other
https://openalex.org/W4386010817
DOI
10.5267/j.ijdns.2023.7.021

GreSIS Basics Section

Is Global South Knowledge
Yes
Country
Jordan

References

  • https://openalex.org/W4386010817