awesome-malware-benign-datasets

Awesome-Malware-Benign-Datasets

Awesome

A curated list of Malware and Benign datasets for security researchers.

Table of Contents

Datasets

Dataset Description Link Public/Private
MALNET-IMAGE A large-scale dataset of 1,262,024 malware images across 696 families for research in malware classification. Link Public
Virus-MNIST A dataset of 51,880 grayscale images of malware, designed for malware classification tasks, with 10 classes. Link Public
Malimg A dataset of 9,458 images of PE malware, categorized into 25 different families. Link Public
Stamina A dataset containing 782,224 binary sequences converted to images, designed for malware classification. Link Public
McAfee A dataset of 367,183 malware samples analyzed by McAfee, categorized into two main types. Link Private
Kancherla A smaller dataset with 27,000 samples focused on binary classification of malware and benign files. Link Private
Choi A dataset of 12,000 samples, split evenly between malware and benign, for binary classification tasks. Link Private
Fu A dataset of 7,087 samples from 15 different malware families, designed for multi-class classification. Link Private
Han A dataset of 1,000 samples across 50 malware families, intended for fine-grained malware classification. Link Private
IoT DDoS A small dataset containing 365 samples for IoT Distributed Denial of Service (DDoS) attack detection, with 3 distinct attack types. Link Public
DikeDataset Binaries of PE malware and benign samples. Link Public
Benign-NET Binaries of PE benign samples. Link Public
Ember Features of PE malware. Link Public
Virushare Binaries of PE malware samples (requires permission for access). Link Private
Microsoft Malware Prediction PE malware features in CSV format. Link Public
Microsoft Malware Classification Challenge (BIG 2015) Binaries of PE malware. Link Public
malware_benign_file Binaries of PE malware and benign samples. Link Public
dumpware 10e 4,294 RGB images from 3,686 malware samples and 608 benign samples, with images rendered in various width schemes. Link Public
CICIDS 2017 Dataset Contains network traffic data including benign and malicious samples, with detailed labels for various types of attacks. Link Public
Kaspersky Malware Dataset A collection of malware samples collected and analyzed by Kaspersky, useful for classification and behavioral analysis. Link Private
CICIDS 2018 Dataset Network traffic data including benign and malicious samples with detailed attack labels and features. Link Public
AILab Malware Dataset Provides malware samples for various research purposes, including behavioral analysis and classification. Link Private
MalNet Dataset A dataset of malware samples collected from various sources, useful for malware detection and analysis. Link Public
Contagio Malware Dump Contains a variety of malware samples used for malware research and analysis. Link Public
The Microsoft Malware Classification Challenge (BIG 2018) Contains malware samples and features with labels for various malware types. Link Public
MalMem2021 Dataset A dataset of memory dumps containing both benign and malicious processes, useful for memory forensics. Link Public
CICIDS 2019 Dataset Network traffic data including benign and malicious samples with comprehensive attack labels. Link Public
Malware Bazaar A collection of malware samples shared by the community for research purposes. Link Public
BODMAS Contains 57,293 malware and 77,142 benign Windows PE files, including binaries (disarmed malware only), feature vectors, and metadata. Link Public

Contribute

Contributions are welcome! Please follow the contribution guidelines for submitting new datasets or updates.

⬆ back to top

License

Creative Commons License

This repository is licensed under a Creative Commons Attribution 4.0 International License.

Topic: Malware Dataset