This is a central metadata index of all of the data available in IMPACT from our federation of Providers.
If you were hoping to find specific data, but didn't please contact us at Contact@ImpactCyberTrust.org and we will see if we can make it available to you.
Note: You must log in to request data.
Scripts for IP Hitlist generation ... We have developed a set of map/reduce processing scripts that run in Hadoop to consume our Internet address censuses an...
A dataset of DNS traffic data collected during 10 separate days. ... Campus DNS network traffic consisting of more than 4000 active users (in peak load hours...
The data sets contain traffic in and out of the web server of the Student Union for Electrical Engineering (Fachbereichsvertretung Elektrotechnik) at Ulm Uni...
ADFA IDS is an intrusion detection system dataset made publicly available in 2013, intended as representative of modern attack structure and methodology to r...
This collection contains labeled network traffic data in ARFF format. The original purpose was to train ransomware detection in the Aktaion IDS. ... Data was...
A dataset containing both normal and malware infected android applications. ... This dataset contains 18,850 normal android application packages and 10,000 m...
We collected more than 10,854 samples (4,354 malware and 6,500 benign) from several sources. We have collected over six thousand benign apps from Googleplay ...
In this project, we focus on the Android platform and aim to systematize or characterize existing Android malware. ... This project has managed to collect mo...
The Android PRAGuard Dataset is a collection of obfuscated malware from Android devices. ... The dataset contains 10479 samples, obtained by obfuscating the ...
This project developed a systematic approach to generate diverse and comprehensive benchmark datasets for intrusion detection resulting in a dataset containi...
This is a corpus of auto-labeled cyber security domain text which was used for automatically extracting security-related entities using machine learning. Thi...
Manage babarchives, checksumed directory trees that can be validated ... Babarchive is a system to manage babarchives, checksumed directory trees that can be...
16,800 clean and 11,960 malicious files for signature testing and research. ... Contagio is a collection of the latest malware samples, threats, observations...
source code for content reuse detection paper ... This repository contains the code and pointers to datasets used in the paper "Precise Detection of Content ...
A dictionary containing every wordlist, dictionary, and password database leak publicly accessible on the internet ... The format of the list is a standard t...
The datasets contains transactions made by credit cards in September 2013 by european cardholders. This dataset presents transactions that occurred in two da...
C/C++ Library/tool for IP address anonymization ... CryptopANT is a C library for IP address anonymization using crypto-PAn algorithm, originally def...
This dataset is composed of a selection of Windows API/System-Call trace files, intended for testing on classifiers treating with sequences. ... Malware call...
Cyber Threat Intelligence Repository expressed in STIX 2.0 ... The Cyber Threat Intelligence Repository of ATT CK and CAPEC catalogs expressed in STIX 2.0 JS...
Tool for scrubbing packet traces ... Dag Scrubber is our tool for scrubbing packets of user data and optionally doing IP address anonymization. It supports b...
DreamMarket Dark Net Market is an online platform for exchanging illegal goods by cybercriminals. This dataset has information about products and sellers. .....
A free, community-sourced, machine-readable knowledge base of digital forensic artifacts that the world can use both as an information source and within othe...
extract DNS traffic from pcap to text with optionally anonymization ... Dnsanon reads pcap files, extracts DNS traffic, and writes it to several plain-text t...
Dnsanon_rssac is an implementation of RSSAC-002v2 processing for DNS statistics ... Dnsanon_rssac is an implementation of RSSAC-002v2 processing for DNS stat...
Ether is a malware analysis framework which leverages hardware virtualization extensions (specifically Intel VT) to remain transparent to malicious software....
Extending and consolidating hosts files from several well-curated sources like adaway.org, mvps.org, malwaredomainlist.com, someonewhocares.org, and potentia...
eMews is a collection of PCAP data captured from an in-lab emulated network, using the CORE network emulator and the eMews framework developed to generate pa...
Hadoop reader/format for census/survey data ... A plugin for Hadoop that parses icmptrain output from our ipv4 censuses and surveys.
30 days of EMS logs in a large anonymized log file from an Energy Management System (EMS). ... The data in the file Event_Export_082217.csv includes 30 days ...
Multiple datasets containing cyber attacks against 2 laboratory scale industrial control systems; a gas pipeline and water storage tank. ... The data sets in...
This dataset is a collection of labeled RTU telemetry streams from a gas pipeline system in Mississippi State University's Critical Infrastructure Protection...
This dataset is split into three smaller datsets, which include measurements related to electric transmission system normal, disturbance, control, cyber atta...
This repository includes a series of PCAP captures generated for cybersecurity research purposes. Each capture set is provided as a release, namely: modbus T...
The Insider Threat Test Dataset is a collection of synthetic insider threat test datasets that provide both background and malicious actor synthetic data. .....
UDP scan and meassurement of public UDP services that could be used in relation to Amplified DDoS attacks. ... The dataset consists of 20 UDP Services and 21...
The Kharon dataset is a collection of Android malware totally reversed and documented. ... This collection gives as much as possible a representation of the ...
This dataset contains measurements of the latencies between a set of DNS servers. It was used as the basis for evaluating the Vivaldi network coordinate syst...
IP Network Traffic Flows Labeled with 75 Apps ... The data presented here was collected in a network section from Universidad Del Cauca, Popayn, Colombia by...
LANDER Trace Capture software handles for packet capture, scrubbing, and triggering user-provided scripts ... LANDER Trace Capture software handles for packe...
The traces released here contain all incoming anonymous FTP connections (i.e. to port 21) to public FTP servers at the Lawrence Berkeley National Laboratory ...
LDplayer component ... Change DNS queries in a network trace file and generate binary input for dns-replay-{controller,client}.
Replay DNS queries against a DNS server with correct timing and optionally log timing or latency. ... dns-replay-client replays DNS queries against a real DN...
LDplayer component ... Distribute DNS query stream and to queriers (dns-replay-client).
A proxy that helps to emulate DNS hierarchy in DNS trace replay. ... dns-replay-proxy manipulates packet addresses to emulate DNS hierarchy in LDplayer. Spe...
LDplayer component ... A set of scripts that set up port-based routing and dns-replay-proxy for replaying queries against a recursive server in LDplayer.
LDplayer component ... A set of scripts that generate zone files in order to replay queries against a recursive server in LDplayer.
This dataset consists of system logs from a Linux Redhat 7.1 system deployed in a honeynet. ... The data has no sanitization or anonymization; the data is pr...
A tiny utility to convert coordinates to color ... For geolocation of IP address maps we needed to convert (lon, lat) to color in HSL and RGB color schemes. ...
This dataset contains signatures generated from many Android APKs, and can be used separately from the detection engine. ... This dataset comes bundled with ...
A public malware dataset generated by Cuckoo Sandbox based on Windows OS API ... The dataset contains malware samples from eight different families: 832 spyw...
A new dataset of 66,301 malware recordings collected over a two-year period using Malrec. ... Malrec, a malware sandbox system, uses PANDA's whole-system det...
A collection of malware samples caught by several honeypots. ... All of the malware samples contained in this repository have been collected by several honey...
Aim of the project is to provide an useful and classified dataset to researchers who want to investigate deeper in malware analysis by using Machine Learning...
This dataset was generated using the IDA disassembler tool. The task is to develop the best mechanism for classifying files in the test set into their respec...
packet capture tool ... A utility for capturing packets concurrently on several network devices and saving output in a single file while making an effort to ...
Machine Learning based Intrusion Detection Systems are difficult to evaluate due to a shortage of datasets representing accurately network traffic and their ...
tool for decoding census/survey data ... A command-line tool that prints icmptrain output from our ipv4 censuses and surveys.
Project Sonar is a security research project by Rapid7 that conducts internet-wide surveys across different services and protocols to gain insights into glob...
Pwned Passwords are 555,278,657 real world passwords previously exposed in data breaches. This exposure makes them unsuitable for ongoing use as they're at m...
These malware samples are uploaded by users or from Rampart Research themselves. These datasets maybe useful as a training datasets to validate anti-virus en...
Ransomware Tracker offers various types of blocklists that allows you to block Ransomware botnet C&C traffic. ... The update interval for the available block...
This is a collection of malware datasets containing a mixed of virus and benign samples amounting to 2TB from SecureAge. ... Researchers will find this colle...
A labeled dataset with billions of records covering a wide variety of low-privileged monitorable smartphone features collected from 50 volunteers over a few ...
The Software Assurance Reference Dataset (SARD) is a growing collection of over 170 000 programs with precisely located bugs. ... The programs are in C, C++,...
This dataset consists of alert logs from the Enterasys Dragon NIDS 4.x intrusion detection system. ... Date range of data: 2006-2007, 590 days of continuous ...
traffic stream merger ... Stream merger is a tool to merge multiple traffic streams by feeding them through a FIFO/Drop tail queue and adjusting packet timi...
A catalog of malware used in the Syrian civil war. ... Each sample lists its respective MD5 hash, filename, links to any media sources or technical details w...
Tdns-server-proxy is a server-side proxy for DNS. It listens to incoming private T-DNS (with TCP and TLS) and turns it back into UDP queries to a local DNS ...
The Drebin dataset contains 5,560 applications from 179 different malware families. The samples have been collected in the period of August 2010 to October 2...
the VERIS Community Database aims to collect and disseminate data breach information for all publicly disclosed data breaches ... VERIS and its accompanying ...
Software to handle indexing and selection of multiple network data types based on a given time range. ... Software to handle indexing and selection of multip...
URLhaus offers an API to both, receive (download) and submit malware URLs from the URLhaus database. ... The URLhaus database dump is a simple CSV feed that ...
A packet capturer and forwarder for active measurement of anycast catchements. ... Verfploeter is a set of tools: Pinger, Packetcapr, and Pingextract. Pinger...
A pinger for active measurement of anycast catchements. ... Verfploeter is a set of tools: Pinger, Packetcapr, and Pingextract. Pinger sends pings to a hitli...
VirusShare is a collection of malware used for malware analysis and machine learning. ... The VirusShare dataset is a repository of malware samples to provid...
A github repository that contains a collection of web attack payloads from various sources. ... Requests extracted from either packet captures or log files o...
A repository of over 35,000 phrases, patterns, and keywords commonly used by spammers and comment bots in usernames, email addresses, link text, and URIs. .....
This dataset includes sanitized password frequency lists collected from Yahoo in May 2011. ... Each of the 51 .txt files represents one subset of all users' ...