This is a non-IMPACT record, meaning that access to the data is not controlled by IMPACT. For access, see the directions below.

Disclaimer:
This Resource is offered and provided outside of the IMPACT mediation framework. IMPACT and the IMPACT Coordination Council/Blackfire Technology, Inc. expressly disclaim all conditions, representations and warranties including but not limited to Resource availability, quality, accuracy, non-infringement, and non-interference. All Resource information and access is controlled by entities and under terms that are external to the IMPACT legal framework.

Summary

DS-1274
Microsoft Malware Classification Challenge (BIG 2015)
External Dataset
External Data Source
Microsoft
Unknown
Unknown
56 (lowest rank is 56)

Category & Restrictions

Other
cyber attack, malware, cyber defense competitions, cyber defense
Unrestricted
true

Description


This dataset was generated using the IDA disassembler tool. The task is to develop the best mechanism for classifying files in the test set into their respective family affiliations.

For each file, the raw data contains the hexadecimal representation of the file's binary content, without the PE header (to ensure sterility).    Also provided is a metadata manifest, which is a log containing various metadata information extracted from the binary, such as function calls, strings, etc.

The dataset consists of known malware files representing a mix of 9 different families. Each malware file has an Id, a 20 character hash value uniquely identifying the file, and a Class, an integer representing one of 9 family names to which the malware may belong:

Ramnit
Lollipop
Kelihos_ver3
Vundo
Simda
Tracur
Kelihos_ver1
Obfuscator.ACY
Gatak

Additional Details

465.7GB
false
Unknown
malware, challenge, 1274, 2015, classification, microsoft malware classification challenge (big 2015), microsoft, external data source, inferlink, source, corporation, external, inferlink corporation, files, dataset, family, affiliations, task, respective, classifying, tool, mechanism, develop, disassembler, ida, generated, test, file, binary, representing, metadata, kelihos, mix, hexadecimal, class, gatak, consists, manifest, calls, families, uniquely, pe, names, vundo, tracur, ver1, strings, ramnit, raw, character, ensure, function, acy, simda, extracted, integer, header, hash, obfuscator, content, representation, lollipop, ver3, log, belong, other, identifying, sterility