My Two Cents On ThreatIntel Sharing

So, now I’ve got a real job I can actually write stuff! This is surely madness.

I work in Threat Intelligence and I’ve spent the last 4 months creating a system to convert Threat Intelligence data between two specific formats, MISP and STIX.

These two formats supposedly are to be used for the same thing, sharing Threat Intelligence data, but someone decided it was a good idea to not make them at all compatible.

I can’t say I’ve actually used either system extensively, but I know their formats.

MISP, an open-source project funded in part by NATO, is meant for storing specific data about threats, it can handle most things you’d want to store - IPs, Domains, Emails, WHOIS info and so on. Problem is that there’s very little structure

MISP uses an almost flat-file approach, going for something akin to

{
  "event": {
        	"attributes": [
	      		          { "type":"ip-dst", "value":"192.168.1.1" },
			          { "type":"domain", "value":"google.com"  }
		              ]
	   }
}

Which is exceptionally easy to work with. MISP can be used almost as a hub, with the ability to write custom export modules in python. Anything that allows python interaction is immediately good.

STIX attempts to do the complete opposite by being intensely structured. I say intensely, I MEAN intensely. You need maybe 3-4 layers of objects just to store a threat actor name. There is an upside to this, however, in that you can capture every little tidbit of information in the leviathan of a STIXPackage, from individual members of a threat, to the diameter of the left nipple of the person who cleans the office.

As great as that is, Threats move fast. This causes the dilemma of “is it worth capturing everything about a threat when it’ll all change by the end of the week?”. As much as I’d love to capture everything in a neat little bundle with a bow on top, it’s just not practical.

Nonetheless, I did write a converter between the two. And I can say that it’s SOOO much easier to go MISP -> STIX than to go STIX -> MISP. This is due to the highly embedded nature of STIX, where an observable might be behind 7 or 8 layers, making extracting everything a task in futility.

This, in addition to the use of CyBOX (a format specifically for Observables), makes the number of possible object to decode skyrocket into the hundreds. This isn’t necessarily a bad thing, but it does make using both formats rather a pain. Maybe this is the idea, “neither will live while the other survives”. The great duel of formats can most keenly been seen when looking at who uses which.

From what I’ve seen, most European agencies use MISP, and big American banks use Soltra (which uses STIX). Following the money leads companies to use STIX when they otherwise wouldn’t bother, and this just might be the deciding factor in this war of storage. I hope it isn’t, KISS dictates that we should keep it simple, and STIX is very much NOT simple. (Nor does it work, fun fact vX.Y.a is not compatible with vX.Y.b under STIX’s versioning system.)

In short, I hope STIX dies soon. Either that or uses JSON exclusively. That might work.

I doubt a proper standard will ever emerge that can encompass all use cases (see this), so for now I guess we’ll have to make do with two standards at opposite ends of the complexity scale.

If you’re some sort of masochist and want to see this mystical converter, it’s here. But don’t blame me if you go insane as well.

Hannah Ward

On Advertisements and Political Tribalism

The following is entirely personal opinion and conjecture.Over the past few years, I've looked on as a passive observer whilst the intern...… Continue reading

Thoughts on a /pol/ throwaway

Published on May 05, 2017

On Trump and reactions to him

Published on January 31, 2017