Tuesday, December 22, 2015

Open-Source BinNavi ... and fREedom

One of the cool things that my former zynamics colleagues (now at Google) did was the open-sourcing of BinNavi - a tool that I used to blog about quite frequently in the old days (here for example when it came to debugging old ScreenOS devices, or here for much more - kernel debugging, REIL etc.).

BinNavi is a GUI / IDE for performing multi-user reverse engineering, debugging, and code analysis. BinNavi allows the interactive exploration and annotation of disassemblies, displayed as browsable, clickable, and searchable graphs - based on a disassembly read from a PostgreSQL database, which can, in theory, be written by any other engine.

Writing UIs is hard work, and while there are many very impressive open-source reverse engineering tools around (Radare comes to mind first, but there are many others), the UI is often not very pretty - or convenient. My hope is that BinNavi can become the "default UI" to a plethora of open-source reverse engineering tools, and grow to realize it's full potential as "the open-source reverse engineering IDE".

One of the biggest obstacles to BinNavi becoming more widely adopted is the fact that IDA is the only "data source" for BinNavi - e.g. while BinNavi is FOSS, somebody that wishes to start reverse engineering still needs IDA to fill the Postgres database with disassembly.

To remedy this situation, Dave Aitel put up a contest: Anybody that either builds a Capstone-to-BinNavi-SQL-bridge or that adds decompilation as a feature to BinNavi gets free tickets to INFILTRATE 2016.

Last week Chris Eagle published fREedom, a Python-based tool to disassemble x86 and x86_64 programs in the form of PE32, PE32+, and ELF files. This is pretty awesome - because it means that BinNavi moves much closer to being usable without any non-free tools.

In this blog post, I will post some first impressions, observations, and screenshots of fREedom in action.

My first test file is putty.exe (91b21fffe934d856c43e35a388c78fccce7471ea) - a relatively small Win32 PE file, with about ~1800 functions when disassembled in IDA.

Let's look at the first function:
Left: IDA's disassembly output. Right: fREedom's disassembly output
So disassembly, CFG building etc. has worked nicely. Multi-user commenting works as expected, as does translation to REIL. Callgraph browsing works, too:

The great thing about having fREedom to start from is that further improvements can be incremental and layered - people have something good to work from now :-) So what is missing / needs to come next? 
  1. fREedom: Function entry point recognition is still relatively poor - out of the ~1800 functions that IDA recognizes in putty, only 430 or so are found. This seems like an excellent target for one of those classical "using Python and some machine learning to do XYZ" blog posts.
  2. fREedom: The CFG reconstruction and disassembly needs to be put through it's paces on big and harder executables.
  3. BinNavi: Stack frame information should be reconstructed - but not by fREedom, but within BinNavi (and via REIL). This will require digging into (and documenting) the powerful-but-obscure type system design.
  4. BinNavi: There has been some bitrot in many areas of BinNavi since 2011 - platforms change, systems change, and there are quite some areas that are somewhat broken or need updating (for example debugging on x64 etc.). Time to brush off the dust :-)
Personally, I am both super happy and pretty psyched about fREedom + BinNavi, and I hope that the two can be fully integrated so that BinNavi always has fREedom as default disassembly backend.

Wednesday, December 16, 2015

A decisionmaker's guide to buying security appliances and gateways

With the prevalence of targeted "APT-style" attacks and the business risks of data breaches reaching the board level, the market for "security appliances" is as hot as it has ever been. Many organisations feel the need to beef up their security - and vendors of security appliances offer a plethora of content-inspection / email-security / anti-APT appliances, along with glossy marketing brochures full of impressive-sounding claims.

Decisionmakers often compare the offerings on criteria such as easy integration with existing systems, manageability, false-positive-rate etc. Unfortunately, they often don't have enough data to answer the question "will installing this appliance make my network more or less secure?".

Most security appliances are Linux-based, and use a rather large number of open-source libraries to parse the untrusted data stream which they are inspecting. These libraries, along with the proprietary code by the vendor, form the "attack surface" of the appliance, e.g. the code that is exposed to an outside attacker looking to attack the appliance. All security appliances require a privileged position on the network - a position where all or most incoming and outgoing traffic can be seen. This means that vulnerabilities within security appliances give an attacker a particularly privileged position - and implies that the security of the appliance itself is rather important.

Installing an insecure appliance will make your network less secure instead of safer. If best engineering practices are not followed by the vendor, a mistake in any of the libraries parsing the incoming data will compromise the entire appliance.

How can you decide whether an appliance is secure or not? Performing an in-depth third-party security assessment of the appliance may be impractical for financial, legal, and organisational reasons.

Five questions to ask the vendor of a security appliance

In the absence of such an assessment, there are a few questions you should ask the vendor prior to making a purchasing decision:

  1. What third-party libraries interact directly with the incoming data, and what are the processes to react to security issues published in these libraries?
  2. Are all these third-party libraries sandboxed in a sandbox that is recognized as industry-standard? The sandbox Google uses in Chrome and Adobe uses in Acrobat Reader is open-source and has undergone a lot of scrutiny, so have the isolation features of KVM and qemu. Are any third-party libraries running outside of a sandbox or an internal virtualization environment? If so, why, and what is the timeline to address this?
  3. How much of the proprietary code which directly interacts with the incoming data runs outside of a sandbox? To what extent has this code been security-reviewed?
  4. Is the vendor willing to provide a hard disk image for a basic assessment by a third-party security consultancy? Misconfigured permissions that allow privilege escalation happen all-too often, so basic permissions lockdown should have happened on the appliance.
  5. In the case of a breach in your company, what is the process through which your forensics team can acquire memory images and hard disk images from the appliance?
A vendor that takes their product quality (and hence your data security) seriously will be able to answer these questions, and will be able to confidently state that all third-party parsers and a large fraction of their proprietary code runs sandboxed or virtualized, and that the configuration of the machine has been reasonably locked down - and will be willing to provide evidence for this (for example a disk image or virtual appliance along with permission to inspect).

Why am I qualified to write this?

From 2004 to 2011 I was CEO of a security company called zynamics that was acquired by Google in 2011. Among other things, we used to sell a security appliance that inspected untrusted malware. I know the technical side involved with building such an appliance, and I understand the business needs of both customers and vendors. I also know quite a bit about the process of finding and exploiting vulnerabilities, having worked in that area since 2000.

Our appliance at the time was Debian-based - and the complex processing of incoming malware happened inside either memory-safe languages or inside a locked-down virtualized environment (emulator), inside a reasonably locked-down Linux machine. This does not mean that we never had security issues (we had XSS problems at one point where strings extracted from the malware could be used to inject into the Web UI etc.) - but we made a reasonable effort to adhere to best engineering practices available to keep the box secure. Security problems happen, but mitigating their impact is not rocket science - good, robust, and free software exists that can sandbox code, and the engineering effort to implement such mitigations is not excessive.

Bonus questions for particularly good vendors

If your vendor can answer the 5 questions above in a satisfactory way, his performance is already head-and-shoulders above the industry average. If you wish to further encourage the vendor to be proactive about your data security, you can ask the following "bonus questions":
  1. Has the vendor considered moving the Linux on their appliance to GRSec in order to make privilege escalations harder?
  2. Does the vendor publish hashes of the packages they install on the appliance so in case of a forensic investigation it is easy to verify that the attacker has not replaced some?