Tuesday, August 22, 2017

A quick post on Wikipedia-scrubbing and a historical document on binary diffing

I am a huge fan of Wikipedia -- I sometimes browse Wikipedia like other people watch TV, skipping from topic to topic and - on average - being impressed by the quality of the articles.

One thing I have noticed in recent years, though, is that the base-democratic principles of Wikipedia open it up to manipulation and whitewashing - Wikipedia's guidelines are strict, and a person can get a lot of negative information removed just by cleverly using the guidelines to challenge entries. This is no fault of Wikipedia -- in fact, I think the guidelines are good and useful -- but it is often instructive to read the history of a particular page.

I recently stumbled over a particularly amusing example of this, and feel compelled to write about it.

More than a twelve years ago, when BinDiff was brand-new and wingraph32.exe was still the graph visualization tool of choice, there was a controversy surrounding a product called "CherryOS" - which purported to be an Apple emulator. A student had raised the allegation that "CherryOS" had misappropriated source code from an open-source project called "PearPC" on his website, and the founder of the company selling CherryOS (somebody by the name of Arben Kryeziu) had threatened the student legally over this claim.

In order to help a good cause, we did a quick analysis of the code similarities between CherryOS and PearPC, and found that approximately half of the code in CherryOS was verbatim copy & paste from PearPC. We wrote a small report, provided it to the lawyer of the student under allegation, and the entire kerfuffle died down quickly. Wikipedia used to have a page that detailed some of the drama for a few years thereafter.

I recently stumbled over the Wikipedia page of CherryOS, and was impressed: The page had been cleaned of any information that supported the code-theft claims, and offered a narrative where there had never been conclusive consensus that CherryOS was full of misappropriated code. This is not a reflection of what happened back then at all.

Anyhow, in a twist of fate, I also found an old USB stick which still contained a draft of the 2005 note we wrote. For the sake of history, here it is :-)

I had forgotten how painful it was to look at disassembly CFGs in wingraph32. Sometimes, when I am frustrated at the speed at which RE tools improved during my professional life, it is useful to be reminded what the dark ages looked like.