-atlas wandering-
   


-atlas wandering-
Bloggorama for breaking things

\

Categories:
  • /(58)

Subscribe
Subscribe to a (RSS) feed of this weblog.



Archives


This Blog





atlasutils-2.2.17.tgz


disass-3.04.tgz

       
Fri, 16 Dec 2011

canyoucrackit?


hacking is prime time. whether you are good at throwing metasploit payloads, cracking codes, finding bugs, or writing 0-day there are many interested organizations that would like your acquaintance. such groups include organized crime, nation-states, mega-corporations, terrorists, and boutique consultancies. jobs (legitimate ones) range from penetration testing to exploit-development, and -- apparently -- spy?

- intro to challenge
the gchq (aka british intelligence) recently released a challenge, arguably to generate interest and find smart people to fill the roll of ??spy-related hacker/code-breaker??. i usually gloss over things that don't promise good vulnerability research/exploitation. there's only so many hours in a day, and i have a family to protect (from my own neglect). however, this challenge, hosted at http://canyoucrackit.co.uk, piqued my interest, mostly because of the good hex-pr0n.

stage1


the challenge (in case the site is down), seemed to consist of the following png, the message "Can you crack it?" and a web field to enter a Key.



the hex bytes were enough to get me to overcome the time/value dilemma and dive in. intelligence and code-breaking isn't really my strong suit, but i enjoy learning new aspects of my PRECIOUSSSS binary space.

my first thought was 'it looks like x86 instructions'. EB always looks interesting (as it's the first byte to the 'jmp' instruction, something also common in videogame-cheat identification). in fact, EA, EB, E8, and E9 are all interesting, being a combination of 'call's and 'jmp's. also of interest is the various 41 41 41 41 and 42 42 42 42 bytes, which most hackers will identify as the most common bytes to use in fuzzing (although i'm partial to the 40 byte, the @ symbol). a few other things jumped out at me, including that part that solidified my confidence that it was x86 machine code: cd 80 ("int 0x80", the instruction used by user-land code to communicate with the linux or bsd kernel.

i've been working with disassemblers/emulators for a while, mostly based on invisigoth's ENVI disassembly/emulation framework. so i saw this as a great opportunity to play with in an emulator. i've done similar things in the past, but it's always been a one-off solution, slammed together in a frustrating/frantic rush. also, for some time i've been using a hacked-together cli tool which gives me command-line access to x86 disassembly. i decided to take some time and extend the cli disassembly tool to include some rapid emulator-spinup helper code. the result is the new (currently slightly ugly) envi_x86.py addition to atlasutils. note: many of my tools are designed to be used from bash or ipython. note: most the code is ugly.

the twist


slapping this in the emulator produced no identifiable results, so i loaded the bytes into a disassembler and did some analysis. the code reads like shellcode, with a jmp/callback method of finding EIP and relative in-memory resources/gchq.

analyzing the code and watching it in the emulator, it became obvious pretty quickly that it was some form of crypto. at first i thought it might be a one-way hash, but it seemed to use a key (included in the code, based on the number 0xdeadbeef) to mutate some data. after some wikipedia searching on the various common crypto-algorithms, it looked very much like RC4. problem is, the "data" that gets mutated (decrypted) was missing! in the emulator, it first gave me a SegFault because it was trying to read past the bytes i gave it. at first, i worked to give it what it wanted... which started off with 0x42424242, followed by a length. i tried to hand it 256 NULLs, hoping the mutation would turn them into interesting data. i then handed in 256 A's (41) followed by 256 B's (42), followed by 0xa0 of each of those, followed by the code itself. there is a 4-byte segment in the bytes which is jumped over at the beginning of the code, and never referenced, so i thought perhaps they used that to true up the instruction bytes to spit out some code (yes, i realize how hard that would have been for them... but i tried it anyway). then i started looking at the web site. searching around the web site for more data... the web page itself had no distinguishable oddities... i checked out the CSS and javascript looking for some hints. the image had no comments i could find, but running strings on the png shows the following base64 string:

QkJCQjIAAACR2PFtcCA6q2eaC8SR+8dmD/zNzLQC+td3tFQ4qx8O447TDeuZw5P+0SsbEcYR
78jKLw==



decode that:



the first four bytes of this decode to "BBBB" (42424242). the next dword make the number 0x32, which is the length of the remaining bytes. i think we found our encrypted portion.



and the output:



added challenge: the bug!


so i plugged all this into the emulator, called emu.runmap:() and when it terminated with the "INT 0x80" instruction (calling SYS_EXIT), the memory location held a whole lotta nothing. i kept plugging away, changing the way things were laid out, debugging, and eventually gave up. something was terribly wrong.

vdb to the rescue


eventually, i decided to use the real machine. to do this, i loaded up a binary in my favorite debugger, VDB. the idea is simple: run a program, attach and stop it with the debugger, copy the bytes into executable location, set the Program Counter to the start of the bytes, and run it. since i run 64-bit linux, i had to find s 32bit binary, since these were 32bit x86 instructions. running "file /usr/bin/* |grep 32" found several usable binaries, and i chose to run "wineserver" which worked great. the last executed instruction of these bytes would terminate the program (and wipe memory) by asking the kernel to SYS_EXIT, so we want to place a breakpoint before that is executed. since VDB is a programmable debugger, i used the following script to load up the bytes and prepare it to run.



and the resulting memory looks like this.


emulator fixed!


since emulation is so important to me, i wrote a tool that locked the emulator and a debugger in step and compared the registers and memory after each instruction.

side note


if you are only interested in the challenge, move to the next section.

during lock-step emulation, i discovered a bug in the emulation handler of the ROR instruction. intended to ROLL RIGHT (bits shift right, but instead of just falling off the end of the lowest bit end, those bits get placed back on the most significant bit. the handler was written exactly as described in the IA32 manuals, but whereas the manual dealt with size in terms of bits, the handler was written using the size in terms of bytes. fixing this allowed the emulator to work correctly as well.



and the lock-step emulator test script:
download the emulator testing script here

so the answer is GET /15b436de1f9107f3778aad525e5d0b20.js HTTP/1.1, which many of you will recognize as part of an HTTP GET request. without having any host information, i used the www.canyoucrackit.co.uk name and sure enough, it downloads a javascript file.

stage2


the javascript file turns out to be, as it states, "stage 2 of 3". crap. i thought i'd be done with thing so i could get back to my life... no such luck. thankfully, this stage proved just as interesting and fun!

download 15b436de1f9107f3778aad525e5d0b20.js here and open it in vim... oh, and set your color scheme to green on black ;)

so they give us some setup for a virtual machine, describe how it is supposed to work, and give us an empty "exec" function for us to implement. sadly, i couldn't turn away from this either. i like writing disassemblers and emulators, and this instruction-set was really simple, so it did it. took an hour to write the emulator, four hours to debug, and i got a little pissed off at the gchq for forgetting to mention one small vital detail, which i'll share in a minute.. javascript schmavascript, i wrote mine in python. the first bug i found was truly my fault.

added challenge: the bug!


so the emulator i wrote worked for the most part. the codeflow for an emulator is simple, and similar from platform to platform. read in the byte(s) necessary to determine the instruction type and decoding pattern, read in any additional required bytes, decode the instruction and its operands, then perform the emulation steps as close to the specification as possible. the emulator was great, but i noticed that the code they provided in the memory variable 'mem' wrote the number of the destination address (offset into the data segment) into the location. this didn't really stick out to me as odd, since the identity permutation (each byte is equal to it's offset from the start) is the first thing that happens in stage1! turns out, i copied and pasted wrong... so i was using the same operand for data and destination... doh!

yeah, but now i'm pissed.


the thing they didn't tell us... the memory architecture is a 16-byte segmented address-space, with a code-segment register and a data segment register. all data references are assumed to be offsets from 16*DS (data segment register). all code references (ie. the next instruction and jmps) are either inferred to be offsets from 16*CS (code segment register), or explicit given in a second operand (in the case of 'jmp r2:r1').
however, when using the 'jmp' instruction variant that provides a code segment, not only are we to recalculate the new instruction pointer based on the explicit code-segment operand, but undocumentedly, we have to update the CS register with the new value. wtf?

however, once we make the changes, everything works out ok. the output looks like this:


download the stage2.py emulator code here. is your screen color set up correctly? this code may not work on systems with other screen schemes.

stage3


so now we get to some fun stuff... finally, a PE binary (run file on it if you doubt). i prefer ELF, but PE is fun. let's see what this thing does.... threw it into the disassembler, and discovered what it does. the executable opens 'license.txt' in the current directory and reads the first line. it first checks the first four bytes against "gchq" (the number 0x71686367 in LSB is 67 63 68 71... or 'g' 'c' 'h' 'q'),



then runs the next 8 bytes through posix crypt() function using a set SALT of 'hq' (see wikipedia for info on SALTs in cryptography).





if they match, the next three dwords are used to create a URL can call back to a provided host. handing in www.canyoucrackit.co.uk, the executable connects and pulls the following url:

http://www.canyoucrackit.co.uk/hqDTK7b8K2rvw/dword0/dword1/dword2/key.txt




two problems. what's the crypted thing??? and what's the next three dwords?!?

problem 1: the crypted 8-bytes... so i started with all sorts of cracking techniques, including wrapping several 'john --stdout' setups to permute different password attacks. after further analysis, however, it because pretty clear that the 8 bytes were simply used for "client-side auth". ie. i just NOP'd out the the comparison and conditional jmp instruction, and skipped the whole mess. insert my favorite 8 bytes, and on to the next problem...




problem 2: the three dwords to be used in the url. i looked around for any hints as to what could be those 3 dwords, finding nothing (except that they compiled it all in cygwin, and had dependencies on a couple cygwin libraries). i tried a few oddities, including nulls, oddities, and other numbers. the trick was remembering unnecessary dwords from previous stages. the first instruction of stage 1 is "jmp 04", skipping over four bytes (af c2 bf a3) that are never again referenced. stage2 has two dwords in the firmware variable.

'firmware': [0xd2ab1f05, 0xda13f110]

so i suspected that if i did an HTTP GET on some permutation of the following URL, i'd finish the challenge:

http://www.canyoucrackit.co.uk/hqDTK7b8K2rvw/a3bfc2af/d2ab1f05/da13f110/key.txt

added challenge: the time!
the twist... it never worked. i tried every permutation of this URL, swapping byte-arrangement for the numbers, and eventually wrote a script that randomly selected from the valid bytes to form new url. i only ever got to determine that i got the right answer by looking up a cheat writeup that detailed the last section.

if i'm not mistaken, the gchq, interested in only british citizens, failed to consider other time-zones in their web app. i thought it was quite considerate that they gave me until midnight in my own time zone... but probably not. as i'm several hours behind GMT, depending on the time of year, when it showed that i had 50 minutes left (when i got around to the answer), the challenge was over for hours. oops.

oh well, so i didn't get it. but i grew some, had a *lot* of fun, and "sited in" my toolset. and since i'm not a british citizen, i'm doubting they'd want to hire me anyway :) especially since i got a lotta irish in my veins.

however... just to recap some lessons learned, and other thoughts i want to leave you with...

* hacking is not a good way to make money.
it is a passion you pursue. money just sometimes ensues...

* never give up

* take good notes along the way... and review when you are stuck. you never know when tidbits of odd info may help you.

* identify your weak points
- fix them

* understanding... grok the planet!

* unit testing! holy crap. both stage 1 and stage 2 in my work would have been improved had i spent the time to write unit tests for each instruction emulated.

if you found this interesting enough to walk through the steps on your own machine, or otherwise enjoy breaking software/hardware, drop me an email at atlas@r4780y.com.

good. now i can get back to writing my talk for shmoocon and truing up some code.

gloria a Dios, and merry Christmas all.
@

[] permanent link / /





February 2012
Sun Mon Tue Wed Thu Fri Sat