Automating RE Using r2pipe

09 July 2018

By Jacob Pimental

In this article we will go over Radare2’s r2pipe and its uses. R2pipe is the API for Radare2 that allows you to automate Radare2 and interact with a session from outside of Radare2. This can be used to simplify certain tasks, emulate a certain section of code, decrypt strings, or even reverse engineer multiple binaries with ease. In this specific example we will revisit a malware sample that I have detailed in a previous article titled Linux Malware Analysis—Why Homebrew Encryption is Bad. We will use r2pipe and Python to automate the process of deobfuscating strings within the binary.

The first thing we will do is create a function in Python that will decrypt the string at our current location in Radare2. I will paste the code here and then go over it in detail.

import r2pipe

from decrypt_string import decrypt_string

def decrypt_string_at_location(location):

    info = r2.cmdj('pdj 1 @ ' + location)[0]

    val_offset = ''

    if "val" in info.keys():

        val_offset = info['val']

    else:

        print('Not valid command, exiting...')

        exit()

    val_info = r2.cmdj('aij ' + hex(val_offset))

    if not val_info or val_info['read'] == False:

        print('Not a string, exiting...')

        exit()

    value = ' '.join(r2.cmd('pj 1 @ ' + hex(val_offset)).split())

    decrypted_string = decrypt_string(value, len(value)).strip()

    r2.cmd('CCa ' + location + ' ' + decrypted_string)

if __name__ == '__main__':

    r2 = r2pipe.open()

    location = hex(r2.cmdj('pdj 1')[0]['offset'])

    decrypt_string_at_location(location)

First, the script imports r2pipe using import r2pipe, it then imports our decrypt_string function from the previous article (you can get that script on my GitHub). Then it jumps down to the main “function” and opens r2pipe with r2 = r2pipe.open(). The open command in r2pipe can also take an argument, being the name of the file we wish to open in Radare2, but since we will be running this script from inside Radare2 then there is no need to specify a file. The location variable is then set using r2pipe’s cmdj function, which lets us run any Radare2 command we want. The function cmdj will parse the JSON output of whatever command we are running, there is also a regular cmd function that will not parse the output of the command at all. The command we are running is pdj 1 which will print the disassembly of our current memory address in Radare2 and output it in JSON format. We grab the first object in the list it returns and get the offset value of that, which determines our location in Radare2. The program then converts that value into a hexadecimal string and passes that string into the decrypt_string_at_location function.

The decrypt_string_at_location function is using the pdj command to get the information at the location specified in the arguments. It then checks to see if there is a key/value pair for the key “val”, which represents the location of the value the program is putting into a register. If a value exists then the script will use the Radare2 aij command to check if the location that the value is at is readable. This comes in handy when someone runs the script on a command that puts a numerical value into the register, since that value technically doesn’t have a location then the memory address is not readable. The function then reads the value at the memory location using the pj command and formats it into something user-friendly and human-readable (I was having some issues where certain strings would randomly have new-line characters in them and the output looked weird). Finally, the function uses the CCa command to add a comment at our location in Radare2 with the deobfuscated string.

To run this script inside Radare2, first make sure that the script is in the same working directory as you are when analyzing the sample. Then, call the script with the command #!pipe python2 decrypt.py.

Output from R2pipe Automation

We can now deobfuscate strings on a single line of code in Radare2. This is great, but what if we had multiple references to the same string in a function? Well, we can take this script one step further and add a comment to every line that references this string. Here is the new code:

import r2pipe

from decrypt_string import decrypt_string

def decrypt_string_at_location(location):

    info = r2.cmdj('pdj 1 @ ' + location)[0]

    val_offset = info['val']

    val_info = r2.cmdj('aij ' + hex(val_offset))

    if not val_info or val_info['read'] == False:

        print('Not a string, exiting...')

        exit()

    value = ' '.join(r2.cmd('pj 1 @ ' + hex(val_offset)).split())

    decrypted_string = decrypt_string(value, len(value)).strip()

    r2.cmd('CCa ' + location + ' ' + decrypted_string)

def decrypt_all_strings_in_function(func_location, current_location):

    info = r2.cmdj('pdj 1 @ ' + current_location)[0]

    val_offset = ''

    if "val" in info.keys():

        val_offset = info['val']

    else:

        print('Not valid command, exiting...')

        exit()

    func_info = r2.cmdj('pdfj')

    for line in func_info['ops']:

        if 'val' in line.keys() and line['val'] == val_offset:

            decrypt_string_at_location(hex(line['offset']))

if __name__ == '__main__':

    r2 = r2pipe.open()

    func_location = hex(r2.cmdj('pdj 1')[0]['fcn_addr'])

    location = hex(r2.cmdj('pdj 1')[0]['offset'])

    decrypt_all_strings_in_function(func_location, location)

This new script now uses the pdj 1 command to find the location of the function we are in using the fcn_addr keys in the JSON output. We then pass that, along with our current location to the decrypt_all_strings_in_function method. This method will grab the value at our current location and loop through every line in the function, calling our decrypt_string_at_location function at every line that references the same string. If we run the script in Radare2 we can see that every reference to a string now has a comment with the deobfuscated version of said string.

R2pipe deobfuscator

Using this script, we can now analyze the malware sample without having to worry about the obfuscated strings. If we run into one, we can just run our script on it to deobfuscate it everywhere it is referenced in the function! R2Pipe is an amazing tool that can come in handy in any Reverse Engineering situation. I hope this article was helpful in giving information on the basics of r2pipe and scripting for Radare2. I am still learning as I go, so if I did something wrong or could have done something in a different/better manner then please reach out to me on Twitter and LinkedIn and tell me. I always welcome criticism!

Thanks for reading and happy reversing!

GoggleHeadedHacker

Automating RE Using r2pipe

09 July 2018

Tutorial, Radare2, Hacking, Malware Analysis, Malware, Scripting, Automation, r2pipe

More Content Like This:

OneNote Analysis

Intro to Cutter

BlackGuard Analysis - Deobfuscation Using Dnlib

Analysis of Log4jShell Attack

Reverse Engineering Crypto Functions: AES

Reverse Engineering Crypto Functions: RC4 and Salsa20