Reverse-engineering with strace

Jan 12, 2017 by joonas.fi

Let’s learn some reverse-engineering with strace by inspecting how Docker interacts with its server API (Docker uses client-server model).

Docker has API documentation so it’s not like this is necessary, but this serves as a good example on how to reverse-engineer black boxes.

Install strace

First, make sure you have strace, curl and jq installed:

$ apt-get install -y strace curl jq

System calls? What and why?

strace stands for “system trace”. With strace we can trace system calls. System calls are calls that programs make to communicate with the Kernel.

Kernel calls are needed for any I/O such as disk, network etc. Why are they needed? Because the only way for a program to read/write to/from disk or to/from network is to utilize disk/network drivers, and it would be stupid for each program to implement those drivers themselves. Instead, the drivers are implemented only once and they are hosted in the Kernel. Userspace programs (= “normal programs”) communicate with those drivers by calling functions in the Kernel (“syscall”).

Why would it be stupid to implement disk/network/etc drivers ourselves? If you used to play DOS games in ye oldie (but golden) times, you probably recall that you had to choose and configure a sound card driver before you could play that game. That meant that each game had to contain sound card drivers for each type of device. That was inefficient in both time and money, as each game developer had to develop drivers for each sound card (or even video card) and also probably purchase those cards and test the drivers against them. It also slowed the pace of innovation and competition, as it took some time for new games to support newer sound cards that came to the market.

So in short: system calls are awesome because they let your program just abstractly ask the Kernel to read a file, and not care about whether:

The disk is a HDD, SSD, CD-ROM or Blu-ray.
It’s attached via SATA, IDE, USB or even as a network drive.
The filesystem is FAT32, NTFS, ext4 etc.
The block level on the disk is encrypted or not.
Etc etc.

Quick introduction to strace

Okay, let’s run a short program first without using strace:

$ cat /etc/os-release
NAME="Ubuntu"
VERSION="16.04.1 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.1 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
UBUNTU_CODENAME=xenial

cat is a program that prints a file to stdout - usually which is connected eventually to your screen.

Now, let’s run that very same program but with strace:

$ strace cat /etc/os-release
execve("/bin/cat", ["cat", "/etc/os-release"], [/* 21 vars */]) = 0
... snipped ...
open("/etc/os-release", O_RDONLY)       = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=274, ...}) = 0
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
mmap(NULL, 139264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb6f4c4e000
read(3, "NAME=\"Ubuntu\"\nVERSION=\"16.04.1 L"..., 131072) = 274
write(1, "NAME=\"Ubuntu\"\nVERSION=\"16.04.1 L"..., 274NAME="Ubuntu"
VERSION="16.04.1 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.1 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
UBUNTU_CODENAME=xenial
) = 274
read(3, "", 131072)                     = 0
munmap(0x7fb6f4c4e000, 139264)          = 0
close(3)                                = 0
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++

strace dumps its tracing output to stderr, while stdout is directly connected to the running program.

That means that:

By running strace cat /etc/os-release we see both the output from strace and the original program (as pictured above).
By running strace cat /etc/os-release > /dev/null we ignore stdout => we only see output from strace.
By running strace cat /etc/os-release 2> /dev/null we ignore stderr => we only see output from the original program (this is effectively useless).

Demonstration:

$ strace cat /etc/os-release 2> /dev/null
NAME="Ubuntu"
VERSION="16.04.1 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.1 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
UBUNTU_CODENAME=xenial

$ strace cat /etc/os-release 1> /dev/null
execve("/bin/cat", ["cat", "/etc/os-release"], [/* 21 vars */]) = 0
... snipped ...
open("/etc/os-release", O_RDONLY)       = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=274, ...}) = 0
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
mmap(NULL, 139264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f171a572000
read(3, "NAME=\"Ubuntu\"\nVERSION=\"16.04.1 L"..., 131072) = 274
write(1, "NAME=\"Ubuntu\"\nVERSION=\"16.04.1 L"..., 274) = 274
read(3, "", 131072)                     = 0

The “ignore stdout” is relevant because if your original program produces much output, we might not be interested in that at all.

So, from the above output we can see that cat program:

Opens a file (open) /etc/os-release as readonly (O_RDONLY).
Issues read calls to file descriptor #3, receiving the file contents (NAME="Ubuntu"\n...).
Other file descriptors would be #0 for stdin, #1 for stdout, #2 for stderr and 3..N are other open file descriptors.
In this case the file that was specified in the only argument to cat ended up as fd #3.
Immediately after reading the content, cat outputs the same content into its own stdout: write(1, same content as what was read...).

Okay, now that we know the basics of strace, let’s move on.

Using strace with real-life programs

Okay, we have this docker command, that we want to reverse engineer:

$ docker service ps whoami
ID                         NAME      IMAGE                     NODE    DESIRED STATE  CURRENT STATE              ERROR
aps1k88b6rv0dcxnh6tstgjc4  whoami.1  emilevauge/whoami:latest  master  Running        Running about an hour ago

That command interrogates the Docker API to list tasks (whoami.1, identified by aps1k88b6rv0dcxnh6tstgjc4) running for a specific service (whoami).

Okay, since we know that docker is a client for a client-server model, therefore it accesses the server API, probably over TCP or a Unix socket. Both of those we can crudely inspect with strace.

We know that Docker uses HTTP protocol to communicate with the API, so we’ll search for a string GET (= start of most HTTP requests) to validate that assumption:

$ strace docker service ps whoami | grep GET
execve("/usr/bin/docker", ["docker", "service", "ps", "whoami"], [/* 21 vars */]) = 0
uname({sysname="Linux", nodename="master", ...}) = 0
brk(NULL)                               = 0x2360000
brk(0x23611c0)                          = 0x23611c0
... snipped ...

Okay, we made our first mistake. So we grepped for GET but none of the output lines contain that keyword, so what’s going on?

Remember that strace (as each program) outputs two streams: stdout and stderr. The log output we wanted to grep from is in stderr, but the shell | operator pipes stdout to the next program. So we were grepping from the wrong stream. This drawing illustrates the problem:

+------------------------------+      +-------------+                  +-------------------+
|                         +------+    |NAME="Ubuntu"|     +------+     |                   |
|                         |stdout+---->...          +----->$ grep+----->                   |
|                         +------+    +-------------+     +------+     |                   |
| $ strace cat /etc/os-release |                                       |    Your screen    |
|                         +------+    +-------------+                  |                   |
|                         |stderr+---->execve()     +------------------>                   |
|                         +------+    |...          |                  |                   |
+------------------------------+      +-------------+                  +-------------------+

So, knowing that, let’s redirect strace’s stderr to stdout so we can grep it (2>&1):

$ strace docker service ps whoami 2>&1 | grep GET

Nothing, so we were expecting a write call with a string GET, but there was no matches. Let’s try just listing the write calls:

$ strace docker service ps whoami 2>&1 | grep write

Nothing, that’s weird! When running again:

$ strace docker service ps whoami 2>&1 | grep write
write(3, "GET /v1.24/services/whoami HTTP/"..., 95) = 95

Why are those write calls randomly showing up and sometimes not? Well, that relates to the underlying programming language used by Docker, the Go language. Go takes advantage of processors with multiple cores by spawning multiple processes, and sometimes that write call shows up in the main process that we launched, and sometimes not.

There’s an option to strace that we can use:

-f      follow forks

This means that when Go spawns child processes, strace monitors those as well. We probably should’ve used that switch by default, so we don’t have to take guesses about the inner workings of each program to debug from now on. Continuing with that:

$ strace -f docker service ps whoami 2>&1 | grep write
[pid  9337] <... write resumed> )       = 95
[pid  9337] write(3, "GET /v1.24/tasks?filters=%7B%22s"..., 160 <unfinished ...>
[pid  9337] <... write resumed> )       = 160
[pid  9336] write(3, "GET /v1.24/services/7g1xbqf95ly6"..., 114 <unfinished ...>
[pid  9336] <... write resumed> )       = 114
[pid  9338] write(3, "GET /v1.24/nodes/7xynabmkp3r117p"..., 111 <unfinished ...>
... snipped ...

Observation: each outbound HTTP request had its own process (-f option prefixes each line with a process id). So they must take heavy use of Go’s goroutines for easy parallelization of computation. :)

Now, each time we run that same command we consistently see all the writes, awesome!

Now moving forward with our initial plan of only scanning for the GET strings to identify only the calls relating to outbound HTTP GET requests:

$ strace -f docker service ps whoami 2>&1 | grep GET
[pid  9361] ioctl(2, TCGETSstrace: Process 9365 attached
[pid  9361] ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0
[pid  9361] ioctl(1, TCGETS, 0x7ffecdce30c0) = -1 ENOTTY (Inappropriate ioctl for device)
[pid  9364] write(3, "GET /v1.24/services/whoami HTTP/"..., 95 <unfinished ...>
[pid  9363] write(3, "GET /v1.24/tasks?filters=%7B%22s"..., 160 <unfinished ...>
[pid  9364] write(3, "GET /v1.24/services/7g1xbqf95ly6"..., 114) = 114
[pid  9364] write(3, "GET /v1.24/nodes/7xynabmkp3r117p"..., 111) = 111

Okay, strace seems to truncate the write command, because we’re not seeing the write argument in its entirety - we are not seeing the entire URLs which we are interested about. There’s a switch for that:

-s strsize     limit length of print strings to STRSIZE chars (default 32)

So:

$ strace -s 64 -f docker service ps whoami 2>&1 | grep GET
[pid  9377] ioctl(2, TCGETS <unfinished ...>
[pid  9377] ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0
[pid  9377] ioctl(1, TCGETS <unfinished ...>
[pid  9381] write(3, "GET /v1.24/services/whoami HTTP/1.1\r\nHost: docker\r\nUser-Agent: D"..., 95) = 95
[pid  9381] write(3, "GET /v1.24/tasks?filters=%7B%22service%22%3A%7B%227g1xbqf95ly6bt"..., 160 <unfinished ...>
[pid  9380] write(3, "GET /v1.24/services/7g1xbqf95ly6btrwyglymrigt HTTP/1.1\r\nHost: do"..., 114) = 114
[pid  9380] write(3, "GET /v1.24/nodes/7xynabmkp3r117pj9rxhln9k9 HTTP/1.1\r\nHost: docke"..., 111) = 111

# better. let's also:
# - filter only to the `write` lines that contain keyword GET
# - increase string length to 96 because we were not seeing the /tasks?filters=... URL fully

$ strace -s 96 -f docker service ps whoami 2>&1 | grep write | grep GET
[pid  9442] write(3, "GET /v1.24/services/whoami HTTP/1.1\r\nHost: docker\r\nUser-Agent: Docker-Client/1.12.3 (linux)\r\n\r\n", 95 <unfinished ...>
[pid  9439] write(3, "GET /v1.24/tasks?filters=%7B%22service%22%3A%7B%227g1xbqf95ly6btrwyglymrigt%22%3Atrue%7D%7D HTTP"..., 160 <unfinished ...>
[pid  9442] write(3, "GET /v1.24/services/7g1xbqf95ly6btrwyglymrigt HTTP/1.1\r\nHost: docker\r\nUser-Agent: Docker-Client/"..., 114 <unfinished ...>
[pid  9442] write(3, "GET /v1.24/nodes/7xynabmkp3r117pj9rxhln9k9 HTTP/1.1\r\nHost: docker\r\nUser-Agent: Docker-Client/1.1"..., 111 <unfinished ...>

Okay, we know that Docker makes a HTTP request to path /v1.24/services/whoami. Does it use HTTP or a Unix socket?

Searching from the full output of strace just before the first write we find:

$ strace -s 64 -f docker service ps whoami 1>/dev/null | less -S
... snip ...
[pid  9401] setsockopt(3, SOL_SOCKET, SO_BROADCAST, [1], 4) = 0
[pid  9401] connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/docker.sock"}, 23) = 0
... snip ...
[pid  9401] write(3, "GET /v1.24/services/whoami HTTP/1.1\r\nHost: docker\r\nUser-Agent: D"..., 95 <unfinished ...>
... snip ...

Okay, we saw that it uses a Unix socket and its location in the filesystem is /var/run/docker.sock.

Lets try connecting to Docker API ourself without the official Docker client:

$ curl --unix-socket /var/run/docker.sock http:/v1.24/services/whoami
# snipped: non-pretty JSON output in one line

# if you have jq installed, you can pipe JSON into it and have it pretty-printed
$ curl --unix-socket /var/run/docker.sock http:/v1.24/services/whoami | jq .
{
  "ID": "7g1xbqf95ly6btrwyglymrigt",
  "Version": {
    "Index": 808
  },
  "CreatedAt": "2017-01-11T12:37:55.171414273Z",
  "UpdatedAt": "2017-01-11T13:19:20.515932494Z",
  "Spec": {
    "Name": "whoami",
    "Labels": {
      "traefik.enable": "true",
... snipped
}

Now, checking out the /tasks?filters... endpoint:

$ curl --unix-socket /var/run/docker.sock http:/v1.24/tasks?filters=%7B%22service%22%3A%7B%227g1xbqf95ly6btrwyglymrigt%22%3Atrue%7D%7D | jq .
[
  {
    "ID": "aps1k88b6rv0dcxnh6tstgjc4",
    "Version": {
      "Index": 728
    },
    "CreatedAt": "2017-01-11T12:37:55.17994955Z",
    "UpdatedAt": "2017-01-11T12:38:02.142646711Z",
    "Spec": {
      "ContainerSpec": {
        "Image": "emilevauge/whoami:latest",
        "Env": [
          "VIRTUAL_HOST=whoami._CLUSTER_.fn61.net"
        ]
      },
... snipped ...
    },
    "ServiceID": "7g1xbqf95ly6btrwyglymrigt",
    "Slot": 1,
    "NodeID": "7xynabmkp3r117pj9rxhln9k9",
    "Status": {
      "Timestamp": "2017-01-11T12:38:02.097532639Z",
      "State": "running",
      "Message": "started",
      "ContainerStatus": {
        "ContainerID": "a748e970d01cf55fea3550c29f769f2ccd81f018e12b002b4cfbedb9f60a1f0c",
        "PID": 27839
      }
    },
    "DesiredState": "running",
... snipped ...
  }
]

And /nodes/7xynabmkp3r117pj9rxhln9k9:

curl --unix-socket /var/run/docker.sock http:/v1.24/nodes/7xynabmkp3r117pj9rxhln9k9 | jq .
{
  "ID": "7xynabmkp3r117pj9rxhln9k9",
... snipped ...
  "Description": {
    "Hostname": "master",
    "Platform": {
      "Architecture": "x86_64",
      "OS": "linux"
    },
... snipped ...
}

Okay cool, so we managed to make the same requests as the Docker client made, with only the curl tool. HTTP is awesome! :)

So, recapping from the output of strace we reverse-engineered the logic for getting the output of $ docker service ps whoami:

Make request to /services/whoami.
Find out that the service ID is 7g1xbqf95ly6btrwyglymrigt.
Make request to /tasks?filters={"service":{"7g1xbqf95ly6btrwyglymrigt":true}} (for clarity I urldecoded the garbage that looked like this: %7B%22service%22..).
Make request to /nodes/7xynabmkp3r117pj9rxhln9k9 to find out the details of the node the task is running on.

So from above HTTP requests we found the data from the JSON responses that go into the final ps output:

$ docker service ps whoami
ID                         NAME      IMAGE                     NODE    DESIRED STATE  CURRENT STATE        ERROR
aps1k88b6rv0dcxnh6tstgjc4  whoami.1  emilevauge/whoami:latest  master  Running        Running 3 hours ago

The task ID aps1k88b6rv0dcxnh6tstgjc4 comes from response from /tasks (Task.ID).
Task name is combination of service.Name and task.Slot (whoami + 1 => whoami.1).
Image we know from /tasks (Task.Spec.ContainerSpec.Image).
Node name was from /nodes/7xynabmkp3r117pj9rxhln9k9 (Node.Description.Hostname).
Desired state was from /tasks (Task.DesiredState).
Current state was from /tasks (Task.Status.State).
Error would be from /tasks as well (would be Task.Status.State=failed, Task.Status.Err=task: non-zero exit (1)).

Closing words

strace is awesome, as it helps you understand how a program works behind the scenes (reverse engineering), without needing access to the source code. And it doesn’t matter whether the program you are inspecting is open or closed source, or written in Python, C++ or Go, because in the end almost everything happens via syscalls.

function61.com