A researcher found that Meta’s popular open source PyTorch framework used self-hosted runners in its GitHub repository, against best practice, and was able to exploit this to steal secrets that would enable compromise of the release code.
Security engineer John Stawinski IV described the attack in a post last week, explaining that it “resulted in the ability to upload malicious PyTorch releases to GitHub, upload releases to AWS, potentially add code to the main repository branch, backdoor PyTorch dependencies” and more.
The key weakness in this attack is the use of self-hosted runners. A runner is a virtual machine used to execute processes in GitHub Actions, widely used as part of CI/CD (continuous integration/continuous delivery) processes. Actions can be triggered when code is committed to a repository, or by a variety of other events including pull requests. A pull request might come from anyone who has forked the repository.
Most runners are hosted by GitHub, in which case they are ephemeral, fired up once for a job and then deleted. Self-hosted runners, as the name implies, are located outside GitHub and have more flexibility, since they can be customized by the organization using them. Unlike GitHub-hosted runners, a clean instance is not required for each job. Instead, the same instance may be used for many jobs.
GitHub warns that “untrusted workflows running on your self-hosted runner pose significant security risks for your machine and network environment, especially if your machine persists its environment between jobs.” The company recommends that self-hosted runners are only used by private repositories.
Stawinski worked with colleague Adnan Khan to develop an attack on the PyTorch repository, which despite GitHub’s advice does use self-hosted runners. The researchers also discovered that the ability to execute workflows via pull requests was restricted to previous contributors. They submitted a trivial fix for a typo in a markdown file, and then made a pull request that triggered an action on self-hosted runners including malicious scripts, granting root access. Then, “all we had to do was wait until a non-PR workflow ran on the runner with a privileged GITHUB_TOKEN,” which they could grab for further escalation.
The issue was submitted to Meta for a bug bounty. Meta reported last month that the issue was mitigated.
GitHub also warns about the risks of actions triggered by pull requests, stating that “since, by definition, a PR supplies code to any build or test logic in place for your project, attackers can achieve arbitrary code execution in a workflow runner operating on a malicious PR in a variety of ways.” However, last month Khan reported on an attack on one of GitHub’s own repositories using a similar technique, demonstrating that security best practices are often not observed.
The kinds of problems developers face are illustrated by a comment on Hacker News. “Ideally the builds for external PRs should not gain any access to any secret. But it is not practical. For example, you may want to use docker containers. Then you will need to download docker images from somewhere. For example, downloading ubuntu:20.04 from docker hub. That one requires a token, otherwise your request may get throttled. Even accessing most Github services requires a token. I agree with you that these things require a lot of care. In reality most dev teams are not willing to put enough time on securing their build system. That’s why supply chain attacks are so common today.”
Supply chain attacks are powerful because they impact any code that pulls in the compromised dependency, even in private repositories.