Phylum report: password-stealing PyPi packages discovered, downloaded over 5,700 times

Phylum report: password-stealing PyPi packages discovered, downloaded over 5,700 times
Supply chain security

Supply chain security specialist Phylum has reported on malware in “dozens of newly published PyPi packages.”

PyPi (Python Package Index) is the default package repository for pip, the package management tool built into Python. PyPi publishes its download statistics, and Phylum reckons that the malicious packages have been downloaded over 5700 times. The malware attempts to install W4SP, a script which exfiltrates passwords and cookies from browser sessions.

The attack began around October 12th and has been evolving as, apparently, the criminals refined their code. Phylum researchers identified around 30 packages, with innocent-sounding names such as typesutil, typestring, colorwin and pyhints.

Typically, the attackers copy an existing package with “a few slight modifications in an effort to make the text consistent with the phony package name it was published under,” the researchers said. Typesutil, for example, is based on the genuine package datetime2. Some of the packages exploit “typosquatting”, where a developer mistypes a popular package name and inadvertently installs malware. For example, a package called twyne is a possible mistyping of the genuine package called twine. The real twine is downloaded nearly 150,000 times a day, according to stats, so even a small hit rate can be effective.

A developer PC is an attractive target because it can result in malicious packages propagating to other developer PCs or even to production software, and developers are also likely to have valuable credentials used for software testing and deployment.

The attackers, according to Phylum’s analysis, made considerable to efforts to hide their work. Techniques include putting malicious code far to the right, so that it is invisible unless the coder scrolls horizontally or turns on word wrap in their editor. Malicious code is also obfuscated, compressed, or disguised by Base64 encoding. The code in the package itself is just the starting point, since it creates temporary files and downloads additional code in order to deliver its payload.

The researchers also found bugs, concluding at one point that “we couldn’t get it to do anything other than produce syntax errors and tacked it to our wall of ‘malware that doesn’t work’.” Unfortunately the criminals continued to evolve the code and may still be posting new malware to PyPi.

This kind of analysis makes it clear that developers face real risk if they install packages from public package repositories without taking extreme care to avoid typos.

Awareness of the risks around software supply chain security has grown in recent years, and steps are being taken to mitigate those. Late last month, the OpenSSF (Open Source Security Foundation) announced general availability of Sigstore, tools to make it easier to sign and verify software packages. Steps like package signing and mandating two-factor security (2FA) for package maintainers help, and PyPi is in the process of requiring 2FA for all projects in the top 1% of downloads over a 6-month period, and gave free hardware security keys to eligible maintainers.

PyPi maintainer Dustin Ingram spoke about PyPi security plans earlier this year.

Another approach is to use smaller, curated package repositories such as that in Google’s Software Delivery Shield. Assured Open Source Software, a paid service currently in preview and part of its Software Delivery Shield, contains packages built and pre-checked by Google. The downside here, cost aside, is that it may not contain a package that is needed, though there is provision for developers to download direct from PyPi as a secondary option.

Safeguarding the most popular packages is a valuable step, but something more comprehensive is necessary, and reports like this demonstrate that the problem is not solved yet.