AWS CodeWhisperer enters preview stage, early testers encounter snags

AWS has commenced the invitation-only preview of CodeWhisperer, its AI-powered automatic coder, but early testers have hit problems including the mysterious appearance of other people’s S3 buckets in their code, and questions about ownership of the generated code.

CodeWhisperer is similar to GitHub’s Copilot, in that it uses machine learning (ML) models trained on existing code, which the FAQ describes as “including Amazon and open-source code.” Developers install it as part of the AWS Toolkit extension for either JetBrains IDEs such as PyCharm, or the open source Visual Studio Code. It is free during the preview.

CodeWhisperer generates simplistic Python code to test for a prime number. The grey text means it is a suggestion, and pressing Tab will insert it into the editor.

CodeWhisperer is easy to use. Suggestions may pop up automatically, or be requested by pressing Alt-C or on a Mac, Option-C. Suggestions are accepted using Tab, and coders can cycle through multiple suggestions, if present, using arrow keys. Additional features include a security scan and a code reference panel.

A security issue detected by CodeWhisperer

The code reference panel is an intriguing detail, since it allows developers to identify the source of some code suggestions. The implication is that on occasion CodeWhisperer will do something close to copy and paste from another project, used to train CodeWhisperer, in which case it will tell you where to find the original.

Who owns such code though? The FAQ states that, regarding code generated by CodeWhisperer, “developers own the code and are responsible for it.” However the terms and conditions for the preview seem to take a different view, stating that “The CodeWhisperer Beta Service may produce computational results that are attributed to AWS or our affiliates and we retain the underlying intellectual property rights in that content.” These terms are visible in this official AWS video.

A developer on Twitter with access to the preview was surprised to find that: “While playing around I got a suggestion including an existing S3 bucket … should we start worrying about security and PII [personally identifiable information]?” The screenshot showed code to post data to a specific S3 bucket, which we were told was not familiar to the developer, though it did not include any secrets that would allow access. It is, however, another example of artificial intelligence looking a lot like copy and paste.

Supported languages are Java, Python and JavaScript, but not TypeScript, to the disappointment of one previewer. Another said that it is a “great tool to be able to get a running start in AWS,” which is consistent with the idea that training on AWS internal code means that there are good chances of getting a utility function that works with an AWS service, such as in this example, code to detect labels from images using the Rekognition service.

There is a big difference though between a handy way to get snippets of code for AWS services, and generic AI that understands a developer’s intent and will generate working code accordingly.

We have asked AWS to comment on these issues and will update this post when received.