AWS previews CodeWhisperer trained on internal code repositories

AWS previews CodeWhisperer trained on internal code repositories
code on-screen,

AWS has previewed a new capability for its AI-driven coding assistant, CodeWhisperer, enabling it to be trained on internal code repositories which are kept private from other customers.

Coding assistants like CodeWhisperer or GitHub’s Copilot can in some circumstances output code suggestions that are close copies of code on which its AI has been trained, though vendors have made efforts to mitigate the issue. Training the AI on private code repositories means that developers can benefit from AI that better understands the organization’s internal services and coding practices, while also removing the possibility of this code being leaked externally. “Each customization is isolated from other customers, and none of the customizations built with this new capability will be used to train the foundation model underlying CodeWhisperer,” the company states.

According to AWS Developer Advocate Donnie Prakoso, “With this feature, developers who are part of Amazon CodeWhisperer Professional tier can now receive real-time code recommendations that include their internal libraries, APIs, packages, classes, and methods.”

Administrators configure customizations by connecting them to one or more code repositories in GitHub, GitLab or Atlassian BitBucket, or by manually uploading code to Amazon S3 (Simple Storage Service). The code will be processed in the AWS US East 1 region during the preview. Once training is complete, the customization can be made available to selected team members. Developers can then configure CodeWhisperer in their IDE to connect to that customization.

Supported IDEs are Visual Studio Code, IntelliJ JetBrains, Visual Studio, and AWS Cloud9; and supported programming languages include Python, Java, JavaScript, TypeScript, and C#. Pricing is not yet announced but there is no extra cost during the preview.

Microsoft-owned GitHub does not yet have this feature for Copilot, though it is a common request, and in January a team member stated that “we don’t have this capability today, but we’re actively planning/working on it.” Microsoft also has a feature marked “work in progress” that would let Copilot examine the entire current repository, whereas currently “GitHub Copilot only knows about the contents of your current file and possibly a few other open tabs, rendering it blind to important type definitions, patterns and greater connections in your codebase.”

Despite concerns about the reliability and licensing of AI-generated code, the momentum behind the trend remains strong. According to recent GitHub research “85% of developers felt more confident in their code quality when authoring code with GitHub Copilot and Copilot Chat.”

How does CodeWhisperer compare to Copilot? Unlike Copilot, the AWS service is free for individuals, which is an advantage for trialling the service. CodeWhisperer is trained on AWS internal code which makes it suitable for use with AWS services. Copilot is trained on public repositories making it good for general purposes. Copilot also has chat features which CodeWhisperer lacks. Training on private repositories though is a strong feature particularly for developers embedded in the AWS ecosystem.