
Graphite has introduced Diamond, a code review agent based on its existing Graphite Reviewer, but also insists that “AI will never replace human code review.”
The biz, which specialises in code review tools, was founded in 2020 by Tomas Reimers (ex-Facebook), Greg Foster (ex Google) and Meril Lutsky (ex Posmetrics, a market survey firm).
The core problem Graphite tackles relates large pull requests (PRs) which are too big to review properly, the proposed solution being stacking, breaking up code branches into multiple pull requests that are reviewed separately. This can introduce tricky problems with merging upstream changes into current work, a problem which Graphite (among other tools) addresses with its Graphite CLI (command line interface) and an extension for Visual Studio Code (VS Code).
Code review is an obvious target for AI assistance and is now a feature of many tools such as GitHub Copilot, Amazon Q Developer, Google Gemini and more. Without AI Graphite would fall behind, so in October last year the company introduced Graphite Reviewer, the idea being to allow developers to catch issues with their PRs before submitting them to other team members for review.
Graphite has now made Graphite Reviewer into a standalone product, called Diamond, as well as announcing further venture capital funding. Features of Diamond include detecting bugs, style inconsistencies and security vulnerabilities, performance issues, documentation issues, customizable rules, context awareness based on the entire codebase of an application, and possibly accidentally committed code (such as a function hard-coded to return true for convenience when writing other code). Diamond will suggest fixes.
Limitations of Diamond include that it currently only integrates with GitHub organizations, and that IDE integration is only for VS Code.
Despite Graphite’s lurch towards AI, Foster says that “AI can’t fully replace human code review … I don’t ever see them becoming a stand-in for an actual human engineer signing off on a pull request.”
Foster describes the importance of context, by which he means not only the code, but also the business context. “[AI] probably doesn’t know how your product roadmap shifted after a big meeting with the customer,” he writes. Other aspects include personal factors within a team, such as a bias for or against particular coding practices, which might cause sub-optimal code. “Real code review demands domain expertise.”
Another issue hew highlights is that AI is not accountable. If there is a vulnerability that results in a security incident, the AI cannot he held responsible, especially given the caveats around all AI assistance that it is sometimes wrong.
Anthropic CEO Dario Amodei predicted recently that AI will be writing 90 percent of all code within 3-6 months, but even if this proves to be the case Foster says “if we’re shipping code that’s never actually read or understood by a fellow human, we’re running a huge risk.”
Foster’s words will resonate with developers fearful that AI-generated code which nobody fully understands may make it into production with errors that cause unforeseen consequences.
The puzzle is how businesses can strike a balance between taking advantage of the productivity of AI-powered tools such as Diamond, while also ensuring that skilled humans are never taken out of the loop; a problem which may become more severe if developers let AI write code for them to the extent that their own skills are undermined.