Nothing to hide? Then add these to your ML repo, Papers with Code says

AI/ML

By Julia Schmidt

April 14, 2020

Nothing to hide? Then add these to your ML repo, Papers with Code says

In a bid to make advancements in machine learning more reproducible, ML resource and Facebook AI Research (FAIR) appendage Papers With Code has introduced a code completeness checklist for machine learning papers.

It is based on the best practices the Papers with Code team has seen in popular research repositories and the Machine Learning Reproducibility Checklist which Joelle Pineau, FAIR Managing Director, introduced in 2019, as well as some additional work Pineau and other researchers did since then.

Papers with Code was started in 2018 as a hub for newly published machine learning papers that come with source code, offering researchers an easy to monitor platform to keep up with the current state of the art. In late 2019 it became part of FAIR “to further accelerate our growth”, as founders Robert Stojnic and Ross Taylor put it back then.

As part of FAIR, the project will get a bit of a visibility push since the new checklist will also be used in the submission process for the 2020 edition of the popular NeurIPS conference on neural information processing systems.

The ML code completeness checklist is used to “assess code repositories based on the scripts and artefacts that have been provided within it” to enhance reproducibility and “enable others to more easily build upon published work”. It includes checks for dependencies, so that those looking to replicate a paper’s results have some idea what is needed in order to succeed, training and evaluation scripts, pre-trained models, and results.

While all of these seem like useful things to have, Papers with Code also tried using a somewhat scientific approach to make sure they really are indicators for a useful repository. To verify that, they looked for correlations between the number of fulfilled checklist items and the star-rating of a repository.

Their analysis showed that repositories that hit all the marks got higher ratings implying that the “checklist score is indicative of higher quality submissions” and should therefore encourage researchers to comply in order to produce useful resources. However, they simultaneously admitted that marketing and the state of documentation might also play into a repo’s popularity.

They nevertheless went on recommending to lay out the five elements mentioned and link to external resources, which always is a good idea. Additional tips for publishing research code can be found in the project’s GitHub repository or the report on NeurIPS reproducibility program.

Sourcegraph coding assistant now supports Anthropic Claude 3 – though limited to 7K token input

Supabase moves out of beta, adds supports for Swift, plugs in Oriole storage engine

Go dev survey shows frustration with Python’s dominance of AI

AI coding: Hugging Face engineer extols benefits of open source models, but hard questions remain

.NET Smart Components experiment the "Visual Basic" of AI programming?

GitHub autofix progresses to public beta: insecure code corrected with AI, but only for enterprise

JetBrains bows to user pressure and unbundles AI Assistant in new IntelliJ IDEA beta

Hands On: Netlify AI-assisted deployment aims to reduce log-diving

Stack Overflow turns to Google for hosting and AI features, trusts in Gemini for tech answers

Employing your cloud data warehouse to scale up AI/ML

Rust-based Zed editor now open source – with built-in support for OpenAI and GitHub Copilot

AI assistance is leading to lower code quality, claim researchers

Nothing to hide? Then add these to your ML repo, Papers with Code says

ABOUT US

FOLLOW US