Netflix puts its own spin on notebooks and makes it Scala

AI/ML

By Julia Schmidt

October 24, 2019

Netflix puts its own spin on notebooks and makes it Scala

Netflix’s Personalization Infrastructure team has open sourced its experimental Polynote notebook in a bid to bring reproducibility and polyglotism to machine learning researchers and data scientists.

It was developed to help the company’s machine learning researchers concentrate on their work by getting rid of frustrations they had with the lack of tooling provided for code editing in notebooks and offer them ways to use Scala (and Spark), SQL, and Python in the same environment.

In comparison to other notebook environments like Jupyter, Polynote wasn’t built on a read-eval-print loop (REPL). According to the project’s creators, this should help in reducing surprises with notebook results. When using the REPL approach, evaluated expressions and the evaluation results are immutable. They are added to a global state the next expression can use. Users, however, are able to execute cells in any order, which in turn affects other cells and, with the state not really being visible, can make it difficult to reproduce results.

This is why the team decided to let Polynote construct the input state for a given cell based on the cells that have run above, making the position of a cell important and helping to make the result more predictable.

Another thing that is meant to help with reproducibility is the level of insight Polynote offers. The UI includes a number of helpers that let users know which cells currently run which statements and which other jobs are active at any given time. It also publicises the status of a kernel (idle and connected, busy, disconnected, or not started) and the values resulting from a cell’s execution.

Dependencies for each notebook can be set in a configuration section and are stored directly within the notebook. Exploring and visualising data is facilitated by the inclusion of open source libraries Vega and matplotlib, and features like a plot constructor, a data schema view, or a table inspector.

To make working with the new environment easier, Polynote provides interactive code completion, error highlighting, and a rich text editor for text cells and inserting LaTeX equations. It also allows users to write each cell in a different language and share variables between them, which helps if a data set is for example generated in a language different from the one used to do computations.

Currently the project apparently is able to work with Scala, Python, and SQL, since those are the languages Netflix’s researchers make most use of. More might follow if the open source community sees the need to add any.

More details can be found in the introductory blog post. Polynote itself is available on GitHub and licensed under the Apache License 2.0.

Sourcegraph coding assistant now supports Anthropic Claude 3 – though limited to 7K token input

Supabase moves out of beta, adds supports for Swift, plugs in Oriole storage engine

Go dev survey shows frustration with Python’s dominance of AI

AI coding: Hugging Face engineer extols benefits of open source models, but hard questions remain

.NET Smart Components experiment the "Visual Basic" of AI programming?

GitHub autofix progresses to public beta: insecure code corrected with AI, but only for enterprise

JetBrains bows to user pressure and unbundles AI Assistant in new IntelliJ IDEA beta

Hands On: Netlify AI-assisted deployment aims to reduce log-diving

Stack Overflow turns to Google for hosting and AI features, trusts in Gemini for tech answers

Employing your cloud data warehouse to scale up AI/ML

Rust-based Zed editor now open source – with built-in support for OpenAI and GitHub Copilot