Scikit-learn celebrates the big 1.0 with keyword arguments and online one-class SVMs

AI/ML

By Julia Schmidt

September 27, 2021

Scikit-learn celebrates the big 1.0 with keyword arguments and online one-class SVMs

After a good 14 years of development, the team behind scikit-learn has released version 1.0 of the Python machine learning library, signaling the open-source project’s stability and adjusting things to make it more straightforward to use.

For its first major release, the maintainer’s focus was mainly on stabilisation as well as some enhancements meant to help in more complex scenarios. When preprocessing data for instance, users can now generate polynomial features by using spline transformers, which was previously only possible through less flexible (and often less stable) pure polynomials.

The library also gained an online one-class SVM implementation that uses stochastic gradient descent, which could be useful for cases in which large numbers of training samples are needed for fitting linear classifiers. Data scientists interested in predicting intervals instead of data points should meanwhile take a look at the newly added quantile regressors.

Another new addition means all estimators now set a feature_names_in_ attribute that includes the feature name when fitted on pandas Dataframes. If a name is non-consistent with non-fit methods, scikit-learn will raise a warning.

To make metrics visible, scikit-learn includes a plotting API, that now sports additional class methods from_estimator and from_predictions in metrics.ConfusionMatrixDisplay, metrics.PrecisionRecallDisplay, metrics.DetCurveDisplay, and inspection.PartialDependenceDisplay for creating plots based on estimators and predictions.

Since the number of parameters available for many functions made scikit-learn code tricky to read (and write without looking into the documentation) at times, its developers decided to deprecate positional arguments in version 0.23. Starting with v1.0, the library will raise a TypeError if constructor and function parameters aren’t provided with their names. The release also saw histogram-based Gradient Boosting Models maturing from their experimental status, meaning they can be imported and used like regular models.

Developers who are afraid of the breaking changes major releases often bring about can breathe easy, as v1.0 promises to be comparatively straightforward to upgrade to. However, changes in manifold.TSNE, manifold.Isomap, and the splitting criterion of tree.DecisionTreeClassifier and tree.DecisionTreeRegressor may lead to slightly different models than before, as well as the switch to keyword-only arguments might make modifications necessary.

Long-term planners should also check out the deprecations in v1.0 to give them enough time to prepare for their removal in version 1.2. The scikit-learn team for instance worked hard to unify the use of squared and absolute errors through criterion and loss parameters, making squared_error and absolute_error the default and deprecating old option names.

Other deprecations include np-matrix, get_feature_names in the Transformer API, cluster.Birch attributes, fit_ and partial_fit_, grid_scores_ in feature_selection.RFECV, the normalize parameter of linear_model.LinearRegression, as well as utils._testing.assert_warns and utils._testing.assert_warns_message. Details are available via the project’s changelog.

Scikit-learn is licensed under a BSD-3-Clause License and can be found on GitHub. The project started out as a summer of code project in 2017 and has been driven by a group of INRIA scientists since 2010 which makes the project stand out amongst other popular machine learning projects that are mainly developed by large corporations.

Sourcegraph coding assistant now supports Anthropic Claude 3 – though limited to 7K token input

Supabase moves out of beta, adds supports for Swift, plugs in Oriole storage engine

Go dev survey shows frustration with Python’s dominance of AI

AI coding: Hugging Face engineer extols benefits of open source models, but hard questions remain

.NET Smart Components experiment the "Visual Basic" of AI programming?

GitHub autofix progresses to public beta: insecure code corrected with AI, but only for enterprise

JetBrains bows to user pressure and unbundles AI Assistant in new IntelliJ IDEA beta

Hands On: Netlify AI-assisted deployment aims to reduce log-diving

Stack Overflow turns to Google for hosting and AI features, trusts in Gemini for tech answers

Employing your cloud data warehouse to scale up AI/ML

Rust-based Zed editor now open source – with built-in support for OpenAI and GitHub Copilot

AI assistance is leading to lower code quality, claim researchers

Scikit-learn celebrates the big 1.0 with keyword arguments and online one-class SVMs

ABOUT US

FOLLOW US