What’s the point: Google AI, JetBrains’ big data tools, Subversion, Docker <3 Azure

What’s the point: Google AI, JetBrains’ big data tools, Subversion, Docker <3 Azure

Google’s AI teams used the last days of May to share their advances in natural language generation evaluation and demonstrate how the large-scale pre-training approach currently hot in the language domain could be used in the field of computer vision. 

The idea behind the latter boils down to pre-training general features with a variety of datasets and fine-tuning the resulting model with less data for a task of interest in a second step. The exact methodology is explained in “Big Transfer (BiT): General Visual Representation Learning” with models and notebooks available via a project repository.

Meanwhile the language team has been busy using pre-training to find a way of measuring the quality of systems generating natural language. The result is called BLEURT, a “learned evaluation metric based on BERT that can model human judgments with a few thousand possibly biased training examples”. The examples mentioned stem from public rating collections and additional user input.

BLEURT is meant to combine advantages of human evaluation and automatic metrics, while being more performant than competing approaches. The first results seem to compare well enough to yield further investigation, with the team staging to look into multilingualism and multimodality next.

JetBrains opens Big Data Tools to those who know a thing or two about data

Python and SQL users working with JetBrains’ PyCharm or DataGrip can now take the Big Data Tools for a spin. The plugin was previously meant for IntelliJ IDEA users only, but now also allows those less used to Java to work with tools such as Apache Spark, Hadoop’s HDFS, and AWS S3 from the comfort of their IDE.

Not dead yet: Subversion gets another long-term support release

Subversion, the version control system fostered by the Apache Software foundation, has just received another update, making 1.14.0-LTS ready for downloading. Although the tool lost some of its fan base to arch rival Git in the last couple of years, it is still very much relevant in many large enterprises. To make sure it stays that way, Subversion’s Python bindings got overhauled to work with Python 3 as well as its legacy predecessors.

Other than that, a duplication feature moved into the software along with some experimental additions like Viewspec, Shelving and Checkpointing. Those allow users to save view layouts and recreate them later, and to “save, restore, and roll back snapshots of their work, without making commits to the central repository” respectively.

Docker expands strategic Microsoft partnership

That Microsoft is quite essential for Docker’s enterprise clientele is something the company has kept repeating for a while. Now the container bedrock is expanding its partnership with Redmond to the cloud, thanks to an integration with Azure Container Instances. While getting Docker to run on Azure wasn’t too hard to begin with, the process is said to have become easier, with users being “able to log into Azure directly from the Docker CLI so you can connect to your Azure account”. 

Currently, a beta is planned for the second half of 2020. The announcement also states that users can submit which provider to on-board next. AWS comes to mind, being the big player in the cloud.