ONNX runtime tries hand at WinML, changes compatibility pattern

ONNX runtime

Microsoft has updated its inference engine for open neural network exchange models ONNX runtime to v1.2, fitting the tool with WinML API support, featurizer operators, and changes to the forward-compatibility pattern.

The latter consists of an added model opset number and IR version check, which should “guarantee correctness of model prediction and remove behavior ambiguity due to missing opset information”. Since the runtime won’t support models higher than the opset implemented for that specific version, users might have to look into custom operators should a more advanced one be needed.

The ONNX RT team also made some additions to various APIs, increasing for example the default maximum number of graph transformation steps in the SessionOptions API to 10 and setting the graph optimisation level to ORT_ENABLE_ALL (99) out of the box. The C API now includes a slew of new functions such as GetDenotationFromTypeInfo, GetMapValueType, SessionEndProfiling, ModelMetadataGetProducerName, and ReleaseModelMetadata, while the Java API is now available on Android, but needs Gradle to be available when building onnxruntime.

To give users something to experiment with until the next release, version 1.2 also comes with a couple of experimental features the ONNX runtime team would appreciate feedback on. It, for example, comes with featurizer operators which are meant as an expansion to the Contrib operators, and a preview of “Windows Machine Learning (WinML) APIs in Windows builds of ONNX Runtime, with DirectML for GPU acceleration”.

The WinML API is a WinRT API designed for Windows devs, which is compatible with Windows 8.1 for CPU and Windows 10 1709 for GPU usage. A getting started guide can be found in the project’s documentation, with code available via GitHub or NuGet for pre-built packages.

Speaking of NuGet, the associated package structure got an update, so that there’s now a managed Microsoft.ML.OnnxRuntime.Managed assembly which is shared between CPU and GPU packages. With this release also comes the capability to generate onnxruntime Android Archive files from source, so that they can be easily imported into Android Studio.

More details about the release, which include information about component updates such as the TensorRT Execution Provider and CUDA, can be found in the ONNX RT release notes.

The ONNX Runtime was open sourced in 2018 in an effort to “drive product innovation in AI”. Microsoft describes the project as a way to “accelerate machine learning inferencing across all of your deployment targets using a single set of APIs”. It can be used to parse models from the supported projects, which range from TensorFlow to PyTorch and scikit-learn, trimming and consolidating nodes for optimisation purposes, handing out tips on hardware acceleration available along the way.