The Khronos Group – a graphics, media, and parallel computation standards consortium – has declared that the OpenCL 3.0 Specification is officially available.
The update is expected to make the framework for cross-platform, parallel programming easier to use and more flexible for new extensions. To achieve this, the OpenCL working group transferred the functionality included in the 1.2 release into optional features. If the features are needed by an OpenCL 2.x application, they can simply be queried, while 1.2 apps will be able to run as they are.
The change is supposed to make OpenCL better suited for embedded use cases and open it up to more devices, as features like SVM – something Nvidia had problems with, for example – have become optional. A reworked specification describing all versions of the standard is meant to help developers understand its evolution better without needing to switch between documents.
Language-wise, the OpenCL working group now recommends using the community-built C++ for the OpenCL compiler project, which provides additional C++17 features on top of the formerly used OpenCL C++. A corresponding extension is available and also adds functionality to check which language version is supported by the device compiler.
OpenCL 3.0, however, also comes with a few enhancements. It now includes a query to “return a universally unique identifier (UUID) for an OpenCL driver and device” to identify across both devices and APIs, for instance. Asynchronous direct memory access (DMA) extension enables ordered DMA transactions and is supposed to be the first in a line of additions to make OpenCL more useful in an embedded context.
Getting started with OpenCL isn’t the easiest of things, so its originators have decided to set up a dedicated SDK. Though still in development, it already provides a few code samples. Other components likely to land in the repository soon include OpenCL headers and C++ bindings that are needed to program an application along with some additional documentation.
In the coming months, the OpenCL working group plans to move ahead with development of extended subgroups, debugging information, external memory sharing, and interoperability with 3D API Vulkan. Longer-term goals include the inclusion of machine learning primitives, recordable command buffers, a device topology, and unified shared memory.
Those who can’t wait that long to get better debugging capabilities can take a look at Intel’s Intercept Layer for OpenCL Applications. The project just saw its v3.0 release, which is the first to officially support OpenCL 3.0. Improvements in the latest version include proper handling of extension APIs from multiple platforms as well as tracing for OpencL 3.0 APIs and more vendor-specific extensions.