The team behind resource scheduler YuniKorn just celebrated their first release as an Apache Incubator project, fitting the tool with dynamic queue management and a framework for better integration with Kubernetes operators.
Version 0.8 provides users with a way of setting placement rules to delegate queue management – so that, for instance, Kubernetes namespaces can be mapped to a YuniKorn queue – without having to create them beforehand.
Another new addition is node-sorting policies. Ops folks can use these to have better control over allocation distribution, and if the built-in FAIR and BinPacking aren’t to their taste, custom ones can be created and plugged into the scheduler.
Since batch workloads were one of the main concerns when setting up YuniKorn, the project team also worked on improving the scheduling performance in such scenarios. However, large resource requests not connected to the batch should still be able to succeed, which is why nodes can now be reserved for specific application requests, so that they aren’t available for other apps anymore.
YuniKorn is built to play nicely with other big data tools, however, integration with certain resources could be a bit tricky up until now. To facilitate better cooperation with Apache Spark, Flink and the like, the project now includes a pluggable app management framework. It allows the use of Kubernetes operators and provides the scheduler with additional information through custom resource definitions.
Apart from that, the YuniKorn project fixed 60 issues, a list of which can be found in its JIRA.
YuniKorn was introduced to the public by big data company Cloudera in 2019. Its developers built the new project to address the perceived lack of a tool that could handle both stateless batch processing and long running stateful services. The team tackled the issue by coming up with a common scheduler interface that decouples the scheduler from underlying resource management platforms.
The approach seemed interesting enough to make YuniKorn worth admitting into the Apache Incubator in January 2020. The new home appears to bode well for the stand-alone scheduler, since the 0.8 release notes list new supporters from companies such as Microsoft, Apple, and Nvidia.
And support is needed to get the word out there, given that Kubernetes and YARN both come with their own schedulers and users are more tempted to just go with those for the sake of simplicity. Better performance for some scenarios alone probably won’t be enough to win regular users over, but we’ll see what the next updates will bring to the table for more specialist use cases.