GitHub engineers band together on Elasticsearch library project, Vulcanizer bounces out

GitHub engineers band together on Elasticsearch library project, Vulcanizer bounces out

GItHub engineers have open-sourced a library for operating Elasticsearch that grew out of a frustrated effort to build a packaged chat app to administer their clusters.

Nick Canzoneri and Jess Breckenridge said GitHub used Elasticsearch to underpin its search services, and administered clusters using ChatOps via Hubot, and that “As of 2017, those commands were a collection of Bash and Ruby-based scripts.”

The scripts lacked composability and reusability, they continue, and it was “difficult to contribute back to the community by open sourcing any of these scripts due to the fact they are specific to bespoke GitHub infrastructure.”

So, they resolved to create “a high-level API that corresponded to the common operations we took on a [Elasticsearch] cluster, such as disabling allocation or draining the shards from a node. Our goal was a library that focused on these administrative operations and that our existing tooling could easily use.”

“We initially scoped the project out to be a packaged chat app and planned to open source only what we were using internally,” they explained, and fixed on Go to build it.

But, they explain they hit a wall, as GitHub uses a protocol called ChatOps RPC, which is not widely adopted outside GitHub, while the internal REST Library their ChatOps commands used was not open sourced – it is currently being open source but this will take some time. Lastly, they relied on Consul for service discovery, “which not everyone uses.”

So they decided to break out the core of their library that could be open sourced, and which could access the REST endpoints on a single host; perform an action and provide results of the action.

The result is Vulcaniser, which they described as a Go library for interacting with an Elasticsearch cluster, providing a high level API to help with common tasks such as “querying health status of the cluster, migrating data off of nodes, updating cluster settings, and more.”

The library is in v0.4.0, with v0.1.0 introducing the basic functionality, and subsequent versions adding more functions around repositories and snapshots, and the most recent adding functionality for managing indices.

Proposed additions include more work around shard allocation and recovery, and more index related cases.

You can see a list of Go examples on their blog post here, while the project’s GitHub page is here.