Database company PingCAP has open sourced two tools to help anyone interested in using their distributed SQL database TiDB with the transition away from MySQL.
Ti stands for Titanium, and if you’ve never heard of the project before, this might be because it is mostly known in China (though the company is registered in San Mateo, California – with offices in Beijing). Widely spread throughout FinTech, gaming, and e-commerce businesses there, prominent users include Lenovo, consumer electronics company Xiaomi, and US-American streaming service Hulu.
TiDB is supposed to be able to handle online transactional processing (OLTP) as well as online analytical processing (OLAP) workloads, which is why it is also dubbed a hybrid transactional and analytical processing (HTAP) database. It is also compatible with MySQL but promises better performance for write-heavy workloads, since it doesn’t use replicas with a complete copy of the data for scaling out.
Instead query execution is handled via a layer of stateless TiDB servers (a storage layer helps with persistence), which can be extended by using Kubernetes’ ReplicaSets. Meanwhile table data is split into shards and distributed amongst the TiKV servers of the storage layer, with several copies of each shard on a cluster, but no server with a full data copy. TiKV is a distributed transactional key-value database, which implements the Raft consensus algorithm and stores the consensus state in RocksDB.
Queries across shards are supported with metadata about location maintained by a so-called placement driver. Operations are ACID compliant and those modifying data that spans several shards use a two-phase commit. On top of that, TiDB ships metrics to Prometheus and Grafana, and implements an online data definition language to help with the externalisation with schema changes on multiple nodes.
If this makes you want to consider the open source project, PingCAP now offers its data migration platform DM under the same Apache License 2.0 as its database. It is meant to help users migrate data either partly or completely from MySQL versions between 5.5 and 5.8, as well as MariaDB 10.1.2 and above to TiDB.
DM consists of a master, a worker, and the command line tool dmctl to control a DM cluster. While the master stores the topology information of a DM cluster, monitors processes, and manages and schedules data synchronisation tasks, the worker component executes the latter.
For those already using TiDB, there is now a tool to import large amounts of data quicker or backup and restore all data. TiDB-Lightning however is only able to read SQL dumps exported with the mydumper tool at the moment. Its front end component will read the dump, import the database structure into th TiDB cluster, transform the data into key-value pairs and send them to the tikv-importer. The importer then combines and sorts the pairs and imports them into a TiKV cluster.