Announcing KantoDB

2025 is the year of Rust in the database!

KantoDB is an OLTP-oriented hybrid SQL database server and programming library. It is written in the Rust programming language, using DataFusion & Arrow for data wrangling and (for now) RocksDB for underlying data storage.

You can use it as a library like SQLite, or as a database server like Postgres.

KantoDB is still very early in development. Keep reading, or see the development roadmap for our plans.

Vision

A robust, safe-to-operate, database that is close enough to Postgres that apps think it’s Postgres, but also runnable as a library like SQLite — best of both worlds!

And if you’re willing to not limit yourself to Postgres compatibility, you also get fancy new technologies like unsigned integers in a database! (Old SQL ecosystem is sometimes ridiculous.)

Origin story

My personal journey goes something like this:

I’ve always suffered from both SQLite and Postgres idiosyncrasies, and almost always when deploying I’ve wanted to start out small, not have a big dedicated database server, and have meaningful tests that don’t have multi-gigabyte dependencies and runtime assumptions. The idea of having a “close enough to Postgres to not have to learn much new” database with the low-end abilities of SQLite is something I’ve been wanting for roughly as long as I’ve known about SQLite — even more so if it could also scale up like Postgres and remove the fear of differences between dev/early-stage vs later.

Much later, I learned about the newly-fashionable OLAP-over-object-store architectures, and I learned about Parquet. That lead to discovering Arrow and DataFusion. Arrow is an in-memory data format intended to be a standard interchange layer. It’s basically array per column, which isn’t exactly point-query oriented but helps make modern-day CPUs happy; quite well aligned with SIMD processing. DataFusion is a Rust framework that’s essentially a query engine, and it has a decent query planner (arguably the hardest part of writing a database). RocksDB supports transactions and does MVCC, which is probably the second hardest part of writing a database. The rest just fell in place: sqlparser-rs is a Rust SQL syntax parser with Postgres etc compatibility nicely worked out. pgwire implements the Postgres wire protocol. Non-legacy clients can use FlightSQL and Arrow IPC for faster data transfer (Postgres wire protocol has much higher overhead for data transfer, it’s that old). In-process use from Rust is darn trivial with DataFusion, and other languages can be dealt with by writing a C bridge — once again, Arrow is an inter-language standard already, so all we need to do is to shove the result data buffers over to a more native “dataframe” library. It looks like I can actually glue these things together with less than a decade of effort!

Status

KantoDB is in very, very early development.

There are still huge gaps of missing functionality. We need to a lot more effort in stress tests and fault injection and such.

Next up, we are looking to identify significant open source projects with automated integration test suites, and classify our success based on how many of those won’t even realize they are not running on Postgres. Stay tuned for that.

Do not store anything important in KantoDB yet.

Future

There’s lot to still worry about, but I’m feeling pretty positive about the project. And later, if and when I get to replace RocksDB with a pure-Rust data store that has all the right bells and whistles (in-house or not), the end result will be pure Rust, and aligned for modern world of NVMe, io_uring, and what not. One database for your app’s full lifecycle: development, cheap cloud VM, and scale-up. That’s a world I definitely want to live in.

For developers

For the curious developers reading this: The darn thing already works as a SQL database — largely because it’s just DataFusion’s query engine and me feeding it table scans. I wrote a SQL database without ever debugging a JOIN! The shortcuts I’ve been able to take due to help from preexisting projects are huge. For someone who grew up in the world of “every C project has to write basic data structures for themselves because C isn’t very modular”, it’s downright amazing!

If you know Rust and/or robust systems programming, please join me on the community forum for this exciting journey into something that could become a new fundamental building block of software systems.

Source is at https://git.kantodb.com/kantodb/kantodb.

—Tommi Virtanen